publications | Robin Schmucker

An up-to-date list is available on Google Scholar.

2024

XRDS

Harnessing Machine Learning and Generative AI: A New Era in Online Tutoring Systems

Robin Schmucker

XRDS: Crossroads, The ACM Magazine for Students (to appear), 2024

@article{Schmucker2024:Harnessing,
  author = {Schmucker, Robin},
  journal = {XRDS: Crossroads, The ACM Magazine for Students (to appear)},
  publisher = {Association for Computing Machinery},
  title = {Harnessing Machine Learning and Generative AI: A New Era in Online Tutoring Systems},
  year = {2024},
  volume = {30},
  pages = {1-5},
  number = {6},
}

AIED

Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System

Robin Schmucker, Meng Xia, Amos Azaria, and Tom Mitchell

In Proceedings of the 25th International Conference on Artificial Intelligence in Education , 2024

Abs PDF Code

Conversational tutoring systems (CTSs) offer learning experiences through interactions based on natural language. They are recognized for promoting cognitive engagement and improving learning outcomes, especially in reasoning tasks. Nonetheless, the cost associated with authoring CTS content is a major obstacle to widespread adoption and to research on effective instructional design. In this paper, we discuss and evaluate a novel type of CTS that leverages recent advances in large language models (LLMs) in two ways: First, the system enables AI-assisted content authoring by inducing an easily editable tutoring script automatically from a lesson text. Second, the system automates the script orchestration in a learning-by-teaching format via two LLM-based agents (Ruffle&Riley) acting as a student and a professor. The system allows for free-form conversations that follow the ITS-typical inner and outer loop structure. We evaluate Ruffle&Riley’s ability to support biology lessons in two between-subject online user studies (N = 200) comparing the system to simpler QA chatbots and reading activity. Analyzing system usage patterns, pre/post-test scores and user experience surveys, we find that Ruffle&Riley users report high levels of engagement, understanding and perceive the offered support as helpful. Even though Ruffle&Riley users require more time to complete the activity, we did not find significant differences in short-term learning gains over the reading activity. Our system architecture and user study provide various insights for designers of future CTSs. We further open-source our system to support ongoing research on effective instructional design of LLM-based learning technologies.
L@S

Gaining Insights into Group-Level Course Difficulty via Differential Course Functioning

Frederik Baucks^*, Robin Schmucker^*, Conrad Borchers, Zachary A. Pardos, and Laurenz Wiskott

In Proceedings of the 11th ACM Conference on Learning @ Scale , 2024

Abs PDF Code

Curriculum Analytics (CA) studies curriculum structure and student data to ensure the quality of educational programs. One desirable property of courses within curricula is that they are not unexpectedly more difficult for students of different backgrounds. While prior work points to likely variations in course difficulty across student groups, robust methodologies for capturing such variations are scarce, and existing approaches do not adequately decouple course-specific difficulty from students’ general performance levels. The present study introduces Differential Course Functioning (DCF) as an Item Response Theory (IRT)-based CA methodology. DCF controls for student performance levels and examines whether significant differences exist in how distinct student groups succeed in a given course. Leveraging data from over 20,000 students at a large public university, we demonstrate DCF’s ability to detect inequities in undergraduate course difficulty across student groups described by grade achievement. We compare major pairs with high co-enrollment and transfer students to their non-transfer peers. For the former, our findings suggest a link between DCF effect sizes and the alignment of course content to student home department motivating interventions targeted towards improving course preparedness. For the latter, results suggest minor variations in course-specific difficulty between transfer and non-transfer students. While this is desirable, it also suggests that interventions targeted toward mitigating grade achievement gaps in transfer students should encompass comprehensive support beyond enhancing preparedness for individual courses. By providing more nuanced and equitable assessments of academic performance and difficulties experienced by diverse student populations, DCF could support policymakers, course articulation officers, and student advisors.
L@S

Automated Generation and Tagging of Knowledge Components from Multiple-Choice Questions

Steven Moore, Tom Mitchell, Robin Schmucker, and John Stamper

In Proceedings of the 11th ACM Conference on Learning @ Scale (best dataset award) , 2024

Abs PDF

Knowledge Components (KCs) linked to assessments enhance the measurement of student learning, enrich analytics, and facilitate adaptivity. However, generating and linking KCs to assessment items requires significant effort and domain-specific knowledge. To streamline this process for higher-education courses, we employed GPT-4 to generate KCs for multiple-choice questions (MCQs) in Chemistry and E-Learning. We analyzed discrepancies between the KCs generated by the Large Language Model (LLM) and those made by humans through evaluation from three domain experts in each subject area. This evaluation aimed to determine whether, in instances of non-matching KCs, evaluators showed a preference for the LLM-generated KCs over their human-created counterparts. We also developed an ontology induction algorithm to cluster questions that assess similar KCs based on their content. Our most effective LLM strategy accurately matched KCs for 56% of Chemistry and 35% of E-Learning MCQs, with even higher success when considering the top five KC suggestions. Human evaluators favored LLM-generated KCs, choosing them over human-assigned ones approximately two-thirds of the time, a preference that was statistically significant across both domains. Our clustering algorithm successfully grouped questions by their underlying KCs without needing explicit labels or contextual information. This research advances the automation of KC generation and classification for assessment items, alleviating the need for student data or predefined KC labels.
JEDM

The Knowledge Component Attribution Problem for Programming: Methods and Tradeoffs with Limited Labeled Data

Yang Shi, Robin Schmucker, Keith Tran, John Bacher, Kenneth Koedinger, Thomas Price, Min Chi, and Tiffany Barnes

Journal of Educational Data Mining, 2024
LAK
Gaining Insights into Course Difficulty Variations Using Item Response Theory

Frederik Baucks^*, Robin Schmucker^*, and Laurenz Wiskott

In Proceedings of the 14th Learning Analytics and Knowledge Conference , 2024

Abs Bib PDF Code

Curriculum analytics (CA) studies curriculum structure and student data to ensure the quality of educational programs. To gain statistical robustness, most existing CA techniques rely on the assumption of time-invariant course difficulty, preventing them from capturing variations that might occur over time. However, ensuring low temporal variation in course difficulty is crucial to warrant fairness in treating individual student cohorts and consistency in degree outcomes. We introduce item response theory (IRT) as a CA methodology that enables us to address the open problem of monitoring course difficulty variations over time. We use statistical criteria to quantify the degree to which course performance data meets IRT’s theoretical assumptions and verify validity and reliability of IRT-based course difficulty estimates. Using data from 664 Computer Science and 1,355 Mechanical Engineering undergraduate students, we show how IRT can yield valuable CA insights: First, by revealing temporal variations in course difficulty over several years, we find that course difficulty has systematically shifted downward during the COVID-19 pandemic. Second, time-dependent course difficulty and cohort performance variations confound conventional course pass rate measures. We introduce IRT-adjusted pass rates as an alternative to account for these factors. Our findings affect policymakers, student advisors, accreditation, and course articulation.
@inproceedings{Baucks2024:IRT, author = {Baucks, Frederik and Schmucker, Robin and Wiskott, Laurenz}, title = {Gaining Insights into Course Difficulty Variations Using Item Response Theory}, year = {2024}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3636555.3636902}, booktitle = {Proceedings of the 14th Learning Analytics and Knowledge Conference}, pages = {450--461}, numpages = {12}, series = {LAK '24}, }
AAAI
Learning to Compare Hints: Combining Insights from Student Logs and Large Language Models

Ted Zhang, Harshith Arun Kumar, Robin Schmucker, Amos Azaria, and Tom Mitchell

In AAAI 2024 Workshop on AI for Education , 2024

Abs Bib PDF

We explore the general problem of learning to predict which teaching actions will result in the best learning outcomes for students in online courses. More specifically, we consider the problem of predicting which hint will most help a student who answers a practice question incorrectly, and who is about to make a second attempt to answer that question. In previous work (Schmucker et al., 2023) we showed that log data from thousands of previous students could be used to learn empirically which of several pre-defined hints produces the best learning outcome. However, while that study utilized data from thousands of students submitting millions of responses, it did not consider the actual text of the question, the hint, or the answer. In this paper, we ask the follow-on question “Can we train a machine learned model to examine the text of the question, the answer, and the text of hints, to predict which hint will lead to better learning outcomes?” Our experimental results show that the answer is yes. This is important because the trained model can now be applied to new questions and hints covering related subject matter, to estimate which of the new hints will be most useful, even before testing it on students. Finally, we show that the pairs of hints for which the model makes most accurate predictions are the hint pairs where choosing the right hint has the biggest payoff (i.e., hint pairs for which the difference in learning outcomes is greatest).
@inproceedings{Zhang2024:Learning, title = {Learning to Compare Hints: Combining Insights from Student Logs and Large Language Models}, author = {Zhang, Ted and Kumar, Harshith Arun and Schmucker, Robin and Azaria, Amos and Mitchell, Tom}, booktitle = {AAAI 2024 Workshop on AI for Education}, year = {2024}, pages = {1--8}, }

2023

NeurIPS
Ruffle&Riley: Towards the Automated Induction of Conversational Tutoring Systems

Robin Schmucker, Meng Xia, Amos Azaria, and Tom Mitchell

In NeurIPS 2023 Workshop on Generative AI for Education , 2023

Abs Bib PDF

Conversational tutoring systems (CTSs) offer learning experiences driven by natural language interaction. They are known to promote high levels of cognitive engagement and benefit learning outcomes, particularly in reasoning tasks. Nonetheless, the time and cost required to author CTS content is a major obstacle to widespread adoption. In this paper, we introduce a novel type of CTS that leverages the recent advances in large language models (LLMs) in two ways: First, the system induces a tutoring script automatically from a lesson text. Second, the system automates the script orchestration via two LLM-based agents (Ruffle&Riley) with the roles of a student and a professor in a learning-by-teaching format. The system allows a free-form conversation that follows the ITS-typical inner and outer loop structure. In an initial between-subject online user study (N = 100) comparing Ruffle&Riley to simpler QA chatbots and reading activity, we found no significant differences in post-test scores. Nonetheless, in the learning experience survey, Ruffle&Riley users expressed higher ratings of understanding and remembering and further perceived the offered support as more helpful and the conversation as coherent. Our study provides insights for a new generation of scalable CTS technologies.
@inproceedings{Schmucker2020:Multi, author = {Schmucker, Robin and Xia, Meng and Azaria, Amos and Mitchell, Tom}, title = {Ruffle\&Riley: Towards the Automated Induction of Conversational Tutoring Systems}, year = {2023}, booktitle = {NeurIPS 2023 Workshop on Generative AI for Education}, }
ECTEL
Learning to Give Useful Hints: Assistance Action Evaluation and Policy Improvements

Robin Schmucker, Nimish Pachapurkar, Shanmuga Bala, Miral Shah, and Tom Mitchell

In Proceedings of the 18th European Conference on Technology Enhanced Learning , 2023

Abs Bib PDF

We describe a fielded online tutoring system that learns which of several candidate assistance actions (e.g., one of multiple hints) to provide to students when they answer a practice question incorrectly. The system learns, from large-scale data of prior students, which assistance action to give for each of thousands of questions, to maximize measures of student learning outcomes. Using data from over 190,000 students in an online Biology course, we quantify the impact of different assistance actions for each question on a variety of outcomes (e.g., response correctness, practice completion), framing the machine learning task as a multi-armed bandit problem. We study relationships among different measures of learning outcomes, leading us to design an algorithm that for each question decides on the most suitable assistance policy training objective to optimize central target measures. We evaluate the trained policy for providing assistance actions, comparing it to a randomized assistance policy in live use with over 20,000 students, showing significant improvements resulting from the system’s ability to learn to teach better based on data from earlier students in the course. We discuss our design process and challenges we faced when fielding data-driven technology, providing insights to designers of future learning systems.
@inproceedings{Schmucker2023:Learning, author = {Schmucker, Robin and Pachapurkar, Nimish and Bala, Shanmuga and Shah, Miral and Mitchell, Tom}, title = {Learning to Give Useful Hints: Assistance Action Evaluation and Policy Improvements}, booktitle = {Proceedings of the 18th European Conference on Technology Enhanced Learning}, year = {2023}, publisher = {Springer Nature Switzerland}, address = {Cham}, pages = {383--398}, }
EDM
KC-Finder: Automated Knowledge Component Discovery for Programming Problems

Yang Shi, Robin Schmucker, Min Chi, Tiffany Barnes, and Thomas Price

In Proceedings of the 16th International Conference on Educational Data Mining , 2023

Abs Bib PDF

Knowledge components (KCs) have many applications. In computing education, knowing the demonstration of specific KCs has been challenging. This paper introduces an entirely data-driven approach for (i) discovering KCs and (ii) demonstrating KCs, using students’ actual code submissions. Our system is based on two expected properties of KCs: (i) generate learning curves following the power law of practice, and (ii) are predictive of response correctness. We train a neural architecture (named KC-Finder) that classifies the correctness of student code submissions and captures problem-KC relationships. Our evaluation on data from 351 students in an introductory Java course shows that the learned KCs can generate reasonable learning curves and predict code submission correctness. At the same time, some KCs can be interpreted to identify programming skills. We compare the learning curves described by our model to four baselines, showing that (i) identifying KCs with naive methods is a difficult task and (ii) our learning curves exhibit a substantially better curve fit. Our work represents a first step in solving the data-driven KC discovery problem in computing education.
@inproceedings{Shi2023:Kc, title = {KC-Finder: Automated Knowledge Component Discovery for Programming Problems}, author = {Shi, Yang and Schmucker, Robin and Chi, Min and Barnes, Tiffany and Price, Thomas}, booktitle = {Proceedings of the 16th International Conference on Educational Data Mining}, year = {2023}, publisher = {International Educational Data Mining Society}, pages = {28--39}, }
AAAI
Tracing Changes in University Course Difficulty Using Item Response Theory

Frederik Baucks^*, Robin Schmucker^*, and Laurenz Wiskott

In AAAI 2023 Workshop on AI for Education , 2023

Abs Bib PDF

Curriculum analytics (CA) studies educational program structure and student data to ensure the quality of courses inside a curriculum. Ensuring low variation in course difficulty over time is crucial to warrant equal treatment of individual student cohorts and consistent degree outcomes. Still, existing CA techniques (e.g., process mining/simulation and curriculum-based prediction) are unable to capture such temporal variations due to their central assumption of time-invariant course behavior. In this paper, we introduce item response theory (IRT) as a new methodology to the CA domain to address the open problem of tracing changes in course difficulty over time. We show the suitability of IRT to capture variance in course performance data and assess the validity and reliability of IRT-based difficulty estimates. Using data from 664 CS Bachelor students, we show how IRT can yield valuable insights by revealing variations in course difficulty over multiple years. Furthermore, we observe a systematic shift in course difficulty during the COVID-19 pandemic.
@inproceedings{Baucks2022:Tracing, title = {Tracing Changes in University Course Difficulty Using Item Response Theory}, author = {Baucks, Frederik and Schmucker, Robin and Wiskott, Laurenz}, booktitle = {AAAI 2023 Workshop on AI for Education}, year = {2023}, pages = {28--39}, }

2022

ICCE
Transferable Student Performance Modeling for Intelligent Tutoring Systems

Robin Schmucker, and Tom M Mitchell

In Proceedings of the 30th International Conference on Computers in Education , 2022

Abs Bib PDF

Millions of students worldwide are now using intelligent tutoring systems (ITSs). At their core, ITSs rely on student performance models (SPMs) to trace each student’s changing ability level over time, in order to provide personalized feedback and instruction. Crucially, SPMs are trained using interaction sequence data of previous students to analyze data generated by future students. This induces a cold-start problem when a new course is introduced, because no students have yet taken the course and hence there is no data to train the SPM. Here, we consider transfer learning techniques to train accurate SPMs for new courses by leveraging log data from existing courses. We study two settings: (i) In the naive transfer setting, we first train SPMs on existing course data and then apply these SPMs to new courses without modification. (ii) In the inductive transfer setting, we fine tune these SPMs using a small amount of training data from the new course (e.g., collected during a pilot study). We evaluate the proposed techniques using student interaction sequence data from five different mathematics courses taken by over 47,000 students. The naive transfer models that use features provided by human domain experts (e.g., difficulty ratings for questions in the new course) but no student interaction training data for the new course, achieve prediction accuracy on par with standard BKT and PFA models that use training data from thousands of students in the new course. In the inductive setting our transfer approach yields more accurate predictions than conventional SPMs when only limited student interaction training data (<100 students) is available to both.
@inproceedings{Schmucker2022:Transferable, title = {Transferable Student Performance Modeling for Intelligent Tutoring Systems}, author = {Schmucker, Robin and Mitchell, Tom M}, booktitle = {Proceedings of the 30th International Conference on Computers in Education}, publisher = {APSCE}, year = {2022}, pages = {13--23}, address = {Kuala Lumpur, MY}, series = {ICCE '22}, }
JEDM
Assessing the Knowledge State of Online Students-New Data, New Approaches, Improved Accuracy

Robin Schmucker, Jingbo Wang, Shijia Hu, Tom Mitchell, and others

Journal of Educational Data Mining, 2022

Abs Bib PDF Code

We consider the problem of assessing the changing performance levels of individual students as they go through online courses. This student performance modeling problem is a critical step for building adaptive online teaching systems. Specifically, we conduct a study of how to utilize various types and large amounts of log data from earlier students to train accurate machine learning models that predict the performance of future students. This study is the first to use four very large sets of student data made available recently from four distinct intelligent tutoring systems. Our results include a new machine learning approach that defines a new state of the art for logistic regression-based student performance modeling, improving over earlier methods in several ways: First, we achieve improved accuracy of student modeling by introducing new features that can be easily computed from conventional question-response logs (e.g., features such as the pattern in the student’s most recent answers). Second, we take advantage of features of the student history that go beyond question-response pairs (e.g., features such as which video segments the student watched, or skipped) as well as background information about prerequisite structure in the curriculum. Third, we train multiple specialized student performance models for different aspects of the curriculum (e.g., specializing in early versus later segments of the student history), then combine these specialized models to create a group prediction of the student performance. Taken together, these innovations yield an average AUC score across these four datasets of 0.808 compared to the previous best logistic regression approach score of 0.767, and also outperforming state-of-the-art deep neural net approaches. Importantly, we observe consistent improvements from each of our three methodological innovations, in each diverse dataset, suggesting that our methods are of general utility and likely to produce improvements for other online tutoring systems as well.
@article{Schmucker2022:Assessing, title = {Assessing the Knowledge State of Online Students-New Data, New Approaches, Improved Accuracy}, author = {Schmucker, Robin and Wang, Jingbo and Hu, Shijia and Mitchell, Tom and others}, journal = {Journal of Educational Data Mining}, volume = {14}, number = {1}, pages = {1--45}, year = {2022}, }

2021

PLOS CompBio
Combination treatment optimization using a pan-cancer pathway model

Robin Schmucker, Gabriele Farina, James Faeder, Fabian Fröhlich, Ali Sinan Saglam, and Tuomas Sandholm

PLOS Computational Biology, 2021

Abs Bib PDF Code

The design of efficient combination therapies is a difficult key challenge in the treatment of complex diseases such as cancers. The large heterogeneity of cancers and the large number of available drugs renders exhaustive in vivo or even in vitro investigation of possible treatments impractical. In recent years, sophisticated mechanistic, ordinary differential equation-based pathways models that can predict treatment responses at a molecular level have been developed. However, surprisingly little effort has been put into leveraging these models to find novel therapies. In this paper we use for the first time, to our knowledge, a large-scale state-of-the-art pan-cancer signaling pathway model to identify candidates for novel combination therapies to treat individual cancer cell lines from various tissues (e.g., minimizing proliferation while keeping dosage low to avoid adverse side effects) and populations of heterogeneous cancer cell lines (e.g., minimizing the maximum or average proliferation across the cell lines while keeping dosage low). We also show how our method can be used to optimize the drug combinations used in sequential treatment plans—that is, optimized sequences of potentially different drug combinations—providing additional benefits. In order to solve the treatment optimization problems, we combine the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm with a significantly more scalable sampling scheme for truncated Gaussian distributions, based on a Hamiltonian Monte-Carlo method. These optimization techniques are independent of the signaling pathway model, and can thus be adapted to find treatment candidates for other complex diseases than cancers as well, as long as a suitable predictive model is available.
@article{Schmucker2022:Combination, author = {Schmucker, Robin and Farina, Gabriele and Faeder, James and Fröhlich, Fabian and Saglam, Ali Sinan and Sandholm, Tuomas}, journal = {PLOS Computational Biology}, publisher = {Public Library of Science}, title = {Combination treatment optimization using a pan-cancer pathway model}, year = {2021}, volume = {17}, pages = {1-22}, number = {12}, }
arXiv
Multi-objective Asynchronous Successive Halving

Robin Schmucker, Michele Donini, Muhammad Bilal Zafar, David Salinas, and Cédric Archambeau

arXiv preprint arXiv:2106.12639, 2021

Abs Bib PDF Code

Hyperparameter optimization (HPO) is increasingly used to automatically tune the predictive performance (e.g., accuracy) of machine learning models. However, in a plethora of real-world applications, accuracy is only one of the multiple – often conflicting – performance criteria, necessitating the adoption of a multi-objective (MO) perspective. While the literature on MO optimization is rich, few prior studies have focused on HPO. In this paper, we propose algorithms that extend asynchronous successive halving (ASHA) to the MO setting. Considering multiple evaluation metrics, we assess the performance of these methods on three real world tasks: (i) Neural architecture search, (ii) algorithmic fairness and (iii) language model optimization. Our empirical analysis shows that MO ASHA enables to perform MO HPO at scale. Further, we observe that that taking the entire Pareto front into account for candidate selection consistently outperforms multi-fidelity HPO based on MO scalarization in terms of wall-clock time. Our algorithms (to be open-sourced) establish new baselines for future research in the area.
@article{Schmucker2021:Multi, title = {Multi-objective Asynchronous Successive Halving}, author = {Schmucker, Robin and Donini, Michele and Zafar, Muhammad Bilal and Salinas, David and Archambeau, C{\'e}dric}, journal = {arXiv preprint arXiv:2106.12639}, year = {2021}, }
AIES
Fair Bayesian Optimization

Valerio Perrone, Michele Donini, Muhammad Bilal Zafar, Robin Schmucker, Krishnaram Kenthapadi, and Cédric Archambeau

In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , 2021

Abs Bib PDF Code

Given the increasing importance of machine learning (ML) in our lives, several algorithmic fairness techniques have been proposed to mitigate biases in the outcomes of the ML models. However, most of these techniques are specialized to cater to a single family of ML models and a specific definition of fairness, limiting their adaptibility in practice. We introduce a general constrained Bayesian optimization (BO) framework to optimize the performance of any ML model while enforcing one or multiple fairness constraints. BO is a model-agnostic optimization method that has been successfully applied to automatically tune the hyperparameters of ML models. We apply BO with fairness constraints to a range of popular models, including random forests, gradient boosting, and neural networks, showing that we can obtain accurate and fair solutions by acting solely on the hyperparameters. We also show empirically that our approach is competitive with specialized techniques that enforce model-specific fairness constraints, and outperforms preprocessing methods that learn fair representations of the input data. Moreover, our method can be used in synergy with such specialized fairness techniques to tune their hyperparameters. Finally, we study the relationship between fairness and the hyperparameters selected by BO. We observe a correlation between regularization and unbiased models, explaining why acting on the hyperparameters leads to ML models that generalize well and are fair.
@inproceedings{Perrone2021:Fair, title = {Fair Bayesian Optimization}, author = {Perrone, Valerio and Donini, Michele and Zafar, Muhammad Bilal and Schmucker, Robin and Kenthapadi, Krishnaram and Archambeau, C{\'e}dric}, booktitle = {Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society}, pages = {854--863}, year = {2021}, }
AAAI
Bandit Linear Optimization for Sequential Decision Making and Extensive-Form Games

Gabriele Farina, Robin Schmucker, and Tuomas Sandholm

In Proceedings of the AAAI Conference on Artificial Intelligence , 2021

Abs Bib PDF

Tree-form sequential decision making (TFSDM) extends classical one-shot decision making by modeling tree-form interactions between an agent and a potentially adversarial environment. It captures the online decision-making problems that each player faces in an extensive-form game, as well as Markov decision processes and partially-observable Markov decision processes where the agent conditions on observed history. Over the past decade, there has been considerable effort into designing online optimization methods for TFSDM. Virtually all of that work has been in the full-feedback setting, where the agent has access to counterfactuals, that is, information on what would have happened had the agent chosen a different action at any decision node. Little is known about the bandit setting, where that assumption is reversed (no counterfactual information is available), despite this latter setting being well understood for almost 20 years in one-shot decision making. In this paper, we give the first algorithm for the bandit linear optimization problem for TFSDM that offers both (i) linear-time iterations (in the size of the decision tree) and (ii) O(sqrt(T)) cumulative regret in expectation compared to any fixed strategy, at all times T. This is made possible by new results that we derive, which may have independent uses as well: 1) geometry of the dilated entropy regularizer, 2) autocorrelation matrix of the natural sampling scheme for sequence-form strategies, 3) construction of an unbiased estimator for linear losses for sequence-form strategies, and 4) a refined regret analysis for mirror descent when using the dilated entropy regularizer.
@inproceedings{Farina2021:Bandit, title = {Bandit Linear Optimization for Sequential Decision Making and Extensive-Form Games}, author = {Farina, Gabriele and Schmucker, Robin and Sandholm, Tuomas}, booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence}, volume = {35}, number = {6}, pages = {5372--5380}, year = {2021}, }

2020

NeurIPS
Multi-Objective Multi-Fidelity Hyperparameter Optimization with Application to Fairness

Robin Schmucker, Michele Donini, Valerio Perrone, and Cédric Archambeau

In NeurIPS 2020 Workshop on Meta-learning , 2020

Abs Bib PDF Code

In many real-world applications, the performance of machine learning models is evaluated not along a single objective, but across multiple, potentially competing ones. For instance, for a model deciding whether to grant or deny loans, it is critical to make sure decisions are fair and not only accurate. As it is often infeasible to find a single model performing best across all objectives, practitioners are forced to find a trade-off between the individual objectives. While several multi-objective optimization (MO) techniques have been proposed in the machine learning literature (and beyond), little effort has been put towards using MO for hyperparameter optimization (HPO) problems, a task that has gained immense relevance and adoption in recent years. In this paper, we evaluate the suitability of existing MO algorithms for HPO and propose a novel multi-fidelity method for this problem. We evaluate our approach on public datasets with a special emphasis on fairness-motivated applications, and report substantially lower wall-clock times when approximating Pareto frontiers compared to the state-of-the-art.
@inproceedings{Schmucker2020:Multj, author = {Schmucker, Robin and Donini, Michele and Perrone, Valerio and Archambeau, Cédric}, title = {Multi-Objective Multi-Fidelity Hyperparameter Optimization with Application to Fairness}, year = {2020}, booktitle = {NeurIPS 2020 Workshop on Meta-learning}, }
AAAI
Counterfactual-Free Regret Minimization for Sequential Decision Making and Extensive-Form Games

Gabriele Farina, Robin Schmucker, and Tuomas Sandholm

In AAAI 2020 Workshop on Reinforcement Learning in Games , 2020

Abs Bib PDF

Sequential decision processes (SDPs) model the multi-stage online decision-making problems that each player faces in an extensive-form game, as well as MDPs and POMDPs where the agent conditions on observed history. Prior regret minimization approaches for sequential decision processes typically rely heavily on having access to counterfactuals, that is, information on what would have happened had the agent chosen a different action at any decision point. While this assumption is reasonable when regret minimization algorithms are used in self-play (for instance, as a way to converge to a Nash equilibrium in an extensive-form game), it is unrealistic in online decision-making settings, where the algorithm is deployed to learn strategies against an unknown environment. In this paper, we give the first efficient algorithm for the bandit linear optimization problem on SDPs—and therefore also extensive-form games—and show that it achieves O(√T) cumulative regret in expectation against any strategy.
@inproceedings{Farina2020:Counterfactual, title = {Counterfactual-Free Regret Minimization for Sequential Decision Making and Extensive-Form Games}, author = {Farina, Gabriele and Schmucker, Robin and Sandholm, Tuomas}, booktitle = {AAAI 2020 Workshop on Reinforcement Learning in Games}, year = {2020}, }

2019

RoboCup
Multimodal Movement Activity Recognition Using a Robot’s Proprioceptive Sensors

Robin Schmucker, Chenghui Zhou, and Manuela Veloso

In RoboCup 2018: Robot World Cup XXII , 2019

Abs Bib PDF

By recognizing patterns in streams of sensor readings, a robot can gain insight into the activities that are performed by its physical body. Research in Human Activity Recognition (HAR) has been thriving in recent years mainly because of the widespread use of wearable sensors such as smartphones and activity trackers. By introducing HAR approaches to the robotics domain, this work aims at creating agents that are capable of detecting their own body’s activities. An activity recognition pipeline is proposed that allows a robot to classify its actions by analyzing heterogeneous, asynchronous data streams provided by its inbuilt sensors. The approach is evaluated in two experiments featuring the service robot Pepper. In the first experiment, a set of base movements is recognized by analyzing data from various proprioceptive sensors. The findings indicate that a multimodal activity recognition approach can achieve more accurate classifications than single-sensor approaches. In the second experiment, a person interferes with the forward movement of the robot by pulling its base backward. This happens in a way that is not detected by Pepper’s inbuilt systems. The approach can detect the unexpected behavior and could be used to extend Pepper’s inbuilt capabilities. Through its generality, this work can be used to recognize activities of other robots with comparable sensing capabilities.
@inproceedings{Schmucker2019:Multimodal, author = {Schmucker, Robin and Zhou, Chenghui and Veloso, Manuela}, title = {Multimodal Movement Activity Recognition Using a Robot's Proprioceptive Sensors}, booktitle = {RoboCup 2018: Robot World Cup XXII}, year = {2019}, publisher = {Springer International Publishing}, address = {Cham}, pages = {299--310}, }

2018

AAMAS
Towards a Robust Interactive and Learning Social Robot

Michiel De Jong, Kevin Zhang, Aaron M Roth, Travers Rhodes, Robin Schmucker, Chenghui Zhou, Sofia Ferreira, João Cartucho, and Manuela Veloso

In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems , 2018

Abs Bib PDF

Pepper is a humanoid robot, specifically designed for social interaction, that has been deployed in a variety of public environments. A programmable version of Pepper is also available, enabling our focused research on perception and behavior robustness and capabilities of an interactive social robot. We address Pepper perception by integrating state-of-the-art vision and speech recognition systems and experimentally analyzing their effectiveness. As we recognize limitations of the individual perceptual modalities, we introduce a multi-modality approach to increase the robustness of human social interaction with the robot. We combine vision, gesture, speech, and input from an onboard tablet, a remote mobile phone, and external microphones. Our approach includes the proactive seeking of input from a different modality, adding robustness to the failures of the separate components. We also introduce a learning algorithm to improve communication capabilities over time, updating speech recognition through social interactions. Finally, we realize the rich robot body-sensory data and introduce both a nearest-neighbor and a deep learning approach to enable Pepper to classify and speak up a variety of its own body motions. We view the contributions of our work to be relevant both to Pepper specifically and to other general social robots.
@inproceedings{De2018:Towards, title = {Towards a Robust Interactive and Learning Social Robot}, author = {De Jong, Michiel and Zhang, Kevin and Roth, Aaron M and Rhodes, Travers and Schmucker, Robin and Zhou, Chenghui and Ferreira, Sofia and Cartucho, Jo{\~a}o and Veloso, Manuela}, booktitle = {Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems}, pages = {883--891}, year = {2018}, }

2016

Nat. Commun
A universal test for gravitational decoherence

Corsin Pfister, Jed Kaniewski, M Tomamichel, A Mantri, R Schmucker, N McMahon, G Milburn, and Stephanie Wehner

Nature Communications, 2016

Abs Bib PDF

Quantum mechanics and the theory of gravity are presently not compatible. A particular question is whether gravity causes decoherence. Several models for gravitational decoherence have been proposed, not all of which can be described quantum mechanically. Since quantum mechanics may need to be modified, one may question the use of quantum mechanics as a calculational tool to draw conclusions from the data of experiments concerning gravity. Here we propose a general method to estimate gravitational decoherence in an experiment that allows us to draw conclusions in any physical theory where the no-signalling principle holds, even if quantum mechanics needs to be modified. As an example, we propose a concrete experiment using optomechanics. Our work raises the interesting question whether other properties of nature could similarly be established from experimental observations alone—that is, without already having a rather well-formed theory of nature to make sense of experimental data.
@article{Pfister2016:Universal, title = {A universal test for gravitational decoherence}, author = {Pfister, Corsin and Kaniewski, Jed and Tomamichel, M and Mantri, A and Schmucker, R and McMahon, N and Milburn, G and Wehner, Stephanie}, journal = {Nature Communications}, volume = {7}, number = {1}, pages = {13022}, year = {2016}, publisher = {Nature Publishing Group UK London}, }