Many contemporary systems in human computer interaction (HCI) including mobile and ubiquitous computing are based on some form of automated sensor data analysis. Prominent examples are innovative and more intuitive input modalities such as voice and gesture, or automated activity logging and analysis. It is fair to conclude that sensor data analysis is key to context aware computing as a whole. Such prominence requires robust and reliable methods that can cope with the challenges of real-world HCI applications and systems, of which there are many: Noisy sensor readings; often ambiguous, sometimes erroneous ground truth annotation (labeling); small datasets that can be used for method development; hard real-time constraints for analysis; etc.
As a key component of sensor data analysis in HCI (and beyond) many researchers have moved towards employing machine learning techniques, especially those related to the automated analysis of time-series data as they are recorded through the multitude of sensors used in HCI. In recent years the field of machine learning has seen an explosion in growth and very sophisticated methods now do exist that are key enablers for a plethora of application areas. Most appealing to many practitioners is the availability of toolkits such as Matlab, Weka, scikit-learn, and the various deep learning frameworks to name but a few, that nicely package machine learning methods. These toolkits effectively hide the complexity of machine learning methods — which, I argue, is both a blessing and a curse. Packaging away complex functionality is common practice in, for example, software engineering where libraries with clear interface specifications provide higher level functionality to practitioners. To some extent machine learning toolkits provide similar functionalities and as such make these methods accessible to practitioners in the first place. Yet, hiding the complexity of machine learning can be dangerous. Without careful considerations of appropriateness of methods for specific problems beyond the mere interface fit of the chosen toolkit, practitioners are at risk of falling victim to flaky conclusions.
In this talk I will advocate the enormous potential machine learning methods have for current and next generations of HCI applications and systems — specifically targeting time-series assessment as it is most common for HCI related sensor data analysis problems. In doing so I will focus on common pitfalls that a utilitarian use of machine learning methods inevitably brings — and will offer ways to avoid these.
Thomas Ploetz is a Computer Scientist with expertise and almost 15 years experience in Pattern Recognition and Machine Learning research (PhD from Bielefeld University, Germany). His research agenda focuses on applied machine learning, that is developing systems and innovative sensor data analysis methods for real world applications. Primary application domain for his work is computational behavior analysis where he develops methods for automated and objective behavior assessments in naturalistic environments. Main driving functions for his work are "in the wild" deployments and as such the development of systems and methods that have a real impact on people's lives.
Thomas has “recently” (February 2017) joined the School of Interactive Computing where he works as an Associate Professor of Computing. Prior to this he was an academic at the School of Computing Science at Newcastle University in Newcastle upon Tyne, UK, where he was a Reader (Assoc. Prof.) for "Computational Behaviour Analysis" affiliated with Open Lab, Newcastle's interdisciplinary research centre for cross-disciplinary research in digital technologies.