This is us: Hannes Diemerling 

January 06, 2025

Hannes Diemerling is a predoctoral fellow at the International Max Planck Research School on the Life Course (IMPRS LIFE) and works at the Center for Lifespan Psychology. His research focuses on machine emotion recognition. He is developing software that can analyze emotions in audio and video recordings. This technology could one day support psychotherapy by closely following the emotional development of patients. 

You work in the Center for Lifespan Psychology on machine emotion recognition. What fascinates you about this area? 

Hannes Diemerling: I am fascinated by how machine learning can help us better understand and predict complex human emotions. The ability to analyze emotional states from video and audio material opens up new perspectives and applications in psychology. I'm particularly interested in the challenge of translating subjective experiences into quantifiable data that can be used for both research and practical applications. 

You recently published a study on this topic. Can you briefly explain your findings? 

Hannes Diemerling: In our study, we used short audio clips of actors deliberately expressing emotions such as sadness or happiness. After preprocessing, we extracted data to train neural networks. These networks, inspired by how the brain works, were able to detect and classify emotions in the audio with an accuracy comparable to human judgments. Our results show that emotion recognition is possible even with very short audio segments - as short as 1.5 seconds, just enough time to say "hello". While there's still work to be done, the results have significant potential for applications in psychological research and therapy. 

Can you elaborate on the therapeutic applications? Where do you see the potential? 

Hannes Diemerling: Our research could significantly improve therapeutic settings. The goal is to provide therapists with an additional tool to objectively measure emotional development over the course of therapy. This can be particularly helpful in evaluating therapy success or documenting emotional states such as frustration, sadness, or joy. 

In addition, there are links between emotional expression and mental health disorders such as depression or anxiety. Our models could help by identifying specific emotional patterns that correlate with these diagnoses, helping therapists confirm diagnoses or identify potential developments early. 

We are also working on models that can describe emotions in detail, not only categorically ("this person is sad"), but also more granularly, e.g. "this person has a downcast expression with slumped shoulders". These detailed insights can be invaluable for documenting therapy progress, analyzing emotional patterns, and studying their relationship to diagnoses or therapy outcomes. 

It's important to emphasize that our work is intended to complement, not replace, therapists. Our models are meant to be additional tools.

You're working on a new dataset from Munich University Outpatient Clinic. What's the focus of your new study and why is this dataset special? 

Hannes Diemerling: My new study focuses on building a dataset of real, natural emotions as a basis for training emotion recognition models. The goal is to create a high-quality dataset that reflects the diversity of human emotions and supports the development of effective machine learning models. 

The dataset from the university's outpatient clinic is particularly valuable because it contains authentic therapeutic sessions that reveal a wide range of emotional nuances. In collaboration with Professor Joachim Kruse, the clinic's director, and research associate Patricia Kulla, we curated a high-quality selection from thousands of hours of recordings. Such material is rare, as genuine emotional moments are often neither recorded nor available for research. 

Students then analyzed short video clips, each 0.5 seconds long, and coded visible emotions based on predefined categories. The results show that advanced models, particularly neural networks, are making impressive strides in emotion recognition. Although humans still outperform these models, the gap has narrowed significantly. With further research, it may be possible to develop systems that match human perception.

You have a background in psychology, but you work extensively with new technologies and machine learning. How did you get started?

Hannes Diemerling: While studying psychology, I quickly realized that I was particularly fascinated by complex data and its analysis. My interest in statistics was sparked by my professor, Timo von Oertzen. He not only supported me during my studies, but also fostered my enthusiasm for the possibilities of data analysis. Together we delved into methodological research in psychology, especially machine learning. 

What fascinated me most was the potential of machine learning to analyze types of data that were previously the domain of humans, such as image recognition. Initially, this technology had an almost magical appeal to me: the idea that algorithms could detect overlapping patterns in large data sets and simulate human decision-making captivated me.

Machine learning offers a wide range of applications and almost unlimited potential for future research. The opportunity to contribute to the advancement of these technologies continues to motivate me to keep learning and exploring new approaches to push the boundaries of what machine learning can achieve.

How do you combine your background in psychology with your expertise in machine learning, especially emotion recognition?

Hannes Diemerling: My research bridges psychology and machine learning, with a focus on emotion recognition methods. With my background in psychology, I approach emotional concepts from a theoretical perspective and integrate them into the development of machine learning models. Often, emotions are approached from either a psychological or a purely data-driven perspective, neglecting the other. My goal is to develop an approach that combines both perspectives. 

Machine learning models can help us understand and study complex human processes such as emotions. These models can be trained to mimic human judgments and responses in emotion recognition. Specialized networks and large language models (LLMs) open new avenues for understanding human behavior and decision making. 

My goal is to develop models that not only perform well, but also capture the subtleties of human emotions. By combining psychological theories with modern machine learning methods, I aim to deepen our understanding of emotion recognition mechanisms and promote applications in research and healthcare. In this way, we can advance our knowledge of both the human psyche and artificial intelligence.

You are part of the graduate program of the International Max Planck Research School on the Life Course (LIFE). How does the program support your personal and scientific development?

Hannes Diemerling: The graduate program has been very supportive of my personal and scientific development and has opened many doors for me. For example, I was able to spend a semester in the US and meet Professor Steve M. Boker at the University of Virginia – an opportunity that wouldn't have been possible without LIFE. 

Through the program, I've been able to explore different topics and connect with international scientists. In addition, the resources and inspiring environment of the Max Planck Institute for Human Development and the program itself are incredibly valuable. The support from experienced researchers and the interaction with other PhD students create a motivating atmosphere in which I can continuously develop. 

More information about the IMPRS LIFE: https://www.imprs-life.mpg.de

Original publication:

Diemerling, H., Stresemann, L., Braun, T., & von Oertzen, T. (2024). Implementing machine learning techniques for continuous emotion prediction from uniformly segmented voice recordings. Frontiers in Psychology, 15, Article 1300996. https://doi.org/10.3389/fpsyg.2024.1300996

Other Interesting Articles

Go to Editor View