How artificial intelligence is transforming psychology

Psychologist Dirk Wulff believes that artificial intelligence (AI) will not only make psychological research more efficient, but also challenge and redefine fundamental concepts. What does the use of large language models (LLMs) mean for psychology as a science? And might AI reveal more about the workings of the human mind than we previously thought? A conversation about opportunities, challenges, and the future of psychology, where AI is already proving indispensable.

The photo shows the portrait of Dirk Wulff in front of a gray and white wall. — Dirk Wulff is Senior Research Scientist at the Center for Adaptive Rationality.

© MPI for Human Development

Dirk Wulff is Senior Research Scientist at the Center for Adaptive Rationality.

© MPI for Human Development

Your recent work has focused on the use of artificial intelligence in psychology. What makes AI such a revolutionary tool for psychological research?

Dirk Wulff: AI takes psychology to a new level. One of the biggest advantages is that it allows us to analyze huge amounts of data that would be almost impossible for us humans to process. And precisely because the field is becoming increasingly data-driven, we need tools that help us recognize patterns and make connections within this sea of information.

Take scientific publications, for example. We’ve used large language models (LLMs) to sift through published psychology papers and identify systematic patterns. The diversity of participants in psychological studies is an interesting example. For many decades, women and minorities were underrepresented, which limited the generalizability and validity of the conclusions drawn. By automatically evaluating thousands of studies, AI can show how this has changed over time and where imbalances still exist. This would simply not be possible with traditional methods.

But AI can also help us analyze research trends and see how scientific discussions are changing. For example, it can identify which psychological concepts are gaining traction and which are slowly disappearing from research. That gives us a more comprehensive picture of how the field is developing and the direction in which it’s moving.

This type of analysis goes far beyond what traditional methods can achieve. AI allows us to examine the entire scientific corpus—automatically and in a fraction of the time. That not only means greater efficiency, but also better quality research, because it enables us to identify developments or systematic biases more quickly. For example, we can track how the current clampdown on diversity-related research in the US will affect the diversity of future studies in psychology.

In a recent paper, you showed how AI can help to challenge and standardize existing psychological constructs. Can you expand on that?

Dirk Wulff: Psychology faces a challenge: Many research findings cannot be generalized to other studies or situations. One reason for this is the lack of a clear mapping between psychological constructs and the scales used to measure them. In personality psychology, there is a lot of conceptual overlap—for example, between “sociability,” “friendliness,” and “extraversion.” While some studies see them as one and the same, others distinguish between them without providing clear definitions.

We used large language models (LLMs) to systematically analyze the constructs and measures used in personality psychology and to determine how similar they were in terms of language and content. The analysis revealed that some constructs cover almost identical ground, despite being artificially separated in research. AI can help optimize the taxonomy of personality traits by showing which constructs really need to be distinguished and which are redundant.

So how valid are the measures used in psychology?

Dirk Wulff: There are shortcomings in the validity of the scales used as well. In many cases, several different questionnaires measure the same construct—for example, “neuroticism.” However, our AI-supported analysis of thousands of questionnaire items revealed that these scales sometimes measure very different content. The consequence being that researchers can arrive at different results depending on the scale they use. AI can help by developing more uniform measures, which in turn enhance the comparability of studies.

What do you hope to achieve with this work?

Dirk Wulff: Our goal is to use AI to create a clearer and more consistent scientific basis for psychological research. By organizing psychological constructs more systematically and identifying redundancies, we can help make psychological findings and theories more robust and more comparable across studies.

In the long term, it could also have practical implications—for diagnostic procedures in clinical psychology, for instance, or for the development of personality tests based on more valid and accurate criteria. AI is a powerful tool for sharpening scientific insights and advancing psychology as a discipline.

Which other applications of AI in psychology do you see as particularly promising?

Dirk Wulff: AI is opening up new perspectives in clinical psychology. For example, it is being used to analyze patient reports and identify early indicators of mental illness. In the context of personalized therapy, AI-supported systems can provide individual recommendations for treatment strategies by evaluating large amounts of data from comparable patients and identifying patterns. AI applications can also be effective as communication partners between therapy sessions.

Another cutting-edge field of application is AI-supported persuasion and advocacy—especially in areas like elections or climate protection. We’re involved in projects where AI is used to persuade people to adopt evidence-based positions, encouraging participation in democracy or support for scientifically sound climate measures. By leveraging AI-driven conversation strategies, we aim to challenge scientifically unfounded beliefs and foster informed decision-making.

AI is also being used to combat disinformation by analyzing patterns of argumentation and providing targeted counterarguments. This approach could be used in social networks—for example, to swiftly identify false information and respond with factual and evidence-based content.

That sounds a bit like manipulation and raises ethical questions. Is AI in psychology an opportunity or a threat?

Dirk Wulff: Yes, that's a very important point. AI can be an incredibly valuable tool, but it must be used responsibly. Transparency is crucial: People always need to know when they are interacting with AI. The question of which data these models use and how they were trained also plays a big role.

AI can help solve scientific and societal problems—but only if we have the right ethical guardrails in place.

There are different views in academia about the impact of artificial intelligence on science. What is your position?

Dirk Wulff: We recently published a discussion paper highlighting various perspectives on this topic. The positions range from the view that working with AI is essentially no different from working with human collaborators to calls for humans to have exclusive control over scientific development. As a co-author, I take a moderate position: AI is an extremely powerful tool that can transform psychological research. But clear principles are needed for its use. Issues such as responsibility, fairness, and bias in AI models are significant. For example, the models primarily use data from Western societies and cannot easily be adapted to represent other communities. This creates biases in texts and citations that can further marginalize already disadvantaged researchers. We advocate for transparent disclosure and careful documentation of AI's use in science.

There’s been a lot of discussion about whether AI might in future co-author scientific papers. Do you think that’s realistic?

Dirk Wulff: It’s an intriguing question. Some scientists predict that AI could serve as a co-author in the future. I think it depends on how things develop. At the moment, AI models generate content that appears convincing but may be inaccurate. As long as that’s the case, human oversight will remain essential. There’s also the philosophical question of whether we want to acknowledge AI as an author. After all, humans take responsibility for their research, while AI cannot.

What kind of AI models are needed in science?

Dirk Wulff: For science, we need open AI models that are transparent, traceable, and freely accessible. Models such as GPT-4 are powerful, but their lack of transparency hinders the reproducibility of studies. Because the training data they are based on are undisclosed and their algorithms change, it is impossible to trace how they arrive at specific results. This makes it difficult to verify scientific work involving AI. We have written an article arguing that science should focus on open alternatives, which would also allow us to train models specifically for scientific purposes.

How will the role of AI in psychology evolve and what are the implications for training young scientists?

Dirk Wulff: AI needs to become an integral part of psychology training. Its applications range from basic research to clinical psychology, where it’s already being tested for therapeutic use. In the future, psychology students will need to understand the basics of AI, in the same way as they now need to grasp the essentials of statistics.

With that in mind, we’ve developed a tutorial in the field of behavioral science. It’s aimed at researchers who want to use AI models in their work. We present various applications, from personality research to predicting decision-making behavior. We also provide the code so that anyone interested can work with AI themselves.

Apart from using AI as a tool in psychological research, there’s the question of what these models can teach us about ourselves. Is AI a model for human cognition?

Dirk Wulff: That’s a fascinating question! It’s exactly what we’re exploring right now. There’s an ongoing debate about whether language models are “just” probability models for text or whether they actually replicate a form of thinking and reasoning. AI models not only help in simulating human thought processes, they also prompt us to re-evaluate fundamental mechanisms of human cognition. For example, the models can formulate coherent arguments and solve complex problems without any conscious experience or deep understanding in the human sense. This suggests that some cognitive abilities we’ve always considered uniquely human can be explained by simple mechanisms like pattern recognition and probability calculations.

It also raises profound questions about the nature of consciousness. We humans assume that our thoughts and feelings are inextricably linked to our subjective experience. But AI models demonstrate that intelligent behavior is possible without consciousness or emotions, prompting new hypotheses about the true nature of consciousness and how it differs from purely cognitive processes. If AI behaves like a conscious human being, on what basis do we conclude that we ourselves have consciousness?

About Dirk Wulff:

Dirk Wulff is a Senior Research Scientist in the Center for Adaptive Rationality at the Max Planck Institute for Human Development, where he heads the Search and Learning group, which explores human decision-making under uncertainty: https://www.mpib-berlin.mpg.de/staff/dirk-wulff

Original publications

Using LLMs in personality psychology to standardize constructs and scales:

Wulff, D. U., & Mata, R. (2025). Semantic embeddings reveal and address taxonomic incommensurability in psychological measurement. Nature Human Behaviour, 1-11. https://doi.org/10.1038/s41562-024-02089-y

Wulff, D. U., & Mata, R. (2025). Escaping the jingle–jangle jungle: Increasing conceptual clarity in psychology using large language models. PsyArXiv. https://osf.io/preprints/psyarxiv/ksuh8_v1

Bentz, D., & Wulff, D. U. (2025). Mapping OCD Symptom Triggers with Large Language Models. medRxiv, 2025-05. https://www.medrxiv.org/content/10.1101/2025.05.15.25327706v1

Discussion paper on the influence of LLMs on science:

Binz, M., Alaniz, S., Roskies, A., Aczel, B., Bergstrom, C. T., Allen, C., Schad, D., Wulff, D. U., West, J. D., Zhang, Q., Shiffrin, R. M., Gershman, S. J., Popov, V., Bender, E. M., Marelli, M., Botvinick, M. M., Akata, Z., & Schulz, E. (2025). How should the advancement of large language models affect the practice of science? Proceedings of the National Academy of Sciences of the United States of America, 122(5), Article e2401227121. https://doi.org/10.1073/pnas.2401227121

Tutorial on using LLMs in behavioral science:

Hussain, Z., Binz, M., Mata, R., & Wulff, D. U. (2024). A tutorial on open-source large language models for behavioral science. Behavior Research Methods, 56, 8214–8237. https://doi.org/10.3758/s13428-024-02455-8

Call for the use of open LLMs in science:

Wulff, D. U., Hussain, Z., & Mata, R. (2024). The behavioral and social sciences need open LLMs. OSF Preprints, September 04, 2024. https://doi.org/10.31219/osf.io/ybvzs

LLMs as a model for human cognition:

Hussain, Z., Mata, R., & Wulff, D. U. (2025, February 19). A rebuttal of two common deflationary stances against LLM cognition. https://doi.org/10.31219/osf.io/y34ur_v2

Centaur, a computational model that can predict and simulate human behavior in every experiment expressible in natural language:

Binz, M., Akata, E., & Bethge, M., Brändle, F., Callaway, F., Coda-Forno, J., Dayan, P., Demircan, C., Eckstein, M., Éltető, N. & Griffiths, T., Haridi, S., Jagadish, A., Ji-An, L., Kipnis, A., Kumar, S. & Ludwig, T. & Mathony, M. & Mattar, M.& Schulz, E.. (2024). Centaur: a foundation model of human cognition. https://doi.org/10.48550/arXiv.2410.20268