Seminar: Decision Heuristics and Evaluation Sensitivity in Large Language Models: Parallels to Cognitive and Behavioral Phenomena

Date: May 5, 2026
Time: 02:00 PM - 03:15 PM (Local Time Germany)
Speaker: Sahar Abdelnabi, ELLIS Institute Tübingen
Location: Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin
Room: Room 316 (CHM)
Host: Center for Humans and Machines
Topic: Discussion and debate formats, lectures

Large language models (LLMs) have been primarly trained on next-token prediction, yet the behavioral properties that emerge from this objective extend well beyond text completion. In this talk, I present two lines of work examining emergent computational properties of LLMs that parallel well-studied phenomena in cognitive and behavioral science. First, I introduce a theory of response sampling in LLMs, showing that when these models generate outputs over large option spaces, the underlying sampling process exhibits a dual structure: a descriptive component reflecting statistical regularities in the training distribution, and a prescriptive component reflecting an implicit normative ideal. This results in systematic deviations of sampled outputs from empirical base rates toward idealized values — a pattern that mirrors the descriptive–prescriptive distinction in human judgment and decision-making literature. We validate this across diverse real-world domains and show that concept prototypes in LLMs are shaped by prescriptive norms, consistent with findings on prototype distortion in human cognition. Second, I present results on evaluation sensitivity in reasoning-capable LLMs. We find that these models produce systematically different outputs when contextual cues signal an evaluation setting — a behavioral shift structurally analogous to the Hawthorne effect observed in human and animal studies. This raises methodological questions about the reliability of static benchmarks and points to an underexplored form of context-dependent response modulation in trained neural networks. These findings suggest that optimization at scale over human-generated data gives rise to functional analogs of cognitive heuristics and context sensitivity — without explicit design. I discuss implications for AI safety and alignment, and outline open questions at the interface of large-scale representation learning and the computational principles studied in neuroscience.

Sahar Abdelnabi is a Principal Investigator at the ELLIS Institute Tübingen and an Independent Research Group Leader at the Max Planck Institute for Intelligent Systems. She leads the COMPASS research group (COoperative Machine intelligence for People-Aligned Safe Systems). Her research focuses on AI security, safety, and alignment, with particular expertise in multi-agent systems, prompt injection attacks, privacy frameworks, and evaluation robustness. Prior to her current role, she worked at the Microsoft Security Response Center on AI security vulnerabilities and red-teaming. Sahar's contributions include pioneering work on indirect prompt injection in LLM-integrated applications, which has been widely adopted by NIST, MITRE, OWASP, and Microsoft. She holds a PhD from CISPA Helmholtz Center for Information Security and Saarland university.

Alternatively, join online:

Microsoft Teams

Meeting ID: 378 283 009 672 8

Passcode: 5yK65cy7