Seminar: Fine-tuning language models to find agreement among humans with diverse preferences

  • Date: Apr 18, 2023
  • Time: 03:00 PM (Local Time Germany)
  • Speaker: Michiel Bakker, DeepMind
  • Location: Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin
  • Room: Small Conference Room
  • Host: Center for Humans and Machines
Seminar: Fine-tuning language models to find agreement among humans with diverse preferences

© private

Michiel Bakker, DeepMind

Fine-tuning language models to find agreement among humans with diverse preferences

Recent work in large language modeling (LLMs) has used fine-tuning to align model outputs with the preferences of a prototypical user. This approach, which forms the basis of popular LLM assistants like ChatGPT, assumes that human preferences are static and homogeneous across individuals, so that aligning to a single "generic" user will confer more general alignment. In my work I challenge this assumption by embracing the heterogeneity of human preferences to consider a different challenge: how might a machine help people with diverse views find agreement? I fine-tune a 70 billion parameter LLM to generate statements that maximize the expected approval for a group of people with potentially diverse opinions. Human participants provide written opinions on thousands of questions touching on moral and political issues (e.g., "should we raise taxes on the rich?"), and rate the LLM's generated candidate consensus statements for agreement and quality. A reward model is then trained to predict individual preferences, enabling it to quantify and rank consensus statements in terms of their appeal to the overall group, defined according to different aggregation (social welfare) functions. The model produces consensus statements that are preferred by human users over those from prompted LLMs and human-generated opinions. These methods represent a significant advancement towards scalable, efficient, and broadly applicable decision-making systems that leverage artificial intelligence to mediate human interaction.

Michiel Bakker is a Senior Research Scientist at DeepMind and focuses his work on large language models and human-AI interaction. Prior to joining DeepMind full-time in early 2021, he earned his computer science master's and PhD degree from MIT EECS, while working under the supervision of Professor Alex Pentland at the MIT Media Lab. He also holds a bachelor's and master's degree in physics from TU Delft, where he gained experience in quantum computing through his work at QuTech and IBM Quantum. Before continuing his education at MIT, Michiel co-founded flower subscription startup Bloomon in London.

Attend at the Max Planck Institute for Human Development or join online.
Webex Access

Meeting number: 2740 676 0855
Password: mqHF8YJkT77

Go to Editor View