Seminar: Leveraging Voluntary Commitments for Advancing Cooperation and AI Safety: A Game Theoretical Approach

Date: Mar 19, 2024
Time: 03:00 PM (Local Time Germany)
Speaker: The Anh Han, Teesside University
Location: Max Planck Institute for Human Development, Lentzeallee 94, 14195 Berlin
Room: Small Conference Room
Host: Center for Humans and Machines

_{© private}

The Anh Han, Teesside University

Leveraging Voluntary Commitments for Advancing Cooperation and AI Safety: A Game Theoretical Approach

Conventional wisdom suggests that arranging a prior commitment or agreement before an interaction enhances the chance of reaching collective behaviour. Yet it is not clear what mechanisms are efficient at ensuring participation in and compliance with such a commitment, especially when the former is costly and deviating from the latter is profitable. Here we develop a theory of participation and compliance with respect to an explicitly formed commitment under institutional incentives where individuals, at first, decide whether to join a cooperative agreement to play a cooperation dilemma. Using evolutionary game theory, we determine when participating in a costly commitment and complying with it, is an evolutionary stable strategy and results in high levels of overall cooperation in a population. We show that, given a budget for providing incentives, rewarding commitment-compliant behaviours better promotes cooperation than punishing non-compliant ones. Moreover, by sparing part of this budget for rewarding those who are willing to participate in a commitment, the overall frequency of cooperation can be significantly enhanced.

In the second part of the talk, starting from a baseline model that captures the fundamental dynamics of a competition for domain supremacy using AI technology, we demonstrate how socially unwanted outcomes may be produced when sanctioning is applied unconditionally to risk-taking, i.e. potentially unsafe, behaviours. We show how voluntary commitments, with sanctions either by peers or an institution, lead to socially beneficial outcomes in all scenarios envisageable in a rapid AI development scenario. These results are relevant for the design of governance and regulatory policies that aim to ensure an ethical and responsible AI technology development process.

The Anh Han is Professor of Computer Science and Director of the Center for Digital Innovation at School of Computing, Engineering and Digital Technologies, Teesside University. He received a PhD in AI from UNL Lisbon (2010-2012) and was a FWO Postdoc fellow from VUB Brussels (2012-1014). His current research spreads several topics in AI and interdisciplinary research, including evolutionary game theory, incentive and behavioral modelling, agent-based simulations, and AI development/safety behaviour modelling. He has published over 120 peer-reviewed articles in top-tier Computer Science conferences and high-ranking scientific journals. He regularly serves in the programme committees of most of top-tier AI conferences (e.g., IJCAI, AAAI, AAMAS) and is on the Editorial Boards of several international journals (e.g., Plos One, Humanities and Social Sciences Communication, Adaptive Behavior). He was awarded prestigious research fellowships and grants as Principal Investigator from the Future of Life Institute, EPSRC, Leverhulme Trust Foundation, and FWO Belgium.

Join at the Max Planck Institute for Human Development or online.
Webex Access
Meeting number: 2741 586 4269
Password: 6nbAVPHB4t5

Han, T. A. (2022). Institutional Incentives for the Evolution of Committed Cooperation: Ensuring Participation is as Important as Enhancing Compliance. Journal of the Royal Society Interface, 19(188). https://doi.org/10.1098/rsif.2022.0036

Han, T. A., Lenaerts, T., Santos, F. C., & Pereira, L. M. (2022). Voluntary safety commitments provide an escape from over-regulation in AI development. Technology in Society, 68, Article 101843. https://doi.org/10.1016/j.techsoc.2021.101843