A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception

A multi-player tournament benchmark that tests LLMs in social reasoning, strategy, and deception. Players engage in public and private conversations, form alliances, and vote to eliminate each othe…

Read in full here: