AutumnBench: World Model Learning in Humans and AI
We’re releasing a new version of Autumn with human baseline results, AI performance comparisons, and an interactive benchmark for world model discovery. This release includes the MARA protocol and provides a public platform for testing causal reasoning capabilities.