People

Dat Nguyen

Joint Postdoctoral Fellow, Basis and Harvard

Dat Nguyen is a joint postdoctoral fellow at Basis and Nada Amin's lab at Harvard SEAS. He works on probabilistic programming and verification, building neuro-symbolic systems that pair LLMs with SMT solvers and proof assistants for proof automation and scientific discovery.

Website Google Scholar GitHub LinkedIn Email

Affiliations

Basis Research Institute
Programming Languages and Formal Methods groups
Harvard SEAS (Nada Amin's lab)

Education

PhD, Computing and Information Systems
University of Melbourne, 2024

Previous Appointments

AI Research Engineer
Cinnamon AI Lab, 2016–2021

About

I am a joint postdoctoral fellow at Basis and Nada Amin’s lab at Harvard SEAS. I work on programming languages for AI, and on AI for programming languages.

I started with a weird combination of program synthesis, software analysis, and deep learning, with a particular interest in graph-based learning over code and documents. These days I work mostly on probabilistic programming and verification, building neuro-symbolic systems that pair LLMs with SMT solvers and proof assistants. At Harvard that means proof automation in Lean and causal modeling for drug repurposing; at Basis it means contributing to MARA and R-ADA.

I did my PhD at the University of Melbourne (2021–2024) on program repair and synthesis, and before that spent five years as an AI research engineer at Cinnamon AI Lab.

Projects

Current and recent Basis projects.

MARA

I worked on Autumn.cpp, the C++ interpreter for MARA’s modeling language, and on the MARA protocol for evaluating world-model learning. I’m now building world-modeling agents that combine LLMs with program synthesis.

R-ADA

I’m helping build the formal language R-ADA uses to describe robots.

Collaborations

Ways to work with Dat at Basis.

Collaborative research projects

I’m happy to talk to people working on proof automation or neuro-symbolic methods, especially anyone trying to point those tools at real scientific problems.

Awards

Recognition for research contributions.

Best Paper, ARC Prize 2024

For “Combining Induction and Transduction for Abstract Reasoning.”

Software

Tools and research code.

Autumn.cpp

A C++ implementation of the Autumn modeling language, with bindings for WASM, Python, and Julia. Used inside MARA.

VRDSynth

Synthesizes programs that link entities across visually-rich documents. Replication code for the ISSTA 2024 paper.

NeuroSymbolicDG

Neuro-symbolic domain generalization via compositional layout grammars.

Recent Publications

Benchmarking World-Model Learning with Environment-Level Queries

ICML 2026

A. Warrier, D. Nguyen, M. Naim, Moksh Jain, Y. Liang, K. Schroeder, Cambridge Yang, Joshua B. Tenenbaum, Sebastian J. Vollmer, Kevin Ellis, Z. Tavares

VirDA: Reusing Backbone for Unsupervised Domain Adaptation with Visual Reprogramming

TMLR 2025

Duy Nguyen, D. Nguyen

ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning

ICLR 2026

Y. Liang, D. Nguyen, Cambridge Yang, Tianyang Li, Joshua B. Tenenbaum, Carl Edward Rasmussen, Adrian Weller, Z. Tavares, Tom Silver, Kevin Ellis

Towards Reliable Evaluation of Neural Program Repair with Natural Robustness Testing

ACM Transactions on Software Engineering and Methodology 2025

Thanh Le-Cong, D. Nguyen, Xuan Bach D. Le, Toby Murray

Combining Induction and Transduction for Abstract Reasoning

OpenReview 2024

Wen-Ding Li, Keya Hu, Carter Larsen, Yuqing Wu, Simon Alford, Caleb Woo, Spencer M. Dunn, Hao Tang, M. Naim, D. Nguyen, Wei-Long Zheng, Z. Tavares, Yewen Pu, Kevin Ellis

Selected Publications

Earlier selected work.

VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction

ISSTA 2024

D. Nguyen, Tung Do-Viet, Hung Nguyen-Duy, Tuan-Hai Luu, Hung Le, Bach Le, Patanamon Thongtanunam

FFL: Fine-grained Fault Localization for Student Programs via Syntactic and Semantic Reasoning

ICSME 2022

D. Nguyen, Thanh Le-Cong, Duc M. Le, Xuan-Bach D. Le, Quyet-Thang Huynh

Articles

Basis essays and updates this person wrote or contributed to.

ExoPredicator: Abstracting Time and State for Robot Planning

April 22, 2026

We introduce ExoPredicator, a system that learns abstract world models for robot planning. By abstracting state, time, and both endogenous and exogenous causal processes, ExoPredicator enables robots to quickly learn how dynamic environments work and plan efficiently in them.

AutumnBench: World Model Learning in Humans and AI

July 17, 2025

We’re releasing a new version of Autumn with human baseline results, AI performance comparisons, and an interactive benchmark for world model discovery. This release includes the MARA protocol and provides a public platform for testing causal reasoning capabilities.