People

Dat Nguyen

Joint Postdoctoral Fellow, Basis and Harvard

Dat Nguyen is a joint postdoctoral fellow at Basis and Nada Amin's lab at Harvard SEAS. He works on probabilistic programming and verification, building neuro-symbolic systems that pair LLMs with SMT solvers and proof assistants for proof automation and scientific discovery.

Affiliations

  • Basis Research Institute
  • Programming Languages and Formal Methods groups
    Harvard SEAS (Nada Amin's lab)

Education

  • PhD, Computing and Information Systems
    University of Melbourne, 2024

Previous Appointments

  • AI Research Engineer
    Cinnamon AI Lab, 2016–2021

About

I am a joint postdoctoral fellow at Basis and Nada Amin’s lab at Harvard SEAS. I work on programming languages for AI, and on AI for programming languages.

I started with a weird combination of program synthesis, software analysis, and deep learning, with a particular interest in graph-based learning over code and documents. These days I work mostly on probabilistic programming and verification, building neuro-symbolic systems that pair LLMs with SMT solvers and proof assistants. At Harvard that means proof automation in Lean and causal modeling for drug repurposing; at Basis it means contributing to MARA and R-ADA.

I did my PhD at the University of Melbourne (2021–2024) on program repair and synthesis, and before that spent five years as an AI research engineer at Cinnamon AI Lab.

Projects

Current and recent Basis projects.

MARA

I worked on Autumn.cpp, the C++ interpreter for MARA’s modeling language, and on the MARA protocol for evaluating world-model learning. I’m now building world-modeling agents that combine LLMs with program synthesis.

R-ADA

I’m helping build the formal language R-ADA uses to describe robots.

Collaborations

Ways to work with Dat at Basis.

Collaborative research projects

I’m happy to talk to people working on proof automation or neuro-symbolic methods, especially anyone trying to point those tools at real scientific problems.

Awards

Recognition for research contributions.

Best Paper, ARC Prize 2024

For “Combining Induction and Transduction for Abstract Reasoning.”

Software

Tools and research code.

Autumn.cpp

A C++ implementation of the Autumn modeling language, with bindings for WASM, Python, and Julia. Used inside MARA.

VRDSynth

Synthesizes programs that link entities across visually-rich documents. Replication code for the ISSTA 2024 paper.

NeuroSymbolicDG

Neuro-symbolic domain generalization via compositional layout grammars.

Recent Publications

Recent papers and preprints.

Selected Publications

Earlier selected work.

VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction

ISSTA 2024

D. Nguyen, Tung Do-Viet, Hung Nguyen-Duy, Tuan-Hai Luu, Hung Le, Bach Le, Patanamon Thongtanunam

FFL: Fine-grained Fault Localization for Student Programs via Syntactic and Semantic Reasoning

ICSME 2022

D. Nguyen, Thanh Le-Cong, Duc M. Le, Xuan-Bach D. Le, Quyet-Thang Huynh

Articles

Basis essays and updates this person wrote or contributed to.

ExoPredicator: Abstracting Time and State for Robot Planning

April 22, 2026

We introduce ExoPredicator, a system that learns abstract world models for robot planning. By abstracting state, time, and both endogenous and exogenous causal processes, ExoPredicator enables robots to quickly learn how dynamic environments work and plan efficiently in them.

AutumnBench: World Model Learning in Humans and AI

July 17, 2025

We’re releasing a new version of Autumn with human baseline results, AI performance comparisons, and an interactive benchmark for world model discovery. This release includes the MARA protocol and provides a public platform for testing causal reasoning capabilities.