Hi! I am Thomas (Hanwen) Zhu. I am currently a first-year MS in Machine Learning student at CMU, working on applying language models to formal theorem proving. I am honored to be advised by Prof. Sean Welleck and Prof. Jeremy Avigad. My research interests more broadly include combining neural and symbolic reasoning methods and advancing reasoning abilities in machine learning models in a scalable, grounded, and robust way.

I had the honor to work on Seed-Prover, which solved 5/6 problems on IMO 2025 and set new state-of-the-art scores on all Lean benchmarks, with an amazing team at ByteDance Seed in 2025. I worked on the test-time compute scaling pipeline and the Lean verification infrastructure.

I received BA Mathematics and Computer Science from Oxford, where I graduated top first and received the highest CS Gibbs Prize. I had the privilege to work with Ruining Li and Tomas Jakab at VGG in applying diffusion models to 3D generation of human-object interactions. I also worked at OxAI in developing a benchmark for gender bias in large vision-language models.

I am always excited to hear about potential collaborations or ideas. Please contact me at [email protected].

Publications

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving
ByteDance Seed
Paper News GitHub

Premise Selection for a Lean Hammer
Thomas Zhu*, Joshua Clune*, Jeremy Avigad†, Albert Q. Jiang†, Sean Welleck†
Paper LeanHammer
In preprint

miniCTX: Neural Theorem Proving with (Long-)Contexts
Jiewen Hu, Thomas Zhu, Sean Welleck
Paper Dataset
ICLR 2025 Oral (Top 1.8%)

DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors
Thomas Zhu, Ruining Li*, Tomas Jakab*
Website Paper GitHub
ICCV HiGen Workshop 2025 Oral

VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution
Siobhan Mackenzie Hall, Fernanda Gonçalves Abrantes, Thomas Hanwen Zhu, Grace Sodunke, Aleksandar Shtedritski, Hannah Rose Kirk
Paper GitHub
NeurIPS Datasets and Benchmarks 2023

Blogs

May 4, 2025: DeepSeek-Prover-V2 Learned to Cheat.