TL;DR
Google has released a major upgrade to Gemini 3 Deep Think, its specialised reasoning mode. The update sets new records on academic benchmarks including Humanity’s Last Exam (48.4%) and ARC-AGI-2 (84.6%), and is now available via the Gemini API for the first time.
Built for Research Problems
Deep Think was developed in partnership with scientists and researchers to tackle problems that lack clear guardrails or single correct answers — the kind of challenges where data is often messy or incomplete.
Early testers are already applying it. Lisa Carbone, a mathematician at Rutgers University working on structures bridging gravity and quantum mechanics, used Deep Think to review a technical paper. The model identified a subtle logical flaw that had passed through human peer review unnoticed.
At Duke University, the Wang Lab used Deep Think to optimise fabrication methods for crystal growth, successfully designing a recipe for growing thin films larger than 100 micrometres — a target previous methods had struggled to hit.
Benchmark Performance
The updated model has reached new highs across rigorous academic benchmarks:
- Humanity’s Last Exam: 48.4% (without tools), setting a new standard on a benchmark designed to test the limits of frontier models
- ARC-AGI-2: 84.6%, verified by the ARC Prize Foundation
- Codeforces: Elo of 3,455 on competitive programming challenges
- International Math Olympiad 2025: Gold-medal level performance
- International Physics and Chemistry Olympiads 2025: Gold-medal level on written sections
Practical Engineering Applications
Beyond benchmarks, Deep Think can turn hand-drawn sketches into 3D-printable objects, analyse complex drawings, model shapes, and generate files for physical manufacturing.
The update is available to Google AI Ultra subscribers in the Gemini app. Scientists, engineers, and enterprises can apply for early API access.
Looking Forward
Deep Think positions Google as a direct competitor to OpenAI and Anthropic in specialised reasoning. For UK research institutions and engineering firms, API access to this calibre of reasoning tool could accelerate work in materials science, physics, and technical design.