Google Gemini Robotics-ER 1.6 pushes gauge-reading accuracy to 98%

TL;DR

  • Google DeepMind’s Gemini Robotics-ER 1.6, released 14 April, raises instrument-reading accuracy from 23% (version 1.5) to 98% when equipped with agentic vision
  • The capability is being deployed on Boston Dynamics’ Spot quadruped for industrial inspections, including in Hyundai automotive plants
  • For UK manufacturers and utilities relying on legacy analogue instrumentation, the jump closes a longstanding gap between autonomous inspection and human-level reliability

Google DeepMind’s new robotics reasoning model lets four-legged robots read analogue pressure gauges, thermometers and sight glasses at a claimed 98% accuracy — a more than four-fold leap over the previous generation. The model, Gemini Robotics-ER 1.6, was announced on 14 April and is being deployed on Boston Dynamics’ Spot robot as part of a continuing collaboration between the two companies.

How the accuracy jump works

The model combines visual reasoning with code execution to create what DeepMind calls a “visual scratchpad”: an intermediate step where the model can point to tick marks, needle positions and text before producing an answer. Without this agentic vision layer, the same model scores 86%. With it, accuracy climbs to 98%. The Gemini 3.0 Flash base model, released in January 2026, reached 67% on the same tasks.

Boston Dynamics has been testing Spot as an industrial inspector across Hyundai Motor Group’s automotive facilities. The tasks involve reading multiple instrument types in varying lighting and angle conditions — historically a weak point for computer vision systems that struggled with reflective glass, partial occlusion and non-standard dial layouts.

Why it matters for UK operators

A large share of UK manufacturing, water treatment and energy infrastructure still relies on analogue gauges installed decades ago. Retrofitting digital instrumentation is often impractical; replacing human walk-round inspections with vision-based autonomous systems has been the obvious alternative but repeatedly stumbled on real-world accuracy. A 98% reliability figure — if it holds outside benchmark conditions — moves the economics. Inspection tasks that previously required human presence in hazardous or remote environments become viable candidates for continuous autonomous monitoring.

Looking forward

The commercial test is whether the gains hold at operational scale. Benchmark accuracy does not always survive contact with dusty, poorly lit, or non-standard instruments, and Boston Dynamics has not disclosed failure modes or comparison baselines against human inspectors. Expect UK asset-heavy operators — National Grid, Thames Water, Tata Steel — to be watching for field data before committing to procurement. The partnership also hints at Hyundai’s strategy of building robotics competence in-house through Boston Dynamics, rather than licensing external AI platforms.