Google DeepMind releases Gemini Robotics-ER 1.6, Spot robot can now automatically read dashboards

robot
Abstract generation in progress

ME News message, April 14 (UTC+8). According to 1M AI News monitoring, Google DeepMind has released Gemini Robotics-ER 1.6, positioned as a robot high-level reasoning model. Compared with its previous model ER 1.5 and Gemini 3.0 Flash, it shows significant improvements in spatial reasoning and multi-view understanding. The model has been made available to developers via the Gemini API and Google AI Studio.

Core upgrades include three capabilities:

  1. Pointing accuracy enhancement: can be used for precise object detection, counting, spatial relationship reasoning (such as “point out all objects that can fit into the blue cup”), and motion trajectory planning, and it can correctly reject pointing at objects that are not present in the pointing view.
  2. Multi-view successful detection: robots can now integrate multiple camera feeds to judge whether the task is completed, maintaining accuracy even in occluded or dynamic environments.
  3. New instrument reading capability: can interpret various industrial instruments such as circular pressure gauges, vertical level indicators, and digital displays. Through agentic vision (visual reasoning + code execution), it performs step-by-step reasoning—first zooming in on detail areas, then using pointing and code to calculate ratios and intervals, and finally combining world knowledge to derive the readings.

The instrument reading capability stems from the collaboration between DeepMind and Boston Dynamics. On the same day, Boston Dynamics announced that it has integrated Gemini and Gemini Robotics-ER 1.6 into its Orbit AIVI-Learning product, which was launched to all AIVI-Learning customers on April 8. After integration, support for gauges was added: the quadruped robot Spot can now autonomously patrol industrial facilities and read instrument data such as pressure gauges. Boston Dynamics says that with Gemini’s reasoning ability, AIVI-Learning’s baseline performance and accuracy in existing tasks such as visual inspections, pallet counting, and liquid leakage detection have also improved.

DeepMind says ER 1.6 is its “safest robot model.” In adversarial spatial reasoning tasks, safety instruction compliance is significantly better than ER 1.5. In safety risk identification tests based on real injury reports, the ER series models achieve 6% higher performance than Gemini 3.0 Flash in text scenarios and 10% higher performance in video scenarios.
(Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin