Google DeepMind releases Gemini Robotics-ER 1.6, Spot robot can now automatically read dashboards

robot
Abstract generation in progress

ME News message, April 14 (UTC+8). According to monitoring by 1M AI News, Google DeepMind released Gemini Robotics-ER 1.6, positioning it as a robot’s high-level reasoning model. Compared with its predecessor ER 1.5 and Gemini 3.0 Flash, it shows significant improvements in spatial reasoning and multi-view understanding. The model has been opened to developers via the Gemini API and Google AI Studio.

Key upgrades include three capabilities:

  1. Pointing accuracy improvement: usable for precise object detection, counting, spatial relationship reasoning (such as “point out all objects that can fit into the blue cup”), and motion trajectory planning, and it can correctly reject pointing at objects that do not exist in the pointing scene.
  2. Multi-view successful detection: robots can now integrate multiple camera feeds to determine whether a task is completed, maintaining accuracy even in occluded or dynamic environments.
  3. New instrument reading capability: can interpret various industrial instruments such as circular pressure gauges, vertical level indicators, and digital displays, achieving step-by-step reasoning through agentic vision (visual reasoning + code execution). It first enlarges detail areas, then uses pointing and code to calculate ratios and intervals, and finally combines world knowledge to derive readings.

The instrument reading capability comes from the collaboration between DeepMind and Boston Dynamics. On the same day, Boston Dynamics announced that it has integrated Gemini and Gemini Robotics-ER 1.6 into its Orbit AIVI-Learning product, which went live for all AIVI-Learning customers on April 8. After the integration, support for gauges was added. The quadruped robot Spot can now autonomously patrol industrial facilities and read instrument data such as pressure gauges. Boston Dynamics said that, with Gemini’s reasoning ability, AIVI-Learning’s baseline performance and accuracy in existing tasks—such as visual inspection, pallet counting, and liquid spill detection—have also improved.

DeepMind said that ER 1.6 is its “safest robot model.” In adversarial spatial reasoning tasks, safety instruction compliance is significantly better than ER 1.5. In safety risk identification tests based on real injury reports, the ER series models are higher than Gemini 3.0 Flash by 6% in text scenarios and by 10% in video scenarios.

(Source: BlockBeats)

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin