Google DeepMind releases Gemini Robotics-ER 1.6, Spot robot can now automatically read dashboards

robot
Abstract generation in progress

ME News Report, April 14 (UTC+8), according to 1M AI News monitoring, Google DeepMind released Gemini Robotics-ER 1.6, positioned as a high-level reasoning model for robots. Compared to the previous ER 1.5 and Gemini 3.0 Flash, it shows significant improvements in spatial reasoning and multi-view understanding. The model has been made available to developers through the Gemini API and Google AI Studio.
Core upgrades include three capabilities:

  1. Pointing accuracy enhancement: suitable for precise object detection, counting, spatial relationship reasoning (such as “point out all objects that can fit into the blue cup”), and motion trajectory planning, with the ability to correctly reject pointing at objects not present in the scene
  2. Multi-view successful detection: robots can now integrate multiple camera views to determine task completion, maintaining accuracy even in occluded or dynamic environments
  3. New instrument reading capability: able to interpret various industrial instruments such as circular pressure gauges, vertical level indicators, and digital displays, achieved through agentic vision (visual reasoning + code execution) for step-by-step inference, first enlarging detail areas, then using pointing and code calculations for ratios and intervals, and finally combining world knowledge to derive readings
    The instrument reading capability originates from DeepMind’s collaboration with Boston Dynamics. On the same day, Boston Dynamics announced that Gemini and Gemini Robotics-ER 1.6 have been integrated into their Orbit AIVI-Learning product, launched to all AIVI-Learning customers on April 8. The integration added support for gauges, allowing quadruped robots like Spot to autonomously inspect industrial facilities and read instrument data such as pressure gauges.
    Boston Dynamics states that with Gemini’s reasoning ability, AIVI-Learning’s baseline performance and accuracy in tasks like visual inspection, pallet counting, and liquid detection have also improved.
    DeepMind claims ER 1.6 is their “safest robot model.” In adversarial spatial reasoning tasks, safety compliance significantly surpasses ER 1.5. In safety risk identification tests based on real injury reports, ER series models outperform Gemini 3.0 Flash by 6% in text scenarios and 10% in video scenarios.
    (Source: BlockBeats)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin