Research on the Disconnection Mechanism Between Tool Use Proxy Cognition and Action

robot
Abstract generation in progress
AIMPACT message, May 17 (UTC+8), this explainability paper focuses on tool usage agents, detecting hidden states to find that models often recognize when to call tools, but actual calls fail, with a mismatch rate of 26%-54%. The issue is entirely centered on the transition from cognition to action, rather than cognition itself. Internal detection directions can be decoded, but the final token mechanism of later layers causes signal rotation, almost orthogonal to the generated action. The research aims to predict the effectiveness of intervention measures, pointing out that common attributions such as insufficient prompts or training may overlook the geometric structure of later layers, providing a reasonable explanation for the performance ceiling in tool usage prompt A/B testing. (Source: AiHot)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 6
  • Repost
  • Share
Comment
Add a comment
Add a comment
GateUser-cf218ace
· 6h ago
The discovery of geometric rotation of the later layer tokens is so crucial. Previously, everyone was focusing on tweaking prompt engineering back and forth, but it turns out the root cause lies in the misalignment of directions in the representation space.
View OriginalReply0
FloatingTeacup
· 6h ago
The conversion bottleneck from cognition to action—this framework can be applied to many AI safety issues.
View OriginalReply0
QuietRugAlarm
· 7h ago
The word "orthogonal" is used brilliantly; signals and actions are almost perpendicular, and even the strongest cognition can't break through.
View OriginalReply0
FarmingNoSleep
· 7h ago
Geometric structure > Prompt engineering, this conclusion is too important for people building agents.
View OriginalReply0
StardustUnderTheGlassDome
· 7h ago
I thought about it, and that explains why sometimes using the same tool with different wording can still call successfully—the rotation angle has changed.
View OriginalReply0
YieldBento
· 7h ago
Internal signals are decodable, but the subsequent layer is orthogonal— is this orthogonality design a bug or a feature?
View OriginalReply0
  • Pinned