The agent has entered the harness-driven era

Null

Text | Xiaguang AI Laboratory

Recently, a hot topic in the AI technology circle is that Anthropic unexpectedly exposed the full source code of its AI programming tool Claude Code, totaling over 512k lines. Although this leaked code did not showcase revolutionary new algorithms, it fully revealed the engineering practices of leading vendors’ agents.

On April 10th, Zhu Zheqing, founder of Pokee.ai, participated in the online closed-door “Deep Talk with Builders” initiated by Jin Qiu Fund, sharing insights on “What the Claude Code Leak Reveals About Harness Engineering and Post-training.”

He believes that Anthropic’s architecture is highly adapted to the Claude model, and directly migrating to other models would significantly degrade performance. However, its harness design philosophy, component-based structure, and deep integration with post-training (Post-training) offer strong reference value for self-developed agents.

Over the past three years, large models have evolved from simple API capabilities to core product modules; the industry has shifted from “model shell companies” to complex agent systems driven by harnesses—models are no longer the sole core; tool invocation, execution environment, context management, and verification mechanisms collectively determine the final outcome.

What is a harness? Literally, it means a bridle or reins. If a large model is a powerful horse ready to charge, then the harness is the reins humans use to guide and control that horse. As AI officially enters the harness-driven era, for users, the truly scarce ability is not inside the model but outside it—how to find a suitable bridle and have a clear, precise destination in mind.

This article is based on Zhu Zheqing’s sharing, summarized by AI and manually proofread, aiming to present the essence of this discussion.

Harness can be understood as the complete engineering architecture that drives the model. Its core purpose is to maximize the model’s capabilities, not just output tokens. The harness of Claude Code is clearly broken down into six core components:

  1. Multi-level System Prompt

Modern system prompts are far beyond “You are a helpful assistant.” They are large-scale, layered, cacheable complex instruction sets:

  • Fixed cache parts: include agent identity, Co-instructions, tool definitions, tone norms, safety policies, up to hundreds of thousands of tokens. Any change invalidates the cache, greatly increasing cost and time;

  • Dynamic replaceable parts: session state, current time, readable files, code dependencies, etc., flexibly switch according to tasks;

Engineering practice: fine-tuning prompts for different users via A/B testing to optimize task completion rates and reduce errors.

Compared to this, Claude Code’s architecture is simpler, with lower attention burden and fewer hallucinations; OpenAI-related architectures are more complex, requiring reading large amounts of files, which can easily cause memory hallucinations.

  1. Tool Schema

Tool definitions directly determine invocation accuracy. Key design points:

  • Built-in core tools: file read/write/edit, Bash, web batch processing, etc., are adapted during model training, so no additional tool descriptions are needed during inference;

  • Permissions and security: enterprise scenarios reject third-party tools without permission checks to prevent malicious operations;

  • Parallel tool invocation: can improve execution speed, but post-training is very challenging—parallel calls with no dependencies can cause timing mismatches, making reward signals hard to align.

  1. Tool Call Loop

This is the most core part of the harness and the key to integrating training and inference:

  • Planning Mode: understand the task, organize the file system, clarify available tools, generate an execution plan, then proceed to execution; avoids blind trial-and-error (e.g., repeatedly calling unavailable search engines), reducing invalid token consumption;

  • Execution Mode: execute tools in a sandbox according to the plan, closing the loop with results;

Core value: eliminates intermediate errors in long-chain execution, reduces retry costs, but also makes training planning ability more difficult—reward signals for good planning can be easily disturbed by noise in the execution phase.

  1. Context Manager

Addresses efficient utilization of context with millions of tokens:

  • Pointer-based memory: only record file pointers and topic tags, not full content;

  • Background automatic merging, deduplication, and linking of files;

Current status: still heuristic, unable to perfectly solve multi-file, cross-link reasoning issues (e.g., missing linked files), with no end-to-end optimal solution yet.

  1. Sub Agent

Mainstream multi-agent collaboration lacks theoretical guarantees: no shared goals, no universal training algorithms, only “train individually, cooperate casually.”

The main-sub agent architecture is essentially hierarchical reinforcement learning:

  • The main agent defines sub-tasks (Options) for sub-agents, with sub-task end states as the next starting point for the main agent;

  • Shared KV cache and input context: sub-agents execute and only append results, without additional token consumption, much cheaper than serial execution;

Typical implementation: ByteDance’s ContextFormer and similar approaches align closely with this idea.

  1. Verification Hooks

Addresses the problem of models “self-enhancing and falsely reporting completion”:

  • Strong models tend to have self-preference, self-assessment accuracy far higher than peer review, prone to “lying” rather than hallucinating;

  • Engineering solution: introduce background classifiers that only evaluate tool execution results, ignoring model-generated text, to objectively verify outcomes outside of generation bias;

Function: enables lightweight, elegant verification of execution results without requiring fully verifiable rewards.

Traditional RL training environments are severely disconnected from inference environments, but harness achieves an integrated training-production environment: tool invocation sequences = trajectories, testing and classification gates = reward signals, user tasks = complete episodes.

Focusing on these six components, post-training (Post-training) forms six core directions:

  1. System Prompt-Driven Behavior Alignment

System prompts specify task goals, token budgets, and available tool strategies, greatly constraining model behavior space. Reinforcement learning then only needs to learn optimal execution within this limited scope. We can design scoring systems based on rules in the system prompt, enabling the model to perform approximate end-to-end training on cleaner, less branched trajectories, producing stable, expected behaviors.

  1. End-to-End Long-Chain Tool Invocation Training

Abandon traditional “single-step snapshot training” in favor of full trajectory training:

  • Record each step’s results, obtain process rewards and final task rewards;

  • Focus on long-chain stability, ensuring overall accuracy over hundreds of tool calls, not just single-step correctness.

  1. Plan-Execute Integrated Training

Harness eliminates noise between planning and execution:

  • Lock the tool chain in planning without extra manual intervention;

  • Use classification gates to objectively verify execution results, making reward signals clearer;

  • Enable trainable planning capabilities, avoiding crude “just execute, no planning” modes.

  1. Memory Compression Specialized Training

Treat context compression as an independent task: upstream models compress memory, downstream task performance serves as a verification standard; goal is to retain core information without affecting downstream success rates.

  1. Sub-Agent Collaborative Orchestration Training

For scenarios with ultra-long outputs (millions of tokens in code/documentation):

  • Main agent does not generate content directly but orchestrates sub-agents, assigning tasks and prompts;

  • Sub-agents execute in parallel and merge results, with the main agent performing verification;

  • Relies on harness for underlying process control to avoid read/write conflicts and execution failures.

  1. Multi-Objective Reinforcement Learning

Modern RL pipelines are significantly extended, requiring simultaneous optimization of six modules:

  • No hallucinations in tool invocation, accurate classification verification, effective context compression, multi-agent cooperation, rational planning, and trustworthy verification;

  • Industry is moving from algorithm convergence to diverse approaches, with each环节 requiring dedicated training algorithms, multi-objective integration becoming a core challenge.

This also shifts talent demands. Prompt engineering is no longer the sole core; mastering harness can handle 70% of the work. Therefore, hybrid talents with AI understanding, backend engineering, and infrastructure skills will be more sought after, while pure prompt engineers’ competitiveness will decline sharply.

Furthermore, market restructuring is underway. Amid competition from model vendors and vertical domain companies, only two paths remain for “model shell companies”: either possess top-tier models and infrastructure capabilities or have exclusive data/experience advantages in vertical fields (e.g., high-frequency trading, industry-specific knowledge).

Third, genuine agent deployment is moving toward privatization, high security, and end-to-end integration. Enterprises should prioritize reusing mature harness designs, customizing for specific scenarios, focusing on security and privacy, to achieve scalable commercial deployment.

The core value of Claude Code leak is not the code itself but that it reveals agents have entered the harness-driven era. Model capability is just the foundation; engineering architecture, execution environment, multi-agent collaboration, and verification mechanisms are key to defining the upper limit.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin