20B Small Model Matches GPT-5 and Opus in Search Capability: Chroma Releases Open Source Agent Search Model Context-1

AirdropBlackHole · 2026-03-27T19:13:31+00:00

Chroma released Context-1, a 20 billion parameter open-source agent search model for multi-turn retrieval tasks. It utilizes 'self-editing context' to enhance search efficiency and is trained through a novel two-phase process. Context-1 achieves competitive metrics in various benchmarks with reduced cost and latency, showcasing cross-domain transferability.

AirdropBlackHole

2026-03-27 19:13:31

Abstract generation in progress

According to monitoring by 1M AI News, the open-source vector database Chroma has released Context-1, a 20 billion parameter agent search model specifically designed for multi-turn retrieval tasks. The model weights are open-sourced under the Apache 2.0 license, and the code for the synthetic data generation pipeline is also publicly available. Context-1 is positioned as a retrieval subagent: it does not directly answer questions but returns a set of supporting documents for downstream reasoning models through multi-turn searches. The core technology is ‘self-editing context,’ where the model actively discards irrelevant document fragments during the search process, freeing up space within a limited context window for subsequent searches, thus avoiding performance degradation caused by context bloat. Training is conducted in two phases: first, using large models like Kimi K2.5 to generate SFT trajectories for supervised fine-tuning warm-up, followed by training on over 8,000 synthetic tasks through reinforcement learning (based on the CISPO algorithm). The reward design employs a curriculum mechanism, encouraging broad exploration in the early stages and gradually shifting towards precision in the later stages to promote selective retention. The base model is gpt-oss-20b, adapted using LoRA, and runs inference with MXFP4 quantization on B200, achieving a throughput of 400-500 tokens per second. In Chroma’s four self-built domain benchmarks (web, finance, law, email) and public benchmarks (BrowseComp-Plus, SealQA, FRAMES, HotpotQA), the four-way parallel version of Context-1 matches or closely approaches the ‘final answer hit rate’ metrics of cutting-edge models such as GPT-5.2, Opus 4.5, and Sonnet 4.5; for instance, it achieved 0.96 on BrowseComp-Plus (compared to 0.87 for Opus 4.5 and 0.82 for GPT-5.2), while its cost and latency are only a fraction of the latter. Notably, the model was trained only on web, legal, and financial data but also demonstrated significant improvements in the email domain, which was not included in the training, indicating cross-domain transferability of search capabilities.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes