Microsoft releases the first 7B-parameter computer-controlled intelligent agent model Fara-7B

robot
Abstract generation in progress
AIMPACT News, May 16 (UTC+8), Microsoft released Fara-7B, its first small language model with 7 billion parameters specifically designed for computer usage scenarios. The model adopts a multimodal decoder architecture, capable of receiving screenshot images and text context, directly predicting parameterized thought chains and operational actions. Built on Qwen 2.5-VL (7B), supporting a 128k context length, trained over 2.5 days on 64 H100 GPUs, and released under the MIT license on November 24, 2025. Fara-7B perceives browser input through screenshots, combining internal reasoning and historical state records to predict the next action and parameters (such as click coordinates). Training relies on large-scale fully synthetic datasets. The model can plan and execute advanced tasks such as booking restaurants, applying for jobs, and planning trips. For safety alignment, it employs robust fine-tuning methods, has key point recognition capabilities, can refuse seven categories of policy-violating tasks, and pauses operations at critical stopping points like inputting personal information or completing purchases. Users can deploy and interact via GitHub repositories, vllm, and fara-cli tools, mainly applied to automated web tasks. (Source: InFoQ)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 8
  • 3
  • Share
Comment
Add a comment
Add a comment
AirdropNightwatch
· 5h ago
When it comes to browser automation, it feels like it's going head-to-head with Browser-use and Computer-use.
View OriginalReply0
MintCondition
· 6h ago
Web page task automation, finally no need to write a bunch of selectors.
View OriginalReply0
SaveABitOnGasFees
· 6h ago
What proportion of data was used for post-training alignment? The paper will be released soon.
View OriginalReply0
GateUser-83c80dd0
· 6h ago
7B parameters for agent planning, lightweight but capability boundaries need to be empirically tested
View OriginalReply0
GateUser-bee672a5
· 6h ago
fara-cli deployment experience pending testing, hope it doesn't turn out like some projects with poor documentation
View OriginalReply0
Half-SectionSucculent
· 6h ago
Coordinate prediction + step-by-step reasoning, fine-grained control is far better than a pure text API
View OriginalReply0
0xLateCoffee
· 6h ago
128k context + screenshot awareness, this combination is quite powerful
View OriginalReply0
CandleChaser
· 6h ago
MIT License praised, 7B can run locally now
View OriginalReply0