Microsoft releases the first 7B-parameter computer-controlled intelligent agent model Fara-7B

robot
Abstract generation in progress
AIMPACT News, May 16 (UTC+8), Microsoft released Fara-7B, its first small language model with 7 billion parameters specifically designed for computer usage scenarios. The model adopts a multimodal decoder architecture, capable of receiving screenshot images and text context, directly predicting parameterized thought chains and operational actions. Built on Qwen 2.5-VL (7B), supporting a 128k context length, trained over 2.5 days on 64 H100 GPUs, and released under the MIT license on November 24, 2025. Fara-7B perceives browser input through screenshots, combining internal reasoning and historical state records to predict the next action and parameters (such as click coordinates). Training relies on large-scale fully synthetic datasets. The model can plan and execute advanced tasks such as booking restaurants, applying for jobs, and planning trips. In terms of safety alignment, it employs robust fine-tuning methods, has key point recognition capabilities, can refuse seven categories of policy-violating tasks, and pauses operations at critical stopping points like inputting personal information or completing purchases. Users can deploy and interact via GitHub repositories, vllm, and fara-cli tools, mainly applied to automated web tasks. (Source: InFoQ)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 5
  • Repost
  • Share
Comment
Add a comment
Add a comment
Pragmatists
· 3h ago
With only 7B parameters, inference costs are manageable, and small to medium teams can also participate.
View OriginalReply0
ReflectionsOnTheStreetCorner
· 6h ago
7B Multi-Modal Agent, Local Deployment Enthusiasts Are Thrilled
View OriginalReply0
YieldTuningFork
· 6h ago
Microsoft's open-source game is on full display, MIT license is really appealing
View OriginalReply0
OracleSkeptic
· 6h ago
Full synthetic data training is quite interesting; I’ve figured out how to close the data loop.
View OriginalReply0
TheProphetOfToast
· 6h ago
Built on Qwen 2.5-VL—China-made base models are really delivering!
View OriginalReply0
  • Pinned