Microsoft releases Fara-7B, a multimodal intelligent agent with 7 billion parameters, designed specifically for computer usage scenarios. It can process screenshots and text simultaneously, directly predicting parameterized thought chains and operational actions, built on Qwen 2.5-VL, with a 128k context window, trained for 2.5 days on 64 H100 units, and released under MIT license. It perceives browser input through screenshots, combining reasoning and historical state prediction to determine the next actions and parameters such as coordinates, relying on large-scale fully synthetic data. It has the capability to plan and execute advanced tasks, and employs robust post-training safety alignment, able to refuse inappropriate tasks and pause at critical points. It can be deployed and interacted with via GitHub, vllm, and fara-cli, for automating web tasks.

MeNews

2026-05-27 04:06:37

Abstract generation in progress

AIMPACT News, May 16 (UTC+8), Microsoft released Fara-7B, its first small language model with 7 billion parameters specifically designed for computer usage scenarios. The model adopts a multimodal decoder architecture, capable of receiving screenshot images and text context, directly predicting parameterized thought chains and operational actions. Built on Qwen 2.5-VL (7B), supporting a 128k context length, trained over 2.5 days on 64 H100 GPUs, and released under the MIT license on November 24, 2025. Fara-7B perceives browser input through screenshots, combining internal reasoning and historical state records to predict the next action and parameters (such as click coordinates). Training relies on large-scale fully synthetic datasets. The model can plan and execute advanced tasks such as booking restaurants, applying for jobs, and planning trips. For safety alignment, it employs robust fine-tuning methods, has key point recognition capabilities, can refuse seven categories of policy-violating tasks, and pauses operations at critical stopping points like inputting personal information or completing purchases. Users can deploy and interact via GitHub repositories, vllm, and fara-cli tools, mainly applied to automated web tasks. (Source: InFoQ)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

12 Likes

Reward
12
10
3
Share

Comment

Add a comment

GateUser-16838403

· 51m ago

A 2.5-day training cycle, Microsoft's efficiency is a bit terrifying.

View OriginalReply0

GateUser-53a6e1a8

· 5h ago

Safe alignment can refuse to perform violations, which is more reliable than AutoGPT in this regard.

View OriginalReply0

TheBluePeony'sProphecy

· 5h ago

Qwen 2.5-VL has a solid foundation, but the multi-modal Agent track is going crazy.

View OriginalReply0

SeaSaltFlavorAirdrop

· 5h ago

In web automation, the Frankenstein patchwork of Playwright + LLMs is putting its creators out of a job.

View OriginalReply0

GateUser-4bd1cc87

· 5h ago

MIT License is well-received, 7B parameters can now run locally

View OriginalReply0

GlassCityAfterTheRain

· 5h ago

Is deploying fara-cli simple? Is there a Docker image available?

View OriginalReply0

GateUser-8da82d63

· 5h ago

Training on fully synthetic data, generalization ability is questionable, awaiting actual testing.

View OriginalReply0

LateAlphaCourier

· 5h ago

128k context should be enough for me to fit the entire webpage inside.

View OriginalReply0

AirdropUnderTheNeonBridge

· 5h ago

Screenshot + text directly predict coordinates, browser automation is about to change.

View OriginalReply0

CandleChaser

· 5h ago

Running 64 H100s for two and a half days, I can't even calculate the cost.

View OriginalReply0

Trending Topics
View More
#
StockTradingChallengeUpTo17000U
16.23M Popularity
#
TrumpBacksCFTCAuthorityOverPredictionMarkets
819.86K Popularity
#
GatePredictionMarketAddsSmartMoneyTracking
13.21M Popularity
#
MicronMarketCapBreaks1Trillion
38.04K Popularity
#
TradeCFDWinGold
3.08M Popularity

Pinned

Sitemap

Microsoft releases the first 7B-parameter computer-controlled intelligent agent model Fara-7B

Trending Topics

StockTradingChallengeUpTo17000U

TrumpBacksCFTCAuthorityOverPredictionMarkets

GatePredictionMarketAddsSmartMoneyTracking

MicronMarketCapBreaks1Trillion

TradeCFDWinGold

Pinned