Microsoft releases Fara-7B, a multimodal intelligent agent with 7 billion parameters, specifically designed for computer usage scenarios. It can process screenshots and text simultaneously, directly predicting parameterized chains of thought and operational actions, built on Qwen 2.5-VL, with a 128k context window, trained on 64 H100s for 2.5 days, and released under MIT license. It perceives browser input through screenshots, combining reasoning and historical state prediction to determine the next action and parameters such as coordinates, relying on large-scale fully synthetic data. It has the ability to plan and execute advanced tasks, and employs robust post-training safety alignment, capable of refusing inappropriate tasks and pausing at critical points. It can be deployed and interacted with via GitHub, vllm, and fara-cli, for automating web page tasks.

MeNews

2026-05-27 02:42:22

Abstract generation in progress

AIMPACT News, May 16 (UTC+8), Microsoft released Fara-7B, its first small language model with 7 billion parameters specifically designed for computer usage scenarios. The model adopts a multimodal decoder architecture, capable of receiving screenshot images and text context, directly predicting parameterized thought chains and operational actions. Built on Qwen 2.5-VL (7B), supporting a 128k context length, trained over 2.5 days on 64 H100 GPUs, and released under the MIT license on November 24, 2025. Fara-7B perceives browser input through screenshots, combining internal reasoning and historical state records to predict the next action and parameters (such as click coordinates). Training relies on a large-scale fully synthetic dataset. The model can plan and execute advanced tasks such as booking restaurants, applying for jobs, and planning trips. For safety alignment, it employs robust fine-tuning methods, has key point recognition capabilities, can refuse seven categories of policy-violating tasks, and pauses operations at critical stopping points like inputting personal information or completing purchases. Users can deploy and interact via GitHub repositories, vllm, and fara-cli tools, mainly applied to automated web tasks. (Source: InFoQ)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

10 Likes

Reward
10
7
3
Share

Comment

Add a comment

BoredInBlockspace

· 4m ago

128k context length is indeed enough for web automation, and for long-running tasks you won’t have to worry about forgetting the earlier context.

View OriginalReply0

MintConditionHuman

· 6h ago

The browser automation track is becoming more competitive, and after AutoGPT, there's another contender.

View OriginalReply0

BlocktimeBarista

· 6h ago

Predicting coordinates is quite important; previously, many models had embarrassingly poor accuracy in locating elements.

View OriginalReply0

RugCheckSkeptic

· 6h ago

Will models trained on fully synthetic data fail when generalized to real complex pages?

View OriginalReply0

QuietValidator

· 6h ago

MIT License is highly praised; finally, no need to look at those commercial restriction clauses.

View OriginalReply0

ColdWalletUnderTheNeonLights

· 6h ago

How’s the deployment experience with fara-cli? Have any brothers who tried it share what pitfalls they ran into?

View OriginalReply0

LateBlockLarry

· 6h ago

64 H100 units trained in 2.5 days—this efficiency is something else. Synthetic data is doing a great job.

View OriginalReply0

Trending Topics
View More
#
StockTradingChallengeUpTo17000U
16.23M Popularity
#
TrumpBacksCFTCAuthorityOverPredictionMarkets
820K Popularity
#
GatePredictionMarketAddsSmartMoneyTracking
13.21M Popularity
#
MicronMarketCapBreaks1Trillion
38.29K Popularity
#
TradeCFDWinGold
3.08M Popularity

Pinned

Sitemap

Microsoft releases the first 7B-parameter computer-controlled intelligent agent model Fara-7B

Trending Topics

StockTradingChallengeUpTo17000U

TrumpBacksCFTCAuthorityOverPredictionMarkets

GatePredictionMarketAddsSmartMoneyTracking

MicronMarketCapBreaks1Trillion

TradeCFDWinGold

Pinned