Cua open-source macOS background computer-use driver: reverse-engineers Apple’s private framework to enable agent control of applications without taking over the cursor

robot
Abstract generation in progress
AIMPACT News, April 24 (UTC+8), according to Dongcha Beating monitoring, the open-source computer-use infrastructure project Cua released cua-driver, a native macOS driver that allows any agent to control Mac applications in the background. When the agent clicks, types, or takes screenshots, the user's cursor does not move, the focus does not change, and macOS does not switch desktops across Spaces. The core technology comes from reverse engineering Apple's private framework SkyLight. Conventional synthetic events using CGEventPost through the HID event stream move the cursor; \CGEvent.postToPid\ can send events directly but Chromium's rendering process filters them out. cua-driver uses SkyLight's SLEventPostToPid to send events through the WindowServer trusted channel, bypassing HID, so Chromium can also receive them. Window activation borrows from the window manager yabai: using SLPSPostEventRecordTo to only flip the AppKit activation state of the target application without raising the window level, avoiding triggering Spaces follow. For Electron apps (Slack, VS Code, Discord, etc.), it uses the undocumented _AXObserverAddNotificationAndCheckRemote to keep the accessibility tree updated even when windows are obscured. cua-driver provides three capture modes: ax mode returns only the accessibility tree and does not require screen recording permission; vision mode returns only screenshots; som mode (default) returns both, allowing the agent to click either by element index or pixel coordinates. The driver supports the MCP protocol and can be integrated with clients like Claude Code and Cursor, or invoked via the command line. Two known limitations: right-click on Chromium web content does not work, and Canvas-based applications (Blender, Unity, game engines) still require brief foreground activation. After OpenAI acquired the former Apple Shortcuts team Sky, Codex was the first to introduce background computer-use functionality but did not open source it. Cua's Francesco Bonacci stated that the background computer-use driver should be a general infrastructure rather than a feature exclusive to a single product. (Source: BlockBeats)
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned