Over the past two years, PC manufacturers have repeatedly mentioned a parameter when promoting "AI PCs": NPU computing power. But whether it's Intel's Lunar Lake with 45 TOPS or AMD's Strix Point with 50 TOPS, these numbers have remained at a relatively moderate level. They can do background blurring, voice denoising, run some small-scale edge models, but that's about it.

On May 31st, NVIDIA unveiled the RTX Spark super chip at GTC 2026, pushing this number to 1 petaflop, which is 1000 TOPS. Not a 30% or 50% increase, but a leap across an entire order of magnitude.

Also announced on the same day were several other pieces of news: Microsoft, in conjunction with RTX Spark, upgraded Windows' native security mechanisms and integrated NVIDIA's open-source sandbox runtime OpenShell into the Windows platform; Adobe announced a fundamental overhaul of Photoshop and Premiere, specifically to adapt to RTX Spark's unified memory architecture; the first six OEMs confirmed they will launch lightweight laptops and compact desktops equipped with this chip in fall this year.

What NVIDIA is doing at GTC 2026 isn't just releasing a new chip. It’s trying to set a new hardware standard for the category of "personal AI computers."

When GPUs become the main focus of PCs

First, let's look at the chip itself. According to data released by NVIDIA at GTC, RTX Spark integrates a Blackwell architecture GPU with 6,144 CUDA cores, paired with a MediaTek co-designed 20-core Arm-based Grace CPU, manufactured with TSMC's 3nm process. The key change lies in the memory architecture: up to 128GB of unified memory, with the CPU and GPU sharing the same memory pool, eliminating the need to transfer data back and forth between them.

This is contrary to traditional PC architecture logic.

The basic structure of traditional PCs is "x86 CPU as the main processor, with an optional discrete GPU." Even with the recent rise of the AI PC concept, Intel and AMD's approach has been to embed an NPU within the CPU as an additional AI acceleration module, with computing power generally in the range of 40 to 50 TOPS. The GPU remains an external add-on.

RTX Spark reassigns the narrative power. This SoC makes the GPU the main actor, with the CPU taking a secondary role. NVIDIA claims an AI computing power of 1 petaflop FP4, equivalent to 1000 TOPS, over 20 times the NPU performance of previous-generation AI PCs. This isn't just a speed boost within the same category; it's the start of a different track altogether.

The pace of follow-up from OEMs confirms this judgment. According to NVIDIA's official announcement and subsequent reports from DIGITIMES, ASUS, Dell, HP, Lenovo, Microsoft Surface, and MSI will launch lightweight laptops and compact desktops with RTX Spark this fall, with Acer and GIGABYTE models following later. Nearly all mainstream Windows PC brands are entering the game.

RTX Spark isn't a product born from scratch. In early 2025, a similar Blackwell + Grace core chip appeared under the names Project DIGITS and DGX Spark, positioned as Linux desktop supercomputers for developers, nearly the size of a small desktop. A year later, this architecture was squeezed into the thermal space of lightweight notebooks, with the OS switched from Linux to Windows, and the target users expanded from AI developers to ordinary consumers and enterprise users. This is the most notable change in the GTC 2026 consumer release: NVIDIA isn't just launching a developer toy; it's opening the door to the consumer market.

Can a 120B model run locally, and is that enough?

Ultimately, the question of computing power and memory boils down to: what can it do?

NVIDIA's answer at the launch was that RTX Spark supports running large models with 120 billion parameters locally, with a context window reaching up to a million tokens. What does 120B mean? As a reference, the current mainstream practice for running local models on consumer hardware is that a 24GB RTX 4090 can run models of 30B to 40B parameters through quantization compression. Smaller models, like 9B, can be quickly run on consumer-grade graphics cards. The jump from 9B to 120B redefines what "enough" means for edge AI.

The 128GB unified memory is the prerequisite for all this. In traditional PC architecture, the CPU has its own system memory, and the GPU has its own VRAM, with a physical boundary between them. A large model exceeding VRAM capacity either can't run at all or requires complex model partitioning and memory swapping, which drastically reduces speed. The unified memory architecture eliminates this bottleneck, allowing model data to be directly placed into a shared pool of 128GB accessible by both CPU and GPU. Apple Silicon first demonstrated the consumer feasibility of this approach, and now NVIDIA has brought it to the Windows ecosystem.

Beyond large model inference, NVIDIA lists use cases such as 12K video editing, rendering 3D scenes over 90GB, and ray tracing games at over 100 fps at 1440p resolution. These scenarios share the characteristic of processing extremely large amounts of data at once, which traditional PCs either require multiple times the processing time or simply can't handle.

"Supporting operation" and "smooth usability" are still a distance apart. NVIDIA hasn't disclosed the actual inference speed of the 120B model on RTX Spark, nor the latency for the first token in a million-token context scenario. The key metric for long-context inference speed is memory bandwidth. For reference, the DGX Spark, which also uses GB10 cores, has a measured memory bandwidth of about 301GB/s. This bandwidth level can support 120B models, but when handling a million tokens of context, users might need to wait several seconds to see the first output token. The notebook version of RTX Spark may have its bandwidth further limited due to power constraints.

Adding a safety cage for AI agents

Another core announcement beyond computing power is NVIDIA's collaboration with Microsoft at the system level. This part might be the most easily overlooked but most impactful for the industry during the GTC 2026 consumer release.

A computer capable of running a 120B model, if used by an autonomous AI agent that can operate the desktop, click buttons, read/write files, the security risk isn't just "data loss" anymore, but "the agent doing things you don't want." Without solving this, enterprises won't deploy such devices to employees.

Microsoft and NVIDIA propose two lines of defense. First, Microsoft upgraded Windows' native security mechanisms to monitor and constrain AI agent behavior at the OS level. Second, NVIDIA officially integrated OpenShell runtime into Windows. According to NVIDIA's official documentation, OpenShell is an open-source sandbox runtime providing kernel-level isolation. It defines a controllable operational scope for AI agents, allowing them to autonomously perform tasks within this scope, but with strict permissions, preventing access to core system files, network connections, or user-sensitive data.

This combination has clear implications for enterprise procurement. Previously, the concept of "local AI agents" was limited to technical demos. Hardware could run them, but security frameworks were absent. No enterprise IT department would include such devices in procurement lists. NVIDIA and Microsoft inserting a standardized isolation layer between hardware and applications turns "usable" into "manageable."

The performance overhead of OpenShell itself remains to be observed. Sandboxing typically introduces some performance loss, but how much it affects inference speed or system responsiveness isn't publicly disclosed by NVIDIA. Deployment complexity for enterprise IT management, compatibility with existing security policies—these practical issues can only be validated after OEM devices are available.

Why Adobe is willing to "rebuild from the bottom"

The degree of software vendor cooperation is often a key indicator of whether a new hardware platform can establish a foothold.

Adobe's announcements during GTC are the most significant software signals in this release. According to NVIDIA's official blog and Adobe executives, Adobe has begun a fundamental overhaul of Photoshop and Premiere, specifically to adapt to RTX Spark's unified memory architecture, claiming up to 2x improvements in AI and graphics processing performance.

"Rebuilding from the bottom" isn't just adding plugins or adaptation layers. On traditional PCs, the CPU and GPU each have their own memory spaces, and processing a huge PSD file or 8K video timeline involves repeated data transfers between two memory systems, wasting performance. RTX Spark's unified memory allows the CPU and GPU to directly share the same 128GB space, which has real value for professional creators. Adobe's effort to modify core code indicates recognition that this architecture isn't just a marketing gimmick.

However, the benchmark for this "2x acceleration" hasn't been disclosed by NVIDIA or Adobe. Is it compared to the same generation x86 processor with a discrete GPU, or to the previous AI PC NPU solution? The results could be very different. Until the benchmark conditions are made public, the significance of this number remains uncertain.

Support has also been announced from Blackmagic Design, ComfyUI, llama.cpp, OTOY, and several game developers. Notably, ComfyUI and llama.cpp are among the most active open-source tools in current local AI workflows. Early community support often reflects a platform's ecological potential more genuinely than big corporate promises.

NVIDIA is building a Windows ecosystem experience similar to Apple's integrated hardware-software approach with CUDA and unified memory architecture. The difference is, Apple built its walled garden itself, while NVIDIA needs to persuade Microsoft and ISVs to build it together. Adobe's willingness to work from the bottom up at least indicates that the first brick of this wall has been laid.

Beyond the parameters on paper

Returning to a very practical question: will these devices actually be available for purchase, and what will the experience be like?

According to NVIDIA, the first RTX Spark devices are expected to launch this fall, including lightweight laptops and compact desktops from ASUS, Dell, HP, Lenovo, Microsoft Surface, and MSI. Acer and GIGABYTE models will follow later. Specific pricing and exact release dates haven't been announced.

More critical than pricing are several physical uncertainties. How will power consumption and cooling be balanced when fitting a 1 petaflop chip into a lightweight laptop? How will RTX Spark perform in everyday office tasks and battery life outside AI scenarios? Will the 128GB unified memory's actual bandwidth in a laptop form factor be significantly reduced due to power limits?

These questions are the real tests of industrialization. The peak performance of a chip in engineering prototypes often differs greatly from its real-world daily performance in consumer hands. NVIDIA emphasized RTX Spark's energy efficiency during the launch but didn't provide specific TDP or battery life data.

From the perspective of the PC industry, the emergence of RTX Spark signals a new division of roles. Over the past three decades, core PC chip power has been held by x86 processor vendors, with GPUs increasingly important but still considered "add-ons." This time, NVIDIA presents a complete SoC, integrating CPU, GPU, and memory controller, with Arm-based CPU designed by MediaTek. The power structure of the PC supply chain is shifting from "x86 CPU plus optional GPU" to "GPU-centered SoC platform."

This shift won't happen overnight. OEM pricing strategies, actual product energy efficiency, software ecosystem adaptation, and enterprise procurement cycles will all influence whether RTX Spark becomes a new industry benchmark or another high-profile tech demo with a high start and low finish. The answer will at least be clear by this fall.

NVDAX-0.56%

AAPLX3.1%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
ShareYourUSStocksWinNvidia
23.99M Popularity
#
WinGoldBarsWithGrowthPoints
1.29M Popularity
#
BTC触底66000
10.03K Popularity
#
MicroStrategySells32Bitcoins
13.64M Popularity
#
DailyPolymarketHotspot
553.71K Popularity

Pinned

Sitemap

AI PC is here—locally hard-booting a 120B large model! Nvidia redefines the “Personal AI Computer” foundation with RTX Spark

When GPUs become the main focus of PCs

Can a 120B model run locally, and is that enough?

Adding a safety cage for AI agents

Why Adobe is willing to "rebuild from the bottom"

Beyond the parameters on paper

Trending Topics

ShareYourUSStocksWinNvidia

WinGoldBarsWithGrowthPoints

BTC触底66000

MicroStrategySells32Bitcoins

DailyPolymarketHotspot

Pinned