NVIDIA Ships 88-Core Vera CPU to Power Agentic AI Data Centers


TL;DR

  • New Chip: NVIDIA announced the Vera CPU, an 88-core Arm processor purpose-built for orchestrating agentic AI workloads in data centers.
  • Key Specs: Vera delivers 2.4x the memory bandwidth of its predecessor Grace, with 1.5 TB of LPDDR5X memory and Spatial Multithreading across 176 threads.
  • Platform Integration: The chip anchors a six-component Vera Rubin platform powering the NVL72 rack, which NVIDIA rates at 14.4 exaFLOPS of FP4 performance.
  • Market Context: Bank of America forecasts the data center CPU market will double from $27 billion to $60 billion by 2030, with AMD and Intel offering competing architectures.
  • Availability: Vera-based systems from Dell, HPE, Lenovo, and Supermicro ship in the second half of 2026, with Meta, Oracle, and Alibaba among early adopters.

NVIDIA is shipping an 88-core CPU designed not as a general-purpose processor but as a dedicated orchestration engine for agentic AI workloads.

Announced at GTC 2026 on March 16, the Vera CPU is now in full production, purpose-built for agentic AI workloads. According to Bank of America, the data center CPU market will double from $27 billion to $60 billion by 2030, driven by orchestration demands that traditional processors cannot deliver.

Vera CPU Architecture

At the heart of the Vera CPU are 88 custom Olympus cores built on Arm v9.2. Each core uses a wide, deep microarchitecture with improved branch prediction and prefetching, and introduces Spatial Multithreading, a technique that runs two hardware threads per core by physically partitioning resources rather than time-slicing. Across all 88 cores, Vera provides 176 threads.

In practice, Spatial Multithreading also introduces a run-time tradeoff between performance and efficiency. Operators can tune CPU behavior dynamically for multi-tenant AI factory environments.

Rather than adopting a chiplet design, NVIDIA built Vera on a single monolithic compute die. Its second-generation Scalable Coherency Fabric connects all 88 cores to a shared L3 cache and memory subsystem, sustaining over 90% of peak memory bandwidth under load. By avoiding chiplet boundaries, Vera delivers consistent latency across its entire die, a design choice that contrasts with multi-chiplet competitors from AMD and Intel.