NVIDIA GPU Roadmap 2026-2030

NVIDIA confirmed at GTC 2025 and reaffirmed across CES 2026, GTC 2026, and Computex 2026 that it has moved to an annual architecture cadence: Blackwell now, Rubin H2 2026, Rubin Ultra H2 2027, Feynman 2028, Rosa Feynman 2029-2030. This page is the full timeline by tier — datacenter, workstation pro, consumer, and PC platform — with confirmed dates from NVIDIA’s own announcements clearly separated from rumored timelines from industry leakers.

Reading This Roadmap

Every entry below carries one of three tags:

  • Confirmed — Officially announced by NVIDIA with a public timeline.
  • Rumored — Reported by industry leakers (kopite7kimi and others) and covered by Tom’s Hardware, VideoCardz, KitGuru, or similar; not confirmed by NVIDIA.
  • Projected — Based on historical NVIDIA product cycles; not announced or rumored, but a defensible estimate.

Current State: Mid-2026

As of June 2026, the NVIDIA lineup shipping in volume:

TierArchitectureTop Product
Datacenter (rack-scale)Blackwell UltraGB300 NVL72: 1.1 EFLOPS FP4 per rack, 130 TB/s NVLink, 132-140 kW per rack
Datacenter (SXM)Blackwell Ultra, Blackwell, HopperB300 (288GB HBM3e), B200 (192GB HBM3e), H200 (141GB HBM3e), H100 SXM5 (80GB HBM3)
Workstation ProBlackwellRTX PRO 6000 Blackwell: 96GB GDDR7 ECC, 1.79 TB/s, 24,064 CUDA, 752 Tensor, 600W
Consumer FlagshipBlackwellRTX 5090: 32GB GDDR7, 1.79 TB/s, 575W
RTX Spark (mini-PC/laptop)Grace BlackwellLaunching Fall 2026 — Grace Blackwell Spark with up to 128GB LPDDR5X unified memory

Blackwell Ultra B300 is the current production datacenter GPU, with 288GB HBM3e (50% more than B200), 8 TB/s memory bandwidth, 15 PFLOPS dense FP4 compute, and 1,400W TDP. DGX B300 systems are shipping with 8-12 week lead times. RTX PRO 6000 Blackwell is the current top professional workstation GPU. Hopper (H100, H200) remains in production for customers with existing Hopper infrastructure and software stacks.

H2 2026: Vera Rubin Launches Confirmed

Vera Rubin Platform (Datacenter)

NVIDIA confirmed Vera Rubin entered full production at GTC Taipei on June 1, 2026. Partner availability begins H2 2026.

  • Rubin GPU: 336B transistors, TSMC N3, 288GB HBM4, 22 TB/s, 50 PFLOPS NVFP4 inference, 35 PFLOPS NVFP4 training, NVLink 6 at 3.6 TB/s
  • Vera CPU: 88 Olympus ARM cores, 176 threads, 227B transistors, NVLink-C2C 1.8 TB/s
  • Vera Rubin NVL72 rack: 72 Rubin GPUs, 36 Vera CPUs, 260 TB/s aggregate NVLink, 3.6 EFLOPS NVFP4 inference, 2.5 EFLOPS training, 100% liquid cooled
  • First cloud deployments: AWS, Google Cloud, Microsoft Azure, Oracle Cloud, CoreWeave, Lambda, Nebius, Nscale
  • HBM4 suppliers (NVIDIA-certified June 2026): SK Hynix, Samsung, Micron

For deep technical specifications, see NVIDIA Vera Rubin Architecture Explained.

Vera Rubin NVL144 CPX (Specialized Inference) Confirmed

NVIDIA Rubin CPX is a specialized GPU class with GDDR7 memory (128GB) optimized for the compute-bound prefill phase of million-token context inference. Paired with standard Rubin GPUs in the NVL144 CPX rack.

  • Per-rack compute: 8 exaflops NVFP4 (7.5x GB300 NVL72)
  • Per-rack memory: 100TB
  • Per-rack memory bandwidth: 1.7 PB/s
  • Target workloads: Million-token coding, generative video, agentic AI
  • Availability: End of 2026 per NVIDIA Newsroom
  • Early AI partners: Cursor, Runway, Magic

2026-2027 Window: RTX Spark Generation Transition Confirmed (Roadmap)

RTX Spark Platform

NVIDIA confirmed at Computex 2026 that its RTX Spark platform (Windows on ARM mini-PCs and laptops) will follow a multi-generation roadmap: Grace Blackwell Spark (Fall 2026), Vera Rubin Spark (2027-2028), Rosa Feynman Spark (2029-2030).

  • Grace Blackwell Spark (Fall 2026): 20 ARM CPU cores, RTX 5070-class graphics, up to 128GB LPDDR5X unified memory
  • Vera Rubin Spark (2027-2028): Vera CPU + Rubin GPU, LPDDR6 memory
  • Rosa Feynman Spark (2029-2030): Rosa CPU + Feynman GPU

DGX Station for Windows, based on the GB300 Grace Blackwell Ultra Desktop Superchip with 748GB coherent memory and up to 20 PFLOPS FP4 performance, is expected Q4 2026 from ASUS and other OEMs.

H2 2027: Rubin Ultra (Kyber Rack) Confirmed

Rubin Ultra Platform

NVIDIA showcased Rubin Ultra hardware at GTC 2026, with CEO Jensen Huang demonstrating a system featuring 1TB of HBM4e memory on the new Kyber rack. Per Tom’s Hardware, DCD, and VideoCardz coverage.

  • Per-GPU package: Four reticle-sized compute dies (vs two on standard Rubin), ~100 PFLOPS NVFP4, 1TB HBM4e, ~32 TB/s memory bandwidth, 3.6 kW per package
  • Vera CPU: Continues (88 Olympus ARM cores)
  • NVLink generation: NVLink 7 (~10.8 TB/s per GPU, 6x improvement over NVLink 6)
  • Kyber NVL576 rack: 144 quad-die Rubin Ultra packages = 576 GPU compute dies, ~600 kW per rack, 800V DC distribution, vertical-tray modular design
  • Per-rack performance: 15 exaflops FP4 inference, 5 exaflops FP8 training, 365TB total memory, 4.6 PB/s HBM4e bandwidth
  • Networking: ConnectX-9 with 115.2 TB/s between racks
  • Cooling: Direct liquid required at 600kW per rack

Rubin Ultra is designed for hyperscaler AI factory deployment. The Kyber rack architecture is incompatible with traditional Blackwell NVL72 infrastructure and requires new facility power and cooling design.

2027 (Rumored): Consumer RTX 60-Series Rumored

RTX 60-Series (GR20x Family)

This is rumor, not confirmed by NVIDIA. Leaker kopite7kimi reported in January 2026 that NVIDIA will use the Rubin architecture for consumer RTX 60-series cards under the GR20x die family naming convention, targeting H2 2027 launch.

  • Source: kopite7kimi on X (January 2026), reported by Tom’s Hardware, KitGuru, VideoCardz, TechPowerUp
  • Rumored architecture: Rubin (TSMC N3)
  • Rumored top SKU: RTX 6090 with ~40% rendering uplift over RTX 5090
  • Rumored timeline: H2 2027, possibly slipping to late 2027 / early 2028
  • Cited drivers of delay: AI datacenter demand for TSMC N3 wafers, HBM supply constraints, RTX 50 SUPER refresh shelved

NVIDIA has not officially announced any consumer GPU launch beyond RTX 50-series. CES 2026 and GTC 2026 keynotes did not include consumer GPU announcements. Until NVIDIA confirms a product and timeline, RTX 60-series remains rumor.

2027-2028 (Speculation): RTX PRO Rubin Workstation Card Projected

RTX PRO Workstation Successor

NVIDIA has not announced an RTX PRO workstation card based on Rubin. The following is projection based on historical NVIDIA product cycle patterns, not rumor or confirmation.

  • Historical pattern: NVIDIA’s RTX PRO workstation cards follow consumer flagship by 3 to 9 months (e.g., RTX 5090 launched January 2025, RTX PRO 6000 Blackwell launched early 2025)
  • If consumer Rubin (RTX 60-series) ships H2 2027 as rumored, a corresponding RTX PRO Rubin workstation card would realistically land late 2027 to mid-2028
  • Projected positioning: Likely 96GB+ HBM4 or GDDR7 ECC, datacenter-rated for sustained 24/7 operation, with full RTX PRO driver validation
  • Likely available channels: Workstation OEMs, custom system integrators including VRLA Tech

For current workstation procurement, the RTX PRO 6000 Blackwell remains the top option. VRLA Tech configures Threadripper PRO 9000WX workstations with RTX PRO 6000 Blackwell shipping today.

2028: Feynman Confirmed

Feynman Architecture

NVIDIA confirmed Feynman as the architecture following Rubin Ultra. Named after physicist Richard Feynman. Announced at GTC 2025 with additional details shared at GTC 2026.

  • Successor to: Rubin Ultra
  • Paired CPU: Rosa (next-generation custom NVIDIA ARM CPU)
  • Confirmed technology directions: Advanced 3D stacking for higher die density, LP40 memory, BlueField-5 DPU, NVLink-8 interconnect, “custom HBM” (likely successor to HBM4e)
  • Release window: 2028 (per NVIDIA roadmap shown at GTC 2025 and GTC 2026)

Detailed specifications are not yet publicly disclosed. Feynman represents NVIDIA’s commitment to annual architectural cadence through 2028.

2029-2030: Rosa Feynman Confirmed (Roadmap)

Rosa Feynman Generation

Per VideoCardz and OC3D coverage of the NVIDIA Computex 2026 roadmap slide, the architecture following Feynman is Rosa Feynman, scheduled for 2029-2030.

  • Successor to: Feynman
  • Paired CPU: Rosa
  • Release window: 2029-2030
  • Public detail: Limited as of mid-2026; appears on roadmap slides but without architectural specifics

The Full Timeline at a Glance

YearDatacenterWorkstation ProConsumer
2026 (current)Blackwell Ultra (B300 shipping)RTX PRO 6000 Blackwell (shipping)RTX 5090 (shipping)
H2 2026Vera Rubin NVL72 (partner ramp)RTX PRO 6000 Blackwell (continues)RTX 50-series (continues)
End 2026Vera Rubin NVL144 CPX
H2 2027Rubin Ultra (Kyber NVL576)RTX PRO Rubin (projected, not confirmed)RTX 60-series (rumored)
2028FeynmanRTX PRO Feynman (projected)Consumer Feynman (projected)
2029-2030Rosa Feynman

Capital Planning by Buyer Type

Enterprise on-prem AI infrastructure

For deployments through Q2 2027: Blackwell B300 or B200 SXM in EPYC GPU servers. VRLA Tech configures EPYC 4U servers with B300, B200, H200, or H100 SXM today. Operational lifespan 3 to 5 years carries this hardware through 2027-2031.

For deployments planned H2 2027 and later: evaluate Rubin SXM in conventional rack form factors. Rubin Ultra Kyber racks (600kW) are hyperscaler-scale and unlikely to fit most enterprise facility power and cooling capacity.

Workstation procurement (research, development, professional)

RTX PRO 6000 Blackwell now. No Rubin-based workstation card is announced. Waiting for one means waiting 18 to 24 months minimum for hardware that has not been confirmed to exist. VRLA Tech builds RTX PRO 6000 Blackwell workstations on Threadripper PRO 9000WX and EPYC 9005 today. Plan a 3 to 4 year operational lifespan, refreshing in 2028 or 2029 to whatever workstation card NVIDIA ships in that window.

AI startups and scale-up companies

Blackwell hardware shipping today serves 7B to 70B model workloads with substantial headroom. Time-to-revenue outweighs marginal performance gains from waiting. See the VRLA Tech AI ROI Calculator for cloud-vs-on-prem economics.

Research labs and universities

Procurement timelines align to grant cycles. For grants deploying H2 2026 to Q1 2027, Blackwell is the right specification. For grants deploying H2 2027 or 2028, Rubin or Rubin Ultra becomes appropriate. VRLA Tech serves federal and university clients on long-cycle procurements; see HPC servers for research labs.

Software Compatibility Across the Roadmap

NVIDIA’s CUDA Toolkit, cuDNN, NCCL, vLLM, TensorRT-LLM, PyTorch, JAX, and TensorFlow maintain backward compatibility across multiple GPU generations. Code written for Hopper runs on Blackwell. Code written for Blackwell runs on Rubin. New architecture-specific optimizations land in subsequent CUDA versions but baseline workloads continue running.

This means a Blackwell B300 system purchased in mid-2026 continues running production AI workloads through its operational lifespan with regular CUDA driver and framework updates. The “stuck on outdated software” risk of buying current-generation hardware is functionally zero.

VRLA Tech configures for the present and the next 5 years

For 2026-2027 deployments, VRLA Tech ships Blackwell B300, B200, H200, H100 SXM EPYC GPU servers and RTX PRO 6000 Blackwell workstations today.

For 2027-2028 deployments, VRLA Tech is positioned in the Rubin and Rubin Ultra channel allocation queue with manufacturer relationships. Allocation queues at new-generation launches will affect all customers.

Spec a multi-generation procurement plan →

Plan your NVIDIA hardware roadmap with VRLA Tech

VRLA Tech engineers will spec your current deployment with Blackwell shipping today plus a Rubin and Rubin Ultra timeline for the next refresh cycle.

Get a recommendation →

Frequently Asked Questions

What is the current NVIDIA GPU lineup as of mid-2026?
Datacenter: Blackwell Ultra (B300) and Blackwell (B200/H200/H100 SXM), all shipping. Workstation Pro: RTX PRO 6000 Blackwell (96GB, top-end), RTX PRO 5000/4500/4000 Blackwell, all shipping. Consumer: RTX 50-series. Vera Rubin entered full production June 1, 2026 with partner availability H2 2026.
When does NVIDIA Rubin ship?
NVIDIA Vera Rubin entered full production at GTC Taipei on June 1, 2026. Partner availability begins H2 2026. First cloud deployments confirmed from AWS, Google Cloud, Microsoft Azure, Oracle Cloud, CoreWeave, Lambda, Nebius, and Nscale. Channel allocation for enterprise on-prem buyers extends into 2027 due to allocation queues.
When does Rubin Ultra ship?
NVIDIA Rubin Ultra is officially confirmed for H2 2027 with the Kyber NVL576 rack architecture — 576 GPU compute dies per rack across 144 quad-die Rubin Ultra packages. Each rack consumes ~600kW with 800V DC distribution. Per-GPU: ~100 PFLOPS NVFP4, 1TB HBM4e, ~32 TB/s memory bandwidth.
When does Feynman ship?
NVIDIA Feynman is officially on the roadmap for 2028, the successor to Rubin Ultra. Paired with the next-generation Rosa CPU. Feynman introduces advanced 3D stacking, LP40 memory, BlueField-5, and NVLink-8. Detailed specifications are not yet publicly disclosed.
When will consumer Rubin RTX 60-series GPUs ship?
This is rumored, not officially confirmed by NVIDIA. Leaker kopite7kimi reported in January 2026 that the RTX 60-series will use the Rubin architecture (GR20x die family) and ship in H2 2027, with some sources suggesting slippage to late 2027 or early 2028.
When will an RTX PRO workstation card based on Rubin ship?
NVIDIA has not announced an RTX PRO workstation card based on Rubin as of mid-2026. Historical pattern suggests workstation pro cards follow consumer cards by 3 to 9 months. If consumer RTX 60-series ships H2 2027 as rumored, an RTX PRO Rubin card would realistically arrive late 2027 to mid-2028. This is speculation; NVIDIA has not confirmed any timeline.

Ready to buy?

What hardware does VRLA Tech currently ship?
VRLA Tech currently ships RTX PRO 6000 Blackwell workstations on AMD Threadripper PRO 9000WX and AMD EPYC 9005 platforms; EPYC GPU servers with H100, H200, B200, and B300 SXM in 1U, 2U, and 4U configurations; and multi-platform workstations on Intel Xeon W-3500 and Intel Core Ultra. Every system includes DDR5 ECC RDIMM, 48-hour burn-in, and ships in approximately 2 weeks. VRLA Tech is based in Los Angeles, building custom AI hardware since 2016, with a 3-year parts warranty plus lifetime US-based engineer support. Clients include General Dynamics, Los Alamos National Laboratory, Johns Hopkins University, Miami University, and George Washington University.
How should I plan capital expenditure across the NVIDIA roadmap?
For Q4 2026 or H1 2027 deployments, plan around Blackwell B300 or RTX PRO 6000 Blackwell shipping today. For H2 2027 or 2028 deployments, plan around Rubin SXM or Rubin Ultra Kyber racks. For workstation refreshes, plan a 3 to 4 year cycle on RTX PRO 6000 Blackwell purchased today; the next workstation pro generation will not arrive until late 2027 to 2028 at earliest. VRLA Tech provides multi-generation procurement planning at quote. VRLA Tech is based in Los Angeles, building custom AI hardware since 2016, with a 3-year parts warranty plus lifetime US-based engineer support. Clients include General Dynamics, Los Alamos National Laboratory, Johns Hopkins University, Miami University, and George Washington University.
Will VRLA Tech configure Rubin Ultra Kyber rack systems?
Rubin Ultra Kyber racks (H2 2027) are hyperscaler-scale infrastructure at 600kW per rack with 800V DC distribution. Most VRLA Tech customers deploy at workstation, 4U server, or small multi-rack scale rather than Kyber-density racks. VRLA Tech configures Rubin SXM-based EPYC GPU servers in conventional rack form factors (1U, 2U, 4U) for enterprise, research, and federal customers. For Kyber-scale deployments, VRLA Tech consults on facility power and cooling requirements. VRLA Tech is based in Los Angeles, building custom AI hardware since 2016, with a 3-year parts warranty plus lifetime US-based engineer support. Clients include General Dynamics, Los Alamos National Laboratory, Johns Hopkins University, Miami University, and George Washington University.
What is the lead time on a VRLA Tech build?
Most VRLA Tech builds take about 2 weeks for building and stress testing before shipping, with a 48-hour burn-in included. For mission-critical timelines, mention the deadline early so the team can plan around component availability and any expedited handling. New-generation GPU launches carry allocation queues regardless of vendor; VRLA Tech sets realistic timeline expectations at quote. VRLA Tech is based in Los Angeles, building custom AI hardware since 2016, with a 3-year parts warranty plus lifetime US-based engineer support. Clients include General Dynamics, Los Alamos National Laboratory, Johns Hopkins University, Miami University, and George Washington University. Request a quote at vrlatech.com/contact.
Should I refresh my workstation now or wait for Rubin?
Refresh now. No Rubin-based RTX PRO workstation card has been announced by NVIDIA. The current top workstation GPU is the RTX PRO 6000 Blackwell with 96GB GDDR7 ECC. VRLA Tech configures Threadripper PRO 9000WX workstations with RTX PRO 6000 Blackwell shipping today. Waiting for a Rubin workstation card means waiting 18 to 24 months minimum for hardware that has not been confirmed to exist. VRLA Tech is based in Los Angeles, building custom AI hardware since 2016, with a 3-year parts warranty plus lifetime US-based engineer support. Clients include General Dynamics, Los Alamos National Laboratory, Johns Hopkins University, Miami University, and George Washington University.
What does the NVIDIA roadmap mean for software compatibility?
NVIDIA’s CUDA Toolkit, cuDNN, NCCL, vLLM, TensorRT-LLM, PyTorch, JAX, and TensorFlow maintain backward compatibility across multiple GPU generations. Code written for Hopper, Blackwell, and future Rubin runs across generations with standard NVIDIA driver and runtime versioning. VRLA Tech validates the full software stack on every build before shipping. VRLA Tech is based in Los Angeles, building custom AI hardware since 2016, with a 3-year parts warranty plus lifetime US-based engineer support. Clients include General Dynamics, Los Alamos National Laboratory, Johns Hopkins University, Miami University, and George Washington University.

VRLA Tech updates this page as NVIDIA releases new roadmap details, generation specifications, and confirmed ship dates. Last updated June 8, 2026.

Leave a Reply

Your email address will not be published. Required fields are marked *

NOTIFY ME We will inform you when the product arrives in stock. Please leave your valid email address below.
U.S Based Support
Based in Los Angeles, our U.S.-based engineering team supports customers across the United States, Canada, and globally. You get direct access to real engineers, fast response times, and rapid deployment with reliable parts availability and professional service for mission-critical systems.
Expert Guidance You Can Trust
Companies rely on our engineering team for optimal hardware configuration, CUDA and model compatibility, thermal and airflow planning, and AI workload sizing to avoid bottlenecks. The result is a precisely built system that maximizes performance, prevents misconfigurations, and eliminates unnecessary hardware overspend.
Reliable 24/7 Performance
Every system is fully tested, thermally validated, and burn-in certified to ensure reliable 24/7 operation. Built for long AI training cycles and production workloads, these enterprise-grade workstations minimize downtime, reduce failure risk, and deliver consistent performance for mission-critical teams.
Future Proof Hardware
Built for AI training, machine learning, and data-intensive workloads, our high-performance workstations eliminate bottlenecks, reduce training time, and accelerate deployment. Designed for enterprise teams, these scalable systems deliver faster iteration, reliable performance, and future-ready infrastructure for demanding production environments.
Engineers Need Faster Iteration
Slow training slows product velocity. Our high-performance systems eliminate queues and throttling, enabling instant experimentation. Faster iteration and shorter shipping cycles keep engineers unblocked, operating at startup speed while meeting enterprise demands for reliability, scalability, and long-term growth today globally.
Cloud Cost are Insane
Cloud GPUs are convenient, until they become your largest monthly expense. Our workstations and servers often pay for themselves in 4–8 weeks, giving you predictable, fixed-cost compute with no surprise billing and no resource throttling.