NVIDIA Plans to Launch Next-Generation Rubin Accelerators in September

kyojuro 2025年6月10日星期二

In market news, NVIDIA is set to begin sampling its next-generation Rubin AI accelerator to customers this September, just six months following the launch of the Blackwell Ultra, indicating a remarkably swift development pace. Both the Rubin R100 GPUs and the new Vera CPUs harness TSMC's 3nm process, incorporate HBM4 memory, and utilize the Chiplet design. These enhancements deliver comprehensive upgrades in performance, power efficiency, and architecture.

Rubin R100 GPUs

The Rubin R100 GPU represents NVIDIA's latest AI accelerator following the Blackwell architecture, crafted to meet the escalating computational demands of data centers. Constructed on TSMC's N3P (3nm performance-enhanced) process, the R100 achieves a 20% boost in transistor density, a reduction in power consumption by 25-30%, and an increase in performance between 10-30% compared to the 4nm process used in the Blackwell B100. This technological progression significantly heightens the R100's power efficiency ratio, favoring it for intense AI training and inference tasks. Notably, the R100 introduces the Chiplet design, enhancing manufacturing yields and architectural flexibility by integrating multiple smaller chip modules. Its 4x reticle design increases the chip area compared to Blackwell's 3.3x reticle, allowing for more compute units and memory interfaces to be included.

Regarding memory, the R100 utilizes eight HBM4 stacks with a total capacity of 288GB and can achieve bandwidths up to 13TB/s — a substantial improvement over the Blackwell B100's HBM3E, which achieves approximately 8TB/s. HBM4 employs 12- or 16-layer stacking technology, offering single-stack capacities of 24Gb or 32Gb, ensuring robust memory support essential for large language models and complex AI reasoning. Additionally, the R100 makes use of TSMC's CoWoS-L packaging technology, which accommodates 100x100mm substrates and up to 12 HBM4 stacks, setting a solid foundation for further Rubin Ultra expansions. Its I/O chip uses the N5B (5nm enhanced) process, further optimizing data transfer efficiency.

Vera CPU and Rubin Integration

Accompanying the Rubin GPU, the Vera CPU signifies a complete overhaul from the Grace CPU, built on a bespoke ARM Olympus core featuring 88 cores and 176 threads. This is a marked improvement over Grace, which had 72 cores and 144 threads. The Vera's memory bandwidth of 1.8 TB/s is 2.4 times that of the Grace, and its memory capacity has been increased by 4.2 times, which significantly enhances data processing abilities. Vera seamlessly connects to the Rubin GPUs via the NVLink-C2C high-speed interconnect, boasting a 1.8TB/s bandwidth to enable efficient inter-chip communication. Its performance is essentially double that of Grace's, making it exceptionally well-suited for AI inference, data preprocessing, and multi-threaded tasks. NVIDIA optimized the ARM instruction set and microarchitecture to render Vera optimal for AI workloads' backend needs.

Since announcing the Rubin architecture at Computex 2024, NVIDIA has been relentlessly advancing its product roadmap. Rubin R100 is anticipated to enter volume production in Q4 2025, with related DGX and HGX systems being rolled out in the first half of 2026. By the latter half of 2026, NVIDIA will unveil the Vera Rubin NVL144 platform, which will integrate 144 Rubin GPUs and numerous Vera CPUs within a liquid-cooled Oberon rack consuming 600kW of power. This setup will deliver 3.6 exaFLOPS of FP4 inference performance and 1.2 exaFLOPS of FP8 training performance, equating to a 3.3x improvement over the Blackwell GB300 NVL72. By 2027, the Rubin Ultra NVL576 platform will host 576 Rubin Ultra GPUs, including 16 HBM4e stacks and up to 1TB of memory. This platform is projected to deliver 15 exaFLOPS of FP4 inference performance and 5 exaFLOPS of FP8 training performance, representing a 14x improvement over GB300. It will also incorporate the NVLink 7 interconnect and ConnectX-9 NIC (1.6Tbps), which collectively amplify system scalability.

To guarantee Rubin's swift launch, NVIDIA has fortified collaborations with key supply chain partners like TSMC and SK Hynix. TSMC plans to ramp up CoWoS packaging capacity to 80,000 wafers monthly by Q4 2025 to accommodate the demands from Rubin and Apple's M5 SoC. SK Hynix accomplished the HBM4 flow in October 2024, delivering 12-layer HBM4 samples to NVIDIA with volume production on the horizon in 2025. Initial pilot production samples of Rubin GPUs and Vera CPUs were completed in June 2025 at TSMC, with production sampling commencing in September and mass production slated for early 2026.

The surging power demands within data centers have triggered a focus on energy efficiency in design. The Rubin R100 conserves power through the 3nm process and HBM4 memory while optimizing thermal management using liquid-cooling technology and high-density racks. Although the Vera Rubin NVL144 platform can consume up to 600kW, its compute density and performance offer superior outputs per power unit compared to earlier models. Market analysis forecasts that the global AI data center market will burgeon to $200 billion by 2025, with NVIDIA's Blackwell and Rubin technologies poised to lead. Major technology firms like Microsoft, Google, and Amazon have preemptively reserved Blackwell chips until the end of 2025, and the early introduction of Rubin further cements NVIDIA's market dominance.

Future of AI with Feynman Architecture

Looking ahead, NVIDIA plans to unveil the Feynman architecture in 2028, perpetuating its tradition of naming chips after renowned scientists. The successful deployment of Rubin and Vera will bolster emerging applications like AI inference, training, and agentic AI, steering AI technology towards a more general-purpose framework. With sample deliveries scheduled for September 2025 and production deployments by 2026, NVIDIA is positioned to uphold its leadership within the global AI market, driving the evolution of data centers and AI applications.

相關資訊

© 2025 - TopCPU.net