Standing before a capacity crowd at a San Jose hockey arena, Nvidia CEO Jensen Huang declared that the “inference inflexion has arrived,” unveiling a roadmap that positions the semiconductor giant to capture a staggering $1 trillion in revenue opportunity through 2027.
The announcement, delivered during the keynote of the 2026 GTC developer conference, marks a pivotal moment in the AI revolution. After three years dominated by training the process of building massive models like GPT-5, Nvidia is betting its future on inference, the real-time application of those models by hundreds of millions of global users.
The $1 Trillion Token Economy
Huang’s updated forecast doubles the $500 billion estimate the company provided just one year ago. The driver for this surge is the rise of Agentic AI, autonomous software agents that perform complex tasks without human intervention.
“In the age of AI, intelligence tokens are the new currency, and AI factories are the infrastructure that generates them,” Huang told the audience. He explained that as companies transition from experimenting with AI to deploying millions of autonomous agents, the demand for inference chips, those optimised for speed and cost-per-token, will dwarf the resources previously spent on model training.
Vera Rubin and the Groq Integration
At the heart of this strategy is the new Vera Rubin platform. Named after the pioneering American astronomer, the Rubin architecture is designed to handle the dual-stage process of modern inference.
NVIDIA revealed that the Rubin chips will handle prefill, converting human language into data tokens, while a secondary stage called decode will utilise technology licensed from the startup Groq. NVIDIA’s deal with Groq in late 2025 has already begun to bear fruit, allowing NVIDIA to offer hardware that delivers up to 10 times higher performance-per-watt than previous generations.
Crucially, Huang also introduced the Vera CPU, a standalone processor aimed directly at unseating Intel in the data center. “We are already seeing this become a multi-billion-dollar business,” Huang noted, signalling Nvidia’s intent to own the entire computing stack, from the GPU to the central processor.
Market Scepticism vs Tech Dominance
Despite the bold projections, Wall Street reacted with measured caution. NVIDIA’s stock, which hit a historic $5 trillion valuation in late 2025, rose a modest 1.2% following the keynote. Analysts suggest that while the growth story remains intact, investors are increasingly focused on execution risks and the ‘white-knuckle’ competition from custom silicon developed by the likes of Google and Meta.
Furthermore, the agentic shift brings new hurdles. Huang addressed the security concerns surrounding OpenClaw, the viral open-source agent platform that has become the industry standard for autonomous workflows. To counter these risks, Nvidia launched NemoClaw, a suite of privacy and safety controls designed to make agentic AI palatable for enterprise and government use.
The Path to 2028: Project Feynman
Looking further ahead, Huang offered a glimpse of the Feynman architecture, slated for 2028. This next-generation platform is expected to integrate co-packaged optics using light instead of electricity to move data, to solve the power-consumption bottlenecks currently facing global data centers.
As the conference continues this week, the message from San Jose is clear: the AI boom is not peaking; it is simply changing shape. By pivoting toward a token-based inference economy, Nvidia aims to ensure that every AI-generated thought, email, or line of code runs on its silicon.




