Google is building a four-partner chip supply chain to challenge Nvidia in AI inference


Google is building a four-partner chip supply chain to challenge Nvidia in AI inference

Summary: Google is building the AI industry’s most diversified custom chip supply chain, with four design partners (Broadcom, MediaTek, Marvell, Intel) and a roadmap stretching from the Ironwood TPU now shipping in the millions to TPU v8 chips at TSMC 2nm in late 2027. The strategy, detailed ahead of Google Cloud Next, splits the next generation explicitly: Broadcom’s “Sunfish” for training, MediaTek’s “Zebrafish” for inference at 20-30% lower cost, with Marvell in talks to add a memory processing unit and an additional inference TPU, positioning Google’s custom silicon as the most direct challenge to Nvidia’s dominance in AI inference.

Google is assembling the most diversified custom chip supply chain in the AI industry, with four design partners, a fabrication relationship with TSMC, and a product roadmap that now stretches from the inference chips it is shipping today to the 2-nanometre processors it expects to deploy in late 2027. The strategy, detailed in a Bloomberg feature ahead of Google Cloud Next this week, positions Google’s silicon programme as the most direct challenge to Nvidia’s dominance in AI inference, the phase of computing where models serve users rather than learn from data.

The centrepiece is Ironwood, Google’s seventh-generation TPU and the first designed specifically for inference. It delivers ten times the peak performance of the TPU v5p, offers 192 gigabytes of HBM3E memory per chip with 7.2 terabytes per second of bandwidth, and scales to 9,216 liquid-cooled chips in a single superpod producing 42.5 FP8 exaflops. Ironwood is now generally available to Google Cloud customers. Google plans to produce millions of units this year, and Anthropic has committed to up to one million TPUs. Meta also has a rental arrangement.

The four-partner supply chain

Google’s chip programme now involves four distinct design partners, each handling different segments of the product line.

Broadcom, which signed a long-term agreement on 6 April to supply TPUs and networking components through 2031, handles the high-performance chip variants. It is also designing the next-generation TPU v8 training chip, codenamed “Sunfish,” targeted at TSMC’s 2-nanometre process node for late 2027. Broadcom commands more than 70% of the custom AI accelerator market and is projecting $100 billion in AI chip revenue by 2027.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

MediaTek is designing the cost-optimised inference variant of the TPU v8, codenamed “Zebrafish,” also targeting TSMC 2nm in late 2027. MediaTek’s involvement began with the I/O modules and peripheral components on Ironwood, where its designs run 20 to 30% cheaper than alternatives. The TPU v8 strategy splits the product line explicitly: Broadcom builds the training chip, MediaTek builds the inference chip, and Google gains the negotiating leverage that comes from having each partner know the other exists.

Marvell Technology, which is in talks with Google to develop a memory processing unit and a new inference-focused TPU, would become the third design partner if those negotiations produce a contract. Google plans to produce nearly two million of the memory processing units, with design finalisation expected by next year. Marvell’s custom silicon business runs at a $1.5 billion annual rate across 18 cloud-provider design wins, and Nvidia invested $2 billion in the company in March.

Intel entered the picture on 9 April with a multi-year deal to supply Xeon processors and custom infrastructure processing units for Google’s AI data centre infrastructure. The arrangement covers the networking and general-purpose compute layers that surround the TPUs rather than the AI accelerators themselves.

TSMC fabricates all of Google’s custom silicon. The relationship is structural: every chip Google designs, regardless of which partner designed it, runs through TSMC’s fabs.

Why inference changes the economics

The shift from training to inference as the dominant AI compute cost is the strategic premise behind Google’s entire chip programme. Training a frontier model is a singular, intensive event. Inference is continuous and scales with every user, every query, and every product that incorporates AI. Google serves billions of AI-augmented search queries, Gemini conversations, and Cloud AI API calls daily. At that scale, the cost per inference determines the economics of the entire AI business.

Nvidia’s GPUs remain dominant for training workloads, where their programmability and the CUDA software ecosystem create switching costs that custom chips cannot easily replicate. But inference workloads are more predictable, more repetitive, and more amenable to the kind of fixed-function optimisation that custom silicon excels at. A purpose-built inference chip that costs less per query than an Nvidia GPU, even if it cannot match the GPU’s versatility, wins on the metric that matters at Google’s scale.

This is why Google is investing in multiple inference chip paths simultaneously. Ironwood serves today’s workloads. MediaTek’s Zebrafish targets the next generation at lower cost. Marvell’s proposed chips would add yet another option. The redundancy is deliberate: Google is building optionality into a supply chain where dependence on any single partner creates pricing risk, capacity risk, and the strategic vulnerability of having its AI infrastructure controlled by someone else’s roadmap.

The numbers behind the ambition

Google’s total expected TPU shipments are projected at 4.3 million units in 2026, scaling to more than 35 million by 2028. Anthropic’s commitment alone represents up to one million of those chips, with access to approximately 3.5 gigawatts of next-generation TPU-based compute starting in 2027. Broadcom’s Mizuho-estimated AI revenue from its Google and Anthropic relationships is $21 billion in 2026, rising to $42 billion in 2027.

The custom ASIC market more broadly is growing faster than GPUs. TrendForce projects custom chip sales will increase 45% in 2026, compared with 16% growth in GPU shipments. The market is expected to reach $118 billion by 2033. Google is not the only hyperscaler building custom inference silicon: Amazon has Trainium and Inferentia, Microsoft has Maia, and Anthropic is exploring its own chip programme. But Google’s multi-partner, multi-generation approach is the most architecturally ambitious.

What to watch at Cloud Next

Google Cloud Next opens on Wednesday in Las Vegas with keynotes from Sundar Pichai and Thomas Kurian. The conference is expected to showcase the next-generation TPU architecture and the custom silicon roadmap that connects Ironwood to the v8 generation. The timing of the Bloomberg feature, one day after The Information broke the Marvell talks and two days before Cloud Next, suggests Google is using the conference to frame its chip programme as a coherent strategy rather than a series of individual partnerships.

The challenge Nvidia faces is not that any single Google chip will outperform its GPUs. It is that Google is building a system in which multiple custom chips, each optimised for a specific workload and cost point, collectively reduce the share of Google’s AI compute that runs on Nvidia hardware. Nvidia’s response has been to embed itself in the custom chip ecosystem rather than fight it: the $2 billion Marvell investment and the NVLink Fusion programme ensure Nvidia retains a position in racks where its GPUs are supplemented or replaced by ASICs.

For Google, the bet is that controlling its own silicon, across multiple partners and multiple generations, will produce a cost advantage in inference that compounds over time. The scale of Nvidia’s business means the incumbent will not be displaced quickly. But the economics of inference favour custom silicon over general-purpose GPUs, and no company has more inference volume than Google. The four-partner supply chain, the dual-track v8 roadmap, and the millions of Ironwood chips shipping this year are the infrastructure for a competitive position that Google expects to strengthen with every query it serves.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with