Uber joins Amazon’s Trainium roster with AWS expansion deal


Uber joins Amazon’s Trainium roster with AWS expansion deal

In short: Uber has expanded its AWS contract to run real-time ride-matching infrastructure on Amazon’s Graviton4 processor and is piloting AI model training on Trainium3, joining Anthropic, OpenAI, and Apple on a customer list that is becoming the clearest evidence yet that Amazon’s custom silicon strategy is working.

Uber’s infrastructure runs on milliseconds. Every time a rider opens the app, a system called Trip Serving Zones determines which drivers to consider, how to weight them, and how quickly to return a match, all before the user has finished watching the loading animation. At Uber’s scale, which reached more than 40 million trips a day in 2025 across 72 countries, the compute cost of that operation is substantial and the latency tolerance is essentially zero. On 7 April 2026, the company announced it is moving more of that workload to AWS, running Trip Serving Zones on Amazon’s Graviton4 processor and beginning a pilot to train AI models on Trainium3. It is the latest addition to a roster of significant technology companies choosing Amazon’s custom silicon over the default, and for Amazon’s chip programme, arguably the most operationally consequential customer yet.

What Uber is moving, and why

The announcement covers two distinct workloads. Trip Serving Zones, Uber’s real-time infrastructure for matching riders and drivers, will run on Graviton4, Amazon’s ARM-based processor designed for high-throughput, low-latency compute. The workload is not AI in any generative sense; it is infrastructure, and its demands are closer to telecommunications switching than to model inference. What it requires is responsiveness under load, particularly during demand spikes when ride volumes surge and the matching system must scale without introducing delay.

Separately, Uber is beginning a pilot to train AI models on Trainium3 using data from its accumulated trip history. The company has recorded 13.567 billion trips over its lifetime and serves more than 200 million monthly active users, generating a continuous stream of behavioural data on driver allocation, estimated arrival times, demand patterns, and route optimisation. Training AI on that dataset is a longer-term initiative, but the economics of Trainium3 make the pilot financially rational even before any performance case is made.

Kamran Zargahi, Uber’s vice-president of engineering, described the operational rationale plainly. “Uber operates at a scale where milliseconds matter. Moving more Trip Serving workloads to AWS gives us the flexibility to match riders and drivers faster and handle delivery demand spikes without disruption.” On the AI side, Zargahi said the company was “building a technology foundation that will make every Uber experience smarter, so we can keep our focus where it belongs: on the people who use Uber every day.” Rich Geraffo, vice-president and managing director for North America at AWS, framed the partnership in terms of Uber’s real-time demands: “Uber is one of the most demanding real-time applications in the world, and we’re proud to be an important part of the infrastructure powering their global operations.

Uber’s complicated cloud journey

The AWS deal is the third major cloud relationship Uber has entered in the past three years. In 2023, the company signed two separate seven-year agreements, one with Oracle Cloud Infrastructure and one with Google Cloud, as part of an exit from its own data centres. That multicloud strategy was framed as a hedge against vendor lock-in and a way to match specific workloads to the clouds best suited to run them. Adding AWS completes a picture in which Uber is, effectively, a significant customer of all three major hyperscalers simultaneously.

The practical consequence of that structure is that Uber has unusual leverage in negotiations with each provider and unusual freedom to route workloads toward whichever platform offers the best performance-cost ratio for a given function. Moving Trip Serving Zones to Graviton4 is a statement about where AWS currently sits on that curve for high-frequency, latency-sensitive infrastructure. The Trainium3 pilot is a more tentative signal, a test of whether Amazon’s AI training economics can compete with the GPU-based infrastructure Uber already has access to through its existing cloud relationships.

The chip behind the deal

Trainium3 is Amazon’s third-generation AI training accelerator, and its specifications make the cost argument straightforward. Each chip delivers 2.517 petaflops in MXFP8 precision, with 144 GB of HBM3e memory and 4.9 terabytes per second of memory bandwidth. At scale, Trainium3 runs at roughly 30 to 50 per cent of the cost of comparable Nvidia H100 or H200 hardware. The UltraServer configuration allows up to 144 accelerators to be networked together, delivering approximately 362 MXFP8 petaflops, a cluster capable of training frontier-scale models.

The cost differential is the headline, but the underlying argument is about workload fit. Training large models on proprietary trip data does not require the same interoperability demands as inference in production environments, where software ecosystems, CUDA toolchains, and integration dependencies have historically made Nvidia hardware the default. In training contexts, where the workflow is more controlled and the cost per training run compounds across thousands of experiments, the case for custom silicon is more straightforward. The AI chip acceleration that defined 2025 created the volume of Trainium deployments Amazon needed to mature its tooling, and Uber’s pilot arrives at a moment when that software ecosystem is meaningfully more capable than it was 18 months ago.

A customer list Amazon has been building carefully

Uber joins a short but strategically significant group of Trainium customers. Anthropic has committed to using more than one million Trainium chips across Amazon’s Project Rainier cluster. OpenAI, despite its close relationship with Microsoft, included Trainium capacity as part of its $50 billion AWS commitment. Apple has publicly praised Trainium’s performance for its own training workloads. The pattern across those customers is consistent: they are all organisations with large, proprietary datasets, predictable training workflows, and sufficient scale to justify the engineering investment of moving off GPU-default infrastructure. The depth of capital flowing into AI infrastructure, illustrated by commitments like OpenAI’s $50 billion AWS deal, is also forcing every AI-dependent company to evaluate whether their compute costs are sustainable, a pressure that makes Trainium’s price advantage more compelling over time.

For Amazon, each addition to the Trainium customer roster performs a dual function: it validates the chip commercially and it builds the software tooling that makes the next adoption easier. Uber’s use case, training on proprietary operational data at scale, is different enough from Anthropic’s frontier model training to expand the range of workloads Amazon can credibly claim Trainium handles well. That breadth matters as Amazon competes for the next wave of enterprise AI infrastructure decisions. The AI infrastructure deals reshaping the industry’s capital structure are not being won solely on chip performance; they are being won on the combination of performance, cost, ecosystem maturity, and the confidence that comes from seeing who else is on the same platform.

The Nvidia question

Every Trainium announcement is, in some sense, a Nvidia story. Amazon’s custom silicon programme exists because the economics and strategic dependencies of GPU dominance have become uncomfortable for the companies that rely on it most. Uber’s pilot is a small data point in a larger pattern of enterprises exploring what alternatives to Nvidia’s stack look like in practice. The competitive response has not been passive: Nvidia’s NVLink Fusion strategy, which opens its high-speed interconnect to third-party silicon including Marvell’s custom AI accelerators, is a direct attempt to absorb the custom silicon movement into Nvidia’s ecosystem rather than compete with it head-on. The logic is that even if customers build or buy non-Nvidia training chips, they remain inside Nvidia’s networking fabric and software dependencies.

How much of Uber’s AI training ultimately migrates to Trainium will depend on the pilot results, and on whether Amazon’s tooling closes the remaining gaps with the CUDA ecosystem that has made Nvidia hardware the path of least resistance for most AI engineering teams. What the announcement does establish is that Uber is testing those gaps seriously rather than treating them as a given. For an industry that has spent three years talking about Nvidia alternatives without producing many at scale, a 40-million-trips-per-day test environment is as real-world a proof of concept as Amazon could ask for.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with