OpenAI Diversifies Compute Stack with Google’s Tensor Processing Units
AI-summarised brief · reviewed before publication
OpenAI, the creator of ChatGPT, has reportedly started using Google's tensor processing units (TPUs) to power its products, marking a significant shift away from its reliance on Nvidia hardware. According to a Reuters report, OpenAI is leasing TPUs through Google's cloud platform, joining a growing list of external customers that includes Apple, Anthropic, and Safe Superintelligence. The move is seen as an effort to lower inference costs and diversify beyond both Nvidia and Microsoft Azure. OpenAI's inference workloads have grown significantly alongside ChatGPT's usage, which now serves over 100 million active users daily. The company's estimated $40 billion annual compute budget is substantially shared by inference workloads, making it essential to find cost-effective solutions. Google's v6e "Trillium" TPUs, designed for steady-state inference, offer high throughput with lower operational costs compared to top-end GPUs. While OpenAI continues to rely on Microsoft-backed Azure for most of its deployment, supply issues and pricing pressures around GPUs have exposed the risks of depending on a single vendor. By bringing Google into the mix, OpenAI improves its ability to scale compute and aligns with the broader industry trend toward mixing hardware sources for flexibility and pricing leverage. There is no indication that OpenAI is planning to abandon Nvidia altogether, but instead, it seems to be exploring alternatives to ensure a more diverse and cost-effective compute infrastructure.