NVIDIA and Google infrastructure cuts AI inference costs

artificialintelligence-news.com Apr 23, 2026

NVIDIA and Google infrastructure cuts AI inference costs

Curated by Shubham · TechCapsules · Apr 23, 2026

AI-summarised brief · reviewed before publication

Google and NVIDIA outlined their hardware roadmap to reduce AI inference costs at scale. The new A5X bare-metal instances, running on NVIDIA Vera Rubin NVL72 systems, aim to deliver ten times lower inference cost per token and higher token throughput. The architecture pairs NVIDIA ConnectX-9 SuperNICs with Google Virgo networking technology, scaling to 80,000 NVIDIA Rubin GPUs within a single site cluster. This reduces costs and increases efficiency for demanding AI workloads.

💡 Why It Matters

· By integrating NVIDIA's platforms with Google Cloud's infrastructure, customers can optimize AI performance, cost, and sustainability.
· This partnership enables enterprises to run demanding workloads while addressing data sovereignty and security requirements.

Read full article on artificialintelligence-news.com

Nearly half of American adults now use AI, but concerns are also…

New research by the Pew Research Center reveals that nearly half of American adults (49%) now use AI chatbots, such as ChatGPT or Gemini, marking a significant increase from last year's 33%. Americans are using AI for various tasks, including researching information, handling work tasks, and entertainment. ChatGPT dominates the U.S. AI market, with 44% of respondents using the OpenAI chatbot. However, concerns about AI's impact on society and data security remain high, with 40% believing AI will be more harmful than beneficial and 71% expecting AI to make their personal information less secure.

💡 Growing AI adoption raises questions about accountability and regulation, as Americans lack confidence in both the government and AI companies to develop and use these tools responsibly. The widespread skepticism about AI's impact on society highlights the need for more transparent and effective regulation to address concerns about data security and societal benefits.

appleinsider.com · Jun 18

Apple’s AI agents in Xcode 27 make vibe coding easier

Apple has unveiled Xcode 27, a developer tool that integrates artificial intelligence (AI) to simplify the coding process. The new version of Xcode includes a Core AI framework that allows developers to use on-device AI models with ease, delivering strong performance through a modern Swift API. Xcode 27 also supports third-party AI models and enables developers to have conversations with AI agents, making it easier to create and edit code. This marks a significant shift in Apple's approach to AI, positioning it as a powerful extension of the coding process.

💡 Apple's emphasis on AI in Xcode 27 raises questions about the future of human interaction in app development. By portraying AI as a tool to augment human capabilities, rather than replace them, Apple may be trying to alleviate concerns about job displacement. However, the company's decision to showcase AI-generated apps and code may ultimately lead to a surge in "vibe-coded" apps, challenging the notion that human touch is essential in app development.

teslarati.com · Jun 18

Elon Musk predicts Grok will start to challenge Hollywood by the end…

Elon Musk has announced that xAI's model, Grok, will be capable of creating full movies by the end of 2026. A trailer for Homer's The Odyssey, created using Grok Imagine Video 1.5, showcases the AI's rapid strides in video generation. The 2-minute-plus trailer features 36 consistent shots, realistic special effects, and emotional depth, demonstrating Grok's capabilities in scale, physics, and cinematic workflow. Musk believes Grok will be "watchable" by the end of 2026 and "really good" by 2027.

💡 Hollywood's dominance may be challenged by AI-generated content, as Grok's capabilities raise questions about the future of cinematic authorship.

androidcentral.com · Jun 18

I barely use Gemini’s default chatbot after trying the new Gemini Live…

Google has released a major update for its Gemini app, introducing a new Neural Expressive user interface for Gemini Live. The revamped interface focuses on AI-generated content, displaying spoken words, images, and more on the screen as users interact with the chatbot. This upgrade enables new use cases, such as generating images and viewing the output immediately, and provides users with more control over the conversation, including the ability to copy, share, and export responses.

💡 The Gemini Live update showcases Google's commitment to refining its AI tools, making them more intuitive and user-friendly. By prioritizing visual content, the new interface opens up new possibilities for everyday use, such as improving productivity and creativity.

techtimes.com · Jun 18

Satellite AI Inference Clears Orbit: Gemma 3 Ran Aboard YAM-9 in April

Loft Orbital's YAM-9 spacecraft successfully demonstrated the first publicly disclosed execution of a vision-language model in orbit. Aboard the spacecraft, Google DeepMind's Gemma 3 performed natural language queries, classified Earth imagery, and produced plain-English summaries without transmitting raw pixels to the ground. This achievement marks a significant shift in the value layer of Earth observation, moving from ground-based processing to onboard inference capabilities. The model, running on an Nvidia Jetson Orin AGX graphics processing unit, utilized a software harness developed by NASA's Jet Propulsion Laboratory to fit within the spacecraft's memory budget and processing constraints.

💡 This breakthrough enables Earth observation satellites to operate more autonomously, reducing reliance on ground-based processing and potentially leading to faster and more efficient analysis of satellite imagery.

interestingengineering.com · Jun 18

China’s Alibaba unveils AI brains designed to power the next generation of…

Chinese firm Alibaba has unveiled its first embodied AI model family, the Qwen-Robot suite, which links large language models with real-world robotic actions. The suite comprises three models focused on navigation, manipulation, and world modeling for robots operating in physical environments. Developed by Alibaba's Tongyi Lab, the Qwen-Robot suite enables machines to perceive, reason, and interact with the real world, joining a growing global push to advance embodied AI beyond traditional chatbot applications. The models are currently undergoing pilot testing with selected Alibaba Cloud enterprise clients.

💡 The Qwen-Robot suite's success could accelerate the development of robots capable of performing complex tasks in real-world environments, potentially transforming industries such as manufacturing, logistics, and healthcare.