Nvidia’s MRC: When ‘just Ethernet’ isn’t enough for gigascale AI
AI-summarised brief · reviewed before publication
Nvidia has unveiled Multipath Reliable Connection (MRC), a new networking innovation that supports the demands of artificial intelligence factories. MRC is a remote direct memory access transport protocol that allows a single connection to fan out across multiple network paths, ensuring high bandwidth utilization and fast failure recovery. The protocol has already been used by OpenAI to train large language models and is being deployed by Microsoft in its AI factories.
💡 Why It Matters
- · Gigascale AI requires a network that's an integral part of the compute pipeline, not just a generic plumbing choice.
- · MRC's ability to dynamically steer around congestion and failures makes it a crucial component for training frontier models across tens of thousands of GPUs.