Why AI Killed Copper: How the Bandwidth Demands of LLM Training Clusters Are Forcing Data Centres to Go Photonic

The data centre of the twentieth century was built on copper. Twisted pair Ethernet cables, copper-trace circuit boards, electrical backplanes, and copper interconnects between chips, between servers, and between racks served the computing infrastructure of the internet era adequately because the bandwidth requirements of web workloads — even at hyperscaler scale — remained within the physical capabilities of well-engineered copper infrastructure.

The data centre of the AI era cannot be built on copper. The bandwidth requirements of frontier AI training clusters have exceeded the physical capabilities of copper interconnects at commercially viable power budgets. This is not a matter of engineering improvement or technology roadmap optimism — it is a statement about fundamental physics. The photonic data centre is not a choice that hyperscalers are making because it is cheaper or more convenient. It is a necessity that AI has imposed on them.

The Numbers That Killed Copper

To understand why AI made copper physically inadequate, consider the bandwidth requirements of a frontier AI training cluster at the scale currently deployed by Microsoft, Google, Meta, and Amazon. A single NVIDIA GB200 NVL72 rack — 72 Blackwell B200 GPUs connected by NVLink — requires aggregate internal NVLink bandwidth of 57.6 Tbps. The rack itself consumes up to 120kW of power. Now scale this to a 100,000 GPU training cluster — the scale GPT-4 was trained on and the baseline for next-generation frontier model training.

At this scale, inter-rack bandwidth requirements reach into the exabit range. Even connecting just 10% of GPUs simultaneously for all-to-all communication during gradient synchronisation requires aggregate fabric bandwidth exceeding 1 Pbps. Copper DAC (Direct Attach Copper) cables — the traditional short-reach data centre interconnect — support 400G at distances up to 5 metres but require active retiming beyond that, consuming 10-15W per port. Deploying 400G copper at 100,000-GPU scale requires interconnect power budgets in the tens of megawatts — economically and thermally unacceptable.

Optical fibre carries the same 400G signal with sub-watt power consumption per transceiver, at distances of hundreds of metres with no signal degradation, in a cable 1/10th the diameter of equivalent copper. The power savings alone justify the transition. The density and distance capabilities make it the only viable option at frontier AI scale.

Silicon Photonics: The Manufacturing Breakthrough

The photonic data centre transition was enabled by a manufacturing breakthrough: the ability to fabricate photonic integrated circuits — waveguides, modulators, multiplexers, and detectors — on standard CMOS silicon fabrication processes at the same fabs that produce conventional microchips. Before silicon photonics, optical components were expensive compound semiconductor devices (indium phosphide, gallium arsenide) that could not be manufactured at CMOS scale or cost.

Silicon photonics changes this entirely. A silicon photonics chip can be fabricated at TSMC, GlobalFoundries, or Intel's fabs on the same wafer runs as digital ASICs. The cost per transceiver has declined over 90% in a decade. Intel's silicon photonics transceiver division ships hundreds of millions of dollars of optical modules annually. Marvell's Brightlane silicon photonics platform powers hyperscaler deployments at Microsoft and Amazon. The manufacturing infrastructure for silicon photonics is mature, high-volume, and cost-competitive with the copper systems it replaces.

"Copper is not losing to silicon photonics because photonics is better technology. Copper is losing because it is physically incapable of meeting the bandwidth requirements of AI infrastructure at commercially viable power budgets. PhotonicDC.com names the data centre that physics demands."

The Hyperscaler Deployment Wave

Every major hyperscaler is actively deploying silicon photonics infrastructure at scale. Google's Jupiter network fabric — the internal data centre network connecting hundreds of thousands of servers — has used silicon photonics transceivers since 2016 and is expanding optical switching capacity with each generation. Microsoft's Azure data centre network uses Marvell silicon photonics in its high-bandwidth spine layers. Meta's data centre fabric uses 400G optical modules at every spine-to-leaf connection. Amazon's AWS data centre network is transitioning spine layers to optical switching to support the bandwidth requirements of its AI accelerator clusters.

The common driver is the same in every case: AI training workloads requiring all-to-all GPU communication at bandwidths that copper infrastructure physically cannot support at viable power budgets. The photonic data centre is not coming — it is being built right now, at hyperscaler scale, by the world's largest technology infrastructure operators. PhotonicDC.com names the platform covering every dimension of this buildout.

Own the Photonic Data Centre Intelligence Domain

PhotonicDC.com — the definitive platform for the light-speed AI data centre transition. Available for acquisition now.

Acquire This Domain →
// more_articles

Continue Reading

CPO & Silicon Photonics

Co-Packaged Optics: Silicon Photonics at Package Level

Jan 30, 202610 min
Domain Value

PhotonicDC.com Domain Value Analysis

Feb 17, 20267 min