- Co-packaged optics (CPO) moved from lab research to volume production in 2025: Broadcom shipped 50,000+ Tomahawk 5-Bailly units and Meta completed one million link hours without a single link failure.
- CPO reduces per-port power consumption from 16–17 W to approximately 4–5 W — a 3.5× efficiency gain that becomes structurally necessary at the hundreds-of-thousands-of-GPU scale required for frontier AI training.
- Copper’s physical limits at 800 Gb/s and 1.6 Tb/s make optical integration an engineering inevitability — the bandwidth wall is now the primary constraint on AI cluster scaling, not GPU availability.
Key Claim: The race to build larger AI clusters has shifted the primary bottleneck from GPU compute to the physics of moving data between chips — and co-packaged optics is the first production-scale answer to that constraint.
The race to build larger AI clusters has pushed GPU counts into the hundreds of thousands. Raw compute has scaled remarkably — NVIDIA’s Hopper and Blackwell generations ship in volume, Google has iterated through multiple TPU generations, and Meta and Microsoft have deployed custom ASICs. As compute supply has grown, a different bottleneck has emerged — older, less glamorous, and harder to solve with transistor scaling: moving data between chips fast enough to keep those processors occupied.
This is the bandwidth wall. And the industry’s answer — co-packaged optics (CPO), which integrates photonic engines directly onto the same package as the switching chip or AI accelerator — has moved from lab research to volume production within the past 12 months.
Why Copper Is Running Out of Road
The data centre networking stack has two layers that matter for large-scale AI training. Scale-up networks connect accelerators within a single training job: NVLink within a node, InfiniBand or RoCE between nodes. Scale-out networks handle rack-to-rack and spine/leaf fabric traffic across the broader cluster. Both are under pressure from cluster sizes that have grown by two orders of magnitude in three years.
Traditional pluggable optical transceivers — the dominant form factor since the 40G generation — work by running a high-speed electrical signal from the switch ASIC to a transceiver cage at the edge of the board, where it is converted to light. At 400 Gb/s this was manageable. At 800 Gb/s, the electrical path between ASIC and cage consumes significant power and introduces latency. An 800G DR4 pluggable module consumes 16–17 W. At scale, that thermal load constrains how densely switches can be deployed.
At 1.6 Tb/s with 224G SerDes, the problem becomes structural. Passive direct-attach copper cables lose signal integrity beyond a few metres. The skin effect — where high-frequency signals propagate only on the surface of a conductor — forces cables to be thicker and shorter as speeds increase. Active electrical cables extend reach to 5–10 metres but at increasing power cost. IEEE Spectrum has documented the physical constraints on copper at these speeds, noting that connecting enough GPUs in a dense, low-power, and reliable way is “the most important packaging problem of this era.”
Co-packaged optics removes the problematic electrical boundary by bonding the optical engine directly to the switch or accelerator die. The signal stays optical from source to destination, without the conversion loss and power overhead of a pluggable transceiver cage. The concept has been discussed for more than a decade. Volume production arrived in 2025.
What Is Shipping Now
Broadcom Tomahawk 5-Bailly
In January 2026, Broadcom confirmed shipments of more than 50,000 Tomahawk 5-Bailly CPO switches during 2025 — the industry’s first volume-production CPO solution. The Bailly chip runs at 51.2 Tb/s total switching capacity, with eight 6.4 Tb/s optical engines and no electrical connections at the optical interface. Power consumption at 800 Gb/s is 5.5 W, a 14.1% reduction versus the previous generation’s Humboldt chip.
Meta Platforms and Tencent have run Bailly in lab characterisation environments. In October 2025, Broadcom reported that Meta’s CPO installation had completed one million link hours in a high-temperature lab environment without a single link flap — the kind of reliability data that hyperscalers require before production-scale rollout.
NVIDIA Spectrum-X and Quantum-X Photonics
At GTC on 18 March 2025, NVIDIA announced two CPO switching platforms. Spectrum-X Photonics delivers 100 Tb/s (128 ports × 800 Gb/s) or 400 Tb/s (512 ports × 800 Gb/s) for Ethernet fabrics. Quantum-X Photonics provides 144 ports of 800 Gb/s InfiniBand with a liquid-cooled design.
The headline claim is a 3.5× improvement in power efficiency over traditional pluggable configurations, reducing per-port power consumption from ~16–17 W to ~4–5 W. NVIDIA also reported 63× improvement in signal integrity at scale, by its own internal benchmarks, attributable to the elimination of the electrical conversion path. NVIDIA had targeted Quantum-X availability for late 2025; production availability has not been independently confirmed as of this writing. Spectrum-X Ethernet switches are scheduled for 2026 from infrastructure partners.
The Accelerator-Side Problem
Switches are the first deployment frontier for CPO, but the more consequential challenge is the accelerator itself. As GPU-to-GPU data rates scale, the I/O interface of the accelerator chip becomes the constraint — not the switch connecting them.
This is where the startup and acquisition activity is concentrated. In September 2025, Ayar Labs and Alchip demonstrated a prototype at TSMC’s OIP forum integrating two full-reticle AI accelerators with eight TeraPHY optical engine chiplets and eight HBM stacks on a common substrate. The result: more than 100 Tb/s of scale-up bandwidth per accelerator with 256+ optical ports per device.
Marvell’s acquisition of Celestial AI in December 2025, for up to $5.5 billion, is aimed at the same problem from a different architectural angle. Where conventional CPO brings optics to the edge of a processor package, Celestial AI’s Photonic Fabric routes optically encoded data to any location on the compute die. Marvell does not expect material revenue from the technology until late 2028.
The distinction matters for infrastructure architects: CPO at the switch level is available today and addresses the scale-out networking bandwidth wall. CPO at the accelerator die — where scale-up traffic lives — remains 24–36 months from production deployment at the leading edge.
The Foundry Infrastructure Question
Silicon photonics has historically been difficult to manufacture at yield because integrating photonic components with standard CMOS logic requires either a separate photonics wafer or a process modification that adds risk. TSMC’s COUPE (Compact Universal Photonic Engine) platform, detailed at SEMICON Taiwan 2025, is the foundry’s attempt to standardise CPO the way SoIC standardised chiplet stacking. TSMC plans mass production of COUPE-based CPO by 2026, and both Broadcom’s Tomahawk 6 and the Ayar Labs / Alchip prototype use it.
The trajectory suggests that photonic integration is following the same commoditisation path as logic fabrication: moving from proprietary in-house processes to a foundry-accessible model. This mirrors the dynamic already visible in custom silicon, where hyperscalers design their own ASICs against a common foundry baseline rather than buying merchant silicon.
What the Numbers Mean for Infrastructure Planning
For engineering teams designing or procuring AI cluster infrastructure in 2026, the practical implications fall across three tiers.
Switching tier: CPO switches are available from Broadcom now and from NVIDIA later in 2026. The power efficiency argument is compelling at scale. An 800G port consuming 4–5 W instead of 16–17 W matters when a large spine switch terminates hundreds of ports.
Scale-up fabric: The accelerator-side CPO products from Ayar Labs and Celestial AI/Marvell are not yet in production. Teams building multi-rack GPU clusters in 2026 will still use NVLink (proprietary, within-node) or InfiniBand/Ethernet (pluggable or CPO switches) for scale-up.
Supply chain: Broadcom has 50,000 Bailly switches in the field. CPO switch adoption remains concentrated in the largest hyperscalers; the broader enterprise and mid-tier cloud market is still on pluggable optics for at least two to three years. The AI semiconductor supply chain constraints that affected GPU availability in 2024–2025 have not fully resolved, and CPO adoption adds a new layer of foundry dependency.
Implications / What to Watch
Near-term (2026): NVIDIA Spectrum-X Photonics Ethernet switch availability in the second half of 2026 will be the clearest signal of whether CPO can displace pluggable optics in new AI cluster builds.
Medium-term (2027–2028): TSMC COUPE entering mass production brings photonic integration into the standard chiplet ecosystem. Broadcom’s Tomahawk 6 Davisson (102.4 Tb/s, 200G per lane) will be the first test of third-generation CPO at volume.
Longer-term (2028–2029): Celestial AI’s Photonic Fabric and Ayar Labs’ TeraPHY products targeting the accelerator die represent the next constraint to solve. The $5.5 billion Marvell bet signals confidence that the scale-up optical market will be large enough to justify that investment.
The bandwidth wall is real. The solutions are in production or near production. The open question is not whether optical interconnects replace copper in AI infrastructure — it is at what layer, on what timeline, and who controls the foundry ecosystem that makes it possible.
This article was produced with AI assistance and reviewed by the editorial team.

