The AI Chip Shortage Never Ended. It Just Changed Shape.

4 min read
Key Takeaways
  • The AI chip shortage didn’t end — it transformed. The bottleneck shifted from wafer supply to CoWoS advanced packaging, which TSMC has sold out through 2026.
  • NVIDIA locked in over 70% of TSMC’s CoWoS-L capacity; the remaining allocation is split among AMD, Broadcom, Marvell, and others — creating a structural constraint that compounds with HBM3E supply tightness.
  • Hyperscalers are bypassing the shortage with custom silicon (TPUs, Trainium, Maia), but this accelerates AI chip market fragmentation rather than resolving the underlying supply problem.

Key Claim: AI chip demand has outpaced supply consistently since 2022, but the shortage has shifted from foundry capacity to memory bandwidth.

The narrative that emerged in mid-2024 — that the AI chip shortage was easing — was accurate in the narrow sense. Wafer supply for AI accelerators had improved. It was wrong in the more important sense: the constraint had migrated downstream, from wafer fabrication to packaging, and from general GPU supply to a small number of highly specialised processes that the entire industry relies on.

The NextWave Signal — Sharp analysis, twice a week.

Two years into the AI infrastructure buildout, the supply chain for high-end AI silicon is not unconstrained. It has a different shape.

The Packaging Chokepoint

The bottleneck that matters most in 2025 and 2026 is Chip-on-Wafer-on-Substrate (CoWoS) packaging — the process that allows High Bandwidth Memory (HBM) to be co-packaged with AI accelerators at the density that modern AI workloads require. Without CoWoS, a wafer is not a finished AI accelerator; it is a piece of silicon waiting for a process that is more constrained than the silicon itself.

TSMC’s CEO C.C. Wei stated publicly that CoWoS capacity was “sold out through 2025 and into 2026.” TrendForce projects TSMC’s CoWoS capacity reaching roughly 120,000 to 130,000 wafers per month by end of 2026 — a significant increase from 75,000 in 2025 — but analysts note the expansion is unlikely to close the gap with demand. NVIDIA reportedly secured over 70% of TSMC’s CoWoS-L capacity for 2025 to support its Blackwell GPU architecture. The remainder, split among AMD, Broadcom, Marvell, and others, has been correspondingly tight.

The HBM situation is structurally similar. SK Hynix, Micron, and Samsung have all confirmed in recent earnings calls that HBM3E supply is fully allocated through 2026, with early signals of continued tightness into 2027 as hyperscalers secure long-term contracts.

TSMC’s Numbers

Against this backdrop, TSMC’s financial results tell the demand story clearly. AI chip demand soared 39% year-on-year in Q3 2025, and high-performance computing — which includes AI and data centre workloads — now accounts for 55% of TSMC’s total quarterly revenue. TSMC’s sales grew 30% year-on-year in the quarter ending March 2026, with AI hardware demand identified as the primary driver.

Capital expenditure tells the supply response story. TSMC’s CapEx is projected to rise to between $52 billion and $56 billion in 2026, up from $40.9 billion in 2025. TSMC began mass production of 2nm chips in Q4 2025, and its advanced packaging expansion, including a projected 58% year-on-year increase in CoWoS capacity, represents the most direct response to the packaging constraint.

Hyperscalers Respond With Custom Silicon

One structural response to supply concentration risk is already underway. Custom ASICs are projected to constitute nearly 45% of CoWoS-based accelerator shipments by 2026, up from 20–30% in 2024, as the major cloud providers develop their own silicon to secure long-term manufacturing allocations. Google’s TPUs, Amazon’s Trainium and Inferentia, Microsoft’s Maia, and Meta’s MTIA are no longer research projects — they are production-scale responses to the risk of depending on NVIDIA’s allocation priorities.

Combined CapEx from the top four cloud providers roughly doubled to approximately $600 billion annually over just two years. A substantial portion of that investment is not just purchasing NVIDIA chips — it is securing foundry capacity and building the supply chain independence that NVIDIA’s dominance of GPU allocation makes strategically necessary.

The Geopolitical Layer

The supply chain complexity is compounded by US export controls. Successive rounds of restrictions have progressively limited what NVIDIA and other US chip designers can sell to China, including the H20 — a downgraded chip designed specifically to comply with earlier control thresholds — which was subject to further restrictions in April 2025. The effect is bifurcated demand: Chinese AI developers have accelerated investment in domestic alternatives, including Huawei’s Ascend series and domestic memory suppliers, while simultaneously driving up demand for whatever export-eligible hardware remains available.

The broader AI semiconductor story that dominated 2025 — from DeepSeek’s cost-efficiency challenge to the race to build sovereign AI infrastructure — flows directly through the same constrained supply chain. Efficiency gains reduce per-inference compute costs; they do not reduce demand for the frontier training runs and infrastructure buildouts that require the most advanced silicon.

What to Watch

The critical signals for 2026 are CoWoS capacity utilisation at TSMC and its OSATs, HBM allocation confirmations from memory manufacturers’ Q2 earnings calls, and the pace of hyperscaler custom ASIC deployments. If TSMC’s packaging expansion runs ahead of schedule, supply pressure eases. If demand continues to accelerate beyond current projections — as it has for each of the past three years — the constraint persists in a new form.

The AI semiconductor supply chain is not broken. It is structurally tight in ways that favour those who secured capacity early and penalise those who did not.

This article was produced with AI assistance and reviewed by the editorial team.

Further Reading


Source Trail

Arjun Mehta, AI infrastructure and semiconductors correspondent at Next Waves Insight

About Arjun Mehta

Arjun Mehta covers AI compute infrastructure, semiconductor supply chains, and the hardware economics driving the next wave of AI. He has a background in electrical engineering and spent five years in process integration at a leading semiconductor foundry before moving into technology analysis. He tracks arXiv pre-prints, IEEE publications, and foundry filings to surface developments before they reach the mainstream press.

Meet the team →
Share: 𝕏 in
The NextWave SignalSubscribe free

The NextWave Signal

Enjoyed this analysis?

One AI market analysis + one emerging-tech signal, every Tuesday and Friday — written for engineers, PMs, and CTOs tracking what shifts before it goes mainstream.

Leave a Comment