HBM's Thermal Wall: Why One Korean Chipmaker Controls the Pace of the AI Build-Out

6 min read

Key Takeaways

SK Hynix controls 62% of global HBM shipments and sold out its entire 2026 production by mid-2025, making High Bandwidth Memory — not GPU wafers — the binding constraint on AI infrastructure.
Samsung’s HBM3E qualification failure with NVIDIA cost it roughly 18 months of competitive position; the 12-layer stacking process requires yields fundamentally different from standard DRAM production.
CoWoS advanced packaging is a parallel bottleneck — TSMC capacity is fully allocated, with lead times extending 12–18 months beyond standard wafer orders.
Even with Micron’s HBM ramp, SK Hynix is projected to retain pricing power through at least 2027, making HBM contracts a key strategic variable for any AI hardware buyer.

Key Claim: The AI supply chain constraint is not the GPU — it is the High Bandwidth Memory that sits on it, and SK Hynix’s dominance over both production volume and NVIDIA qualification means that constraint will not resolve quickly.

Every NVIDIA Blackwell B200 GPU ships with 192 gigabytes of HBM3e memory spread across eight stacks, delivering 8.0 terabytes per second of memory bandwidth. That is 2.4 times the memory capacity of an H100 — packed into roughly the same silicon footprint. What makes this engineering achievement commercially significant is how difficult each one of those memory stacks is to produce, and how concentrated production capacity has become in the hands of a single Korean chipmaker. As of early 2026, SK Hynix controls approximately 62% of global HBM shipments, had sold out its entire 2026 production by mid-2025, and is charging NVIDIA roughly 50% more for HBM4 than it charges for HBM3e. The AI infrastructure build-out has a memory problem — and the memory problem has a vendor problem.

Table of Contents

The Wafer-Capacity Multiplier That Makes HBM Scarcer Than It Looks

The standard framing of AI’s memory demand understates the manufacturing constraint. Market reports typically cite HBM in gigabytes or revenue share, but the relevant unit for production planning is DRAM wafer capacity consumed per gigabyte shipped. Micron has disclosed that each gigabyte of HBM3e requires roughly three times the wafer capacity of an equivalent gigabyte of DDR5. One set of estimates puts the figure closer to four times for standard DRAM comparisons. (Tom’s Hardware)

When TrendForce applies this multiplier to AI demand projections, AI workloads — including both HBM and GDDR7 demand — are expected to consume approximately 20% of global DRAM wafer capacity in 2026, despite representing a far smaller share of shipped gigabytes. (TrendForce, December 2025) The reallocation of wafer starts toward HBM constrains supply for server DDR5 and consumer DRAM, though the exact share shifted remains difficult to quantify publicly.

The physical reason for this inefficiency is the manufacturing process itself. HBM stacks multiple ultra-thin dies — each 30–50 microns thick — vertically using Through-Silicon Vias. The JEDEC standard limits total stack height to 720 microns for HBM3, which means each additional layer forces manufacturers to use thinner dies. Thinner dies are more prone to warpage and breakage during handling, degrading yield at each step. A 12-layer stack running to the height limit leaves almost no margin. (Vikram Sekar, “Why is HBM so Hard to Manufacture”) HBM4 designs targeting 16-layer stacks — SK Hynix showcased a 48 GB prototype at CES 2026 (SK Hynix) — compound these dynamics further.

Samsung’s Eighteen-Month Failure and What It Cost

The clearest evidence that HBM manufacturing difficulty is not merely theoretical is Samsung’s prolonged inability to meet NVIDIA’s qualification requirements for 12-layer HBM3e.

Samsung completed development of its 12-layer HBM3e by early 2024 but failed NVIDIA’s validation at least three times through June 2025, with TrendForce reporting a third stumble in June and a retest scheduled for September. (TrendForce, June 2025) The failure mode was thermal: Samsung’s packaged HBM3e chips generated excessive heat and drew more power than NVIDIA’s validation thresholds permitted. (Data Center Dynamics) An overheating memory stack inside a GPU creates reliability problems that cascade across the entire accelerator.

Samsung eventually cleared NVIDIA’s tests in September 2025 — 18 months after completing development — following a redesign of the DRAM core to address the thermal issues. (TrendForce, September 2025) Samsung began HBM3e shipments to NVIDIA in Q3 2025 but was not expected to supply NVIDIA in meaningful volume until 2026 — by which point SK Hynix and Micron had already allocated most of the available capacity.

The 18-month delay was not commercially neutral. Samsung’s memory business executive vice president Kim Jae-june acknowledged the shortfall plainly in October 2025: “We’ve significantly expanded our HBM production for next year compared with this year, yet customer demand has already outpaced supply.” (KED Global, October 2025) Samsung’s record Q3 2025 memory revenue of 26.7 trillion won came almost entirely from non-NVIDIA channels — notably Google, which sources more than 60% of its TPU HBM3e from Samsung and is set to retain Samsung as its primary HBM supplier in 2026. (TrendForce, December 2025) In practice, the HBM market is bifurcating: SK Hynix and Micron serve NVIDIA; Samsung serves Google and is working its way back into NVIDIA allocation.

Micron’s Entry and the Limits of a Third Supplier

Micron’s qualification as an HBM3e supplier to NVIDIA is a structural improvement in the supply chain’s resilience. By Q2 2025, Micron held approximately 21% of HBM shipments by volume, compared to Samsung’s 17% — notable because Micron entered volume HBM production later and has a smaller DRAM manufacturing base. (Astute Group) Micron’s reported 8-layer HBM3e yield of approximately 75% and 12-layer yield of approximately 70% compare well with the qualification problems Samsung was managing during the same period.

The limits are financial and physical. Micron has raised its capital expenditure plan for fiscal year 2026 to approximately $20 billion, up from roughly $13.8 billion in FY2025 — much of it directed toward HBM capacity and hybrid bonding transitions. (TrendForce, December 2025) Despite the investment, Micron’s entire 2026 HBM supply is already fully booked. Micron is shipping HBM4 engineering samples with volume production planned for Q2 calendar 2026, offering over 60% more bandwidth than HBM3e. Yet even with Micron’s expansion, SK Hynix retains three times Micron’s market share — enough to maintain pricing leverage across the HBM3e generation and into the early HBM4 ramp.

How This Lands in the Data Centre

Each NVIDIA GPU generation requires substantially more HBM per unit — from 80 GB on the H100 to 192 GB on the B200 — exactly as production capacity remains constrained. (Introl Blog) The result is a supply chain where the slowest components set the pace for everything downstream — including TSMC’s CoWoS packaging line, which assembles GPU die and HBM stacks into a single package.

Cloud pricing reflects the tightness. GPU cloud capacity across all types was sold out as of early 2026, with all new capacity coming online through August–September 2026 already booked as of March 2026. (GPU Cloud List) According to GPU Cloud List, H100 one-year rental contract pricing rose approximately 40% from a low of $1.70 per GPU-hour in October 2025 to $2.35 per GPU-hour by March 2026. HBM availability is one of several supply-side constraints — alongside CoWoS packaging and power infrastructure — setting the pace for GPU capacity expansion.

The pricing environment is also tightening at the vendor level. Samsung and SK Hynix have raised HBM3e supply prices approximately 20% for 2026; SK Hynix has reportedly secured a 50% price premium for HBM4 over HBM3e on NVIDIA contracts. (TrendForce, December 2025) BofA estimates the 2026 HBM total addressable market at $54.6 billion; Goldman Sachs is more conservative at $45 billion after revising down 13%. Both figures represent a market that did not exist at meaningful scale four years ago.

What to Watch

Samsung’s HBM4 qualification

Samsung’s 12-layer HBM3e approval in September 2025 was commercially late; the question now is whether Samsung can qualify HBM4 alongside SK Hynix and Micron, or whether the pattern of delayed qualification repeats. Samsung has sold out its 2026 HBM supply — but allocation commitments are not the same as qualification. TrendForce reported in September 2025 that Samsung’s HBM4 had reached the final qualification phase with NVIDIA.

SK Hynix’s Cheongju M15X fab

SK Hynix has invested over 20 trillion won in its new fab, with the first clean room targeted for completion in May 2026 and pilot operations to follow. (TrendForce, January 2026) This is the capacity that determines whether SK Hynix can actually expand NVIDIA’s supply in the HBM4 generation, or whether concentration simply migrates from HBM3e to HBM4.

CXL as a partial pressure valve

CXL 4.0, released 18 November 2025, doubles interconnect bandwidth to 128 GT/s via PCIe 7.0, making memory disaggregation more viable for inference workloads. Samsung and SK Hynix are reportedly exploring next-generation AI memory architectures that could supplement HBM in certain use cases. (TrendForce, March 2026) CXL does not replace the in-package bandwidth that HBM delivers for training, but it could reduce the HBM-per-server requirement for inference-optimised deployments.

Price correction risk

Goldman Sachs cut its 2026 HBM TAM estimate by 13%. TrendForce flagged double-digit HBM price drop risks in a July 2025 report as competition expands and capacity grows. (TrendForce, July 2025) The current seller’s market for HBM is not permanent. The structural question is whether Samsung can close the qualification gap fast enough to break SK Hynix’s pricing leverage before HBM4 locks in another generation of concentration.

HBM’s Thermal Wall: Why One Korean Chipmaker Controls the Pace of the AI Build-Out

The Wafer-Capacity Multiplier That Makes HBM Scarcer Than It Looks

Samsung’s Eighteen-Month Failure and What It Cost

Micron’s Entry and the Limits of a Third Supplier

How This Lands in the Data Centre