Skip to main content

Search Here

Technology Insights

Liquid and Immersion Cooling: The Plumbing Revolution Keeping AI Data Centers from Melting in 2026

Liquid and Immersion Cooling: The Plumbing Revolution Keeping AI Data Centers from Melting in 2026

  • Internet Pros Team
  • April 22, 2026
  • AI & Technology

For sixty years, the humming roar of air-conditioned server rooms was the soundtrack of computing. In 2026, that soundtrack is going quiet — replaced by the soft hum of pumps circulating coolant directly through the bellies of servers, and in some facilities by the eerie silence of GPUs submerged in tanks of clear dielectric fluid. The reason is physics: a single NVIDIA GB200 NVL72 rack can now draw 120 kilowatts, and the coming GB300 platform pushes that beyond 200 kW — an order of magnitude more heat than any air handler can reasonably remove. Liquid and immersion cooling have stopped being exotic options for crypto miners and HPC labs and have become the new default for every hyperscale AI data center being built this year.

Why Air Cooling Finally Broke

A standard data-center rack, cooled by raised-floor air plenums and perimeter CRAC units, can dissipate roughly 15 to 30 kilowatts before the fans scream and the intake temperatures climb out of spec. That was plenty for a decade of x86 servers running web apps, databases, and virtualized workloads. But AI training and inference have re-written the rules. An 8-GPU NVIDIA HGX H100 server already punched past 10 kW on its own. The GB200 NVL72, which connects 72 Blackwell GPUs with a copper NVLink spine, concentrates the workload of an entire AI supercomputer into a single 120-kW rack. Try to air-cool that and you would need more fan wattage than the compute itself.

Liquid removes heat roughly a thousand times more efficiently than air by volume. A one-inch coolant hose can carry away more thermal energy than a server-room hallway of chilled air. That efficiency is what makes the modern AI rack economically viable — and what is reshaping the physical architecture of the data center from the chip upward.

Higher Density

Liquid-cooled racks routinely operate at 100–250 kW, unlocking GPU clusters that would be physically impossible to air-cool in the same floor space.

Lower PUE

Power Usage Effectiveness drops from ~1.5 on legacy air-cooled halls to 1.05–1.10 on modern liquid sites — a 30%+ cut in cooling overhead.

Quieter, Safer Silicon

GPUs run 10–20°C cooler, boosting clock headroom, extending component life, and eliminating the thermal throttling that plagues dense air-cooled trays.

The Three Flavors of Liquid Cooling

Not all liquid cooling is created equal. Operators in 2026 choose between three increasingly aggressive approaches, each with its own cost, retrofit story, and efficiency envelope.

Approach How It Works Rack Density Best For
Rear-Door Heat Exchangers A water-cooled radiator replaces the back of the rack; server fans push hot air through it before it escapes the hall. 30–60 kW Retrofits of existing air-cooled facilities
Direct-to-Chip (D2C) Cold plates bolt directly onto CPUs and GPUs; a closed loop carries coolant to a rack-level CDU and out to the facility loop. 60–250 kW AI training and inference (GB200, MI325X)
Full Immersion Entire servers are submerged in tanks of non-conductive dielectric fluid — single-phase (circulated) or two-phase (boiling). 100–400 kW Greenfield AI campuses, edge HPC, crypto

Direct-to-Chip: The Hyperscaler Default

When NVIDIA shipped the GB200 NVL72 in volume in 2025, it did so with a reference design that is liquid-cooled only — the first time the world's most important AI compute product has come without an air-cooled SKU. Every Blackwell GPU sits under a cold plate connected to a manifold of quick-disconnect lines, and the entire rack plugs into a Coolant Distribution Unit that handles the heat hand-off to the facility water loop. Dell, Supermicro, HPE, Lenovo, and Foxconn all build variants, and the OCP (Open Compute Project) has standardized the fittings so racks from different vendors can share plumbing.

Direct-to-chip wins the hyperscaler market because it is retrofit-friendly. Existing server chassis, existing racks, and existing data-center floors can be upgraded one hall at a time without rebuilding the facility. Microsoft has publicly committed that 100% of its new AI halls ship with D2C, and Google's most recent Iowa and Ohio campuses use facility-water loops designed for 2030-era densities that would have seemed impossible even three years ago.

"We crossed a line in 2024. Every new GPU platform we are designing to is water-cooled at the chip. Air cooling is now a legacy constraint, not a default."

Noelle Walsh, CVP, Microsoft Cloud Operations & Innovation

Immersion Cooling: The Next Step

Immersion cooling takes the idea one level further: instead of plumbing liquid into a server, the entire server is submerged in a bath of non-conductive dielectric fluid — a synthetic or mineral oil that conducts heat beautifully but no electricity. In single-phase systems, pumps circulate the fluid through a heat exchanger. In two-phase systems, the fluid boils right off the chip at roughly 50°C, rises as vapor, condenses on a coil in the top of the tank, and rains back down — a heat pipe the size of a bathtub, with no moving parts inside.

Vendors like Submer, GRC (Green Revolution Cooling), LiquidStack, Iceotope, and Asperitas have spent a decade proving the model in crypto, HPC, and edge sites. In 2026, they are finally landing hyperscale deals: Meta confirmed immersion pods inside its Richland Parish, Louisiana campus; TikTok parent ByteDance standardized on single-phase immersion across new Asian builds; and a new wave of sovereign AI facilities in the UAE, Saudi Arabia, and India are deploying immersion from day one to survive desert ambient temperatures without burning water.

  • No fans: Eliminating server fans can cut a facility's total compute power draw by 5–10% on its own.
  • No dust, no humidity: Immersed hardware runs in a chemically clean environment, extending mean-time-to-failure dramatically.
  • Heat reuse is easy: Outlet coolant at 50–60°C is hot enough to feed district heating, greenhouses, or industrial processes directly.
  • Smaller buildings: Without raised floors and giant air plenums, immersion halls can be half the footprint of an equivalent air-cooled site.

The Water Problem — and the WUE Answer

Liquid cooling does not mean water waste. In fact, modern closed-loop systems use far less water than the evaporative chillers and cooling towers they replace. A traditional hyperscale data center can consume millions of gallons of potable water per day; a well-designed D2C loop loses only what evaporates from the facility's dry coolers, which can be nearly zero in cool climates. Regulators have responded by elevating Water Usage Effectiveness (WUE) alongside PUE as a reporting metric, and hyperscalers now publish both — a change driven largely by the arid-region AI buildouts of the last two years.

Two-phase immersion takes the water story furthest: the fluid never leaves the tank, and the heat exchange to the outside world can be done entirely through dry coolers or a closed glycol loop. Data centers in Arizona and Nevada built on this model in 2025 reported WUE figures below 0.1 L/kWh — less than one-tenth of the industry average a decade ago.

What This Means for the Rest of Us

Most IT buyers will never see an immersion tank in person, but the second-order effects of this shift are already reaching their inboxes. Cloud prices for large AI instances are being set by hyperscalers whose cost-per-token is dominated by how efficiently they can cool a rack. Enterprises planning on-prem GPU installations are quickly learning that a traditional colocation cage is not ready for a GB200, and that liquid-ready colo space carries a meaningful premium. And sustainability officers are discovering that their scope-2 emissions estimates depend more on the PUE and WUE of their cloud vendor's cooling design than on any software efficiency work their developers can do.

Key Takeaways for 2026
  • Air cooling is done for new AI workloads. Every major GPU reference design now ships liquid-cooled by default.
  • Direct-to-chip is the volume winner for the next three years — retrofit-friendly, OCP-standardized, and available from every tier-one server vendor.
  • Immersion is the long game, especially in hot climates and greenfield sovereign AI sites where water is scarce and density needs to go higher still.
  • PUE below 1.1 is the new benchmark. Hyperscalers publishing higher numbers will face growing scrutiny from regulators and enterprise buyers alike.
  • Heat is now a product. Leading operators are signing district-heating, greenhouse, and industrial offtake deals that monetize waste heat instead of throwing it away.

The computing industry rarely has to change its physical plumbing. It did so in the 1960s with mainframe chilled-water loops, and again in the 1990s with raised-floor air conditioning. 2026 is the third such reset — driven not by better computers but by GPUs so hot that air physically cannot keep up. For hyperscalers, liquid and immersion cooling are the price of admission to the AI era. For everyone else, they are the invisible reason the next generation of AI features can be delivered at all.

Share:
Tags: AI & Technology Data Centers Cloud Computing Hardware Sustainability

Related Articles