AI - The AI Supply Chain - Part 5 - Data Center Cooling
- brencronin
- May 6
- 8 min read
Cooling: The Second Pillar of Data Center Operations
While power is the foundation of data center functionality, cooling is the second most critical factor, especially in high-performance environments like AI data centers. Servers and networking equipment generate significant heat, and without effective thermal management, that heat can lead to system failures or reduced performance. As such, data center cooling is not just an operational concern but a vital link in the AI supply chain.
This article offers a simplified high-level overview of how data center cooling works, how cooling technologies have evolved, and why cooling is intrinsically connected to power consumption. In the first article in this series, we discussed a standard 10 kW rack as a reference point for data center power. That limit is largely a byproduct of older air-based cooling designs. But today, advanced cooling technologies, such as direct-to-chip liquid cooling (DLC), are allowing for higher power densities and tighter rack spacing, fundamentally changing what’s possible in modern data centers.
Traditional Data Center Cooling: Simple Explanation of How It Works
At the most basic level, servers are cooled by air. Cold air is drawn into the front of the server chassis, circulated by internal fans across critical components like CPUs and memory, and exhausted out the rear as hot air. This airflow cycle forms the foundation of air-based data center cooling.
To scale this up across rooms full of servers, data centers deploy Computer Room Air Conditioners (CRACs) and Computer Room Air Handlers (CRAHs). These systems generate and distribute cold air into the data halls while simultaneously extracting hot air. The core mechanism behind this heat exchange is relatively straightforward: hot air is passed over chilled liquid-filled coils, transferring the heat to the liquid.

This now-warmed liquid is then pumped to chillers or water towers; the large units often found on the rooftops or perimeters of data center buildings. These chillers remove the heat using a process known as adiabatic expansion cooling and return the now-cooled liquid to the CRAC/CRAH systems, completing the cycle.

Cooling and the Rack: Managing Heat at the Server Level
At the heart of server cooling is the challenge of managing heat generated by individual components, particularly high-performance chips like CPUs and GPUs. These components are equipped with heat sinks, specialized metal structures designed to draw heat away from the chip and disperse it into the surrounding air. This heated air is then expelled from the rear of the server by internal fans.
As chip performance increases, so does heat output. This is why GPU-based servers, which consume significantly more power than standard CPU-based servers, often require larger heat sinks and more airflow, making them physically larger and consuming more rack units (RUs) despite having a similar number of chips.

A key thermal management concept here is ΔT (Delta T), or the temperature differential between the cool air entering the front of the server and the hot air exiting the rear. This differential serves as a measure of cooling effectiveness. The lower the inlet air temperature, the less work the server fans have to do to move enough air to dissipate heat. Conversely, higher inlet temperatures require greater airflow to maintain safe operating conditions. Understanding and optimizing this Delta T is crucial for efficient cooling, especially as densely packed, high-performance racks become more common in AI and High-Performance Compute (HPC) environments.

Controlling Cold and Hot Air in the Data Center
Computer Room Air Conditioners (CRACs) are central to data center cooling, pushing chilled air beneath a raised floor. Perforated floor tiles placed in front of server racks allow pressurized cold air to flow upward into server inlets. Server fans then pull this air across hot components and exhaust the heated air out the back. Some racks also include integrated fans to assist airflow. This cooling approach, illustrated earlier, historically limited power density to around 10 kW per rack, a constraint still common in many U.S. colocation data centers, despite some improvements in airflow design.
One foundational enhancement was the introduction of cold aisle/hot aisle orientation. By alternating the direction of racks, cold air is directed into server inlets from one aisle (cold aisle), and hot air is exhausted into the opposite aisle (hot aisle). This prevents hot exhaust from one server becoming the intake for another.

To further optimize this layout, cold aisle containment systems were introduced. These use doors and panels to enclose the cold aisle, trapping chilled air and preventing it from mixing with warmer ambient air, ensuring servers receive the coolest possible air.

Advanced designs also incorporate hot aisle containment, which encloses the hot aisle to isolate and remove warm exhaust more efficiently. However, hot aisle containment typically requires a custom ceiling plenum and is generally only feasible in newer data center builds.

Fan Walls
An alternative to traditional CRAC/CRAH systems that push cold air through raised floors is the fan wall cooling design. In this approach, wall-mounted air-handling units (AHUs) are integrated into the perimeter or interior walls of the data hall. These AHUs deliver cool air directly into the cold aisles, effectively "flooding" them with chilled air.

After passing through the server racks and absorbing heat, the hot exhaust air is isolated in the hot aisles and either recirculated back to the cooling coils or vented out of the facility, depending on the system configuration.
Fan wall designs are particularly well-suited for slab floor data centers, which are becoming more common in modern builds. Unlike raised floor environments, slab floors offer several advantages:
Improved durability: They support heavier and newer IT equipment without structural concerns.
Lower construction complexity and cost: No need for underfloor cabling or airflow management systems.
Enhanced safety in seismic zones: Slab foundations are inherently more stable during earthquakes.
Rising Server Power Demands Are Reshaping Data Center Cooling
As servers have grown more powerful, driven by denser chips and high-performance workloads like AI training, the demand for cooling has increased dramatically. Modern AI servers often run at full power continuously, generating immense amounts of heat. Today, it's not uncommon for a single server to draw up to 10 kW+ of power, pushing traditional data center designs to their limits.
To manage the heat output, data centers have started adopting new rack configurations, such as 1U to 4U rack designs, often with additional spacing between racks to improve airflow. These changes, while necessary for thermal management, introduce several downstream challenges:
Reduced server density: More space per server means fewer servers per data hall footprint.
Increased latency: As GPUs are spaced farther apart, GPU-to-GPU communication takes longer, potentially degrading performance for tightly coupled workloads.
Fiber limitations: Maintaining high-speed interconnects over longer distances requires more expensive fiber types optimized for longer-range and high-bandwidth performance.
The diagram below illustrates this concept. For instance, consider a medium-power AI server equipped with two 3.3 kW power supplies, drawing a total of 6.6 kW. In a data center where each rack is limited to 10 kW, only one such server can be installed per rack, as shown on the left. With this rack power imitation to deploy four of these servers, they must be spread across four separate racks, wasting floor space, increasing cabling costs, and potentially introducing performance delays due to longer interconnect distances. This example highlights how increasing rack density leads to greater power demands, and, in turn, higher heat output required for the data center to handle.

Rear-Door Heat Exchanger (RDHx): Efficient Close-Coupled Cooling
One of the most effective innovations in data center cooling is the Rear-Door Heat Exchanger (RDHx). This solution places a radiator-style heat exchanger directly on the back of each server rack, where it absorbs the hot exhaust air as it leaves the servers. RDHx systems can operate with chilled water or other coolants, removing heat right at the source, eliminating the inefficiencies of traditional CRAC/CRAH systems that rely on distant cooling units and complex ducting.

Key advantages of RDHx systems include:
High thermal efficiency: Cooling is applied exactly where it's needed, improving energy use and reducing loss.
Neutral room temperature: RDHx systems often eliminate the need for cold or hot aisle containment by keeping the air temperature in the room balanced.
Support for higher rack power densities: RDHx systems can easily cool racks consuming 30–40 kW, and with the addition of rear-door fans, they can exceed 50 kW per rack.
Direct-to-Chip Liquid Cooling (DLC): Unlocking Ultra-High Rack Power Densities
Direct-to-Chip Liquid Cooling (DLC) is a cutting-edge thermal management technology enabling data centers to support rack power densities of 100 kW or more. Unlike traditional air-based cooling, DLC uses cold plates, metal plates in direct contact with high-heat components like CPUs or GPUs, to draw heat away through a liquid coolant.
The process works as follows:
Coolant flows through the cold plate, absorbing heat directly from the chip.
The now-heated liquid is routed to a Coolant Distribution Unit (CDU).
At the CDU, a heat exchanger transfers the heat to a secondary medium (air or another liquid) for external heat rejection.
There are two primary types of CDUs:
Liquid-to-Air (L2A) CDU: Transfers heat from the liquid to air, which is then expelled.
Liquid-to-Liquid (L2L) CDU: Transfers heat to another liquid loop, often tied to a facility’s chilled water system.

The example below shows Supermicro liquid-cooled racks used in the xAI data center. Each rack houses eight 4U servers, with each server containing eight NVIDIA H100 GPUs—for a total of 64 GPUs per rack. Each server draws approximately 10 kW, meaning the entire rack consumes around 80 kW. Thanks to liquid cooling, this high power density is achieved without relying on traditional cooling methods such as raised floors or hot/cold aisle containment.

Rack Cooling Summary by Power Draw
The diagram below, sourced from Vertiv’s article 'Understanding Direct-to-Chip Cooling in HPC Infrastructure: A Deep Dive into Liquid Cooling', provides a clear overview of various cooling technologies and the power levels they support. The top x-axis represents power draw, while the center rectangle bars categorize the different cooling methods and the power levels they support. Each method is color-coded: black shows the typical supported range, orange indicates extended capabilities with modifications, and purple represents the upper operational limits of the technology.

The Data Center as a Chip — Cooling as a Key Enabler
In 1965, Gordon Moore published a landmark article in Electronics magazine titled "Cramming More Components onto Integrated Circuits," where he predicted that the number of transistors on a chip would double approximately every two years, a concept now famously known as Moore's Law. For decades, this held true as transistors shrank and performance scaled, thanks to shorter distances for electrons to travel.
However, as we approach the physical and economic limits of transistor miniaturization, performance improvements are now being driven by clustering multiple chips together, first within servers, then racks, and now across entire data centers. This evolution effectively turns the data center into a single, integrated computational unit. To continue the trajectory of Moore's Law at the system level, the industry has adopted a strategy called System Technology Co-Optimization (STCO).
STCO relies heavily on tightly integrated, high-density compute clusters, requiring significantly more power and generating far more heat. This makes advanced cooling technologies essential. Without the ability to support higher rack power densities, these performance gains are not achievable. In short, modern cooling is no longer just a facility concern, it's a fundamental enabler of next-generation compute performance.
References
Adiabatic cooling:
The Case for Air-Cooled Data Centers:
The 4 Delta T’s of Data Center Cooling: What You’re Missing:
CoolShield Containment: Aisle Containment Systems
Move to a Hot Aisle/Cold Aisle Layout:
Report to Congress on Server and Data Center Energy Efficiency:
Differential air pressure in your data centre:
Hot aisle containment(HAC) solutions is used to maximize efficiencies by coolingand removing the heat produced by data storage and processing equipment:
A numerical investigation of fan wall cooling system for modular air-cooled data center:
Rear door vs. traditional cooling for data centers:
Data Center Cold Wars - Part 4. Rear Door Heat Exchanger:
Direct-to-Chip Cooling: The Future Of The Data Center:
Understanding direct-to-chip cooling in HPC infrastructure: A deep dive into liquid cooling:
Inside the 100K GPU xAI Colossus Cluster that Supermicro Helped Build for Elon Musk:
Supermicro 4U Universal GPU System for Liquid Cooled NVIDIA HGX H100 and HGX H200:
Fabricated Knowledge: The Data Center is the New Compute Unit: Nvidia's Vision for System-Level Scaling:
Cramming More Components onto Integrated Circuits: Moores La:
Microsoft data centers sustainability:
Energy demand from AI:
Semianalysis: Multi-Datacenter Training: OpenAI’s Ambitious Plan To Beat Google’s Infrastructure:
Microsoft: Modern data center cooling:
Data Center Cooling Continues to Evolve for Efficiency and Density:
Direct-to-Chip Cooling: Everything Data Center Operators Should Know:
Microfluidics: Cooling inside the chip:
How Data Centers Use Water, and How We’re Working to Use Water Responsibly:
Comentarios