The rapid rise of artificial intelligence (AI), driven by advancements in generative AI such as ChatGPT, has created unprecedented demands on data centers. Ben Selier, Vice President of Secure Power for Anglophone Africa at Schneider Electric, highlights the challenges posed by AI’s immense power and resource needs and outlines actionable strategies for data centers to stay ahead in a recent article titled How Data Centers Can Keep Up with the AI Boom.
Power
AI workloads are significantly more power-intensive than traditional data center operations. For instance, a single ChatGPT query requires nearly 10 times the electricity of a Google search, pushing data center power demand to grow 130% by 2030. Meeting these demands necessitates upgrades to power distribution systems.
“Off-the-shelf rack power distribution units (rPDUs) offer cost-effective solutions but are limited to 43.5 kW per unit,” explains Selier. High-density AI clusters often require custom solutions or multiple PDUs to meet their capacity needs. Liquid-cooled racks, which outperform air-cooled options, can handle up to 87 kW of density but require careful design considerations to optimize power delivery.
Cooling
The intense computational processes behind AI workloads generate significant heat, making advanced cooling systems essential. While air cooling can manage smaller AI clusters with densities up to 20 kW, liquid cooling is increasingly necessary for workloads exceeding this threshold.
“Liquid cooling provides multiple benefits, including enhanced processor reliability, energy efficiency, and reduced water consumption,” Selier notes. However, implementing liquid cooling systems can be complex, particularly for operators unfamiliar with the technology. Expert guidance is crucial for designing hybrid solutions that balance liquid and air cooling to meet specific operational requirements.
Racks
AI workloads demand robust racks capable of handling greater power, weight, and thermal requirements. The latest AI server platforms require larger, heavier racks with advanced features such as liquid cooling manifolds and high-speed fiber connections.
“To accommodate these needs, racks should have dimensions of at least 750 mm wide, 1,200 mm deep, and 48U high, with weight capacities exceeding 1,800 kg,” says Selier. Redesigning whitespace to support higher-density and heavier racks is critical for maximizing operational efficiency in limited spaces.
Software Management
The complexity of managing AI clusters and hybrid cooling systems underscores the importance of advanced software tools. Data Center Infrastructure Management (DCIM), Electrical Power Management Systems (EPMS), and Building Management Systems (BMS) are indispensable for optimizing operations and minimizing disruptions.
“Software tools can create a ‘digital twin’ of the data center, enabling operators to identify power and cooling constraints and make informed layout decisions,” Selier further explains. As AI power consumption is projected to rise 316% by 2028, these tools are essential for ensuring seamless integration and operation of high-density clusters.
To meet these challenges, Schneider Electric has partnered with NVIDIA to provide reference designs for deploying high-density AI clusters like DGX SuperPODs. “These designs offer clear direction and technical specifications for retrofitting or building purpose-designed systems, empowering data centers to thrive in the AI era,” the Schneider Electric executive concludes.
With AI reshaping the digital landscape, proactive investment in power, cooling, racks, and software management is key to ensuring data centers remain resilient and efficient in meeting future demands.
Comment here