The Outsized Influence of AI on the Data Center Industry
In November 2022, OpenAI sent shockwaves through the IT and business communities with the release of ChatGPT, one of the first generative AI solutions available to the general public. With more than 100 million weekly users in November 2023, ChatGPT has become one of the fastest-growing applications in history.
What’s driving this massive interest in generative AI tools like ChatGPT? There is a limitless ecosystem of use cases and benefits for this technology, which has the power to redefine how work is done, streamline workflows, and improve processes and operations for organizations across every conceivable industry. This promise of immense benefit and deep disruption is poised to have a significant impact on the data center industry.
To better understand how AI workloads and applications could drive changes in physical data centers, we spoke with Steve Conner, the chief technology officer of North America at Vantage Data Centers.
Data Centers Today (DCT): What requirements do AI applications have that traditional workloads don’t? How have these requirements influenced the data center?
Steve Conner: AI workloads have numerous unique requirements that affect the racks, rows and even the networking within the data center. First, AI workloads need networks with very low latency, often necessitating InfiniBand networks to accommodate the sophisticated algorithm functions.
In addition to network constraints, the GPUs often used for AI applications significantly increase the power density of racks, pushing each to approximately 50kW or more.
Currently, AI racks using modern GPUs require between 27kW and 48kW. This power density is likely to increase in the future, which will have a major impact on how we organize and cool the data module. We can continue to cool a standard data module used for AI workloads at these densities with air, but only if the layout undergoes significant changes.
Combining these dense racks with the tight, low-latency networks that AI applications require, alongside the limitations of air cooling, make the AI data module quite different from traditional cloud data modules. Instead of having 20 rows of 24 racks in a 4MW room for traditional cloud workloads, AI layouts might have as few as nine rows of eight to 10 racks, all grouped together.
DCT: Are traditional “production” data centers suitable for running AI applications? Will the design of data centers change as AI evolves and grows in adoption?
Steve Conner: Data centers designed for hyperscale customers can meet the needs of today’s AI applications and workloads at current rack densities. But as rack densities continue to rise, these production data centers will need modifications. Assuming AI continues to evolve as expected, it would be logical to design and build data centers specifically for AI.
There are several factors that data center designers should consider for AI-specific applications and workloads. Cooling is the primary consideration.
Inside the data center, designers must determine how to incorporate liquid cooling. Liquid cooling introduces new devices, changing the geometry of the data module. Additionally, data center operators must consider aspects like distributed plumbing, fluid dynamics in the data modules, potential leaks and their overall impact on operations.
Electrically, we might need to shrink the rooms while increasing their density. As liquid cooling becomes the main cooling mechanism, we can abandon row designs that optimize airflow in favor of arrangements that prioritize AI-specific requirements, such as network latency.
Finally, data center designers will need to account for denser racks and increased loads for advanced network cabling necessary for these AI applications and workloads.
These factors are just the beginning. AI workloads are distinctive and will require data center operators to innovate, offering an exciting opportunity to collaborate closely with customers to evolve the standard data center model.
DCT: You mention liquid cooling. Will that become the norm? If so, what kind of liquid cooling is likely to be most widely adopted?
Steve Conner: Air cooling is still effective in today’s data centers, even for AI applications and workloads. However, as discussed, the increasing power densities—exceeding 60kW or 70kW+ per rack—will necessitate the adoption of liquid cooling. Many large cloud companies and hyperscalers are actively designing liquid cooling solutions.
The current trend, and likely future direction, for liquid cooling in data centers is rack-based liquid cooling, specifically using Coolant Distribution Units (CDUs) with liquid-to-chip technology. This deployment method is becoming more common and seems to be the preferred choice for future liquid cooling implementations.
The shift to liquid-cooled environments will not completely eliminate the need for air. Some customer architecture components are not suitable for, or cannot accommodate, liquid cooling. Data center designers will be tasked with finding the right air-to-liquid ratio and integrating that into their cooling strategies.
DCT: What challenges will this pose for data center providers like Vantage? Are Vantage’s data centers already capable of supporting liquid cooling? If not, what changes are needed to enable this?
Steve Conner: Vantage has long prepared for the shift toward liquid cooling and the higher cooling demands of AI applications and workloads. We are equipped to support liquid-to-rack for CDU and Rear Door Heat Exchanger deployments, and we are poised to support immersion cooling.
The primary design changes we’re implementing in our data centers include integrating CDUs and adjusting to the operational impacts associated with these systems. As mentioned, these devices introduce design considerations that differ from traditional air-cooled solutions, and Vantage is actively engaging with the community to develop an optimal design that can adapt as AI power densities grow.
DCT: How does adopting AI affect the overall power consumption of the data center? Does the increased power demand in the data module present challenges? How can data center operators meet this growing power requirement?
Steve Conner: Power is a finite resource in data centers. Operators can only supply a limited amount to each data module. AI applications must operate within the power constraints of the room’s design.
To support increasing densities while maximizing space, customers will inevitably push the boundaries. However, there are strategies to extend these limits.
For instance, the community might consider transitioning to some form of liquid cooling. Data center owners may need to reevaluate the building’s geometry to allow for more generators, supporting higher densities in a single data module. Additionally, both data center operators and customers may need to rework service level agreements (SLAs) around cooling to accommodate the higher densities
DCT: What is the future of data centers in the era of AI? Will there be dedicated AI data centers separate from “production” data centers, or will these solutions coexist within the same facilities being built today?
Steve Conner: The future is not set in stone—it depends.
If AI rack demands top out at 100kW, it’s possible that data centers currently serving cloud applications could also accommodate AI applications. However, as densities increase, combining both workloads in the same space becomes more challenging
Should AI application demands require data center operators to significantly increase overall density per square foot, the feasibility of supporting both workloads in the same facility diminishes, necessitating different designs.
From an evolutionary perspective, I expect we will see a move toward specialized AI facilities within the next three years, particularly if AI companies can successfully monetize their solutions. The extent of this specialization will depend on the requirements for densification and the commercial success of AI applications.
To learn more about Vantage’s data centers located across North America, EMEA and APAC, click here.