Print Friendly, PDF & Email

The Nvidia GTC Fall 2021 virtual event I attended last week highlighted some exciting developments in the field of AI and machine learning, most notably, in new applications for the metaverse. A metaverse is a digital universe created by the convergence of the real world and a virtual world abstracted from virtual reality, augmented reality, and other 3D visual projections.

Several leading Cloud service providers recently laid out their visions of the metaverse. Facebook, which changed its name to Meta to align its focus on the metaverse, envisions people working, traveling, and socializing in virtual worlds. Microsoft already offers holograms and mixed-reality on its Microsoft Mesh platform and announced plans to bring holograms and virtual avatars to Microsoft Teams next year. Tencent recently shared its metaverse plan to leverage its strengths in multiplayer gaming on its social media platform.

In order to recreate an accurate virtual representation of the real world, massive amounts of AI training data would need to be acquired, captured, and processed. This would stretch the limits of the compute infrastructure. During GTC, Nvidia highlighted various solutions in three areas that could help pave the way for the proliferation of the metaverse in the near future:

  • Compute Architecture: During the Q&A session, I asked Nvidia CEO Jensen Huang how the data center would need to evolve to meet the needs of the metaverse. Jensen emphasized that computer vision and graphics and physics simulation would need to converge in a coherent architecture and be scaled out to millions of people. In a sense, this would be a new type of computer, a fusion of various disciplines with the data center as the new unit of computing. In my view, such an architecture would be composed of a large cluster of accelerated servers with multiple GPUs within a network of tightly coupled, general-purpose servers. The servers would run applications and store massive amounts of data. Memory coherent interfaces, such as CXL,  NVLink, or their future iterations, offered on x86- and ARM-based platforms, would enable memory sharing across racks and pods. These interfaces would also improve connectivity between CPUs and GPUs, reducing system bottlenecks.
  • Network Architecture: As the unit of computing continues to scale, new network architectures will need to be developed. During GTC, Nvidia introduced Quantum-2, a networking solution composed of a 400 Gbps InfiniBand and a Bluefield-3 DPU (data processing unit) Smart NIC. This combination will enable high-throughput, low-latency networking in a dense and tightly coupled cluster scaling up to one million nodes needed for metaverse applications. 400 Gbps is the fastest server access speed available today. It could double to 800 Gbps in several years. The ARM processor in the Bluefield DPU could directly access the network interface, bypassing the CPU and benefiting time-sensitive AI workloads. Furthermore, we can expect that these scaled-out computing clusters would be shared across multiple users. With a Smart NIC, such as the Bluefield DPU, layer isolation could be provided among users, thereby enhancing security.
  • Omniverse: The compute and network infrastructure could only be effectively utilized with a solid software development platform and ecosystem in place. Nvidia’s Omniverse provides the platform to enable developers and enterprises to create and connect virtual worlds for various use cases. During GTC, Jensen described how the Omniverse could be applied to build a digital twin in an automotive factory with the manufacturing process simulated and optimized by AI. This twin would later serve as the blueprint for the physical construct. The range of potential applications ranged from education to healthcare, retail, and beyond.

We are still in the initial developmental stages of the metaverse; the technology build-blocks and ecosystem are still coming together. Furthermore, as we have seen recently with certain social media platforms and the gaming industry, new regulations could emerge to reset the boundaries between the real and virtual worlds. Nevertheless, I believe that the metaverse has the potential to unlock new use cases for both consumers and enterprises and drive investments in data center infrastructure in the Cloud and Enterprise. To access the full Data Center Capex report, please contact us at dgsales@delloro.com.

Print Friendly, PDF & Email

Dell’Oro Group projects that the spend on accelerated compute servers targeted to artificial intelligence (AI) workloads will reach double-digit growth over the next five years, outpacing other data center infrastructure. An accelerated compute server equipped with accelerators such as a GPU, FPGA, or custom ASIC can generally handle AI workloads with much greater efficiency than general purpose (without accelerators) servers. Numerically speaking, deployment of these servers still represents only a fraction of Cloud service providers’ overall server footprint. Yet, at ten or more times the cost of a general-purpose server, accelerated compute servers are becoming a substantial portion of the data center capex.

Tier 1 Cloud service providers are increasing their spending on new infrastructure tailored for AI workloads. In Facebook’s 3Q21 earnings calls, the company announced its plans to increase capex by more than 50% in 2022. Investments will be driven by AI and machine learning to improve ranking and recommendations across Facebook’s platform. In the longer term, as the company shifts its business model to the metaverse, capex investments will be driven by video and compute-intensive applications such as AR and VR. At the same time, Tier 1 Cloud service providers, such as Amazon, Google, and Microsoft, also aim to increase spending on AI-focused infrastructure to enable their enterprise customers to deploy applications with enhanced intelligence and automation.

It has been a year since my last blog on AI data center infrastructure. Since that time, new architectures and solutions have emerged that could pave the way for the further proliferation of AI in the data center. Following are three innovations I’ll be watching closely:

New CPU Architectures

Intel is scheduled to launch its next-generation Sapphire Rapids processor next year. With its AMX (Advanced matrix Extension) instruction set, Sapphire Rapids is optimized for AI and ML workloads. CXL, which will be offered with Sapphire Rapids for the first time, will establish a memory-coherent, high-speed link PCIe Gen 5 interface between the host CPU and accelerators. This, in turn, will reduce system bottlenecks by enabling lower latencies and more efficient sharing of resources across devices. AMD will likely follow on the heels of Intel and offer CXL on EPYC Genoa. For ARM, competing coherent interfaces will also be offered, such as CCIX with Ampere’s Altra processor and NVlink on Nvidia’s upcoming Grace processor.

Faster Networks and Server Connectivity

AI applications are bandwidth hungry. For this reason, the fastest networks available would need to be deployed to connect host servers to accelerated servers to facilitate the movement of large volumes of unstructured data and training models (a) between the host CPU and accelerators, and (b) among accelerators in a high-performance computing cluster. Some Tier 1 Cloud service providers are deploying 400 Gbps Ethernet networks and beyond. The network interface card (NIC) must also evolve to ensure that server connectivity is not inhibited as data sets become larger. 100 Gbps NICs have been the standard server access speed for most accelerated compute servers. Most recently, however, 200 Gbps NICs are increasingly used with these high-end workloads, especially by Tier 1 Cloud service providers. Some vendors have added an additional layer of performance by integrating accelerated compute servers with Smart NICs or Data Processing Units (DPUs). For instance, Nvidia’s DGX system could be configured with two Bluefield-2 DPUs to facilitate packet processing of large datasets and provide multi-tenant isolation.

Rack Infrastructure

Accelerated compute servers, generally equipped with four or more GPUs, tend to be power hungry. For example, an Nvidia DGX system with 8 A100 GPUs has a maximum system power usage rated at 6.5kW. Extra consideration would be needed to ensure efficient thermal management. Today, air-based, thermal management infrastructure is predominantly used. However, as rack power densities are on the rise to support accelerated computing hardware, air-cooling efficiencies and limits are being reached. Novel liquid-based, thermal management solutions, including immersion cooling, are under development to further enhance the thermal efficiencies of accelerated compute servers.

These technology trends will continue to evolve and drive the commercialization of specialized hardware for AI applications. Please stay tuned for more updates from the upcoming Data Center Capex reports.

Print Friendly, PDF & Email

The data center industry is estimated to have consumed 205 terawatt-hours (TWh) or ~1% of the world’s energy consumption in 2018. Other industry estimates peg that rate higher at up to ~2%. Despite these different estimates, one thing is clear: the decade-old fear of runaway growth in data center energy consumption has proved to be unfounded. Hyperscale cloud service providers (CSPs) have largely managed that concern, with the help of industry vendors, through IT virtualization and higher utilization of power and cooling infrastructure. At the same time, enterprises data center operations, while historically less efficient, have transitioned to CSPs.

However, these estimates were calculated before the global COVID-19 pandemic, which saw the world embrace virtual collaboration, remote learning, and accelerated automation through artificial intelligence (AI) and machine learning (ML). While these trends materialized throughout 2020, rendering the industry (barely) able to meet demand, questions resurfaced about managing future energy consumption. For this reason, data center sustainability has become the most pressing issue in the data center industry, one in which data center physical infrastructure vendors believe they can play a critical role.

As part of Dell’Oro Group’s upcoming Data Center Physical Infrastructure program, we will focus on technologies that enable sustainable data center growth. That’s why data center thermal management, which consumes 30% to 40% of a data center’s annual energy consumption, second only to compute, is the logical starting place. Today, air-based, thermal management infrastructure is predominantly used. However, as rack power densities are on the rise to support accelerated computing hardware (such as GPUs and FPGAs), air-cooling efficiencies and limits are being reached. Liquids are a much more effective and efficient medium for transferring heat. For this reason, the data center industry is exploring different ways to safely bring liquids into the data center.

That’s why, when I had an opportunity to see CGG’s High Performance Compute Center, I experienced a level of nervousness and excitement that I haven’t felt in some time prior to touring a data center. This was the first time I have been inside a liquid immersion-cooled facility, supported by Green Revolution Cooling’s (GRC) infrastructure. GRC is a known leader in immersion-cooling technology, in addition to Asperitas, Submer, and other vendors. Visiting my first immersion-cooled facility felt more like a trip to Mars than the type of data center I’ve spent my entire career getting to know.

Although the data center industry treats liquid cooling as though its use for computing is new, it has actually been around for decades. It dates back to the 1990s, when it was used to cool IBM mainframes. Immersion cooling seeks to solve a similar problem today – removing heat directly at the source – but through a different method. A coolant distribution unit (CDU) is used to pump a liquid – usually some kind of mineral oil – to a rack manifold, where it fills and circulates the liquid through the rack (sometimes referred to as a vat or tank). Servers, which require some modification, are then vertically immersed in the liquid to capture and remove 100% of the generated heat. Right now, the big question being asked by the data center industry is how different does immersion cooling makes my data center?

CGG Doubles Compute Capacity with Immersion Cooling

Walking into the CGG High Performance Compute Center, any notion that I was headed to Mars was quickly dispelled. It looked like a conventional data center with a raised floor and traditional infrastructure, from the UPS down to the rack power distribution units (rPDU). The big difference was the horizontal immersion racks as opposed to vertical ones. As I observed the room, I quickly noticed was how quiet it was. CDU pumps produced the only noise. Things were quiet enough to have a conversation with the person standing next to me. The horizontal immersion racks created an open feeling, allowing me to see around the entire room.

However, a friendlier operating environment isn’t what drove CGG to adopt immersion cooling. The company had reached its limits of space, power, and cooling. In order to expand computing capacity, CGG needed more space and power or a new thermal-management solution. And the new thermal management solution – immersion cooling – did not disappoint. In the same floor space and power footprint, CGG was able to double its computing capacity. Additionally, a significant portion of the existing infrastructure was utilized, while deploying immersion racks in scalable, 100 kW cooling-capacity increments. As a result, CGG had no downtime and only limited capital expenditures (CAPEX) during the transition to immersion cooling.

These benefits aren’t unique to CGG’s deployment of immersion cooling. In fact, they can be achieved by many players in the data center industry struggling with space, power, or cooling constraints. To quantify the benefits, CAPEX for construction of a new immersion-cooled data center relative to a traditional air-cooled build can be reduced by 20%. This is the result of eliminating certain infrastructure, such as chillers or air handlers, in addition to smaller-sized electrical infrastructure, such as UPSs, switch gears, and power distribution.

The case for immersion cooling becomes even more compelling when considering operational expenditures (OPEX). Immersion-cooling systems use less power as a result of removing server fans, air handling units, and chilled water systems. Lower-power consumption for thermal management means reduced annual energy costs. Additionally, with fewer moving parts in an immersion-cooling solution, maintenance costs are also reduced. In total, immersion cooling OPEX costs can decrease by up to 33% compared to traditional air-cooled data center builds. From a total cost of ownership (TCO) perspective over the 10-year life of a data center, it’s achievable for an immersion-cooled data center to cost half as much as a traditional air-cooled build.

Immersion Cooling Brings Small Changes to Data Center Operations

So, what’s the catch? The human element of operations in the mission-critical, data center industry can’t be overlooked. Data center uptime is measured by the number of nines (e.g., 99.9% v. 99.9999% uptime), as downtime can translate into hundreds of thousands of dollars – or even millions – in lost revenue. Historically, this had led to slow adoption of new technologies. Early adopters are often driven by need, as is the case with liquid cooling for HPC. But, with increased adoption of accelerated compute, many other companies are already struggling or are expected to struggle with the limits of air-cooling in the near future.

In my visit to CGG’s High Performance Compute Center, I was most eager to learn about the “quirks” of immersion cooling. The biggest difference from air-cooled builds is in server maintenance. Servers have to be pulled out of the oil by hand or using a small, overhead lift. They can then be laid across the tank while work is performed, either immediately or after a short period of drip drying. After maintenance is complete, the server is simply immersed back into the rack.

Other operational differences that data center owners and operators must consider are:

  • Containment of the oil in which servers are immersed is top of mind. For CGG, this didn’t appear to be a problem. Different combinations of rack and row and room containment are used to manage any dripping when removing servers. It’s definitely handy to keep a roll of oil-absorbent towels around but no major spills have occurred.
  • Stickers imprinted with a server’s serial number can come loose during immersion. This seemed to be the biggest potential headache. If a sticker comes loose, it doesn’t cause any damage to the immersion cooling system due to the filtration system. However, it’s possible for a missing sticker to impact asset management. Some immersion-ready servers already utilize a pull-tag system. This eliminates the issue. Development of oil-resistant stickers is also being explored.
  • Cable management isn’t more complex for immersion cooling, just different. CGG utilizes multiple generations of GRC immersion racks, which reflect the evolution of rPDU and network switch placement. They have moved between dry space in the rack and mounted on the back of the tank. GRC’s latest immersion-cooling product, the ICEraQ 10, utilizes dry space in the top-rear of the rack for rPDUs with networking switches mounted on the front behind a panel.
  • Lastly, beware of crickets. It turns out that crickets have a taste for the particular immersion oil GRC uses, so an open bay door may lead to an extra visitor. Just like a loose serial number sticker, there is no threat of damage – just an unexpected find when opening the rack lid.
Immersion Cooling Answers the Call for Sustainable Data Centers of the Future

The engineered benefits of immersion cooling can’t be denied – higher utilization of space and power, while achieving lower CAPEX and OPEX relative to a traditional air-cooled facility. However, I didn’t need to visit an immersion-cooled facility to understand the cost savings. My biggest takeaway was correction of my misconception that an immersion-cooled data center would be dramatically different from an air-cooled facility. It was familiar, like other data centers I have toured. The only difference in physical infrastructure was the rack itself. IT infrastructure is mounted vertically, as opposed to horizontally. Immersion-ready servers are available today with expanding partnerships between chip, server, and immersion vendors working on the next generation of compute. While planning for a few operational differences that need to occur, to my surprise, necessary adjustments are relatively minor. So can immersion cooling be a part of the solution that supports sustainable data centers of the future? After my visit to CGG’s High Performance Compute Center, I believe it just might be.

This November, Dell’Oro Group will launch a new Data Center Physical Infrastructure subscription program. As the program’s lead analyst, I will dig deeper into the market outlook, growth drivers, and the competitive landscape of the data center physical infrastructure market. I will quantify industry trends and developments, providing a timely, accurate, and detailed analysis. To learn more about Dell’Oro Group’s new Data Center Physical Infrastructure program, please contact us at dgsales@delloro.com.

Print Friendly, PDF & Email

 

Dell’Oro published an update to the Ethernet Controller & Adapter 5-Year Forecast report, July 2021. Revenue for the worldwide Ethernet controller and adapter market is projected to increase at a 4% compound annual growth rate (CAGR) from 2020 to 2025, reaching nearly $3.2 billion. The increase is partly driven by the migration to server access speed of 100 Gbps and higher.

The ramp of 25 Gbps port shipments has been strong since the availability of 28 Gbps SerDes in 2016. 25 Gbps has already displaced 10 Gbps to become the dominant speed in revenue, as 25 Gbps gains broad adoption across Cloud service providers (SPs) and high-end enterprises. However, we project that 100 and 200 Gbps speed ports to overtake that of 25 Gbps in revenue as early as 2023.

We identify the market and technology drivers below that are likely to drive the adoption of next-generation server connectivity based on 100 Gbps and beyond:

  • 50 Gbps ports, based on two 28 Gbps SerDes lanes, have been deployed in mainstream among some of the major Cloud SPs. However, with the exponential growth of network traffic and proliferation of cloud computing, the Top 4 US Cloud SPs are demanding even higher server access speeds than the rest of the market. The availability of 56 Gbps SerDes since late 2018 has prompted some of the Top 4 US Cloud SPs to upgrade their networks to 400 Gbps, with upgrades in server network connectivity to 100 Gbps for general-purpose computing in progress.
  • Higher server access speeds of up to 200 Gbps, based on two lanes of 112 Gbps SerDes, could begin to ramp for general-purpose computing for the Top 4 US Cloud SPs following network upgrades 800 Gbps as early as 2022.
  • The increase in demand for bandwidth-hungry AI applications will continue to push the boundaries of server connectivity. Today, 100 Gbps is commonly used to interconnect accelerated servers, while general-purpose servers are connected at 25 or 50 Gbps. As 100 Gbps become the standard connection for general-purpose in several years for the major Cloud SPs, accelerated servers may be connected at twice the data rate at 200 Gbps.

To learn more about the Ethernet Controller and Adapter market, or if you need to access the full report, please contact us at dgsales@delloro.com.

About the Report

The Dell’Oro Group Ethernet Controller and Adapter 5-Year Forecast Report provides a complete, in-depth analysis of the market with tables covering manufacturers’ revenue; average selling prices; and unit and port shipments by speed (1 Gbps, 10 Gbps, 25 Gbps, 40 Gbps, 50 Gbps, and 100 Gbps) for Ethernet and Fibre Channel Over Ethernet (FCoE) controllers and adapters. The report also covers Smart NIC and InfiniBand controllers and adapters. To purchase this report, please contact us at dgsales@delloro.com.

Print Friendly, PDF & Email

 

Dell’Oro published an update to the Data Center Capex 5-Year Forecast report in July 2021. Server spending is forecast to grow at a compound annual growth rate of 11 percent over five-year, comprising nearly half the data center capex by 2025.

The pandemic resulted in strong demand for computing and digital technologies due to a shift in enterprise and consumer behaviors. Current semiconductor foundry capacity is not adequate to meet the recent surge in global demand. The cost of servers and other data center equipment is projected to rise sharply in the near term partly due to the global semiconductor shortages. An increase of server average selling prices (ASPs) could approach the double-digit level that was observed in 2018, which was another period of tight supply and high demand. However, in the longer term, we anticipate that supply and demand dynamics could reach equilibrium and that technology transitions could drive market growth. We identify the following technology trends that shape our five-year forecast:

  • CPU Refresh Cycles: Intel and AMD both have an aggressive roadmap to introduce new platform refreshes as the processor race heats up. Both the Intel Sapphire Rapids and AMD EPYC Genoa, expected in 2022, will pack more processor cores and memory channels, and support the latest interfaces such as CXL, DDR5, and PCIe Gen 5 that could enable denser server form-factors and new architectures.
  • Accelerated Computing: A new class of accelerated servers densely packed with co-processors that are optimized for application-specific workloads, such as artificial intelligence and machine learning, is emerging. Some Cloud service providers such as Amazon and Google, have deployed accelerated servers using internally developed AI chips, while other Cloud service providers and enterprises have commonly deployed solutions based on GPUs and FPGAs. We estimate that attach rate of servers with accelerators to grow to 13 percent by 2025
  • Edge Computing: Certain applications—such as cloud gaming, autonomous driving, and industrial automation—are latency-sensitive, requiring Multi-Access Edge Compute, or MEC, nodes to be situated at the network edge, where sensors are located. Unlike cloud computing, which has been replacing enterprise data centers, edge computing creates new market opportunities for novel use cases.

With the evolution of CPU platforms along with and proliferation of accelerated computing, we anticipate data centers will be better optimized to process application-specific workloads with fewer, but more powerful and denser servers, increasing the total available market through higher server ASPs. Edge computing, on the other hand, will increase the available market with greenfield deployment of servers distributed edge locations. To access the full Data Center Capex report, please contact us at dgsales@delloro.com.

About the Report

Dell’Oro Group’s Data Center Capex 5-Year Forecast Report details the data center infrastructure capital expenditures of each of the ten largest Cloud service providers, as well as the Rest-of-Cloud, Telco, and Enterprise customer segments. Allocation of the data center infrastructure capex for servers, storage systems, and other auxiliary data center equipment is provided. The report also discusses the market and technology trends that can shape the forecast.