[wp_tech_share]

At this year’s NVIDIA GTC, the narrative has moved decisively beyond the initial shift to accelerated computing. What stood out in 2026 is not just the continuation of that trend, but the expansion of AI infrastructure into a heterogeneous, domain-specific ecosystem.

As an analyst covering data center compute, the key takeaway is clear: the industry is entering its next phase—where optimization, not just scale, becomes the defining battleground.

 

From Retrieval to Generative—and Now to Reasoning Infrastructure

Hyperscaler workloads have evolved rapidly from retrieval-based systems toward generative AI, and now increasingly toward reasoning-driven architectures. Internal workloads such as search are being fundamentally re-architected around AI models, signaling a structural shift in how compute is deployed.

This transition continues to drive strong demand for accelerated computing. At Dell’Oro Group, we project global data center capex to exceed $1.7 T by 2030. These estimates could prove conservative given the scale of investment being signaled by hyperscalers, including multi-hundred-billion-dollar capex trajectories and long-term, large-scale infrastructure commitments.

 

The Emergence of LPUs: A Potential Inflection Point

LPUs, particularly through NVIDIA’s partnership with Groq, represent one of the more strategically important developments. Their SRAM-based architecture is optimized for low latency and strong performance per watt, enabling lower cost per token for inference and reasoning workloads.

This introduces greater flexibility in infrastructure design. Different service tiers can be optimized independently, with throughput-oriented configurations for lower-cost services and latency-sensitive deployments for premium offerings. LPUs provide a mechanism to fine-tune this balance in ways that GPUs alone cannot fully achieve.

Early deployments suggest LPUs can be configured at meaningful density. For example, a single Groq LPU rack can integrate hundreds of processors highlighting the degree of parallelism available for inference and reasoning workloads. In practice, such systems are likely to be deployed alongside GPU clusters, with the ratio depending on workload mix and service requirements.

If adoption reaches even modest levels, LPUs could expand the silicon TAM for domain-specific accelerators. At the same time, it remains unclear whether LPUs will primarily complement GPUs or displace portions of certain workloads as operators optimize for overall system efficiency. More broadly, LPUs underscore the growing importance of architectural specialization tailored to specific workload requirements.

 

GPU Roadmap: Density and Scale Continue to Accelerate

NVIDIA continues to push aggressively on GPU density and system integration. Platforms such as Vera Rubin Ultra demonstrate this trajectory, with multi-die architectures, massive HBM capacity reaching the terabyte scale per package, and highly dense, liquid-cooled rack designs.

Future platforms such as Feynman are expected to push these limits further, increasing both compute density and system complexity. However, this rapid scaling introduces new constraints around power, cooling, and system balance. As a result, complementary architectures and more specialized components will play a growing role in maintaining overall efficiency. With compute costs remaining elevated and data center capex scaling into the hundreds of billions annually, operators will need to strategically align infrastructure with domain-specific workloads to maximize efficiency and reduce total cost of ownership.

 

Interconnects: Balancing Standards and Proprietary Innovation

Interconnect strategy remains central to NVIDIA’s roadmap. The company continues to balance proprietary innovation with industry standards, investing in both InfiniBand and Ethernet for scale-out connectivity while advancing NVLink as the backbone of scale-up architectures.

As scale-up domains expand, NVLink will increasingly need to extend beyond the rack and, over time, into the optical domain. This evolution is necessary to support larger, more tightly coupled compute fabrics, but also introduces new technical challenges.

The expansion of scale-up capabilities naturally raises the question of whether they could displace portions of traditional scale-out networking. In practice, both architectures will need to evolve in parallel. Scale-up enables higher performance within tightly coupled systems, while scale-out remains essential for resilience, workload distribution, and efficient utilization across clusters. This is increasingly true not only for training but also for inference, where distributed workloads and service-level requirements demand flexibility.

NVIDIA is reducing reliance on PCIe-based x86 systems by positioning NVLink as an alternative interconnect. With initiatives such as NVLink Fusion and the development of its own CPU roadmap, the company is positioning NVLink as a broader system fabric that could extend beyond GPUs.

 

Connectivity, Networking, and System-Level Optimization

Connectivity is rapidly emerging as one of the primary constraints in next-generation AI infrastructure. Current systems are largely built on 200 Gbps SerDes, but the industry is already looking ahead to 400 Gbps SerDes. However, the transition to 400 Gbps presents significant challenges in signal integrity, power consumption, and packaging complexity, making the timeline aggressive and execution uncertain.

In this context, NVIDIA’s vertically integrated approach provides a meaningful advantage. Its control over InfiniBand technology, including SerDes development, allows it to move ahead of standard Ethernet ecosystems when necessary, particularly when industry standards lag behind system requirements.

At the same time, networking is no longer just about bandwidth. Smart NICs and DPUs, particularly NVIDIA’s BlueField platform, are becoming increasingly central to system architecture, with the market projected to grow at a 30% CAGR over the next five years. DPUs are expanding into broader roles within the AI infrastructure, managing data movement between compute, storage, and CPU domains while offloading networking and orchestration tasks from primary processors..

Taken together, these trends point toward a broader shift to system-level optimization, where performance is increasingly determined by how effectively compute, networking, and storage are integrated across the entire infrastructure stack.

 

Expanding the Platform: Beyond GPUs to Full-Stack Infrastructure

While GPUs remain the foundation of AI infrastructure, NVIDIA is clearly extending its reach across the full data center stack. Beyond its focus on domain-specific accelerators, GTC 2026 also highlighted the dense Vera CPU platform optimized for orchestrating agentic AI workloads, as well as the STX platform designed for KV cache-based context memory. A central theme underpinning this expansion is the increasing importance of co-design—bringing together compute, networking, and storage disciplines into a unified, system-level architecture rather than optimizing each component in isolation.

Taken together, these developments signal a clear expansion of NVIDIA’s total addressable market—from GPUs alone to a broader, full-stack infrastructure opportunity spanning compute, networking, and storage.

 

From Scale to Optimization: The Path Forward

NVIDIA’s rapid innovation cadence raises important questions around long-term economics, particularly as systems become more complex and capital-intensive. Maintaining a strong return on investment will depend not only on hardware performance, but on how effectively these systems can be utilized over time.

Here, NVIDIA’s software ecosystem remains a key advantage. CUDA provides continuity across generations, allowing developers to extract incremental performance improvements and enabling mixed-generation deployments that improve overall total cost of ownership.

More broadly, GTC 2026 makes it clear that the industry is moving beyond the initial phase of scaling AI infrastructure and into one defined by optimization and specialization. The shift toward heterogeneous architectures, combined with a growing focus on efficiency and workload-specific design, is reshaping how data centers are built and operated.

[wp_tech_share]

NVIDIA’s annual developer conference (San Jose, March 16–19) has become a bellwether for data center physical infrastructure (DCPI). This year was no exception. NVIDIA DSX took center stage — a full-stack platform for designing, building, and operating AI factories that now counts over 200 partners in its ecosystem. Several major DCPI vendors—including ABB, Eaton, Mitsubishi Electric, Schneider Electric, Siemens, Trane Technologies, and Vertiv—unveiled co-designed solutions in a tightly choreographed wave of announcements. It was a concrete expression of what CEO Jensen Huang declared in his keynote: “this conference is going to cover every single layer of the five-layer cake of artificial intelligence, from land, power and shell the infrastructure to chips, to the platforms, the models, and, of course, the most important, and ultimately what’s going to take get this industry taken off, is all of the applications.”

NVIDIA’s five-layer cake of AI

 

A Factory for Designing Factories

Among the DSX components, what particularly stood out was the Omniverse DSX Blueprint—a now generally available platform for modeling data center layouts, power topologies, and thermal behavior, using simulation-ready 3D models contributed by infrastructure partners in OpenUSD format. It is an ambitious vision at a time when the reality on the ground is that most data center design still relies on traditional CAD and BIM applications, and digital twin adoption is still in its infancy. This is NVIDIA being characteristically visionary—anticipating what will eventually become a necessity, even if today it can look like an overkill.

The industry is moving from adding capacity in the teens of gigawatts a year to potentially 100GW+ in a decade or less. At that scale, without AI-assisted tools in design, construction, and commissioning, it is hard to see how projects come online at the pace required—particularly given well-known skilled labor shortages. Just as semiconductor design has become fundamentally dependent on AI tools, data center design at gigawatt scale may have no choice but to follow the same path. The Omniverse Blueprint is NVIDIA’s bet on removing the barriers to building AI factories at scale.

But while the Omniverse Blueprint captures the imagination, the conversations dominating the show floor among DCPI vendors were far more immediate. Five topics in particular stood out: the growing heterogeneity of inferencing cluster racks, the fast-approaching 800 VDC transition, the ramp-up of liquid cooling designs, the potential commoditization within the MGX ecosystem and—as no data center discussion could miss it—power availability.

NVIDIA’s Vera Rubin DSX AI Factory Reference Design

 

The End of the One-Rack Era

For the past two NVIDIA generations, data center designers could plan around a single workhorse rack. The Hopper and then Blackwell platforms offered a largely homogeneous building block: one compute rack architecture, scaled across rows and halls, with relatively uniform power and cooling profiles. GTC 2026 broke that pattern decisively.

NVIDIA introduced not one but a number of rack configurations under the Vera Rubin umbrella. The NVL72 remains the flagship—72 Rubin GPUs and 36 Vera CPUs in a fully liquid-cooled, fanless, cableless enclosure exceeding 200 kW per rack. Alongside it, a CPX rack adds Rubin CPX accelerators to the Vera Rubin superchip trays, optimized for inference performance. A Vera CPU-only rack targets inference and data preprocessing without GPU acceleration. And the LPX rack with Groq’s LPUs debuts third-party silicon within NVIDIA’s own reference design.

This is a big departure. And it is also entirely expected. A single architecture serving every workload was only tenable while AI infrastructure was synonymous with large-scale training. As workloads diversify into a variety of fine-tuning and inferencing agentic AI applications, infrastructure must follow suit. Henry Ford was able to offer the Model T alone for only so long.

For DCPI vendors, the implications are immediate. Heterogeneous clusters mean managing mixed rack densities, uneven heat loads, and varying liquid cooling requirements coexisting on the same row. This is a design and operational challenge that will demand far more flexibility from infrastructure solutions than the relatively uniform AI halls of the Hopper and Blackwell era.

 

High Voltage, High Stakes

For the biggest disruption in data center power architecture in decades, 800 VDC power distribution received remarkably little attention in NVIDIA’s official channels. Absent from Jensen’s keynote and with no significant announcements since the technical blog and whitepaper released alongside last year’s OCP Global Summit—an event we covered in a previous blog—NVIDIA’s messaging on the architecture has been sparse.

The relevance of the discussion among vendors, however, could not have been more different. 800 VDC was the talk of the town. Multiple vendors showcased equipment and prototypes, and many dedicated sessions explored everything from semiconductor building blocks to rack-level power delivery and facility integration. Vendors like Delta Electronics, Texas Instruments, and STMicroelectronics focused their marquee March 16 announcements squarely on 800 VDC developments—an unusual departure from the lockstep of similar-themed announcements that have become the norm at GTC.

Schneider Electric’s Jim Simonelli session at GTC draws interest from audience

 

Such advancements are important and necessary, but many pieces of the 800 VDC topology remain unanswered. In his GTC session entitled “A Safe, Efficient, and Scalable Approach to 800 VDC Architecture,” Eaton’s J.P. Buzzell referenced an OCP white paper expected in the coming weeks. The draft should bring more clarity to the architecture, but there is still a long way to go before engineers can fully specify an 800 VDC data hall. And even once the specification matures, supply chains for components will need to be stood up and safety guidelines codified before broad deployment can begin.

 

45 Degrees of Separation

Much like 800 VDC, another infrastructure shift that made waves in an earlier NVIDIA keynote received little airtime at GTC. At CES in January, Jensen highlighted the move toward 45°C warm-water inlet temperatures—a significant departure from the designs more commonly deployed today. Beyond Jensen’s brief nod to Vera Rubin’s 45°C specification, the topic received little attention at GTC.

NVIDIA remains committed to 45°C, but there is no sign of it doubling down or rushing to get there. The convergence toward 45°C architectures will take longer to play out. Facility-side infrastructure needs to be adapted, but operators might remain reluctant to optimize the cooling system if doing so carries any risk of reducing accelerator performance. In an age of highly constrained compute, every token counts. And the imperative to maximize throughput trumps facility-level efficiency optimization.

The water temperature debate, however, was far from the only liquid cooling story at GTC. On the show floor, the direction of travel for CDU capacity was unmistakable. As pod architectures scale and per-rack thermal loads climb, vendors responded with a new class of multi-megawatt CDUs. These are a step change from capacities that dominated the market just a year ago, and we expect this upward trend to continue as next-generation pod architectures push thermal envelopes further still.

Delta Electronics’ 800VDC CDU

 

An interesting product found on the exhibition floor was a direct-current CDU, able to be connected straight to the 800 VDC bus. It is a thoughtful choice that adds flexibility for operators designing next-generation whitespace, even if we expect most large units to be housed in mechanical galleries in the grey space—where traditional AC power distribution is likely to remain the standard for the foreseeable future. Either way, the convergence of power and cooling design choices is becoming impossible to ignore.

 

MGX and the March Toward Standardization

The growing specificity of NVIDIA’s reference architectures—from rack dimensions and cooling requirements to power topologies and simulation-ready digital models—raises an uncomfortable question for DCPI vendors: as NVIDIA defines more of the design, what room is left for differentiation?

The “MGX wall” on the show floor—displaying components from dozens of vendors side by side within the standardized MGX ecosystem—made this tension visible. By standardizing interfaces, form factors, and performance specifications across the infrastructure stack, MGX makes it easier for operators to mix and match components from multiple suppliers. That is a win for deployment speed and supply chain resilience. But it also compresses the space in which vendors can compete on anything other than price and availability—the classic hallmarks of a commoditizing market.

Quick disconnects from multiple vendors showcased at the “MGX wall”

 

Not all vendors will be affected equally. Those with deep system integration expertise, intelligent controls, service capabilities, or engineering and quality differentiation in mission-critical components will find ways to stay above the commoditization line. But for vendors whose value proposition rests primarily on the physical product itself, the tightening of NVIDIA’s specifications around their equipment is a trend worth watching closely.

 

Unlocking the Grid

Perhaps the most consequential launch at GTC came not from the chip announcements but from DSX Flex—NVIDIA’s software layer for connecting AI factories to grid services and orchestrating dynamic power adjustment. With NVIDIA’s order book continuing to grow, the math is simple: the gap between the power needed to energize forecast chip shipments and the pace of grid updates is too large to ignore. And the only near-term path to more power is not launching data centers into space, but tapping into existing grid capacity when it is not being used.

This was a point I raised directly with Jensen during the event. His response was unequivocal: data centers must change their relationship with the grid and be willing to accept less stringent SLAs in exchange for faster access to capacity. AI workloads will need to flex around supply constraints rather than demanding always-on, fully firm power. In a world where tokens per watt is becoming the defining metric for AI factory economics, accessing these watts and maximizing them becomes a dealbreaker. Startups like Emerald AI and Phaidra are building the technology to support this, but unlocking it at scale requires more than just engineering ingenuity. It depends on the willpower and aligned incentives of primary gatekeepers involved—utilities, grid operators, and their regulators.

 

What This Means for the DCPI Market

Dell’Oro Group’s latest DCPI market update, released during GTC week, showed the market reached $10.9 billion in 4Q 2025—up 20% year-over-year—with synchronized backlog surges across vendors in power and cooling. The AI supercycle continues to drive record investment, and GTC 2026 did nothing to dampen expectations. The tone was one of confident optimism—about the trajectory of AI, the scale of compute still to be built, and the opportunities ahead for data center vendors.

Regardless of whether that optimism proves fully warranted, GTC 2026 left little doubt: the DCPI market is entering its most consequential chapter yet. Stay tuned as we continue to track these shifts—and connect with us at Dell’Oro Group to discuss these trends as they unfold.

 


Vendor Press Releases

Accelsius:

Delta Electronics:

Eaton:

Foxconn:

Flex:

Hitachi:

LiteOn:

Schneider Electric:

STMicroelectronics:

Texas Instruments:

Trane Technologies:

Vertiv:

 

 

[wp_tech_share]

A few months after Upscale AI introduced SkyHammer—its clean-slate, open-standards scale-up platform designed to make XPUs “behave like a single coherent machine”—the firm is now extending its vision for open AI networking infrastructure into the scale-out domain, where clusters expand horizontally across multiple racks and, increasingly, across multiple data centers. To that end, Upscale AI is announcing a strategic partnership with NVIDIA aimed at accelerating the deployment of open, scale-out AI networking infrastructure for next-generation data centers.

The collaboration brings together NVIDIA’s Spectrum-X Ethernet switch silicon and Upscale AI’s AI-optimized, SONiC-based networking software to deliver interoperable, high-performance Ethernet fabrics designed for large-scale AI workloads.

As enterprises and neocloud providers expand AI clusters, networking has emerged as a critical bottleneck. The partnership focuses on enabling these customers to deploy scalable, low-latency networking systems that support heterogeneous environments spanning compute, accelerators, memory, and storage.

Open Infrastructure for Heterogeneous AI Environments

As part of the initiative, Upscale AI has joined the NVIDIA Partner Network. The partnership is intended to give customers greater flexibility in how they design and procure AI infrastructure, including deploying Ethernet switching powered by NVIDIA Spectrum silicon in heterogeneous, multi-vendor environments. This collaboration reflects a step toward more interoperable Ethernet infrastructure for AI deployments, while maintaining operational consistency at scale.

Focus on AI-Optimized SONiC

A core element of Upscale AI’s approach is its AI-optimized implementation of SONiC, the open-source network operating system widely used in hyperscale environments.

At Dell’Oro Group, we expect SONiC adoption in AI back-end networks to accelerate much faster than what we have historically observed in front-end networks. This faster uptake will be driven by several tailwinds on both the demand as well as supply sides.

On the demand side, a growing number of fast-growing AI model builders and neocloud providers are evaluating SONiC to diversify vendors, reduce platform lock-in, and gain greater control over their network infrastructure. Vendor diversification also helps mitigate risk especially as supply availability tightens.

On the supply side, an expanding ecosystem of established vendors and new entrants is supporting the SONiC ecosystem. We expect SONiC-based switch sales in AI scale-out networks to grow at more than 50 % CAGR (2025-2030), exceeding $10 B by 2030.

 

Addressing a Critical Gap with Fully Integrated AI Infrastructure for Enterprise and Neocloud Customers

Historically, SONiC adoption has been spearheaded by hyperscalers. However, deploying and operating an open-source network operating system like SONiC demands substantial in-house engineering expertise and integration effort—capabilities many smaller cloud providers and enterprises lack. In addition, SONiC broader ecosystem support—such as turnkey distributions, enterprise-grade tooling, and vendor-backed support—has lagged proprietary network operating systems offerings, limiting SONiC adoption beyond hyperscale environments.

Upscale AI plans to bridge this gap by delivering fully integrated solutions that combine hardware, software, and lifecycle services targeted at organizations building medium and large-scale AI environments.

While the first wave of AI has been driven primarily by large AI model builders—namely hyperscalers—the second wave is expected to be led by other cloud providers, including neocloud providers, as well as large enterprises. Together, these customer segments are projected to account for the majority of the Ethernet data center switch sales in scale-out networks by 2030.

Stitching Together an Open Fabric for AI

SkyHammer was step one. Scale-out is step two. Upscale AI is stitching together an open networking story—from the scale-up interconnect that makes XPUs act like one system, to the Ethernet fabric that lets AI environments grow horizontally while preserving multi-vendor flexibility. The NVIDIA partnership helps validate that direction and accelerates the scale-out side of the roadmap, reinforcing Upscale AI’s broader goal: open, interoperable AI networking infrastructure from pod to cluster.

[wp_tech_share]

As 2025 comes to a close, we reflect on several remarkable milestones achieved by the data center switching market this year, and what 2026 may have in store for us.

Looking back at 2025, several clear inflection points reshaped the market:

  • Ethernet overtakes InfiniBand in AI back-end networking: Supported by strong tailwinds on both the supply and demand sides, 2025 marked a decisive turning point for AI back-end networks, as Ethernet surpassed InfiniBand in market adoption. This shift is particularly striking given that just two years ago, InfiniBand accounted for nearly 80% of the data center switch sales in AI back-end networks.

Dell'Oro Group Predictions for 2026 - Data Center Switch market

  • Overall Ethernet Data Center Switch sales nearly doubled compared with 2022: The rapid adoption of Ethernet in AI back-end deployments propelled total Ethernet data center switch sales to an all-time high in 2025, nearly doubling annual revenues compared with 2022 levels.
  • 800 Gbps well surpassed 20 M ports within just three years of shipments: As a point of reference, it took 400 Gbps six to seven years to achieve the same milestone
  • The vendor landscape shifted meaningfully toward AI-exposed players: Vendors with greater exposure to AI back-end networking significantly outperformed the broader market in 2025. Companies such as Accton, Celestica and NVIDIA were among the primary beneficiaries of this shift, reflecting how AI-driven demand is reshaping competitive dynamics. Arista maintained the leading position in the Total Ethernet Data Center Switching market.

Dell'Oro Group Predictions 2026 - Data Center Switch Front-end Networks and AI Back-end Networks

Looking ahead to 2026, questions are emerging around whether the pace of investment can be sustained after such an extraordinary year. While skepticism around AI returns on investment is growing, we believe the industry is still in the early innings of a multi-year AI investment cycle. Based on the latest capital expenditure outlooks from the large hyperscalers (Google, Amazon, Microsoft, Meta, Oracle and others), we expect another strong year of AI-related investment in 2026, which should continue to drive robust spending across the networking portion of the infrastructure stack.

Networking is becoming increasingly critical, as it plays a central role in addressing some of the most challenging scaling bottlenecks in AI deployments—including power availability and compute demand. Below are some of the inflection points expected for 2026:

  • Demand remains exceptionally strong in AI back-end networking. We continue to expect strong double-digit growth in AI networking spending, driven by ongoing scale-out of AI clusters. The integration of co-packaged optics could further accelerate market growth as optics would easily add multi billions to the market size.
  • Supply constraints remain the primary risk to our forecast. We expect demand to continue to outpace supply, with shortages in chips, memory, and other critical components representing the main caveats to our outlook. As a result, the market remains supply-constrained rather than demand-constrained—a challenging dynamic, but ultimately a more favorable one than the reverse.
  • Scale-up emerges as a new battlefield for Ethernet. After securing a leading position in the scale-out segment of AI back-end networks, Ethernet is now expanding into scale-up, where NVLink has historically dominated. In this space, Ethernet will compete not only with NVLink but also with UALink, another alternative to NVLink. We anticipate 2026 will be a year full of vendor announcements targeting both Ethernet and UALink opportunities in scale-up. Scale-up represents what could be the largest total addressable market expansion the industry has ever seen.
  • 1.6 Tbps switches expected to ship in volume in 2026. 2026 will mark the first year of volume deployments of 1.6 Tbps switches, driven by the insatiable demand for high bandwidth in AI clusters. 1.6 Tbps ramp is expected to be even faster than 800 Gbps, surpassing 5 M ports within one to two years of shipments.
  • Co-packaged optics (CPO) expected to ramp on both InfiniBand and Ethernet switches. After many years of development and debate, 2026 is expected to see the initial volume ramp of CPO on both InfiniBand and Ethernet switches. On the demand side, major hyperscalers are actively trialing the technology. On the supply side, while NVIDIA is leading the way, we expect other vendors to follow shortly.
  • Vendor diversity set to increase in 2026. As AI clusters continue to scale, vendor diversity with both incumbent vendors as well as new entrants, will become increasingly important to ensure risk mitigation and supply availability. We believe that no single vendor can meet the full demand for AI infrastructure. As a result, we expect SONiC adoption to accelerate in both scale-up and scale-out deployments, as it will be critical in enabling this broader vendor ecosystem

In summary, as we look ahead to 2026, the AI-driven data center landscape is set to continue its rapid evolution. From Ethernet’s rise in AI back-end networks and the emergence of scale-up as a new battlefield, to the adoption of 1.6 Tbps switches, co-packaged optics, and a more diverse vendor ecosystem, the infrastructure supporting AI is expanding in both scale and complexity. While supply constraints and ROI questions remain challenges, the industry is clearly in the early innings of a multi-year AI journey. Networking, in particular, will play a pivotal role in enabling the next phase of AI growth, making 2026 an exciting year for both innovation and investment.

[wp_tech_share]
The hyperscale AI infrastructure buildout is entering a more mature phase. After several years of rapid regional expansion driven by resilience, redundancy, and data sovereignty, hyperscalers are now focused on scaling AI compute and supporting infrastructure efficiently. As we move into 2026, the cycle is increasingly defined by capex discipline and execution risk, even as absolute investment levels remain historically high.

Accelerated Servers Remain the Core Spending Driver

Spending on high-end accelerated servers rose sharply in 2025 and continues to anchor AI infrastructure investment heading into 2026. These platforms pull through demand for GPUs and custom accelerators, HBM, high-capacity SSDs, and high-speed NICs and networks used in large AI clusters. While frontier model training remains important, a growing share of deployments is now driven by inference workloads, as hyperscalers scale AI services to millions of users globally.

This shift meaningfully expands infrastructure requirements, as inference workloads require higher availability, geographic distribution, and tighter latency guarantees than centralized training clusters.

 

GPUs Continue to Dominate Component Revenue

High-end GPUs will remain the largest contributor to component market revenue growth in 2026, even as hyperscalers deploy more custom accelerators to optimize cost, power efficiency, and workload-specific performance at scale. NVIDIA is expected to begin shipping the Vera Rubin platform in 2H26, which increases system complexity through higher compute and networking density and optional Rubin CPX inference GPU configurations, materially boosting component attach rates.

AMD is positioning to gain share with its MI400 rack-scale platform, supported by recently announced wins at OpenAI and Oracle. Despite growing competition, GPUs continue to command outsized revenue due to higher ASPs, broader ecosystem support.

 

Near-Edge Infrastructure Becomes Critical for Inference

As AI inference demand accelerates, hyperscalers will need to increase investment in near-edge data centers to meet latency, reliability, and regulatory requirements. These facilities—located closer to population centers than centralized hyperscale regions—are essential for real-time, user-facing AI services such as copilots, search, recommendation engines, and enterprise applications.

Near-edge deployments typically favor smaller but highly dense accelerated clusters, with strong requirements for high-speed networking, local storage, and redundancy. While these sites do not approach the power scale of centralized AI campuses, their sheer number and geographic dispersion represent a meaningful incremental capex requirement heading into 2026. In contrast, far-edge deployments remain more use-case dependent and are unlikely to see material growth until ecosystems and application demand further mature.

 

Networking and CPUs Transition Unevenly

The x86 CPU and NIC markets tied to general-purpose servers are expected to decelerate in 2026 following short-term inventory digestion.  In contrast, demand for high-speed networking remains tightly linked to accelerated compute growth. Even as inference workloads outpace training, inference accelerators continue to rely on scale-out fabrics to support utilization, redundancy, and ultra-low latency.

 

Supply Chains Tighten as Component Costs Rise

AI infrastructure supply chains are becoming increasingly constrained heading into 2026. Memory vendors are prioritizing production of higher-margin HBM, limiting capacity for conventional DRAM and NAND used in AI servers. As a result, memory and storage prices are rising sharply, increasing system-level costs for accelerated platforms.

Beyond memory, longer lead times for advanced substrates, optics, and high-speed networking components are adding further volatility to the supply chain. In parallel, tariff uncertainty and evolving trade policy introduce additional supply-chain risk, and potentially elevating component pricing over the medium term.

 

Capex Remains Elevated, but ROI Scrutiny Intensifies

The US hyperscale cloud service providers continue to raise capex guidance, reinforcing the continuity of the multi-year AI investment cycle into 2026. Accelerated computing, greenfield data center builds, near-edge expansion, and competitive pressures remain strong tailwinds. Changes in depreciation treatment provide levers to optimize cash flow and support near-term investment levels.

However, infrastructure investment has outpaced revenue growth, increasing scrutiny around capex intensity, depreciation, and long-term returns. While cash flow timing can be managed, underlying ROI depends on successful AI monetization, increasing the risk of margin pressure if revenue growth lags infrastructure deployment.