What is an HGX Platform? NVIDIA AI Supercomputing Archite…

May 13, 2026 · Technical Deep Dives
Reviewed by NTS AI Infrastructure Engineer · Technical accuracy verified for enterprise & federal deployment
NTS Elite APEX 8U HGX-B200 Dual Xeon6 AI Server
NTS Elite APEX 8U HGX-B200 Dual Xeon6 AI Server — click to enlarge

Quick Summary

  • Definition: NVIDIA reference architecture for multi-GPU AI supercomputing
  • Components: 8x GPU baseboard, NVLink Switch, reference motherboard design
  • OEM Partners: Supermicro, Dell, HPE, Lenovo build HGX-based systems
  • Current Gen: HGX H100 (80GB HBM3) and HGX H200 (141GB HBM3e)
  • Federal: TAA-compliant HGX configurations available for US government

The NVIDIA HGX platform represents the definitive architecture for enterprise AI computing. HGX (which evolved from "Hybrid Graphics Hub" in early generations to become a standalone product brand) is NVIDIA's reference design for multi-GPU servers, integrating 4 or 8 high-performance GPUs with NVLink interconnect, optimized power delivery, and validated thermal management into a standardized platform adopted by every major server OEM. This comprehensive guide explains HGX architecture, platform generations, and how to select the right HGX configuration for your AI workloads.

HGX Platform Architecture HGX B200 platform

The HGX platform is designed as a modular "baseboard" that hosts multiple NVIDIA GPUs with integrated NVLink switching, power regulation, and thermal monitoring. Server OEMs integrate the HGX baseboard into their chassis designs, adding CPUs, memory, storage, and networking to create complete AI server solutions.

HGX baseboard components: The baseboard contains GPU sockets (SXM form factor), NVLink Switch ASICs for GPU-to-GPU communication (on A100/H100/H200/B200 generations), voltage regulator modules (VRMs) delivering 500-2000A of GPU power, temperature sensors and fan control logic, and management firmware interfaces for out-of-band monitoring.

SXM GPU form factor: HGX uses NVIDIA's SXM (Server PCI Express Module) form factor, distinct from standard PCIe slot-mounted GPUs. SXM provides 2-3x higher power delivery capacity (700W vs 300-450W for PCIe), direct NVLink connections without PCIe bridge overhead, and optimized thermal interface for liquid cooling cold plate attachment. SXM GPUs cannot be installed in standard PCIe slots—they require HGX baseboard compatibility.

HGX Generations Compared

GenerationGPUGPU CountInterconnectGPU MemoryAI FLOPs (FP8)Max Power
HGX A100A100 80GB4 or 8NVLink 3.0 + NVSwitch80GB HBM2e2.5 PFLOPS6.5 kW
HGX H100H100 80GB4 or 8NVLink 4.0 + NVSwitch 380GB HBM33.2 PFLOPS7.0 kW
HGX H200H200 141GB4 or 8NVLink 4.0 + NVSwitch 3141GB HBM3e3.2 PFLOPS7.0 kW
HGX B200B200 192GB4 or 8NVLink 5.0 + NVSwitch 4192GB HBM49.0 PFLOPS8.5 kW

HGX Server Ecosystem

The HGX platform is manufactured through NVIDIA's OEM partner program, with major server vendors offering HGX-based AI servers:

Supermicro: The most popular HGX partner for enterprise deployments. Supermicro offers the 8U HGX H100 server (8x H100, 2x Intel Xeon/AMD EPYC, 2TB RAM, 8x 400Gb NICs), the 4U HGX H100 server (4x H100, single CPU, 1TB RAM, 4x 200Gb NICs), and the 10U HGX B200 server. Supermicro provides the best price-performance ratio and fastest delivery times for HGX servers.

Dell PowerEdge: The XE9680 is Dell's 6U HGX H100 server, offering 8x H100 GPUs with integrated OpenManage enterprise management. Dell provides superior service and support contracts for enterprise customers, including ProSupport Plus with 4-hour mission-critical response.

HPE: The HPE Cray XD670 is HPE's 8U HGX H100 server for AI training, featuring 8x H100 GPUs and HPE's integrated Slingshot interconnect for multi-node clustering. HPE targets government and research customers through their HPC division.

Lenovo: The ThinkSystem SR675 V3 is Lenovo's 8U HGX H100 server, featuring Neptune liquid cooling integration and Lenovo's ThinkShield security features. Lenovo's HGX servers are popular in university and research deployments.

NTS Integration: NTS provides specialized HGX server configurations with enhanced features for government and enterprise deployments, including FISMA-compliant security configurations, custom liquid cooling integration, MIL-STD-810 ruggedization, and TAA-compliant supply chain documentation.

Selecting the Right HGX Configuration

4-GPU HGX: Best for AI inference, model fine-tuning, and small-scale training. Provides cost-effective AI computing for organizations deploying models up to 70B parameters. Recommended GPU: H100 for performance, H200 for larger model capacity.

8-GPU HGX: The standard for production AI training. Supports models up to 405B parameters with tensor parallelism. NVLink Switch provides full-bandwidth connectivity between all 8 GPUs. Recommended for any organization with serious AI training requirements.

HGX + NVLink Switch System: Extends HGX to 256-GPU domains for ultra-scale AI training. Required for pre-training large foundation models (70B+ parameters). Considered infrastructure investment for AI leaders, not entrants.

Related Content

Explore more about this topic:

Frequently Asked Questions

Can HGX GPUs be upgraded independently?

No. HGX baseboards are designed for specific GPU generations. Upgrading from HGX H100 to HGX B200 requires replacing the entire baseboard. Server OEMs typically offer forklift upgrades requiring new chassis, power supplies, and cooling systems.

What is the difference between HGX and DGX?

DGX (Data Center GPU X) is NVIDIA's first-party integrated AI system, combining HGX hardware with NVIDIA-engineered software and support. HGX is a platform licensed to OEMs. DGX systems include additional features: InfiniBand switch integration, Base Command management software, and NVIDIA AI Enterprise software suite. DGX carries a 30-50% premium over equivalent HGX OEM systems.

Does HGX support AMD CPUs?

Yes. HGX A100/H100/H200 platforms support 4th and 5th Gen AMD EPYC processors (9004/9005 series) in addition to Intel Xeon Scalable (4th/5th Gen). AMD EPYC configurations typically offer 2-3x more PCIe lanes and 20-40% higher memory bandwidth for CPU-GPU data transfer.