The Full-Stack SuperClusters Include Air-
and Liquid-Cooled Training and Cloud-Scale Inference Rack
Configurations with the Latest NVIDIA Tensor Core GPUs, Networking,
and NVIDIA AI Enterprise Software
SAN JOSE, Calif., March 18, 2024 /PRNewswire/ -- Supermicro,
Inc. (NASDAQ: SMCI), a Total IT Solution Provider for AI,
Cloud, Storage, and 5G/Edge, is announcing its latest portfolio to
accelerate the deployment of generative AI. The Supermicro
SuperCluster solutions provide foundational building blocks for the
present and the future of large language model (LLM)
infrastructure.
The three powerful Supermicro SuperCluster solutions are now
available for generative AI workloads. The 4U liquid-cooled systems
or 8U air-cooled systems are purpose-built and designed for
powerful LLM training performance, as well as large batch size and
high-volume LLM inference. A third SuperCluster, with 1U air-cooled
Supermicro NVIDIA MGX™ systems, is optimized for
cloud-scale inference.
"In the era of AI, the unit of compute is now measured by
clusters, not just the number of servers, and with our expanded
global manufacturing capacity of 5,000 racks/month, we can deliver
complete generative AI clusters to our customers faster than ever
before," said Charles Liang, president and CEO of Supermicro.
"A 64-node cluster enables 512 NVIDIA HGX H200 GPUs with 72TB
of HBM3e through a couple of our scalable cluster building blocks
with 400Gb/s NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet
networking. Supermicro's SuperCluster solutions combined with
NVIDIA AI Enterprise software are ideal for enterprise and cloud
infrastructures to train today's LLMs with up to trillions of
parameters. The interconnected GPUs, CPUs, memory, storage, and
networking, when deployed across multiple nodes in racks, construct
the foundation of today's AI. Supermicro's SuperCluster solutions
provide foundational building blocks for rapidly evolving
generative AI and LLMs."
To learn more about the Supermicro AI SuperClusters, visit:
www.supermicro.com/ai-supercluster
"NVIDIA's latest GPU, CPU, networking and software technologies
enable systems makers to accelerate a range of next-generation AI
workloads for global markets," said Kaustubh Sanghani, vice president of GPU Product
Management at NVIDIA. "By leveraging the NVIDIA accelerated
computing platform with Blackwell architecture-based products,
Supermicro is providing customers with the cutting-edge server
systems they need that can easily be deployed in data centers."
Supermicro 4U NVIDIA HGX H100/H200 8-GPU systems double the
density of the 8U air-cooled system by using liquid-cooling,
reducing energy consumption and lowering data center TCO. These
systems are designed to support the next-generation NVIDIA
Blackwell architecture-based GPUs. The Supermicro cooling
distribution unit (CDU) and manifold (CDM) are the main arteries
for distributing cooled liquid to Supermicro's custom
direct-to-chip (D2C) cold plates, keeping GPUs and CPUs at optimal
temperature, resulting in maximum performance. This cooling
technology enables up to a 40% reduction in electricity costs for
the entire data center and saves data center real estate space.
Learn more about Supermicro Liquid Cooling technology:
https://www.supermicro.com/en/solutions/liquid-cooling
The NVIDIA HGX H100/H200 8-GPU equipped systems are ideal
for training Generative Al. The high-speed interconnected GPUs
through NVIDIA® NVLink®, high GPU memory
bandwidth, and capacity are key for running LLM models, cost
effectively. The Supermicro SuperCluster creates a massive pool of
GPU resources acting as a single AI supercomputer.
Whether fitting an enormous foundation model trained on
a dataset with trillions of tokens from scratch or building a
cloud-scale LLM inference infrastructure, the spine and leaf
network topology with non-blocking 400Gb/s fabrics allows it to
scale from 32 nodes to thousands of nodes seamlessly. With fully
integrated liquid cooling, Supermicro's proven testing processes
thoroughly validate the operational effectiveness and efficiency
before shipping.
Supermicro's NVIDIA MGX™ system designs featuring the
NVIDIA GH200 Grace Hopper Superchips will create a blueprint for
future AI clusters that address a crucial bottleneck in Generative
Al: the GPU memory bandwidth and capacity to run large language
(LLM) models with high inference batch sizes to lower operational
costs. The 256-node cluster enables a cloud-scale high-volume
inference powerhouse, easily deployable and scalable.
SuperCluster with 4U Liquid-cooled System in 5 Racks or 8U
Air-cooled System in 9 Racks
- 256 NVIDIA H100/H200 Tensor Core GPUs in one scalable unit
- Liquid cooling enabling 512 GPUs, 64-nodes, in the same
footprint as the air-cooled 256 GPUs, 32-node solution
- 20TB of HBM3 with NVIDIA H100 or 36TB of HBM3e with NVIDIA H200
in one scalable unit
- 1:1 networking delivers up to 400 Gbps to each GPU to enable
GPUDirect RDMA and Storage for training large language models with
up to trillions of parameters
- 400G InfiniBand or 400GbE Ethernet switch fabrics with highly
scalable spine-leaf network topology, including NVIDIA Quantum-2
InfiniBand and NVIDIA Spectrum-X Ethernet Platform.
- Customizable AI data pipeline storage fabric with
industry-leading parallel file system options
- NVIDIA AI Enterprise 5.0 software, which brings support for new
NVIDIA NIM inference microservices that accelerate the deployment
of AI models at scale
SuperCluster with 1U Air-cooled NVIDIA MGX System in 9
Racks
- 256 GH200 Grace Hopper Superchips in one scalable unit
- Up to 144GB of HBM3e + 480GB of LPDDR5X unified memory suitable
for cloud-scale, high-volume, low-latency, and high batch size
inference, able to fit a 70B+ parameter model in one node.
- 400G InfiniBand or 400GbE Ethernet switch fabrics with highly
scalable spine-leaf network topology
- Up to 8 built-in E1.S NVMe storage devices per node
- Customizable AI data pipeline storage fabric with NVIDIA
BlueField®-3 DPUs and industry leading parallel file
system options to deliver high-throughput and low-latency storage
access to each GPU
- NVIDIA AI Enterprise 5.0 software
With the highest network performance achievable for GPU-GPU
connectivity, Supermicro's SuperCluster solutions are optimized for
LLM training, deep learning, and high volume and high batch size
inference. Supermicro's L11 and L12 validation testing combined
with its on-site deployment service provides customers with a
seamless experience. Customers receive plug-and-play scalable units
for easy deployment in a data center and faster time to
results.
About Super Micro Computer, Inc.
Supermicro (NASDAQ: SMCI) is a global leader in
Application-Optimized Total IT Solutions. Founded and operating in
San Jose, California, Supermicro
is committed to delivering first to market innovation for
Enterprise, Cloud, AI, and 5G Telco/Edge IT Infrastructure. We are
a Total IT Solutions manufacturer with server, AI, storage, IoT,
switch systems, software, and support services. Supermicro's
motherboard, power, and chassis design expertise further enable our
development and production, enabling next generation innovation
from cloud to edge for our global customers. Our products are
designed and manufactured in-house (in the US, Taiwan, and the
Netherlands), leveraging global operations for scale and
efficiency and optimized to improve TCO and reduce environmental
impact (Green Computing). The award-winning portfolio of Server
Building Block Solutions® allows customers to optimize
for their exact workload and application by selecting from a broad
family of systems built from our flexible and reusable building
blocks that support a comprehensive set of form factors,
processors, memory, GPUs, storage, networking, power, and cooling
solutions (air-conditioned, free air cooling or liquid
cooling).
Supermicro, Server Building Block Solutions, and We Keep IT
Green are trademarks and/or registered trademarks of Super Micro
Computer, Inc.
All other brands, names, and trademarks are the property of
their respective owners.
Photo
- https://mma.prnewswire.com/media/2365342/Super_Micro_Computer_Inc.jpg
Logo -
https://mma.prnewswire.com/media/1443241/Supermicro_Logo.jpg
View original
content:https://www.prnewswire.co.uk/news-releases/supermicro-launches-three-nvidia-based-full-stack-ready-to-deploy-generative-ai-superclusters-that-scale-from-enterprise-to-large-llm-infrastructures-302092055.html