<bdo id="uphtf"><nobr id="uphtf"><nobr id="uphtf"></nobr></nobr></bdo>

<track id="uphtf"><track id="uphtf"></track></track>
<font id="uphtf"><xmp id="uphtf">

<track id="uphtf"><track id="uphtf"></track></track>

      NVIDIA HGX-2

      Powered by NVIDIA Tesla V100 GPUs and NVSwitch


      World’s Most Powerful Accelerated Server Platform for Deep Learning, Machine Learning, and HPC

      We’re at the dawn of a new age of intelligence, where deep learning, machine learning and high performance computing (HPC) are transforming our world. From autonomous vehicles, and optimizing retail logistics, to global climate simulations, new challenges are emerging whose solutions demand enormous computing resources.

      NVIDIA HGX-2 is the world’s most powerful accelerated scale-up server platform. Designed with mixed-precision computing, it accelerates every workload to solve these massive challenges. The HGX-2 platform was used to set records on MLPerf, the first industry-wide AI benchmark, delivering the highest single-node performance and validating it as the world’s most powerful, versatile, and scalable computing platform.

      Enables “the World’s Largest GPU”

      Accelerated by 16 NVIDIA? Tesla? V100 GPUs and NVIDIA NVSwitch?, HGX-2 has the unprecedented compute power, bandwidth, and memory topology to train massive models, analyze datasets, and solve simulations faster and more efficiently. The 16 Tesla V100 GPUs work as a single unified 2-petaFLOP accelerator with half a terabyte (TB) of total GPU memory, allowing it to handle the most computationally intensive workloads and enable "the world’s largest GPU."

      Enables the World’s Largest GPU

      AI Training: HGX-2 Replaces 300 CPU-Only Server Nodes

      Workload: ResNet50, 90 epochs to solution  | CPU Server: Dual-Socket Intel Xeon Gold 6140
      |  Dataset: ImageNet2012  |

      Driving Next-Generation AI Deep Learning to Faster Performance

      AI deep learning models are exploding in complexity and require large memory, multiple GPUs, and an extremely fast connection between the GPUs to work. With NVSwitch connecting all GPUs and unified memory, HGX-2 provides the power to handle these new models for faster training of advanced AI. A single HGX-2 replaces 300 CPU-powered servers, saving significant cost, space, and energy in the data center.

      Machine Learning: HGX-2 544X Speedup Compared to CPU-Only Server Nodes

      GPU Measurements Completed on DGX-2 | CPU: 20 CPU cluster- comparison is prorated to 1 CPU (61 GB of memory, 8 vCPUs, 64-bit platform), Apache Spark | US Mortgage Data Fannie Mae and Freddie Mac 2006-2017 | 146M mortgages | Benchmark 200GB CSV dataset | Data preparation includes joins, variable transformations

      Driving Next-Generation AI Machine Learning to Faster Performance

      AI machine learning models require loading, transforming and processing extremely large datasets to glean insights. With 0.5TB of unified memory accessible at a bandwidth of 16TB/s, and all-to-all GPU communications with NVSwitch, HGX-2 has the power to load and perform calculations on enormous datasets to derive actionable insights quickly. With RAPIDS open source machine learning software, a single HGX-2 replaces 544 CPU-based servers, generating significant cost and space savings.

      HPC: HGX-2 Replaces up to 135 CPU-Only Server Nodes

      Application (Dataset): MILC (APEX Medium) and Chroma (szscl21_24_128) | CPU Server: Dual-Socket Intel Xeon Platinum 8280 (Cascade Lake)

      The Highest-Performing HPC Supernode

      HPC applications require strong server nodes with the computing power to perform a massive number of calculations per second. Increasing the compute density of each node dramatically reduces the number of servers required, resulting in huge savings in cost, power, and space consumed in the data center. For HPC simulations, high-dimension matrix multiplication requires a processor to fetch data from many neighbors to facilitate computation, making GPUs connected by NVSwitch ideal. A single HGX-2 server replaces up to 135 CPU based servers in science applications.

      NVSwitch for Full-Bandwidth Computing

      NVSwitch enables every GPU to communicate with every other GPU at full bandwidth of 2.4TB/sec to solve the largest of AI and HPC problems. Every GPU has full access to 0.5TB of aggregate HBM2 memory at a bandwidth of 16TB/s to handle the most massive of datasets. By enabling a unified server node, NVSwitch dramatically accelerates complex AI deep learning, AI machine learning, and HPC applications.

      NVSwitch for Full-Bandwidth Computing


      HGX-1 HGX-2
      Performance 1 petaFLOP tensor operations
      125 teraFLOPS single-precision
      62 teraFLOPS double-precision
      2 petaFLOPS tensor operations
      250 teraFLOPS single-precision
      125 teraFLOPS double-precision
      GPUs 8x NVIDIA Tesla V100 16x NVIDIA Tesla V100
      GPU Memory 256GB total
      7.2TB/s bandwidth
      512GB total
      16TB/s bandwidth
      NVIDIA CUDA? Cores 40,960 81,920
      NVIDIA Tensor Cores 5,120 10,240
      Communication Channel Hybrid cube mesh powered by NVLink 300GB/s bisection bandwidth NVSwitch powered by NVLink 2.4TB/s bisection bandwidth

      HGX-1 Reference Architecture

      Powered by NVIDIA Tesla GPUs and NVLink

      NVIDIA HGX-1 is a reference architecture that standardized the design of data centers accelerating AI in the cloud. Based on eight Tesla SXM2 V100 boards, a hybrid cube mesh topology for scalability, and 1 petaFLOP of compute power, its modular design works seamlessly in hyperscale data centers and delivers a quick, simple path to AI.

      Empowering the Data Center Ecosystem

      NVIDIA partners with the world’s leading manufacturers to rapidly advance AI cloud computing. NVIDIA provides HGX-2 GPU baseboards, design guidelines, and early access to GPU computing technologies for partners to integrate into servers and deliver at scale to their data center ecosystems.