Skip to content

Mulit-Instance GPU

MIG

NVIDIA Multi-Instance GPU (MIG) is a feature introduced with the NVIDIA Ampere architecture. It allows a single GPU to be securely partitioned into up to seven separate GPU Instances. Each instance has its own high-bandwidth memory, cache, and compute cores. This provides multiple users with separate GPU resources for optimal utilization

MIG is particularly beneficial for workloads that do not fully saturate the GPU’s compute capacity. Users may want to run different workloads in parallel to maximize utilization.

Each instance’s processors have separate and isolated paths through the entire memory system. This ensures that an individual user’s workload can run with predictable throughput and latency. MIG can partition available GPU compute resources to provide a defined quality of service (QoS) with fault isolation for different clients.

MIG Profile

MIG use the notation Xg.Ygb as a profile to indicate the GPU core and memory allocation, where X means number of core and Y means the memory size.

Under specification of A800, every GPU has 7 cores and 80GB of memory. It is allowed to assign the following types of profile to it.

  • 1g.10gb
  • 1g.20gb
  • 2g.20gb
  • 3g.40gb
  • 4g.40gb
  • 7g.80gb

The following table shows the current MIG profile assigned on the GPUs on every DGX nodes.

GPU MIG Profile
GPU0 2g.20gb 2g.20gb 2g.20gb 1g.20gb
GPU1 2g.20gb 2g.20gb 2g.20gb 1g.20gb
GPU2 2g.20gb 2g.20gb 2g.20gb 1g.20gb
GPU3 2g.20gb 2g.20gb 2g.20gb 1g.20gb
GPU4 2g.20gb 2g.20gb 2g.20gb 1g.20gb
GPU5 2g.20gb 2g.20gb 2g.20gb 1g.20gb
GPU6 3g.40gb 4g.40gb
GPU7 7g.80gb