Mulit-Instance GPU¶

NVIDIA Multi-Instance GPU (MIG) is a feature introduced with the NVIDIA Ampere architecture. It allows a single GPU to be securely partitioned into up to seven separate GPU Instances. Each instance has its own high-bandwidth memory, cache, and compute cores. This provides multiple users with separate GPU resources for optimal utilization

MIG is particularly beneficial for workloads that do not fully saturate the GPU’s compute capacity. Users may want to run different workloads in parallel to maximize utilization.

Each instance’s processors have separate and isolated paths through the entire memory system. This ensures that an individual user’s workload can run with predictable throughput and latency. MIG can partition available GPU compute resources to provide a defined quality of service (QoS) with fault isolation for different clients.

MIG Profile¶

MIG use the notation Xg.Ygb as a profile to indicate the GPU core and memory allocation, where X means number of core and Y means the memory size.

Under specification of A800, every GPU has 7 cores and 80GB of memory. It is allowed to assign the following types of profile to it.

1g.10gb
1g.20gb
2g.20gb
3g.40gb
4g.40gb
7g.80gb

The following table shows the current MIG profile assigned on the GPUs on every DGX nodes.

GPU	MIG Profile
GPU0	`2g.20gb` `2g.20gb` `2g.20gb` `1g.20gb`
GPU1	`2g.20gb` `2g.20gb` `2g.20gb` `1g.20gb`
GPU2	`2g.20gb` `2g.20gb` `2g.20gb` `1g.20gb`
GPU3	`2g.20gb` `2g.20gb` `2g.20gb` `1g.20gb`
GPU4	`2g.20gb` `2g.20gb` `2g.20gb` `1g.20gb`
GPU5	`2g.20gb` `2g.20gb` `2g.20gb` `1g.20gb`
GPU6	`3g.40gb` `4g.40gb`
GPU7	`7g.80gb`

https://www.nvidia.com/en-us/technologies/multi-instance-gpu/ ↩