Skip to content

How to use GPUs

If you need a GPU, you can request one by doing two things:

  • Add a resource limit for a number of nvidia.com/gpu, for example 1 (this is constrained by the number of GPUs available, and your quotas).
  • Add a toleration for nvidia.com/gpu. This is required because we tainted the GPU nodes to prevent them from running non-GPU workloads.
apiVersion: v1
kind: Pod
metadata:
  name: gpu-test
spec:
  containers:
    - name: cuda-container
      image: nvcr.io/nvidia/cuda:9.0-devel
      resources:
        limits:
          nvidia.com/gpu: 1 # Request 1 NVIDIA GPU
  tolerations:
    - key: nvidia.com/gpu
      operator: Exists
      effect: NoSchedule

If you are using a Job, put the resource limit and toleration in the pod template:

apiVersion: batch/v1
kind: Job
metadata:
  name: gpu-test
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: cuda-container
          image: nvcr.io/nvidia/cuda:9.0-devel
          resources:
            limits:
              nvidia.com/gpu: 1 # Request 1 NVIDIA GPU
      tolerations:
        - key: nvidia.com/gpu
          operator: Exists
          effect: NoSchedule

List of available GPUs

  • hsrn-edbse-12wvpl
    • NVIDIA Quadro RTX 8000 (48 GB)
  • hsrn-ed8a-2mt
    • NVIDIA RTX A6000
    • NVIDIA Quadro RTX 8000 (48 GB)
  • hsrn-ed2a-60fifthave
    • NVIDIA Quadro RTX 8000 (48 GB)
  • hsrn-ed2n-rcdc
    • NVIDIA RTX A6000 (48 GB)
    • NVIDIA Quadro RTX 8000 (48 GB)
  • hsrn-ed10d-7e12
    • NVIDIA A100 (80GB)
    • NVIDIA A100 (80GB)
  • hsrn-ed2d-wwh
    • NVIDIA A100 (40GB)
    • NVIDIA A100 (40GB)