How to enable NVIDIA GPU workloads on k3s cluster

GPU workloads on k3s

K3s is a highly available, certified Kubernetes distribution designed for production workloads. It can also be used for AI workloads.

By default, k3s nodes do not recognize GPUs. In this article, we will enable k3s to work with a GPU.

Step 1 : Install NVIDIA drivers

# Ubuntu / Debian
sudo apt-get update
sudo apt-get install -y nvidia-driver-535
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart containerd

Verify Docker/Containerd GPU access:

nvidia-container-cli info

Step 2 : Install k3s

k3s uses containerd, so we must enable the NVIDIA runtime.

Create config:

sudo mkdir -p /etc/rancher/k3s
sudo nano /etc/rancher/k3s/config.yaml

Inside the config file you need to add the following:

write-kubeconfig-mode: "0644"
container-runtime-endpoint: ""

Now install k3s : (the below command can also be tweaked further, more details on k3s installation page)

curl -sfL https://get.k3s.io | sh -

Restart k3s:

sudo systemctl restart k3s

Step 3 : Install helm

Official Helm repository:

curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Point Helm explicitly to k3s kubeconfig

export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
sudo chmod 644 /etc/rancher/k3s/k3s.yaml

Step 4 : Install NVIDIA GPU Operator

helm install gpu-operator nvidia/gpu-operator \
  --namespace gpu-operator \
  --set driver.enabled=false

After some time, you need to create a test yaml file called cuda-test.yaml

apiVersion: v1
kind: Pod
metadata:
  name: cuda-test
spec:
  restartPolicy: Never
  containers:
    - name: cuda
      image: nvidia/cuda:12.2.0-base-ubuntu22.04
      resources:
        limits:
          nvidia.com/gpu: 1
      command:
        - bash
        - -c
        - |
          echo "=== NVIDIA-SMI OUTPUT ==="
          nvidia-smi
          echo "========================="
          sleep 3600

Use below command to spawn a new test pod:

kubectl apply -f cuda-test.yaml
kubectl get pods

Now if everything is successful, you should see something like this;

You can now use the node for GPU workloads.