Architecture Template: GPU Node Pool Setup
Complete YAML for a multi-tier GPU cluster with taints, tolerations, affinity, quotas, and priority classes. Copy, configure, deploy.
When to use this template:
Setting up GPU node isolation for the first time
Adding a new GPU tier to an existing cluster
Configuring per-team GPU quotas
Setting up priority classes for GPU workloads
File 1: gpu-node-taints.sh
Apply taints to GPU nodes. Run once per node or set at the node pool level.
#!/bin/bash
# gpu-node-taints.sh
# Apply taints to GPU nodes for workload isolation
set -euo pipefail
echo "=== Tainting Production GPU Nodes (Tier 1) ==="
for node in $(kubectl get nodes -l gpu-tier=production -o jsonpath='{.items[*].metadata.name}'); do
kubectl taint nodes $node nvidia.com/gpu=present:NoSchedule --overwrite
kubectl label nodes $node gpu-tier=production --overwrite
echo " Tainted: $node"
done
echo ""
echo "=== Tainting Development GPU Nodes (Tier 2) ==="
for node in $(kubectl get nodes -l gpu-tier=development -o jsonpath='{.items[*].metadata.name}'); do
kubectl taint nodes $node nvidia.com/gpu=present:NoSchedule --overwrite
kubectl label nodes $node gpu-tier=development --overwrite
echo " Tainted: $node"
done
echo ""
echo "=== Verification ==="
echo "Production GPU nodes:"
kubectl get nodes -l gpu-tier=production -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
echo ""
echo "Development GPU nodes:"
kubectl get nodes -l gpu-tier=development -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
File 2: priority-classes.yaml
# gpu-priority-classes.yaml
# Three tiers of GPU workload priority
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: gpu-production
value: 1000000
globalDefault: false
preemptionPolicy: PreemptLowerPriority
description: |
Production GPU inference workloads.
Highest priority. Will preempt development and batch workloads.
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: gpu-development
value: 100000
globalDefault: false
preemptionPolicy: PreemptLowerPriority
description: |
Development GPU workloads (notebooks, experiments).
Preempted by production. Will preempt batch workloads.
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: gpu-batch
value: 10000
globalDefault: false
preemptionPolicy: Never
description: |
Batch GPU jobs (training, data processing).
Lowest priority. Will NOT preempt other workloads.
Waits for available GPUs.
kubectl apply -f priority-classes.yaml



