Production Runbook: GPU Pod Stuck in Pending
The complete debug path from symptoms to fix. Every step has a command.
Your GPU pod is stuck in the Pending state. The events say:
0/12 nodes are available: 12 Insufficient nvidia.com/gpu
This could mean six different things. Most engineers start debugging the scheduler. That’s almost never the problem.
This runbook walks through the exact diagnostic sequence, in the right order, so you find the root cause in minutes instead of hours.



