DefiledAI Tools

VRAM CALCULATOR

Calculate exact VRAM requirements for any model size and quantization format, including KV cache overhead at your chosen context length.

Quality
92%
5128K32K128K
✗ EXCEEDS VRAM
150.5GB
102.5GB short — try a lower quant
Breakdown
Model weights (Q4_K_M)70.0 GB
KV cache (4,096 ctx)80.0 GB
Runtime overhead0.5 GB
Total150.5 GB
All Quants at 70B
F16360.5 GB
Q8_0220.5 GB
Q6_K185.5 GB
Q5_K_M168.0 GB
Q4_K_M150.5 GB
Q3_K_M133.0 GB
IQ3_M141.8 GB
Q2_K115.5 GB
IQ1_M106.8 GB