DefiledAI Community
BENCHMARK SUBMISSIONS
Submit your real-world inference results. Results are stored locally in your browser and displayed below.
Submit Your Result
Community Results (10)
| Model | Quant | Backend | GPU | VRAM | Tok/s | Ctx | OS | Date |
|---|---|---|---|---|---|---|---|---|
| Llama 3.1 8B | Q4_K_M | ExLlamaV2 | RTX 4090 | 24GB | 128 | 4,096 | Windows 11 | 2026-05-27 |
| Llama 3.1 8B | Q8_0 | Ollama | RTX 4090 | 24GB | 94 | 4,096 | Windows 11 | 2026-05-27 |
| Mistral 7B | Q8_0 | Ollama | RTX 3080 10GB | 10GB | 89 | 4,096 | Windows 11 | 2026-05-23 |
| Phi-3 Medium 14B | Q6_K | llama.cpp | RTX 4090 | 24GB | 68 | 8,192 | Ubuntu 22.04 | 2026-05-22 |
| Gemma 2 27B | Q4_K_M | llama.cpp | RTX 4090 | 24GB | 44 | 4,096 | Windows 11 | 2026-05-24 |
| Mixtral 8x22B | Q4_K_M | ExLlamaV2 | 2× RTX 3090 NVLink | 48GB | 24.7 | 4,096 | Ubuntu 22.04 | 2026-05-25 |
| Llama 3.1 70B | Q4_K_M | ExLlamaV2 | 2× RTX 3090 NVLink | 48GB | 21.3 | 4,096 | Ubuntu 22.04 | 2026-05-28 |
| Qwen 3 72B | Q4_K_M | ExLlamaV2 | 2× RTX 3090 NVLink | 48GB | 19.8 | 4,096 | Ubuntu 22.04 | 2026-05-26 |
| DeepSeek R1 70B | Q4_K_M | ExLlamaV2 | 2× RTX 3090 NVLink | 48GB | 19.2 | 4,096 | Ubuntu 22.04 | 2026-05-26 |
| Llama 3.1 70B | Q4_K_M | llama.cpp | 2× RTX 3090 NVLink | 48GB | 17.8 | 4,096 | Ubuntu 22.04 | 2026-05-28 |
Your submissions are stored in your browser. Tok/s = sustained output tokens per second, first token excluded.