DefiledAI Community

BENCHMARK SUBMISSIONS

Submit your real-world inference results. Results are stored locally in your browser and displayed below.

Submit Your Result
Community Results (10)
ModelQuantBackendGPUVRAMTok/sCtxOSDate
Llama 3.1 8BQ4_K_MExLlamaV2RTX 409024GB1284,096Windows 112026-05-27
Llama 3.1 8BQ8_0OllamaRTX 409024GB944,096Windows 112026-05-27
Mistral 7BQ8_0OllamaRTX 3080 10GB10GB894,096Windows 112026-05-23
Phi-3 Medium 14BQ6_Kllama.cppRTX 409024GB688,192Ubuntu 22.042026-05-22
Gemma 2 27BQ4_K_Mllama.cppRTX 409024GB444,096Windows 112026-05-24
Mixtral 8x22BQ4_K_MExLlamaV22× RTX 3090 NVLink48GB24.74,096Ubuntu 22.042026-05-25
Llama 3.1 70BQ4_K_MExLlamaV22× RTX 3090 NVLink48GB21.34,096Ubuntu 22.042026-05-28
Qwen 3 72BQ4_K_MExLlamaV22× RTX 3090 NVLink48GB19.84,096Ubuntu 22.042026-05-26
DeepSeek R1 70BQ4_K_MExLlamaV22× RTX 3090 NVLink48GB19.24,096Ubuntu 22.042026-05-26
Llama 3.1 70BQ4_K_Mllama.cpp2× RTX 3090 NVLink48GB17.84,096Ubuntu 22.042026-05-28

Your submissions are stored in your browser. Tok/s = sustained output tokens per second, first token excluded.