DefiledAI

TOOLS

26 interactive tools for local AI inference. No sign-up. Everything runs in your browser.

Flagship

Local MoE Pipeline Builder

Design a macro-scale Mixture-of-Experts pipeline from independent local models. Router + domain experts + conditional synthesizer. Generates Python code and YAML config.

Hardware Simulator

NEW & UNIQUE

Beyond basic VRAM calculator: Input your exact setup (e.g., 2× RTX 4090 + 128GB RAM, or 1× 5090 + CPU offload) and get realistic estimates

Abliteration Test Suite

NEW & UNIQUE

Abliteration test suite with publishable results score-card

Multi-Model Planner

NEW & UNIQUE

Generate code to run multiple models in a variety of ways

Prompt Tester

NEW & UNIQUE

Test prompt effectiveness

Quant Quality Estimator

NEW & UNIQUE

Visual representation for Quant quality V baseline

True VRAM Calculator

NEW & UNIQUE

True VRAM Calculator based on all metrics

Hardware & Performance

Model Compatibility Checker

MOST USEFUL

Select your GPU — see every uncensored and abliterated model that fits with estimated tok/s and HuggingFace links.

Can I Run It?

EMBEDDABLE

Quick GPU check for uncensored models. Embeddable iframe for Discord and websites.

Inference Speed Estimator

POPULAR

Predict tokens per second before downloading. 30+ GPUs, all quants, 6 backends.

Inference Profiler

ADVANCED

Detailed profile: throughput, time-to-first-token, bandwidth utilisation, CPU offload analysis. Compare two configs side-by-side.

GPU Price / Performance

UPDATED WEEKLY

Current GPU rankings by inference value. Tok/s per dollar, sortable by metric. Updated weekly.

Benchmark Compare

VISUAL

Side-by-side GPU comparison across 7B, 13B, and 70B model sizes with visual bar charts.

Hardware Advisor

NEW

4-question wizard giving specific GPU and build recommendations. Budget-aware, use-case aware.

Model Selection & Planning

VRAM Calculator

PRECISE

Exact VRAM for any model size, quant, and context length including KV cache breakdown.

Context Length Calculator

UNIQUE

Find your maximum context window given VRAM, model, and KV cache quantization.

Token Budget Calculator

NEW

Plan context usage, generation time, and API cost. Works for local and cloud models.

Quant Picker

BEGINNER

Answer 3 questions — get the right quantization format with a clear explanation.

Backend Picker

PRACTICAL

4 questions to find the right inference backend for your GPU, OS, and use case.

Uncensored & Abliteration

Abliteration Quality Scorer

UNIQUE

Compare base vs abliterated benchmark scores. Grade retention S–D. 14 known models.

Model Diff

RESEARCH

Side-by-side comparison of base vs abliterated outputs with real examples.

HuggingFace Tracker

CURATED

Curated list of abliterated, uncensored, and Dolphin uploads. Updated weekly.

Configuration & Prompting

Ollama Modelfile Generator

SAVES TIME

Build a Modelfile with system prompt, sampling params, and context. 6 presets.

System Prompt Library

NEW

20 production-ready system prompts — uncensored assistant, coding, creative writing, reasoning, productivity.

Community

Community Model Reviews

COMMUNITY

Structured reviews with hardware, use case, verdict, pros and cons.

Submit Benchmark

COMMUNITY

Share your inference results. Community table with filter and sort.