Decision tool

Local AI stack recommender

Answer a few questions and get a deterministic stack suggestion with hardware, privacy, document workflow, and configuration caveats.

Step 1 of 2: describe your machine and workflow. Step 2 updates the deterministic result below.

System RAM (GB)Use 0 or blank if unknown.GPU VRAM (GB)Leave blank for Apple unified memory or unknown.Need a local API

PlatformGPU typePrimary goalSkill levelUI preferenceDocker tolerancePrivacy needDocument workflowSetup preference

general_chat

Mapped to the simplest source-backed stack that fits the selected workflow.

fits_comfortably

This setup should likely run the selected model locally with the selected context, assuming the model format and runtime are configured correctly.

casual

Configuration-dependent.

simplest

Docker-heavy paths are penalized unless browser UI was requested.

Official documentation reviewed, with caveatsSource-backed estimate

Use LM Studio as the first stack to evaluate for this input set.

This setup should likely run the selected model locally with the selected context, assuming the model format and runtime are configured correctly.

Privacy depends on model/provider selection, embeddings, storage, network binding, sync, and tools.

Do not choose Open WebUI alone as the model runtime; pair it with Ollama or another provider.
Do not assume document chat is local-only without checking the LLM, embedder, vector database, and tool settings.
Do not increase stack complexity before lowering model size, context, or quantization when hardware fit is poor.

This is deterministic planning guidance, not a benchmark or hands-on setup result.
Model fit still depends on model size, quantization, context length, runtime overhead, drivers, and other running apps.