Local AI runtimes and tools

LOCAL_AI_STACK / runtimes(7)

v2.2 · 2026-06-12

TYPE7 of 7 runtimes/tools

OllamaRUNTIME

**-moderate

FEATURES

OpenAI API: partial

Document chat: possible with integrations

Docker: optional

Multi-user: no

GPU support: yes

FIT_NOTES

Ollama is a local model runner used to pull, run, and serve models locally. It often acts as the backend for other local AI apps.

API_BEHAVIOR

Local REST API; privacy and network exposure depend on host binding, providers, pulls, and configuration.

BEST_FOR

running local models from a simple runtime;

terminal-first local AI workflows;

local API experiments;

pairing with Open WebUI;

developers who want a local backend.

NOT_GOOD_FOR

users who want a polished first GUI-only experience;

built-in PDF chat without another layer;

performance claims without hardware testing;

sensitive workflows before privacy settings are reviewed.

mac / windows / linux · local-first-> view full record

LM StudioRUNTIME

*--beginner

FEATURES

OpenAI API: yes

Document chat: built in

Docker: not required

Multi-user: no

GPU support: yes

FIT_NOTES

LM Studio is a local AI desktop app and developer stack. It can be used as a chat app, model manager, document-chat app, and local server.

API_BEHAVIOR

Desktop app can expose local API/server modes; serving beyond localhost changes the risk boundary.

BEST_FOR

beginner GUI workflows;

model browsing and downloads;

local chat in a desktop app;

document attachments and local document experiments;

optional local API/server workflows.

NOT_GOOD_FOR

the lightest possible backend runtime;

multi-user self-hosted web interfaces;

claims about performance without hardware tests;

assuming offline/privacy behavior without checking settings.

mac / windows / linux · hybrid-> view full record

Open WebUIUI

**-moderate

FEATURES

OpenAI API: partial

Document chat: provider dependent

Docker: common path

Multi-user: yes

GPU support: unknown

FIT_NOTES

Open WebUI is a self-hosted-style web interface that can connect to local providers such as Ollama and to cloud-compatible providers.

API_BEHAVIOR

Connects to Ollama or other providers; the provider choice controls where model calls go.

BEST_FOR

browser UI over Ollama;

users who want a ChatGPT-like local workspace;

home-lab or self-hosted-style workflows;

document and knowledge workflows after setup;

people comfortable with Docker/server concepts.

NOT_GOOD_FOR

users who do not want to manage local server settings;

users confused by Docker networking;

sensitive documents before provider and embedding settings are reviewed;

claims that the full workflow is local without verification.

web / docker / linux / mac / windows · hybrid-> view full record

AnythingLLMRAG-APP

**-moderate

FEATURES

OpenAI API: unknown

Document chat: built in

Docker: optional

Multi-user: yes

GPU support: unknown

FIT_NOTES

AnythingLLM is an AI workspace/document-chat category tool. It can be useful for RAG-style workflows, but it should be evaluated by provider choice and storage behavior.

API_BEHAVIOR

Provider layer; LLM, embedder, vector database, and tool choices determine local/cloud behavior.

BEST_FOR

document chat exploration;

RAG-style workflows;

workspace organization around files;

users comparing PDF/chat options;

local or private knowledge-base experiments.

NOT_GOOD_FOR

assuming local-only behavior without reviewing configuration;

users who only need a simple runtime;

benchmark or citation-accuracy claims without testing;

sensitive use before storage and provider settings are reviewed.

mac / windows / linux / docker · unknown-> view full record

JanRUNTIME

*--beginner

FEATURES

OpenAI API: unknown

Document chat: unknown

Docker: not required

Multi-user: no

GPU support: unknown

FIT_NOTES

Jan is a desktop local AI app category entry. It should not be elevated above better-documented launch tools until its current features, provider behavior, and privacy posture are reviewed.

API_BEHAVIOR

Current provider and API behavior needs current official documentation before stronger site guidance.

BEST_FOR

users comparing LM Studio alternatives;

desktop local AI exploration;

future roundup pages on GUI-first local AI apps.

NOT_GOOD_FOR

high-confidence recommendations without current official documentation;

privacy-sensitive work without current official documentation;

performance claims without testing.

mac / windows / linux · unknown-> view full record

llama.cppLIBRARY

***advanced

FEATURES

OpenAI API: unknown

Document chat: possible with integrations

Docker: optional

Multi-user: no

GPU support: unknown

FIT_NOTES

llama.cpp is a lower-level local inference project often used directly by advanced users and indirectly by local AI apps.

API_BEHAVIOR

Backend/library path; app behavior depends on build flags, serving mode, model format, and wrapper.

BEST_FOR

technical users;

lower-level local inference experiments;

understanding GGUF and quantization;

developers who want control over backend behavior.

NOT_GOOD_FOR

one-click beginner setup;

casual users who only want a chat app;

unsupported hardware-performance claims.

mac / windows / linux · local-first-> view full record

GPT4AllRUNTIME

*--beginner

FEATURES

OpenAI API: unknown

Document chat: unknown

Docker: not required

Multi-user: no

GPU support: unknown

FIT_NOTES

GPT4All is a local AI app/project category entry. It belongs in the ecosystem map, but stronger recommendations need current official documentation.

API_BEHAVIOR

Current app/API behavior needs current official documentation before stronger site guidance.

BEST_FOR

users researching desktop local AI apps;

comparison completeness;

future local AI app alternatives pages.

NOT_GOOD_FOR

high-confidence launch recommendations before current review;

benchmark claims;

exact compatibility claims.

mac / windows / linux · unknown-> view full record

RUNTIME DIRECTORY CONTEXT

How to compare local AI runtimes and tools

Intro

This directory explains the difference between local runtimes, desktop apps, browser interfaces, RAG/document apps, and lower-level libraries.

Category explanation

Category	Meaning	Examples
Runtime	Runs models locally and exposes CLI/API behavior.	Ollama, llama.cpp
Desktop app	Gives users a GUI for downloading and chatting with models.	LM Studio, Jan, GPT4All
Browser UI	Web interface layered over local or cloud providers.	Open WebUI
RAG/document app	Focuses on document ingestion, retrieval, and workspace workflows.	AnythingLLM

Directory caution

Do not treat all tools as equivalent. A runtime, GUI, RAG app, and self-hosted web UI solve different problems and carry different privacy risks.