Local AI services for running Large Language Models.
Overview
Combines Ollama (LLM runner) with Open WebUI (ChatGPT-like interface) for local AI capabilities. Ollama v0.14.0+ natively speaks the Anthropic Messages API, enabling direct integration with Claude Code.
Access:
- Open WebUI: https://ai.janvv.nl (internal) or via Cloudflare tunnel (external)
- Ollama API: https://ollama.janvv.nl (requires Bearer token)
Container: CT 124 IP: 192.168.144.62 Resources: 12 vCPU, 24 GB RAM Ports: 11435 (auth proxy -> Ollama), 8080 (Open WebUI)
Architecture
Internal path:
Client -> Pi-hole DNS -> Caddy (192.168.144.31:443) -> Auth proxy (192.168.144.62:11435) -> Ollama
External path:
Client -> Cloudflare DNS -> CF Tunnel -> Auth proxy (192.168.144.62:11435) -> Ollama
Docker stack:
- Ollama - LLM inference (Docker-internal only, not exposed to host)
- Auth proxy (Caddy) on port 11435 - validates Bearer token, proxies to Ollama
- Open WebUI on port 8080 - browser chat interface, connects to Ollama via Docker network
Ollama's port 11434 is not exposed to the host. All API access goes through the auth proxy on 11435.
Source Code
Docker Compose & Install Script: github.com/opajanvv/homelab-docker/tree/main/ai
Local copy: ~/dev/homelab-docker/ai/
Deployment
# Clone LXC template
pct clone 902 124 --hostname ai --full
pct set 124 --cores 12 --memory 24576
pct set 124 -net0 name=eth0,bridge=vmbr0,firewall=1,gw=192.168.144.1,ip=192.168.144.62/23
pct set 124 -mp0 /lxcdata/ai,mp=/data
pct set 124 -mp1 /home/jan/homelab-docker,mp=/opt/homelab-docker
pct set 124 -features nesting=1,keyctl=1
pct set 124 -onboot 1
# Add AppArmor workaround
cat >> /etc/pve/lxc/124.conf << 'EOF'
lxc.apparmor.profile: unconfined
lxc.mount.entry: /dev/null sys/module/apparmor/parameters/enabled none bind 0 0
EOF
# Deploy
pct start 124
pct exec 124 -- bash -c 'systemctl enable --now docker'
# Configs available via bind mount (no git clone needed)
pct exec 124 -- bash -c 'cd /opt/homelab-docker/ai && chmod +x install.sh && ./install.sh'
Configuration
Data Locations:
/data/ollama/- Downloaded models (inside container)/opt/homelab-docker/ai/- Docker Compose config (inside container)~/dev/homelab-docker/ai/- Local working directory (laptop)
Environment variables (in /opt/homelab-docker/ai/.env):
OLLAMA_API_KEY- Bearer token for API authentication
Authentication
All API access requires a Bearer token via the auth proxy (Caddy on port 11435). Open WebUI has its own login system and connects to Ollama internally.
# Test (should return 401):
curl -s -o /dev/null -w "%{http_code}" https://ollama.janvv.nl/v1/messages
# Test with auth (should return model response):
curl -H "Authorization: Bearer $OLLAMA_API_KEY" \
https://ollama.janvv.nl/v1/messages \
-d '{"model":"qwen3-coder","max_tokens":10,"messages":[{"role":"user","content":"hi"}]}'
Claude Code Integration
Ollama's native Anthropic Messages API (/v1/messages) enables direct use with Claude Code via ANTHROPIC_BASE_URL. Shell wrapper functions are in ~/.config/claude-wrappers.sh:
claude-ollama- Connects to Ollama on the homelab serverclaude-local- Connects to Ollama running locally on the laptop
API key for the server is stored in ~/.config/ollama-api-key.
Models
Models are downloaded on-demand and stored in /data/ollama/.
Installed:
qwen3-coder(30b, 18 GB) - Primary code modelllama3(4.7 GB) - General purposedeepseek-r1(5.2 GB) - Reasoning
Pull via CLI (in container):
pct exec 124 -- bash -c 'docker exec ollama ollama pull <model-name>'
Backup
What to backup:
/lxcdata/ai/ollama/- Downloaded models (can be large)
Note: Models can be re-downloaded, but backing up saves time/bandwidth.
Maintenance
Update:
pct exec 124 -- bash -c 'cd /opt/homelab-docker/ai && docker compose pull && docker compose up -d'
View logs:
pct exec 124 -- bash -c 'cd /opt/homelab-docker/ai && docker compose logs -f'
List models:
pct exec 124 -- bash -c 'docker exec ollama ollama list'
Remove unused models:
pct exec 124 -- bash -c 'docker exec ollama ollama rm <model-name>'