AI Services

Local AI services for running Large Language Models.

Overview

Combines Ollama (LLM runner) with Open WebUI (ChatGPT-like interface) for local AI capabilities. Ollama v0.14.0+ natively speaks the Anthropic Messages API, enabling direct integration with Claude Code.

Access:

Open WebUI: https://ai.janvv.nl (internal) or via Cloudflare tunnel (external)
Ollama API: https://ollama.janvv.nl (requires Bearer token)

Container: CT 124 IP: 192.168.144.62 Resources: 12 vCPU, 24 GB RAM Ports: 11435 (auth proxy -> Ollama), 8080 (Open WebUI)

Architecture

Internal path:
  Client -> Pi-hole DNS -> Caddy (192.168.144.31:443) -> Auth proxy (192.168.144.62:11435) -> Ollama

External path:
  Client -> Cloudflare DNS -> CF Tunnel -> Auth proxy (192.168.144.62:11435) -> Ollama

Docker stack:

Ollama - LLM inference (Docker-internal only, not exposed to host)
Auth proxy (Caddy) on port 11435 - validates Bearer token, proxies to Ollama
Open WebUI on port 8080 - browser chat interface, connects to Ollama via Docker network

Ollama's port 11434 is not exposed to the host. All API access goes through the auth proxy on 11435.

Source Code

Docker Compose & Install Script: github.com/opajanvv/homelab-docker/tree/main/ai Local copy: ~/dev/homelab-docker/ai/

Deployment

# Clone LXC template
pct clone 902 124 --hostname ai --full
pct set 124 --cores 12 --memory 24576
pct set 124 -net0 name=eth0,bridge=vmbr0,firewall=1,gw=192.168.144.1,ip=192.168.144.62/23
pct set 124 -mp0 /lxcdata/ai,mp=/data
pct set 124 -mp1 /home/jan/homelab-docker,mp=/opt/homelab-docker
pct set 124 -features nesting=1,keyctl=1
pct set 124 -onboot 1

# Add AppArmor workaround
cat >> /etc/pve/lxc/124.conf << 'EOF'
lxc.apparmor.profile: unconfined
lxc.mount.entry: /dev/null sys/module/apparmor/parameters/enabled none bind 0 0
EOF

# Deploy
pct start 124
pct exec 124 -- bash -c 'systemctl enable --now docker'
# Configs available via bind mount (no git clone needed)
pct exec 124 -- bash -c 'cd /opt/homelab-docker/ai && chmod +x install.sh && ./install.sh'

Configuration

Data Locations:

/data/ollama/ - Downloaded models (inside container)
/opt/homelab-docker/ai/ - Docker Compose config (inside container)
~/dev/homelab-docker/ai/ - Local working directory (laptop)

Environment variables (in /opt/homelab-docker/ai/.env):

OLLAMA_API_KEY - Bearer token for API authentication

Authentication

All API access requires a Bearer token via the auth proxy (Caddy on port 11435). Open WebUI has its own login system and connects to Ollama internally.

# Test (should return 401):
curl -s -o /dev/null -w "%{http_code}" https://ollama.janvv.nl/v1/messages

# Test with auth (should return model response):
curl -H "Authorization: Bearer $OLLAMA_API_KEY" \
     https://ollama.janvv.nl/v1/messages \
     -d '{"model":"qwen3-coder","max_tokens":10,"messages":[{"role":"user","content":"hi"}]}'

Claude Code Integration

Ollama's native Anthropic Messages API (/v1/messages) enables direct use with Claude Code via ANTHROPIC_BASE_URL. Shell wrapper functions are in ~/.config/claude-wrappers.sh:

claude-ollama - Connects to Ollama on the homelab server
claude-local - Connects to Ollama running locally on the laptop

API key for the server is stored in ~/.config/ollama-api-key.

Models

Models are downloaded on-demand and stored in /data/ollama/.

Installed:

qwen3-coder (30b, 18 GB) - Primary code model
llama3 (4.7 GB) - General purpose
deepseek-r1 (5.2 GB) - Reasoning

Pull via CLI (in container):

pct exec 124 -- bash -c 'docker exec ollama ollama pull <model-name>'

Backup

What to backup:

/lxcdata/ai/ollama/ - Downloaded models (can be large)

Note: Models can be re-downloaded, but backing up saves time/bandwidth.

Maintenance

Update:

pct exec 124 -- bash -c 'cd /opt/homelab-docker/ai && docker compose pull && docker compose up -d'

View logs:

pct exec 124 -- bash -c 'cd /opt/homelab-docker/ai && docker compose logs -f'

List models:

pct exec 124 -- bash -c 'docker exec ollama ollama list'

Remove unused models:

pct exec 124 -- bash -c 'docker exec ollama ollama rm <model-name>'