Configuration Reference
Detailed configuration options for THON and Lemonade Server
Configuration Reference
thon.yaml (Unified Config)
The recommended way to configure THON is via a single thon.yaml file, created
by python -m thon init. This file is the single source of truth for all settings — the
CLI, API, and dashboard all read from it (or its .env export).
Full Example
# THON Configuration
# Generated by `python -m thon init` — edit freely or re-run the wizard.
# ── Network ───────────────────────────────────────────────────
# External IP for SSL cert SAN and URLs (auto-detected if empty)
external_ip: ""
# ── Groups & Users ────────────────────────────────────────────
# Each user gets their own VS Code sandbox with workspace at
# /workspace/{group}/{username}
groups:
alpha:
- alice
- bob
- carol
beta:
- dave
- eve
gamma:
- frank
# ── Sandbox ───────────────────────────────────────────────────
sandbox:
domain: "localhost:8080"
api_key: ""
image: "waterpistol/thon:latest"
python_version: "3.11"
starting_port: 8443
timeout_minutes: 0
# ── VS Code Instances ────────────────────────────────────────
vscode:
# Enable per-user password authentication for code-server
secure: false
# Path to VS Code settings JSON (injected into each sandbox)
settings_file: ""
# Path to extensions.txt list (injected into each sandbox)
extensions_file: ""
# ── Nginx & SSL ──────────────────────────────────────────────
nginx:
enabled: true
ssl_dir: "/etc/nginx/ssl"
# ── Workspace ─────────────────────────────────────────────────
# Set dir to enable persistent bind mounts (empty = ephemeral)
workspace:
dir: ""
# ── Lemonade Server (Local LLM Inference) ────────────────────
lemonade:
enabled: false
host: "0.0.0.0"
port: 13305
model: "unsloth/gemma-4-31B-it-GGUF:Q8_K_XL"
model_name: "gemma-4-31b-it"
mmproj: "mmproj-BF16.gguf"
llamacpp_backend: "auto"
prefer_system: true
llamacpp_bin: "/usr/local/bin/llama-server"
generate_keys: true
api_key: ""
admin_api_key: ""
# ── Kilo Code ─────────────────────────────────────────────────
# Auto-generated from Lemonade settings when lemonade.enabled=true
kilo:
config_file: ""
# ── AI Gateway (APISIX Rate Limiting) ────────────────────────
gateway:
enabled: false
# per-user: each user gets their own API key and rate limit
# per-group: each group shares one API key with combined limit
mode: "per-user"
admin_key: "edd1c9f034335f136f87ad84b625c8f1"
redis_host: ""
redis_port: 6379
rate_limit: 500
time_window: 60
# ── Dashboard ─────────────────────────────────────────────────
dashboard:
host: "0.0.0.0"
port: 8100
debug: false
# ── Authentication (OIDC) ─────────────────────────────────────
# For Streamlit dashboard: set AUTH_LOCAL_PASSWORD env variable
auth:
enabled: false
session_secret: ""
github:
client_id: ""
client_secret: ""
gitlab:
client_id: ""
client_secret: ""
linkedin:
client_id: ""
client_secret: ""Config Sections
| Section | Description |
|---|---|
external_ip | External IP for SSL cert SAN and URLs |
groups | Group name → list of usernames mapping |
sandbox | Sandbox server connection and instance settings |
vscode | VS Code security and customization |
nginx | Nginx reverse proxy and SSL |
workspace | Persistent workspace bind-mounts |
lemonade | Local LLM inference server |
kilo | Kilo Code extension config injection |
gateway | APISIX AI Gateway with rate limiting |
dashboard | Web dashboard settings |
auth | OIDC authentication providers |
Generating .env from thon.yaml
# Export config as .env file for the API and dashboard
python -m thon config env --output .envThe .env file maps thon.yaml fields to environment variables (e.g.,
lemonade.port → LEMONADE_PORT). Values already set in the environment
are not overwritten.
groups.yaml
Defines users and groups for VS Code instance creation. Used by main.py --groups
and can also be imported into the database via the dashboard.
Structure
groups:
<group-name>:
users:
- <username>
- <username>Example
groups:
alpha:
users:
- alice
- bob
- carol
beta:
users:
- dave
- eve
gamma:
users:
- frankBehavior
- Each user gets a unique sandbox instance
- Workspace path:
/workspace/{group}/{username} - URL path:
https://{ip}/{endpoint_port}/proxy/{code_server_port}/ - Port assignment: Sequential from
--port(default 8443)
nginx Configuration
Directory Structure
/etc/nginx/
├── sites-available/
│ ├── sandbox-thon-8443
│ ├── sandbox-thon-8444
│ └── ...
├── sites-enabled/
│ ├── sandbox-thon-8443 -> ../sites-available/sandbox-thon-8443
│ └── ...
└── ssl/
├── server-<ip-hash>.crt
├── server-<ip-hash>.key
└── ca.crt (if mkcert)Per-Port Server Block
Each VS Code instance gets its own nginx config:
server {
listen 80;
listen 443 ssl;
server_name _;
ssl_certificate /etc/nginx/ssl/server.crt;
ssl_certificate_key /etc/nginx/ssl/server.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://127.0.0.1:52322/;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto https;
proxy_redirect off;
add_header Service-Worker-Allowed /;
proxy_read_timeout 86400;
proxy_send_timeout 86400;
proxy_buffering off;
proxy_request_buffering off;
}
location = /ca.crt {
alias /path/to/mkcert/rootCA.pem;
}
}Key Configuration Points
| Setting | Value | Reason |
|---|---|---|
proxy_pass | http://127.0.0.1:{port}/ | No upstream path (prevents doubling) |
Service-Worker-Allowed | / | Fixes SW scope errors |
proxy_read_timeout | 86400 | 24h for long-lived WebSocket |
proxy_buffering | off | Real-time terminal output |
Lemonade Server Configuration
config.json
Location: /var/lib/lemonade/.cache/lemonade/config.json
{
"config_version": 1,
"port": 13305,
"host": "0.0.0.0",
"log_level": "info",
"global_timeout": 300,
"max_loaded_models": 2,
"no_broadcast": false,
"extra_models_dir": "",
"models_dir": "auto",
"ctx_size": 1572864,
"offline": false,
"disable_model_filtering": false,
"enable_dgpu_gtt": false,
"llamacpp": {
"backend": "auto",
"args": "",
"prefer_system": true,
"rocm_bin": "/usr/local/bin/llama-server",
"vulkan_bin": "/usr/local/bin/llama-server",
"cpu_bin": "/usr/local/bin/llama-server"
},
"whispercpp": {
"backend": "auto",
"args": "",
"cpu_bin": "builtin",
"npu_bin": "builtin"
},
"sdcpp": {
"backend": "auto",
"args": "",
"steps": 20,
"cfg_scale": 7.0,
"width": 512,
"height": 512,
"cpu_bin": "builtin",
"rocm_bin": "builtin",
"vulkan_bin": "builtin"
},
"flm": { "args": "" },
"ryzenai": { "server_bin": "builtin" },
"kokoro": { "cpu_bin": "builtin" }
}When embedding is enabled, max_loaded_models is automatically set to 2
(1 chat model + 1 embedding model).
Config Fields
| Field | Type | Default | Description |
|---|---|---|---|
port | int | 13305 | HTTP server port |
host | string | localhost | Bind address |
log_level | string | info | trace, debug, info, warning, error |
global_timeout | int | 300 | Timeout in seconds |
max_loaded_models | int | 2 | Max models per type (-1 for unlimited); auto-set to 2 when embedding enabled |
ctx_size | int | 4096 | Default context size |
offline | bool | false | Skip model downloads |
llamacpp.backend | string | auto | auto, vulkan, cpu |
llamacpp.prefer_system | bool | false | Prefer system llama.cpp |
llamacpp.*_bin | string | builtin | Path to binary or "builtin" |
user_models.json
Location: /var/lib/lemonade/.cache/lemonade/user_models.json
{
"gemma-4-31b-it": {
"model_name": "gemma-4-31b-it",
"checkpoint": "unsloth/gemma-4-31B-it-GGUF:Q8_K_XL",
"recipe": "llamacpp",
"suggested": true,
"labels": ["custom", "vision"],
"mmproj": "mmproj-BF16.gguf"
},
"harrier-oss-v1-0.6b": {
"model_name": "harrier-oss-v1-0.6b",
"checkpoint": "SuperPauly/harrier-oss-v1-0.6b-gguf:harrier-oss-v1-0.6B-BF16",
"recipe": "llamacpp",
"suggested": true,
"labels": ["custom", "embedding"]
}
}Model Entry Fields
| Field | Required | Description |
|---|---|---|
model_name | No | Display name (matches key) |
checkpoint | Yes | HuggingFace checkpoint (org/repo:variant) |
recipe | Yes | Backend engine (llamacpp, whispercpp, etc.) |
suggested | No | Show as suggested model |
labels | No | Tags (custom, vision, embedding) |
mmproj | No | Vision model mmproj filename |
size | No | Model size in GB (informational) |
recipe_options.json
Location: /var/lib/lemonade/.cache/lemonade/recipe_options.json
{
"user.gemma-4-31b-it": {
"ctx_size": 1572864,
"llamacpp_backend": "auto",
"llamacpp_args": "-b 8192 -ub 8192 -to 3600 -ctk q8_0 -ctv q8_0 --temp 1.0 --top-k 64 --top-p 0.95 --min-p 0.0 --repeat-penalty 1.0 --no-webui --threads-http -1 --threads -1 -np 6"
},
"user.harrier-oss-v1-0.6b": {
"ctx_size": 196608,
"llamacpp_backend": "auto",
"llamacpp_args": "-b 8192 -ub 8192 -to 3600 -ctk q8_0 -ctv q8_0 --no-webui --threads-http -1 --threads -1 -np 6"
}
}Embedding Model Scaling
When embedding is enabled, the embedding model gets its own entry in
recipe_options.json with scaled parameters:
| Parameter | Chat Model | Embedding Model |
|---|---|---|
ctx_size | 262144 per user | 32768 per user |
-np | num_users | num_users |
Recipe Options Fields
| Field | Description |
|---|---|
ctx_size | Total context size (for all parallel slots) |
llamacpp_backend | Override backend for this model |
llamacpp_args | Custom llama.cpp arguments |
server_models.json
Location: /var/lib/lemonade/.cache/lemonade/server_models.json
Same structure as user_models.json. Used for server-suggested models
that appear in the Lemonade desktop app.
kilo.json (Kilo Code Config)
Location in Sandbox
/home/vscode/.config/kilo/config.jsonStructure
{
"provider": {
"lemonade": {
"models": {
"user.gemma-4-31b-it": {
"name": "unsloth/gemma-4-31B-it-GGUF:Q8_K_XL",
"limit": {
"context": 262144,
"output": 4096
}
}
},
"options": {
"apiKey": "your-api-key",
"baseURL": "http://YOUR_IP:13305/v1"
}
}
},
"model": "lemonade/user.gemma-4-31b-it",
"experimental": {
"batch_tool": false,
"codebase_search": true,
"openTelemetry": false,
"continue_loop_on_deny": true,
"semantic_indexing": true,
"agent_manager_tool": true
},
"indexing": {
"enabled": true,
"provider": "openai-compatible",
"vectorStore": "lancedb",
"openai-compatible": {
"baseUrl": "http://YOUR_IP:13305/v1",
"apiKey": "your-api-key",
"model": "user.harrier-oss-v1-0.6b"
}
}
}Fields
| Field | Description |
|---|---|
provider.<name>.models.<id>.name | Display name (checkpoint) |
provider.<name>.models.<id>.limit.context | Max context tokens |
provider.<name>.models.<id>.limit.output | Max output tokens |
provider.<name>.options.apiKey | API key for authentication |
provider.<name>.options.baseURL | OpenAI-compatible API endpoint |
model | Active model (provider/model-id format) |
experimental.batch_tool | Enable batch tool calling |
experimental.codebase_search | Enable codebase search |
experimental.openTelemetry | Enable OpenTelemetry tracing |
experimental.continue_loop_on_deny | Continue agent loop on tool deny |
experimental.semantic_indexing | Enable semantic code indexing |
experimental.agent_manager_tool | Enable agent manager tool |
indexing.enabled | Enable semantic indexing |
indexing.provider | Indexing provider (openai-compatible) |
indexing.vectorStore | Vector store type (lancedb) |
indexing.openai-compatible.baseUrl | Embedding API base URL |
indexing.openai-compatible.apiKey | Embedding API key |
indexing.openai-compatible.model | Embedding model ID (e.g., user.harrier-oss-v1-0.6b) |
Gateway-aware kilo.json
When the AI Gateway is enabled, the kilo.json points to the gateway
instead of directly to Lemonade, and the indexing config uses the gateway
URL for embedding requests:
{
"providers": {
"lemonade-gateway": {
"baseUrl": "http://1.2.3.4:9080",
"apiKey": "<consumer-api-key>"
}
},
"models": {
"gemma-4-31b-it": {
"provider": "lemonade-gateway",
"modelId": "user.gemma-4-31b-it"
}
},
"experimental": {
"batch_tool": false,
"codebase_search": true,
"openTelemetry": false,
"continue_loop_on_deny": true,
"semantic_indexing": true,
"agent_manager_tool": true
},
"indexing": {
"enabled": true,
"provider": "openai-compatible",
"vectorStore": "lancedb",
"openai-compatible": {
"baseUrl": "http://1.2.3.4:9080/v1",
"apiKey": "<consumer-api-key>",
"model": "user.harrier-oss-v1-0.6b"
}
}
}VS Code Settings
vscode-settings.jsonc
Injected into each sandbox at /home/vscode/.local/share/code-server/User/settings.json.
Key settings for Lemonade integration:
{
"kilo-code.new.showTaskTimeline": true,
"kilo-code.new.browserAutomation.enabled": true,
"telemetry.telemetryLevel": "off"
}Config File Storage
VS Code settings can also be stored in the database (via the Settings page
in the dashboard or the /api/config-files REST API). When main.py runs
without --vscode-settings, it reads the stored config from the database.
| DB Key | Label | Description |
|---|---|---|
config_vscode_settings | VS Code Settings | Code-server settings JSON |
config_kilo_json | Kilo Code Config | Kilo Code provider config |
config_groups_yaml | Groups YAML | Groups and users definition |
Priority: CLI flag (e.g., --vscode-settings) > database > none.
Authentication
THON supports two authentication mechanisms that operate independently:
Streamlit Dashboard: Local Password
A single shared password that gates access to the Streamlit dashboard.
| Variable | Default | Description |
|---|---|---|
AUTH_LOCAL_PASSWORD | (none) | Set a password to require login; unset = no auth |
When set, visitors see a login form before any dashboard content renders. Suitable for internal or hackathon use. Not a replacement for OIDC in production.
FastAPI REST API: OIDC/OAuth2
Full OIDC/OAuth2 authentication via GitHub, GitLab, or LinkedIn.
| Variable | Default | Description |
|---|---|---|
AUTH_ENABLED | false | Enable OIDC authentication on the REST API |
AUTH_SESSION_SECRET | (none) | HMAC secret for signing session tokens (required when enabled) |
AUTH_GITHUB_CLIENT_ID | (none) | GitHub OAuth App client ID |
AUTH_GITHUB_CLIENT_SECRET | (none) | GitHub OAuth App client secret |
AUTH_GITLAB_CLIENT_ID | (none) | GitLab OAuth App client ID |
AUTH_GITLAB_CLIENT_SECRET | (none) | GitLab OAuth App client secret |
AUTH_LINKEDIN_CLIENT_ID | (none) | LinkedIn OIDC client ID |
AUTH_LINKEDIN_CLIENT_SECRET | (none) | LinkedIn OIDC client secret |
When AUTH_ENABLED=true, unauthenticated requests to /api/* return 401.
The OAuth flow uses PKCE (S256) for security. Sessions are HMAC-signed tokens
stored as HttpOnly cookies with 24-hour expiry.
See Dashboard → Authentication for the full OAuth flow diagram, session details, and provider setup instructions.
Systemd Override
Location
/etc/systemd/system/lemonade-server.service.d/override.confStructure
[Service]
Environment="LEMONADE_API_KEY=your-api-key"
Environment="LEMONADE_ADMIN_API_KEY=your-admin-key"Apply Changes
sudo systemctl daemon-reload
sudo systemctl restart lemonade-serverPVC Workspace Volumes
When users are created via the dashboard or imported from YAML, each user is assigned a Docker named volume (PVC) for persistent workspace storage.
Volume Naming
- Workspace:
thon-workspace-{group}-{username} - Storage:
thon-storage-{group}-{username}
Behavior
- Volumes are created automatically via
docker volume createbefore sandbox creation - When a sandbox is recreated (via the dashboard or
--from-db), the same PVC volume is reattached so files persist across instance lifecycles - PVC volumes take precedence over
--workspace-dirbind mounts
From main.py
# When using --from-db, PVC volumes from the database are used automatically
python ./scripts/main.py --from-db --external-ip 1.2.3.4Dashboard Configuration
The Streamlit dashboard reads the same environment variables as the backend
services. No separate configuration file is needed — or use python -m thon config env
to generate the .env file from thon.yaml.
Environment Variables
| Variable | Default | Description |
|---|---|---|
SANDBOX_DOMAIN | localhost:8080 | Sandbox server address |
SANDBOX_API_KEY | (none) | Sandbox API key |
SANDBOX_IMAGE | waterpistol/thon:latest | Docker image for sandboxes |
LEMONADE_HOST | 0.0.0.0 | Lemonade server bind address |
LEMONADE_PORT | 13305 | Lemonade server port |
LEMONADE_API_KEY | (none) | Lemonade API key |
LEMONADE_ADMIN_API_KEY | (none) | Lemonade admin API key |
THON_DB_PATH | ~/.thon/thon.db | SQLite database path |
THON_WORKSPACE_DIR | ~/.thon/workspace | Workspace directory for groups |
DASHBOARD_HOST | 0.0.0.0 | FastAPI bind address |
DASHBOARD_PORT | 8100 | FastAPI port |
DASHBOARD_DEBUG | false | Enable FastAPI debug/reload mode |
AI Gateway Configuration
| Variable | Default | Description |
|---|---|---|
GATEWAY_ENABLED | false | Enable APISIX AI Gateway |
GATEWAY_ADMIN_URL | http://127.0.0.1:9180 | APISIX Admin API URL |
GATEWAY_ADMIN_KEY | edd1c9f034335f136f87ad84b625c8f1 | APISIX Admin API key |
GATEWAY_PROXY_PORT | 9080 | APISIX proxy port |
GATEWAY_REDIS_HOST | (none) | Redis host for shared rate limiting |
GATEWAY_REDIS_PORT | 6379 | Redis port |
GATEWAY_REDIS_PASSWORD | (none) | Redis password |
GATEWAY_RATE_LIMIT_TOKENS | 500 | Token limit per consumer per time window |
GATEWAY_RATE_LIMIT_WINDOW | 60 | Rate limit time window in seconds |
GATEWAY_MODE | per-user | Consumer mode: per-user or per-group |
Running
# Streamlit dashboard (port 8501)
streamlit run dashboard/streamlit_app.py --server.port 8501
# FastAPI REST API (port 8100, optional)
python -m app.mainSee Dashboard for full documentation including pages and implementation details.