Configuration Reference

thon.yaml (Unified Config)

The recommended way to configure THON is via a single thon.yaml file, created by python -m thon init. This file is the single source of truth for all settings — the CLI, API, and dashboard all read from it (or its .env export).

Full Example

# THON Configuration
# Generated by `python -m thon init` — edit freely or re-run the wizard.

# ── Network ───────────────────────────────────────────────────
# External IP for SSL cert SAN and URLs (auto-detected if empty)
external_ip: ""

# ── Groups & Users ────────────────────────────────────────────
# Each user gets their own VS Code sandbox with workspace at
# /workspace/{group}/{username}
groups:
  alpha:
    - alice
    - bob
    - carol
  beta:
    - dave
    - eve
  gamma:
    - frank

# ── Sandbox ───────────────────────────────────────────────────
sandbox:
  domain: "localhost:8080"
  api_key: ""
  image: "waterpistol/thon:latest"
  python_version: "3.11"
  starting_port: 8443
  timeout_minutes: 0

# ── VS Code Instances ────────────────────────────────────────
vscode:
  # Enable per-user password authentication for code-server
  secure: false
  # Path to VS Code settings JSON (injected into each sandbox)
  settings_file: ""
  # Path to extensions.txt list (injected into each sandbox)
  extensions_file: ""

# ── Nginx & SSL ──────────────────────────────────────────────
nginx:
  enabled: true
  ssl_dir: "/etc/nginx/ssl"

# ── Workspace ─────────────────────────────────────────────────
# Set dir to enable persistent bind mounts (empty = ephemeral)
workspace:
  dir: ""

# ── Lemonade Server (Local LLM Inference) ────────────────────
lemonade:
  enabled: false
  host: "0.0.0.0"
  port: 13305
  model: "unsloth/gemma-4-31B-it-GGUF:Q8_K_XL"
  model_name: "gemma-4-31b-it"
  mmproj: "mmproj-BF16.gguf"
  llamacpp_backend: "auto"
  prefer_system: true
  llamacpp_bin: "/usr/local/bin/llama-server"
  generate_keys: true
  api_key: ""
  admin_api_key: ""

# ── Kilo Code ─────────────────────────────────────────────────
# Auto-generated from Lemonade settings when lemonade.enabled=true
kilo:
  config_file: ""

# ── AI Gateway (APISIX Rate Limiting) ────────────────────────
gateway:
  enabled: false
  # per-user: each user gets their own API key and rate limit
  # per-group: each group shares one API key with combined limit
  mode: "per-user"
  admin_key: "edd1c9f034335f136f87ad84b625c8f1"
  redis_host: ""
  redis_port: 6379
  rate_limit: 500
  time_window: 60

# ── Dashboard ─────────────────────────────────────────────────
dashboard:
  host: "0.0.0.0"
  port: 8100
  debug: false

# ── Authentication (OIDC) ─────────────────────────────────────
# For Streamlit dashboard: set AUTH_LOCAL_PASSWORD env variable
auth:
  enabled: false
  session_secret: ""
  github:
    client_id: ""
    client_secret: ""
  gitlab:
    client_id: ""
    client_secret: ""
  linkedin:
    client_id: ""
    client_secret: ""

Config Sections

Section	Description
`external_ip`	External IP for SSL cert SAN and URLs
`groups`	Group name → list of usernames mapping
`sandbox`	Sandbox server connection and instance settings
`vscode`	VS Code security and customization
`nginx`	Nginx reverse proxy and SSL
`workspace`	Persistent workspace bind-mounts
`lemonade`	Local LLM inference server
`kilo`	Kilo Code extension config injection
`gateway`	APISIX AI Gateway with rate limiting
`dashboard`	Web dashboard settings
`auth`	OIDC authentication providers

Generating .env from thon.yaml

# Export config as .env file for the API and dashboard
python -m thon config env --output .env

The .env file maps thon.yaml fields to environment variables (e.g., lemonade.port → LEMONADE_PORT). Values already set in the environment are not overwritten.

groups.yaml

Defines users and groups for VS Code instance creation. Used by main.py --groups and can also be imported into the database via the dashboard.

Structure

groups:
  <group-name>:
    users:
      - <username>
      - <username>

Example

groups:
  alpha:
    users:
      - alice
      - bob
      - carol
  beta:
    users:
      - dave
      - eve
  gamma:
    users:
      - frank

Behavior

Each user gets a unique sandbox instance
Workspace path: /workspace/{group}/{username}
URL path: https://{ip}/{endpoint_port}/proxy/{code_server_port}/
Port assignment: Sequential from --port (default 8443)

nginx Configuration

Directory Structure

/etc/nginx/
├── sites-available/
│   ├── sandbox-thon-8443
│   ├── sandbox-thon-8444
│   └── ...
├── sites-enabled/
│   ├── sandbox-thon-8443 -> ../sites-available/sandbox-thon-8443
│   └── ...
└── ssl/
    ├── server-<ip-hash>.crt
    ├── server-<ip-hash>.key
    └── ca.crt (if mkcert)

Per-Port Server Block

Each VS Code instance gets its own nginx config:

server {
    listen 80;
    listen 443 ssl;
    server_name _;

    ssl_certificate /etc/nginx/ssl/server.crt;
    ssl_certificate_key /etc/nginx/ssl/server.key;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;

    location / {
        proxy_pass http://127.0.0.1:52322/;

        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto https;
        proxy_redirect off;

        add_header Service-Worker-Allowed /;

        proxy_read_timeout 86400;
        proxy_send_timeout 86400;
        proxy_buffering off;
        proxy_request_buffering off;
    }

    location = /ca.crt {
        alias /path/to/mkcert/rootCA.pem;
    }
}

Key Configuration Points

Setting	Value	Reason
`proxy_pass`	`http://127.0.0.1:{port}/`	No upstream path (prevents doubling)
`Service-Worker-Allowed`	`/`	Fixes SW scope errors
`proxy_read_timeout`	`86400`	24h for long-lived WebSocket
`proxy_buffering`	`off`	Real-time terminal output

Lemonade Server Configuration

config.json

Location: /var/lib/lemonade/.cache/lemonade/config.json

{
  "config_version": 1,
  "port": 13305,
  "host": "0.0.0.0",
  "log_level": "info",
  "global_timeout": 300,
  "max_loaded_models": 2,
  "no_broadcast": false,
  "extra_models_dir": "",
  "models_dir": "auto",
  "ctx_size": 1572864,
  "offline": false,
  "disable_model_filtering": false,
  "enable_dgpu_gtt": false,
  "llamacpp": {
    "backend": "auto",
    "args": "",
    "prefer_system": true,
    "rocm_bin": "/usr/local/bin/llama-server",
    "vulkan_bin": "/usr/local/bin/llama-server",
    "cpu_bin": "/usr/local/bin/llama-server"
  },
  "whispercpp": {
    "backend": "auto",
    "args": "",
    "cpu_bin": "builtin",
    "npu_bin": "builtin"
  },
  "sdcpp": {
    "backend": "auto",
    "args": "",
    "steps": 20,
    "cfg_scale": 7.0,
    "width": 512,
    "height": 512,
    "cpu_bin": "builtin",
    "rocm_bin": "builtin",
    "vulkan_bin": "builtin"
  },
  "flm": { "args": "" },
  "ryzenai": { "server_bin": "builtin" },
  "kokoro": { "cpu_bin": "builtin" }
}

When embedding is enabled, max_loaded_models is automatically set to 2 (1 chat model + 1 embedding model).

Config Fields

Field	Type	Default	Description
`port`	int	13305	HTTP server port
`host`	string	localhost	Bind address
`log_level`	string	info	trace, debug, info, warning, error
`global_timeout`	int	300	Timeout in seconds
`max_loaded_models`	int	2	Max models per type (-1 for unlimited); auto-set to 2 when embedding enabled
`ctx_size`	int	4096	Default context size
`offline`	bool	false	Skip model downloads
`llamacpp.backend`	string	auto	auto, vulkan, cpu
`llamacpp.prefer_system`	bool	false	Prefer system llama.cpp
`llamacpp.*_bin`	string	builtin	Path to binary or "builtin"

user_models.json

Location: /var/lib/lemonade/.cache/lemonade/user_models.json

{
  "gemma-4-31b-it": {
    "model_name": "gemma-4-31b-it",
    "checkpoint": "unsloth/gemma-4-31B-it-GGUF:Q8_K_XL",
    "recipe": "llamacpp",
    "suggested": true,
    "labels": ["custom", "vision"],
    "mmproj": "mmproj-BF16.gguf"
  },
  "harrier-oss-v1-0.6b": {
    "model_name": "harrier-oss-v1-0.6b",
    "checkpoint": "SuperPauly/harrier-oss-v1-0.6b-gguf:harrier-oss-v1-0.6B-BF16",
    "recipe": "llamacpp",
    "suggested": true,
    "labels": ["custom", "embedding"]
  }
}

Model Entry Fields

Field	Required	Description
`model_name`	No	Display name (matches key)
`checkpoint`	Yes	HuggingFace checkpoint (org/repo:variant)
`recipe`	Yes	Backend engine (llamacpp, whispercpp, etc.)
`suggested`	No	Show as suggested model
`labels`	No	Tags (custom, vision, embedding)
`mmproj`	No	Vision model mmproj filename
`size`	No	Model size in GB (informational)

recipe_options.json

Location: /var/lib/lemonade/.cache/lemonade/recipe_options.json

{
  "user.gemma-4-31b-it": {
    "ctx_size": 1572864,
    "llamacpp_backend": "auto",
    "llamacpp_args": "-b 8192 -ub 8192 -to 3600 -ctk q8_0 -ctv q8_0 --temp 1.0 --top-k 64 --top-p 0.95 --min-p 0.0 --repeat-penalty 1.0 --no-webui --threads-http -1 --threads -1 -np 6"
  },
  "user.harrier-oss-v1-0.6b": {
    "ctx_size": 196608,
    "llamacpp_backend": "auto",
    "llamacpp_args": "-b 8192 -ub 8192 -to 3600 -ctk q8_0 -ctv q8_0 --no-webui --threads-http -1 --threads -1 -np 6"
  }
}

Embedding Model Scaling

When embedding is enabled, the embedding model gets its own entry in recipe_options.json with scaled parameters:

Parameter	Chat Model	Embedding Model
`ctx_size`	`262144` per user	`32768` per user
`-np`	`num_users`	`num_users`

Recipe Options Fields

Field	Description
`ctx_size`	Total context size (for all parallel slots)
`llamacpp_backend`	Override backend for this model
`llamacpp_args`	Custom llama.cpp arguments

server_models.json

Location: /var/lib/lemonade/.cache/lemonade/server_models.json

Same structure as user_models.json. Used for server-suggested models that appear in the Lemonade desktop app.

kilo.json (Kilo Code Config)

Location in Sandbox

/home/vscode/.config/kilo/config.json

Structure

{
  "provider": {
    "lemonade": {
      "models": {
        "user.gemma-4-31b-it": {
          "name": "unsloth/gemma-4-31B-it-GGUF:Q8_K_XL",
          "limit": {
            "context": 262144,
            "output": 4096
          }
        }
      },
      "options": {
        "apiKey": "your-api-key",
        "baseURL": "http://YOUR_IP:13305/v1"
      }
    }
  },
  "model": "lemonade/user.gemma-4-31b-it",
  "experimental": {
    "batch_tool": false,
    "codebase_search": true,
    "openTelemetry": false,
    "continue_loop_on_deny": true,
    "semantic_indexing": true,
    "agent_manager_tool": true
  },
  "indexing": {
    "enabled": true,
    "provider": "openai-compatible",
    "vectorStore": "lancedb",
    "openai-compatible": {
      "baseUrl": "http://YOUR_IP:13305/v1",
      "apiKey": "your-api-key",
      "model": "user.harrier-oss-v1-0.6b"
    }
  }
}

Fields

Field	Description
`provider.<name>.models.<id>.name`	Display name (checkpoint)
`provider.<name>.models.<id>.limit.context`	Max context tokens
`provider.<name>.models.<id>.limit.output`	Max output tokens
`provider.<name>.options.apiKey`	API key for authentication
`provider.<name>.options.baseURL`	OpenAI-compatible API endpoint
`model`	Active model (provider/model-id format)
`experimental.batch_tool`	Enable batch tool calling
`experimental.codebase_search`	Enable codebase search
`experimental.openTelemetry`	Enable OpenTelemetry tracing
`experimental.continue_loop_on_deny`	Continue agent loop on tool deny
`experimental.semantic_indexing`	Enable semantic code indexing
`experimental.agent_manager_tool`	Enable agent manager tool
`indexing.enabled`	Enable semantic indexing
`indexing.provider`	Indexing provider (`openai-compatible`)
`indexing.vectorStore`	Vector store type (`lancedb`)
`indexing.openai-compatible.baseUrl`	Embedding API base URL
`indexing.openai-compatible.apiKey`	Embedding API key
`indexing.openai-compatible.model`	Embedding model ID (e.g., `user.harrier-oss-v1-0.6b`)

Gateway-aware kilo.json

When the AI Gateway is enabled, the kilo.json points to the gateway instead of directly to Lemonade, and the indexing config uses the gateway URL for embedding requests:

{
  "providers": {
    "lemonade-gateway": {
      "baseUrl": "http://1.2.3.4:9080",
      "apiKey": "<consumer-api-key>"
    }
  },
  "models": {
    "gemma-4-31b-it": {
      "provider": "lemonade-gateway",
      "modelId": "user.gemma-4-31b-it"
    }
  },
  "experimental": {
    "batch_tool": false,
    "codebase_search": true,
    "openTelemetry": false,
    "continue_loop_on_deny": true,
    "semantic_indexing": true,
    "agent_manager_tool": true
  },
  "indexing": {
    "enabled": true,
    "provider": "openai-compatible",
    "vectorStore": "lancedb",
    "openai-compatible": {
      "baseUrl": "http://1.2.3.4:9080/v1",
      "apiKey": "<consumer-api-key>",
      "model": "user.harrier-oss-v1-0.6b"
    }
  }
}

VS Code Settings

vscode-settings.jsonc

Injected into each sandbox at /home/vscode/.local/share/code-server/User/settings.json.

Key settings for Lemonade integration:

{
  "kilo-code.new.showTaskTimeline": true,
  "kilo-code.new.browserAutomation.enabled": true,
  "telemetry.telemetryLevel": "off"
}

Config File Storage

VS Code settings can also be stored in the database (via the Settings page in the dashboard or the /api/config-files REST API). When main.py runs without --vscode-settings, it reads the stored config from the database.

DB Key	Label	Description
`config_vscode_settings`	VS Code Settings	Code-server settings JSON
`config_kilo_json`	Kilo Code Config	Kilo Code provider config
`config_groups_yaml`	Groups YAML	Groups and users definition

Priority: CLI flag (e.g., --vscode-settings) > database > none.

Authentication

THON supports two authentication mechanisms that operate independently:

Streamlit Dashboard: Local Password

A single shared password that gates access to the Streamlit dashboard.

Variable	Default	Description
`AUTH_LOCAL_PASSWORD`	(none)	Set a password to require login; unset = no auth

When set, visitors see a login form before any dashboard content renders. Suitable for internal or hackathon use. Not a replacement for OIDC in production.

FastAPI REST API: OIDC/OAuth2

Full OIDC/OAuth2 authentication via GitHub, GitLab, or LinkedIn.

Variable	Default	Description
`AUTH_ENABLED`	`false`	Enable OIDC authentication on the REST API
`AUTH_SESSION_SECRET`	(none)	HMAC secret for signing session tokens (required when enabled)
`AUTH_GITHUB_CLIENT_ID`	(none)	GitHub OAuth App client ID
`AUTH_GITHUB_CLIENT_SECRET`	(none)	GitHub OAuth App client secret
`AUTH_GITLAB_CLIENT_ID`	(none)	GitLab OAuth App client ID
`AUTH_GITLAB_CLIENT_SECRET`	(none)	GitLab OAuth App client secret
`AUTH_LINKEDIN_CLIENT_ID`	(none)	LinkedIn OIDC client ID
`AUTH_LINKEDIN_CLIENT_SECRET`	(none)	LinkedIn OIDC client secret

When AUTH_ENABLED=true, unauthenticated requests to /api/* return 401. The OAuth flow uses PKCE (S256) for security. Sessions are HMAC-signed tokens stored as HttpOnly cookies with 24-hour expiry.

See Dashboard → Authentication for the full OAuth flow diagram, session details, and provider setup instructions.

Systemd Override

Location

/etc/systemd/system/lemonade-server.service.d/override.conf

Structure

[Service]
Environment="LEMONADE_API_KEY=your-api-key"
Environment="LEMONADE_ADMIN_API_KEY=your-admin-key"

Apply Changes

sudo systemctl daemon-reload
sudo systemctl restart lemonade-server

PVC Workspace Volumes

When users are created via the dashboard or imported from YAML, each user is assigned a Docker named volume (PVC) for persistent workspace storage.

Volume Naming

Workspace: thon-workspace-{group}-{username}
Storage: thon-storage-{group}-{username}

Behavior

Volumes are created automatically via docker volume create before sandbox creation
When a sandbox is recreated (via the dashboard or --from-db), the same PVC volume is reattached so files persist across instance lifecycles
PVC volumes take precedence over --workspace-dir bind mounts

From main.py

# When using --from-db, PVC volumes from the database are used automatically
python ./scripts/main.py --from-db --external-ip 1.2.3.4

Dashboard Configuration

The Streamlit dashboard reads the same environment variables as the backend services. No separate configuration file is needed — or use python -m thon config env to generate the .env file from thon.yaml.

Environment Variables

Variable	Default	Description
`SANDBOX_DOMAIN`	`localhost:8080`	Sandbox server address
`SANDBOX_API_KEY`	(none)	Sandbox API key
`SANDBOX_IMAGE`	`waterpistol/thon:latest`	Docker image for sandboxes
`LEMONADE_HOST`	`0.0.0.0`	Lemonade server bind address
`LEMONADE_PORT`	`13305`	Lemonade server port
`LEMONADE_API_KEY`	(none)	Lemonade API key
`LEMONADE_ADMIN_API_KEY`	(none)	Lemonade admin API key
`THON_DB_PATH`	`~/.thon/thon.db`	SQLite database path
`THON_WORKSPACE_DIR`	`~/.thon/workspace`	Workspace directory for groups
`DASHBOARD_HOST`	`0.0.0.0`	FastAPI bind address
`DASHBOARD_PORT`	`8100`	FastAPI port
`DASHBOARD_DEBUG`	`false`	Enable FastAPI debug/reload mode

AI Gateway Configuration

Variable	Default	Description
`GATEWAY_ENABLED`	`false`	Enable APISIX AI Gateway
`GATEWAY_ADMIN_URL`	`http://127.0.0.1:9180`	APISIX Admin API URL
`GATEWAY_ADMIN_KEY`	`edd1c9f034335f136f87ad84b625c8f1`	APISIX Admin API key
`GATEWAY_PROXY_PORT`	`9080`	APISIX proxy port
`GATEWAY_REDIS_HOST`	(none)	Redis host for shared rate limiting
`GATEWAY_REDIS_PORT`	`6379`	Redis port
`GATEWAY_REDIS_PASSWORD`	(none)	Redis password
`GATEWAY_RATE_LIMIT_TOKENS`	`500`	Token limit per consumer per time window
`GATEWAY_RATE_LIMIT_WINDOW`	`60`	Rate limit time window in seconds
`GATEWAY_MODE`	`per-user`	Consumer mode: `per-user` or `per-group`

Running

# Streamlit dashboard (port 8501)
streamlit run dashboard/streamlit_app.py --server.port 8501

# FastAPI REST API (port 8100, optional)
python -m app.main

See Dashboard for full documentation including pages and implementation details.

Configuration Reference

On this page