CLI Reference Complete command-line reference for THON and Lemonade Server
The thon CLI provides a unified entry point for interactive setup, configuration,
and instance management. It reads from a single thon.yaml config file.
Command Description thon initInteractive setup wizard (creates thon.yaml) thon setupInstall prerequisites + configure from thon.yaml thon runStart VS Code instances from thon.yaml thon config showDisplay current config thon config envExport config as .env file thon config validateValidate thon.yaml thon cleanupTear down all resources
Option Default Description --config PATH./thon.yaml Path to thon.yaml config file
python -m thon init [OPTIONS]
Interactive guided setup wizard that walks through every THON feature with
sensible defaults, validates choices, and writes a thon.yaml config file.
Option Default Description --non-interactivefalse Generate config with defaults (no prompts) --config PATH./thon.yaml Output path for config file
python -m thon setup [OPTIONS]
Installs system prerequisites and configures all components from thon.yaml:
System prerequisites (setup.sh)
SSL directory
Lemonade Server (if lemonade.enabled)
AI Gateway (if gateway.enabled)
.env file generation
Summary
python -m thon run [OPTIONS]
Starts VS Code instances from thon.yaml. Delegates to scripts/main.py.
Option Default Description --group GROUP(all) Run only this group --config PATH./thon.yaml Path to config file
python -m thon config show [OPTIONS]
Displays the full resolved config as YAML.
thon config env [OPTIONS]
Exports configuration as a .env file.
Option Default Description --output PATH.env Output .env file path
thon config validate [OPTIONS]
Validates thon.yaml for common errors (missing groups, auth without providers, etc.).
Tear down all resources: nginx configs, Lemonade server, and AI Gateway.
# Interactive setup wizard
python -m thon init
# Non-interactive (CI-friendly)
python -m thon init --non-interactive
# Install prerequisites and configure
python -m thon setup
# Start instances
python -m thon run
# Start only one group
python -m thon run --group alpha
# Validate config
python -m thon config validate
# Export .env file
python -m thon config env --output .env
# Clean up all resources
python -m thon cleanup
python ./scripts/main.py [OPTIONS]
Option Type Default Description --groups FILEstring (none) Path to groups.yaml file --group GROUPstring (all) Run only this group (works with --groups or --from-db) --from-dbflag false Read groups/users from the database instead of a YAML file --port PORTint 8443 Starting port for code-server instances --timeout MINint 0 Timeout in minutes (0 = no timeout)
Option Type Default Description --domain DOMAINstring localhost:8080 Sandbox server domain --api-key KEYstring (none) Sandbox API key
Option Type Default Description --image IMAGEstring waterpistol/thon:latest Docker image for sandbox --python-version VERstring 3.11 Python version in sandbox
Option Type Default Description --secureflag false Enable per-user password authentication
Option Type Default Description --external-ip IPstring (auto-detect) External IP for SSL cert and URLs --ssl-dir DIRstring /etc/nginx/ssl SSL certificate storage directory --no-nginxflag false Disable nginx, use direct HTTP access
Option Type Default Description --workspace-dir DIRstring (none) Host dir for persistent bind mounts
Option Type Default Description --lemonade KILO_JSONstring (none) Path to kilo.json for LLM config injection --vscode-settings JSONstring (none) VS Code settings file to inject
Option Type Default Description --gatewayflag false Enable APISIX AI Gateway with rate limiting --gateway-per-groupflag false One consumer per group (shared API key) instead of per user --gateway-redis-host HOSTstring (none) Redis host for shared rate limiting --gateway-rate-limit Nint 500 Token limit per consumer per time window --gateway-time-window Nint 60 Rate limit time window in seconds
Option Type Default Description --cleanupflag false Remove all nginx configs and exit
# Single instance (no groups)
python ./scripts/main.py
# All groups with nginx SSL
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4
# Single group
python ./scripts/main.py --groups groups.yaml --group alpha --external-ip 1.2.3.4
# Start instances from database instead of YAML
python ./scripts/main.py --from-db --external-ip 1.2.3.4
# Start a specific group from database
python ./scripts/main.py --from-db --group beta --external-ip 1.2.3.4
# Per-user passwords
python ./scripts/main.py --groups groups.yaml --secure --external-ip 1.2.3.4
# Persistent workspace bind mounts
python ./scripts/main.py --groups groups.yaml --workspace-dir /thon-workspace --external-ip 1.2.3.4
# Local LLM inference
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --lemonade kilo.json
# Per-user rate limiting (each user gets own API key)
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --gateway
# Per-group rate limiting (shared API key per group)
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 --gateway --gateway-per-group
# With Redis-backed rate limiting
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 \
--gateway --gateway-redis-host 127.0.0.1
# Custom VS Code settings
python ./scripts/main.py --groups groups.yaml --external-ip 1.2.3.4 \
--vscode-settings vscode-settings.jsonc
# Direct HTTP access
python ./scripts/main.py --groups groups.yaml --no-nginx
# Remove nginx configs
python ./scripts/main.py --cleanup
bash ./scripts/setup-lemonade.sh [OPTIONS]
Option Default Description --groups FILE(none) groups.yaml for user count --group GROUP(all) Filter to single group --num-users N1 Override parallel user count --port PORT13305 Server port --host HOST0.0.0.0 Bind address --backend BACKENDauto llama.cpp backend: auto, vulkan, cpu --ctx-size SIZE262144 Per-user context size --model MODELunsloth/gemma-4-31B-it-GGUF:Q8_K_XL HuggingFace checkpoint --model-name NAMEgemma-4-31b-it Short model name --mmproj FILEmmproj-BF16.gguf Vision mmproj filename --external-ip IP(auto) External IP for kilo.json --generate-keysfalse Generate API keys --no-prefer-system(system) Use bundled llama.cpp --llamacpp-bin PATH/usr/local/bin/llama-server System binary path --kilo-config PATH./kilo.json Output path for kilo.json --embeddingtrue Enable embedding model for semantic indexing --no-embeddingfalse Disable embedding model --embedding-model MODELSuperPauly/harrier-oss-v1-0.6b-gguf:harrier-oss-v1-0.6B-BF16 Embedding model HuggingFace checkpoint --embedding-model-name NAMEharrier-oss-v1-0.6b Short name for embedding model -h, --helpShow help message
Variable Description LEMONADE_PORTServer port LEMONADE_HOSTBind address LEMONADE_BACKENDllama.cpp backend LEMONADE_CTX_SIZEPer-user context size LEMONADE_MODELHuggingFace checkpoint LEMONADE_MODEL_NAMEShort model name LEMONADE_EXTERNAL_IPExternal IP LEMONADE_GENERATE_KEYSGenerate API keys (true/false) LEMONADE_NUM_USERSParallel user count LEMONADE_KILO_CONFIGkilo.json output path LEMONADE_PREFER_SYSTEMPrefer system binary (true/false) LEMONADE_LLMACPP_BINSystem binary path LEMONADE_MMPROJmmproj filename LEMONADE_EMBEDDINGEnable embedding model (true/false) LEMONADE_EMBEDDING_MODELEmbedding model HuggingFace checkpoint LEMONADE_EMBEDDING_MODEL_NAMEShort name for embedding model
bash setup-lemonade.sh --generate-keys --external-ip 1.2.3.4
bash setup-lemonade.sh --groups groups.yaml --generate-keys --external-ip 1.2.3.4
bash setup-lemonade.sh --generate-keys --external-ip 1.2.3.4 --no-embedding
bash setup-lemonade.sh \
--embedding-model some-org/embedding-model-GGUF:Q8_0 \
--embedding-model-name my-embedding \
--generate-keys \
--external-ip 1.2.3.4
bash setup-lemonade.sh \
--model Qwen/Qwen2.5-Coder-7B-Instruct-GGUF:Q4_K_M \
--model-name qwen-coder-7b \
--generate-keys \
--external-ip 1.2.3.4
bash setup-lemonade.sh \
--llamacpp-bin /opt/llama.cpp/llama-server \
--generate-keys \
--external-ip 1.2.3.4
python ./lemonade_server.py COMMAND [OPTIONS]
Command Description installInstall lemonade-server via PPA configureConfigure server settings startStart the server stopStop the server restartRestart the server statusCheck server status pullPull a model to local cache runFull setup + keep alive count-usersCount users from groups.yaml write-model-configsWrite user_models.json and recipe_options.json generate-kilo-configGenerate kilo.json for Kilo Code cleanupStop server and clean up
python lemonade_server.py install
Installs lemonade-server from PPA.
python lemonade_server.py configure [OPTIONS]
Option Default Description --port PORT13305 Server port --host HOST0.0.0.0 Bind address --llamacpp-backend BACKENDauto Backend: auto, vulkan, cpu --ctx-size SIZE4096 Default context size --max-loaded-models N1 Max models per type slot --generate-keysfalse Generate API keys --prefer-systemtrue Prefer system llama.cpp --no-prefer-systemUse bundled llama.cpp --llamacpp-bin PATH/usr/local/bin/llama-server System binary path --kilo-config PATH(none) Generate kilo.json --model MODEL(default) Model for kilo.json --external-ip IP(auto) External IP for kilo.json
python lemonade_server.py pull --model MODEL
Option Default Description --model MODEL(required) HuggingFace checkpoint
python lemonade_server.py run [OPTIONS]
Full setup: install + configure + start + pull model + keep alive.
Option Default Description --model MODELunsloth/gemma-4-31B-it-GGUF:Q8_K_XL HuggingFace checkpoint --model-name NAMEgemma-4-31b-it Short model name --groups FILE(none) groups.yaml for user count --group GROUP(all) Filter to single group --num-users N1 Override parallel user count --port PORT13305 Server port --host HOST0.0.0.0 Bind address --llamacpp-backend BACKENDauto Backend: auto, vulkan, cpu --ctx-size SIZE4096 Default context size --generate-keysfalse Generate API keys --external-ip IP(auto) External IP --kilo-config PATH(auto) kilo.json output path --prefer-systemtrue Prefer system binary --llamacpp-bin PATH/usr/local/bin/llama-server System binary path --mmproj FILEmmproj-BF16.gguf Vision mmproj filename --skip-installfalse Skip installation check --embeddingtrue Enable embedding model for semantic indexing --no-embeddingfalse Disable embedding model --embedding-model MODELSuperPauly/harrier-oss-v1-0.6b-gguf:harrier-oss-v1-0.6B-BF16 Embedding model checkpoint --embedding-model-name NAMEharrier-oss-v1-0.6b Short name for embedding model
python lemonade_server.py write-model-configs [OPTIONS]
Option Default Description --model MODEL(default) HuggingFace checkpoint --model-name NAMEgemma-4-31b-it Short model name --num-users N1 Parallel user count --llamacpp-backend BACKENDauto Backend --mmproj FILEmmproj-BF16.gguf Vision mmproj filename --embeddingtrue Also write embedding model configs --no-embeddingfalse Skip embedding model configs --embedding-model MODELSuperPauly/harrier-oss-v1-0.6b-gguf:harrier-oss-v1-0.6B-BF16 Embedding model checkpoint --embedding-model-name NAMEharrier-oss-v1-0.6b Short name for embedding model
python lemonade_server.py generate-kilo-config [OPTIONS]
Option Default Description --model MODEL(default) HuggingFace checkpoint --model-name NAMEgemma-4-31b-it Short model name --external-ip IP(auto) External IP --output PATHkilo.json Output path --api-key KEY(none) API key --admin-api-key KEY(none) Admin API key --embedding-model-name NAMEharrier-oss-v1-0.6b Embedding model name for indexing config --no-embeddingfalse Omit indexing section from kilo.json
# Full setup (with embedding model)
python lemonade_server.py run --groups groups.yaml --generate-keys --external-ip 1.2.3.4
# Full setup without embedding model
python lemonade_server.py run --groups groups.yaml --generate-keys --external-ip 1.2.3.4 --no-embedding
# Just configure
python lemonade_server.py configure --generate-keys --external-ip 1.2.3.4
# Write model configs only (includes embedding by default)
python lemonade_server.py write-model-configs --num-users 6
# Generate kilo.json without embedding/indexing section
python lemonade_server.py generate-kilo-config --admin-api-key YOUR_KEY --external-ip 1.2.3.4 --no-embedding
Variable Default Description SANDBOX_DOMAINlocalhost:8080 Sandbox server address SANDBOX_API_KEY(none) Sandbox API key SANDBOX_IMAGEwaterpistol/thon:latest Docker image PYTHON_VERSION3.11 Python in sandbox
Variable Description LEMONADE_API_KEYAPI key for regular endpoints LEMONADE_ADMIN_API_KEYAPI key for admin endpoints LEMONADE_EMBEDDINGEnable embedding model (true/false) LEMONADE_EMBEDDING_MODELEmbedding model HuggingFace checkpoint LEMONADE_EMBEDDING_MODEL_NAMEShort name for embedding model
Variable Description GATEWAY_ENABLEDEnable AI Gateway (true/false) GATEWAY_ADMIN_URLAPISIX Admin API URL GATEWAY_ADMIN_KEYAPISIX Admin API key GATEWAY_PROXY_PORTAPISIX proxy port GATEWAY_REDIS_HOSTRedis host for rate limiting GATEWAY_REDIS_PORTRedis port GATEWAY_REDIS_PASSWORDRedis password GATEWAY_RATE_LIMIT_TOKENSToken limit per consumer per window GATEWAY_RATE_LIMIT_WINDOWTime window in seconds GATEWAY_MODEConsumer mode: per-user or per-group
Variable Description THON_DB_PATHSQLite database path (default: ~/.thon/thon.db) THON_WORKSPACE_DIRWorkspace directory for groups
Variable Default Description AUTH_ENABLEDfalseEnable OIDC authentication on the REST API AUTH_SESSION_SECRET(none) HMAC secret for signing session tokens AUTH_LOCAL_PASSWORD(none) Single password for Streamlit dashboard access AUTH_GITHUB_CLIENT_ID(none) GitHub OAuth App client ID AUTH_GITHUB_CLIENT_SECRET(none) GitHub OAuth App client secret AUTH_GITLAB_CLIENT_ID(none) GitLab OAuth App client ID AUTH_GITLAB_CLIENT_SECRET(none) GitLab OAuth App client secret AUTH_LINKEDIN_CLIENT_ID(none) LinkedIn OIDC client ID AUTH_LINKEDIN_CLIENT_SECRET(none) LinkedIn OIDC client secret
python scripts/apisix_gateway.py COMMAND [OPTIONS]
Command Description setupFull gateway setup: create routes + consumers from groups.yaml create-consumerCreate a single consumer with API key delete-consumerDelete a consumer by username generate-kiloGenerate kilo.json for a consumer statusCheck gateway status cleanupRemove all consumers and routes
python scripts/apisix_gateway.py setup [OPTIONS]
Creates two APISIX routes:
/v1/chat/completions — chat completions via ai-proxy-multi
/v1/embeddings — embedding requests via upstream proxy (when --no-embedding is not set)
Option Default Description --groups FILE(none) Path to groups.yaml --group GROUP(all) Filter to single group --lemonade-url URLhttp://127.0.0.1:13305 Lemonade server URL --lemonade-api-key KEY(none) Lemonade API key --lemonade-model MODELuser.gemma-4-31b-it Lemonade model name --per-groupfalse One consumer per group with shared API key --admin-key KEY(default) APISIX Admin API key --admin-port PORT9180 APISIX Admin API port --proxy-port PORT9080 APISIX proxy port --redis-host HOST(none) Redis host for rate limiting --redis-port PORT6379 Redis port --redis-password PW(none) Redis password --rate-limit N500 Token limit per consumer per time window --time-window N60 Rate limit time window in seconds --generate-kilofalse Generate kilo.json for each consumer --external-ip IP(auto) External IP for kilo.json base URL --embedding-model MODELuser.harrier-oss-v1-0.6b Embedding model name for Lemonade --no-embeddingfalse Disable embedding route creation
python scripts/apisix_gateway.py create-consumer --username alice [OPTIONS]
Option Default Description --username(required) Consumer username --api-key(auto) API key (auto-generated if omitted) --rate-limit500 Token limit per time window --time-window60 Time window in seconds
python scripts/apisix_gateway.py generate-kilo --username alice --api-key KEY [OPTIONS]
Option Default Description --username(required) Consumer username --api-key(required) Consumer API key --proxy-port9080 APISIX proxy port --external-ip127.0.0.1 External IP for gateway URL --modeluser.gemma-4-31b-it Model name for kilo.json --embedding-modeluser.harrier-oss-v1-0.6b Embedding model name for indexing config --no-embeddingfalse Omit indexing section from kilo.json
# Full setup from groups.yaml (includes embedding route)
python scripts/apisix_gateway.py setup --groups groups.yaml \
--lemonade-url http://127.0.0.1:13305
# Setup without embedding route
python scripts/apisix_gateway.py setup --groups groups.yaml \
--lemonade-url http://127.0.0.1:13305 --no-embedding
# Per-group setup with Redis rate limiting
python scripts/apisix_gateway.py setup --groups groups.yaml \
--lemonade-url http://127.0.0.1:13305 --per-group --redis-host 127.0.0.1
# Create single consumer
python scripts/apisix_gateway.py create-consumer --username alice --rate-limit 500
# Generate kilo.json (with embedding/indexing config)
python scripts/apisix_gateway.py generate-kilo --username alice --api-key KEY \
--external-ip 1.2.3.4
# Generate kilo.json without embedding
python scripts/apisix_gateway.py generate-kilo --username alice --api-key KEY \
--external-ip 1.2.3.4 --no-embedding
# Cleanup all gateway resources
python scripts/apisix_gateway.py cleanup