LLM Inference Service
Optional local AI inference service that powers Windshift's AI features. Runs a 1.2B parameter language model via llama.cpp - no GPU required, CPU-only inference with ~2 GB RAM.
AI Features
The LLM service enables the following features in Windshift:
| Feature | Description |
|---|---|
| Plan My Day | Generates a prioritized daily schedule from your assigned items |
| Catch Me Up | Summarizes recent activity and changes on a work item |
| Find Similar | Detects duplicate or related items across your workspace |
| Decompose | Breaks a work item into smaller sub-tasks |
| Release Notes | Generates release notes from a milestone's completed items |
| Dependency Analysis | Identifies dependencies between work items across teams and sprints |
| AI Chat | Interactive assistant that can look up and update items, track time, and answer questions |
| Daily Briefing | Morning briefing with activity recap, today's focus, and upcoming deadlines |
Docker Compose
Add the LLM service to your docker-compose.yml:
services:
windshift:
image: ghcr.io/windshiftapp/windshift:latest
environment:
- BASE_URL=https://windshift.example.com
- SSO_SECRET=${SSO_SECRET}
- LLM_ENDPOINT=http://llm:8081
depends_on:
llm:
condition: service_healthy
llm:
image: ghcr.io/windshiftapp/llm:latest
container_name: windshift-llm
restart: unless-stopped
environment:
- LLM_PORT=8081
- LLM_CTX_SIZE=4096
- LLM_THREADS=${LLM_THREADS:-4}
- LLM_PARALLEL=2
- LLM_BATCH_SIZE=512
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8081/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 120sEnvironment Variables
| Variable | Default | Description |
|---|---|---|
LLM_PORT |
8081 |
Server port |
LLM_CTX_SIZE |
4096 |
Context window size |
LLM_THREADS |
4 |
CPU threads for inference |
LLM_PARALLEL |
2 |
Maximum concurrent requests |
LLM_BATCH_SIZE |
512 |
Batch size for prompt processing |
Connecting to Windshift
Set LLM_ENDPOINT on your main Windshift service and add a depends_on with a healthcheck condition:
windshift:
environment:
- LLM_ENDPOINT=http://llm:8081
depends_on:
llm:
condition: service_healthySee Environment Variables for the full configuration reference.
Resource Requirements
- RAM: ~2 GB (1.25 GB model + KV cache + processing overhead)
- CPU: Any modern x86_64 or ARM64 processor. More threads = faster inference.
- GPU: Not required. The service runs entirely on CPU.
- Startup: ~120 seconds while the model loads into memory. The healthcheck
start_periodaccounts for this.
Adjust LLM_THREADS based on your available CPU cores. The default of 4 threads works well for most deployments.
Bundled Model & Licensing
The default Docker image ships with LiquidAI's LFM2.5-1.2B-Instruct model, licensed under the LFM Open License v1.0.
Important: Organizations with $10M+ annual revenue cannot use this model commercially without a separate license from Liquid AI.
If this restriction applies to you, swap the model - the llama.cpp server supports any GGUF-format model. Mount a different .gguf file into the container and adjust LLM_CTX_SIZE to match:
llm:
image: ghcr.io/windshiftapp/llm:latest
environment:
- LLM_CTX_SIZE=8192
volumes:
- ./my-model.gguf:/models/model.ggufExternal LLM Providers
Instead of running the local inference service, you can configure external LLM providers (OpenAI, Anthropic) through the admin UI. This is useful if you prefer cloud-hosted models or need more capable models for your workload.
Configure providers via the LLM_PROVIDERS_FILE environment variable or --llm-providers CLI flag. See Configuration Options for details.
Customizing AI Prompts
Windshift's AI features use embedded system prompts that work out of the box. You can override any prompt by placing a .txt file in the prompts directory — only the prompts you provide are replaced, the rest keep their defaults.
Prompt Files
| Filename | Feature |
|---|---|
plan_my_day.txt |
Plan My Day |
catch_me_up.txt |
Catch Me Up |
find_similar.txt |
Find Similar |
decompose.txt |
Decompose |
release_notes.txt |
Release Notes |
dependency_analysis.txt |
Dependency Analysis |
ai_chat.txt |
AI Chat |
daily_briefing.txt |
Daily Briefing |
Docker Setup
The Docker image sets AI_PROMPTS_DIR=/data/prompts by default. Mount a volume to that path with your override files:
services:
windshift:
image: ghcr.io/windshiftapp/windshift:latest
volumes:
- ./my-prompts:/data/prompts:ro
environment:
- LLM_ENDPOINT=http://llm:8081Only the files you place in the directory are overridden. For example, to customize just the "Plan My Day" prompt, create a single plan_my_day.txt file — all other prompts use the built-in defaults.
Outside Docker, pass the directory path with --ai-prompts-dir or AI_PROMPTS_DIR. See Configuration Options for details.
Format Placeholders in ai_chat.txt
The ai_chat.txt prompt contains four format placeholders that are filled at runtime:
Today's date is %s.
The current user is %s (user ID: %d). When the user asks about
"my items", "assigned to me", or similar, use the list_items
tool with assignee_id=%d.When customizing this file, you must preserve these %s and %d verbs in the correct order. Removing or reordering them will cause incorrect output.
Testing Prompts
Append ?preview=true to any AI feature endpoint to see the assembled prompts without calling the LLM:
curl http://localhost:8080/ai/plan-my-day?preview=trueThe response includes system_prompt and prompt fields showing exactly what would be sent to the model. Use this to verify your override loaded correctly.
Checking Status
Verify the AI service is available:
curl http://localhost:8081/healthFrom within Windshift, the GET /ai/status endpoint reports whether AI features are active and which provider is in use.