Configuration¶
Rocco supports multiple LLM providers. Configure your preferred provider in the .env file.
Setting Up .env¶
Copy the template:
cp .env.example .env
Then edit .env with your chosen provider.
Supported LLM Providers¶
OpenAI (Default)¶
LLM_PROVIDER=openai
LLM_API_KEY=sk-proj-your-key-here
LLM_MODEL=gpt-4o-mini
Get an API key: https://platform.openai.com/api-keys
Model options:
Check https://developers.openai.com/api/docs/models for the latest available models and pricing.
SambaNova¶
LLM_PROVIDER=sambanova
LLM_API_KEY=sk-your-key-here
LLM_MODEL=Llama-4-Maverick-17B-128E-Instruct
SambaNova at TACC endpoint: https://ai.tejas.tacc.utexas.edu/v1
Anthropic (Claude)¶
LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-your-key-here
LLM_MODEL=claude-opus-4-7
Get an API key: https://console.anthropic.com/
Model options: Check https://platform.claude.com/docs/en/about-claude/models/overview for the latest available models and pricing.
Google Gemini¶
LLM_PROVIDER=gemini
LLM_API_KEY=AIza-your-key-here
LLM_MODEL=gemini-2.0-flash
Get an API key: https://aistudio.google.com/app/apikey
Model options: Check https://ai.google.dev/gemini-api/docs/models for the latest available models and pricing.
DeepSeek¶
LLM_PROVIDER=deepseek
LLM_API_KEY=sk-your-key-here
LLM_MODEL=deepseek-chat
Get an API key: https://platform.deepseek.com/
Model options: Check https://api-docs.deepseek.com/quick_start/pricing for the latest available models and pricing.
HuggingFace (Serverless Inference)¶
LLM_PROVIDER=huggingface
LLM_API_KEY=hf_your-token-here
LLM_MODEL=meta-llama/Llama-3.1-8B-Instruct
Get an API key: https://huggingface.co/settings/tokens
Model options: Use only instruction-tuned model ID from [HuggingFace Hub](https://huggingface.co/models?sort=trending&search=instruction)
Warning
Important caveats:
Only instruction-tuned models work (not base/pretrained models)
Free tier is heavily rate-limited — HF Pro subscription or Inference Endpoints recommended
Not all models support OpenAI-compatible API — check model docs for compatibility
Ollama (Local)¶
Ollama runs locally on your machine.
Installing Ollama If you haven’t installed Ollama yet, you can do so from the command line.
For more information, see the [Ollama installation guide](https://ollama.com/download).
Note
We recommend installing Ollama on WSL2 (Windows Subsystem for Linux) if you’re on Windows, rather than the Windows Desktop app, as it provides better integration with development tools.
If you don’t have WSL2 installed, see Windows Subsystem for Linux Installation.
Install Ollama from the command line:
curl -fsSL https://ollama.com/install.sh | sh
Start Ollama:
ollama serveIn another terminal, pull a model (e.g., llama2):
ollama pull llama2
Verify Ollama is running:
curl http://localhost:11434/api/tags
Configure Rocco for Ollama
Edit your .env file:
LLM_PROVIDER=ollama
LLM_BASE_URL=http://localhost:11434/v1
# LLM_API_KEY is auto-set to "ollama" (no real key needed)
LLM_MODEL=llama2
Available models: llama2, llama3, mistral, phi3, neural-chat, etc. (any model you’ve pulled)
Custom OpenAI-Compatible Endpoint¶
For any OpenAI-compatible API — custom servers, proxies, or open-source models deployed with OpenAI compatibility:
LLM_PROVIDER=openai_compatible
LLM_API_KEY=your-key-here
LLM_BASE_URL=https://your-custom-endpoint.com/v1
LLM_MODEL=your-model-name
What “OpenAI-compatible” means:
The API must expose a /v1/chat/completions endpoint with the same request/response format as OpenAI. Examples:
Text Generation Inference (TGI) with
--openai-compatible-apiflagvLLM with
--served-model-nameAny proxy or router that wraps an LLM with OpenAI API compatibility
If your endpoint is NOT OpenAI-compatible (e.g., native HuggingFace API, custom format), you’ll need to:
Wrap your API with an OpenAI-compatible adapter, OR
Fork Rocco and modify
src/llm/client.pyto support your specific API format
Environment Variable Reference¶
Variable |
Required? |
Description |
|---|---|---|
|
No |
Shortcut to pre-fill |
|
Yes |
Your API key. For Ollama or local services, can
be dummy value (e.g., |
|
No |
Custom endpoint URL. Required for custom or proprietary providers. If not set, provider mapping is used (or defaults to OpenAI). |
|
No |
Model name (e.g., |
Changing Providers¶
To switch providers at runtime, just edit .env and restart the Streamlit app.
Troubleshooting¶
- “API Key not found”
Check that
.envexists in the repo root andLLM_API_KEYis set.- “Invalid API key”
Verify your key is correct (no extra spaces, matches the provider format).
- “Connection timeout”
For Ollama, ensure the server is running (
ollama serve). For cloud providers, check your internet connection.- “Model not found”
Verify the model name is correct for the provider (e.g.,
gpt-4o-minifor OpenAI, notgpt-4).
Next Steps¶
Ready to use Rocco? See Quick Start
Want to understand the architecture? See Architecture
Need help? Check Contributing