Configuration¶

Rocco supports multiple LLM providers. Configure your preferred provider in the .env file.

Setting Up .env¶

Copy the template:

cp .env.example .env

Then edit .env with your chosen provider.

Supported LLM Providers¶

OpenAI (Default)¶

LLM_PROVIDER=openai
LLM_API_KEY=sk-proj-your-key-here
LLM_MODEL=gpt-4o-mini

Get an API key: https://platform.openai.com/api-keys

Model options:

Check https://developers.openai.com/api/docs/models for the latest available models and pricing.

SambaNova¶

LLM_PROVIDER=sambanova
LLM_API_KEY=sk-your-key-here
LLM_MODEL=Llama-4-Maverick-17B-128E-Instruct

SambaNova at TACC endpoint: https://ai.tejas.tacc.utexas.edu/v1

Anthropic (Claude)¶

LLM_PROVIDER=anthropic
LLM_API_KEY=sk-ant-your-key-here
LLM_MODEL=claude-opus-4-7

Get an API key: https://console.anthropic.com/

Model options: Check https://platform.claude.com/docs/en/about-claude/models/overview for the latest available models and pricing.

Google Gemini¶

LLM_PROVIDER=gemini
LLM_API_KEY=AIza-your-key-here
LLM_MODEL=gemini-2.0-flash

Get an API key: https://aistudio.google.com/app/apikey

Model options: Check https://ai.google.dev/gemini-api/docs/models for the latest available models and pricing.

DeepSeek¶

LLM_PROVIDER=deepseek
LLM_API_KEY=sk-your-key-here
LLM_MODEL=deepseek-chat

Get an API key: https://platform.deepseek.com/

Model options: Check https://api-docs.deepseek.com/quick_start/pricing for the latest available models and pricing.

HuggingFace (Serverless Inference)¶

LLM_PROVIDER=huggingface
LLM_API_KEY=hf_your-token-here
LLM_MODEL=meta-llama/Llama-3.1-8B-Instruct

Get an API key: https://huggingface.co/settings/tokens

Model options: Use only instruction-tuned model ID from [HuggingFace Hub](https://huggingface.co/models?sort=trending&search=instruction)

Warning

Important caveats:

Only instruction-tuned models work (not base/pretrained models)
Free tier is heavily rate-limited — HF Pro subscription or Inference Endpoints recommended
Not all models support OpenAI-compatible API — check model docs for compatibility

Ollama (Local)¶

Ollama runs locally on your machine.

Installing Ollama If you haven’t installed Ollama yet, you can do so from the command line.

For more information, see the [Ollama installation guide](https://ollama.com/download).

Note

We recommend installing Ollama on WSL2 (Windows Subsystem for Linux) if you’re on Windows, rather than the Windows Desktop app, as it provides better integration with development tools.

If you don’t have WSL2 installed, see Windows Subsystem for Linux Installation.

Install Ollama from the command line:

curl -fsSL https://ollama.com/install.sh | sh

Start Ollama:
```
ollama serve
```
In another terminal, pull a model (e.g., llama2):
```
ollama pull llama2
```
Verify Ollama is running:
```
curl http://localhost:11434/api/tags
```

Configure Rocco for Ollama

Edit your .env file:

LLM_PROVIDER=ollama
LLM_BASE_URL=http://localhost:11434/v1
# LLM_API_KEY is auto-set to "ollama" (no real key needed)
LLM_MODEL=llama2

Available models: llama2, llama3, mistral, phi3, neural-chat, etc. (any model you’ve pulled)

Custom OpenAI-Compatible Endpoint¶

For any OpenAI-compatible API — custom servers, proxies, or open-source models deployed with OpenAI compatibility:

LLM_PROVIDER=openai_compatible
LLM_API_KEY=your-key-here
LLM_BASE_URL=https://your-custom-endpoint.com/v1
LLM_MODEL=your-model-name

What “OpenAI-compatible” means: The API must expose a /v1/chat/completions endpoint with the same request/response format as OpenAI. Examples:

Text Generation Inference (TGI) with --openai-compatible-api flag
vLLM with --served-model-name
Any proxy or router that wraps an LLM with OpenAI API compatibility

If your endpoint is NOT OpenAI-compatible (e.g., native HuggingFace API, custom format), you’ll need to:

Wrap your API with an OpenAI-compatible adapter, OR
Fork Rocco and modify src/llm/client.py to support your specific API format

Environment Variable Reference¶

Variable	Required?	Description
`LLM_PROVIDER`	No	Shortcut to pre-fill `LLM_BASE_URL`. Options: `openai`, `anthropic`, `gemini`, `deepseek`, `huggingface`, `ollama`, `sambanova`, `openai_compatible` (custom).
`LLM_API_KEY`	Yes	Your API key. For Ollama or local services, can be dummy value (e.g., `ollama` or `unused`).
`LLM_BASE_URL`	No	Custom endpoint URL. Required for custom or proprietary providers. If not set, provider mapping is used (or defaults to OpenAI).
`LLM_MODEL`	No	Model name (e.g., `gpt-4o-mini`). Defaults to `gpt-4o-mini`.

Changing Providers¶

To switch providers at runtime, just edit .env and restart the Streamlit app.

Troubleshooting¶

“API Key not found”: Check that .env exists in the repo root and LLM_API_KEY is set.
“Invalid API key”: Verify your key is correct (no extra spaces, matches the provider format).
“Connection timeout”: For Ollama, ensure the server is running (ollama serve). For cloud providers, check your internet connection.
“Model not found”: Verify the model name is correct for the provider (e.g., gpt-4o-mini for OpenAI, not gpt-4).

Next Steps¶

Ready to use Rocco? See Quick Start
Want to understand the architecture? See Architecture
Need help? Check Contributing