pip install mellea.
Install Ollama
Download the installer from ollama.ai or:.dmg starts the server automatically as a
background service.
Default setup
start_session() connects to Ollama on localhost:11434 and uses
IBM Granite 4 Micro (granite4:micro) by default. On first run, Mellea
automatically pulls the model if it is not already downloaded:
Note: The first run pulls granite4:micro (~2 GB). Subsequent runs start
immediately from the local cache.
Switching models
Pass any model name that Ollama supports:model_ids constants for well-known models — they carry the correct Ollama
model name automatically:
Recommended models
model_ids constant | Ollama name | Notes |
|---|---|---|
IBM_GRANITE_4_MICRO_3B | granite4:micro | Default. Fast, low memory (~2 GB). |
IBM_GRANITE_4_HYBRID_MICRO | granite4:micro-h | Hybrid variant with extended thinking. |
IBM_GRANITE_3_3_8B | granite3.3:8b | Higher quality, ~5 GB. |
IBM_GRANITE_3_3_VISION_2B | ibm/granite3.3-vision:2b | Vision model for image inputs. |
META_LLAMA_3_2_3B | llama3.2:3b | Compact Llama model. |
MISTRALAI_MISTRAL_0_3_7B | mistral:7b | Mistral 7B. |
QWEN3_8B | qwen3:8b | Qwen3 8B. |
DEEPSEEK_R1_8B | deepseek-r1:8b | Reasoning-capable model. |
ollama list to see which models are already downloaded locally.
Direct backend construction
For full control, constructOllamaModelBackend directly:
Custom host
Mellea reads theOLLAMA_HOST environment variable or accepts a base_url
parameter. Use this to connect to Ollama running on a remote machine or a
non-standard port:
base_url takes precedence over OLLAMA_HOST if both are set.
Model options
Pass generation parameters viaModelOption:
instruct()
or chat() apply to that call only and take precedence.
Vision models
Ollama hosts vision-capable models. UseIBM_GRANITE_3_3_VISION_2B or any Ollama
vision model via the OpenAI-compatible endpoint:
Backend note: Vision requires a model that supports image inputs. The defaultgranite4:microis text-only. Pull a vision model explicitly before using images:ollama pull ibm/granite3.3-vision:2b.
Ollama’s OpenAI-compatible endpoint
Ollama exposes an OpenAI-compatible API athttp://localhost:11434/v1. Use this
with the OpenAIBackend to access any Ollama model with OpenAI-style tool calling
or vision support:
OpenAIBackend reference.
Troubleshooting
Connection refused on port 11434
The Ollama server is not running. Start it withollama serve, or on macOS,
launch the Ollama app from Applications.
Model not found
The model has not been pulled. Runollama pull <model-name> before using it, or
let Mellea pull it automatically on first use.
Slow first run
Ollama loads the model into memory on the first request. Subsequent requests in the same session are much faster. On machines with less than 8 GB RAM, consider usinggranite4:micro or llama3.2:1b.
Intel Mac torch errors
Some dependencies require a Rosetta-compatible environment on Intel Macs. Create a conda environment and installtorchvision before pip install mellea:
See also: Backends and Configuration | Getting Started