pip install mellea, Ollama for local inference
or appropriate credentials for cloud backends.
A backend is the engine that runs the LLM. Mellea ships with backends for Ollama,
OpenAI-compatible APIs, LiteLLM, HuggingFace transformers, and IBM WatsonX. You
configure the backend when you create a session.
Default backend
start_session() defaults to Ollama with IBM Granite 4 Micro (granite4:micro).
No API keys needed — just have Ollama running:
Backend comparison
The following table shows all available backends, their class names, import paths, required extras, andstart_session() configuration:
| Backend | Class | Import | Required extras | start_session() |
|---|---|---|---|---|
| Ollama | OllamaModelBackend | mellea.backends.ollama | base | backend_name="ollama" |
| OpenAI | OpenAIBackend | mellea.backends.openai | base | backend_name="openai" |
| LiteLLM | LiteLLMBackend | mellea.backends.litellm | mellea[litellm] | backend_name="litellm" |
| HuggingFace | LocalHFBackend | mellea.backends.huggingface | mellea[hf] | backend_name="hf" |
| WatsonX | WatsonxAIBackend | mellea.backends.watsonx | mellea[watsonx] | backend_name="watsonx" (deprecated) |
Note: Vertex AI uses the LiteLLM backend with appropriate model IDs. See the Vertex AI integration for details. For detailed setup instructions, click the backend name in the table above.
Switching the model
Pass any model string your backend supports:model_ids constants for known models:
OpenAI backend
Backend note: This section requiresUse any OpenAI-compatible API — OpenAI itself, LM Studio, vLLM, or Ollama’s OpenAI-compatible endpoint:pip install mellea(no extras needed — the OpenAI client is included). Needs a validapi_keyfor the OpenAI API; local endpoints such as LM Studio and Ollama’s OpenAI endpoint do not require a real key.
LiteLLM backend
Backend note: RequiresLiteLLM provides unified access to 100+ providers — Anthropic, AWS Bedrock, Azure, and more:pip install "mellea[litellm]". Provider-specific environment variables must be set (e.g.,AWS_BEARER_TOKEN_BEDROCKfor Bedrock). See the LiteLLM docs for your provider’s setup.
HuggingFace backend
Backend note: Requires pip install "mellea[hf]". Models are downloaded from
HuggingFace Hub on first use. GPU recommended for reasonable inference speed.
Required for Intrinsics.
Run models locally using HuggingFace transformers:
WatsonX backend
Deprecated: The native WatsonX backend is deprecated. Use the LiteLLM or OpenAI backend with a WatsonX-compatible endpoint instead. See IBM WatsonX integration for the recommended setup.
Model options
ModelOption provides backend-agnostic keys for common generation parameters.
Options set at session level apply to all calls; options passed to instruct() or
chat() apply to that call only and take precedence:
ModelOption constants:
| Constant | Description |
|---|---|
ModelOption.TEMPERATURE | Sampling temperature |
ModelOption.MAX_NEW_TOKENS | Maximum tokens to generate |
ModelOption.SEED | Random seed for reproducibility |
ModelOption.SYSTEM_PROMPT | System prompt override |
ModelOption.THINKING | Enable thinking / reasoning mode |
ModelOption.STREAM | Enable streaming output |
ModelOption.TOOLS | List of tools available to the model |
ModelOption.CONTEXT_WINDOW | Context window size |
ModelOption constants. If
the same parameter is specified both ways, ModelOption takes precedence.
System prompt
ModelOption.SYSTEM_PROMPT is the recommended way to set a system message. It is
translated correctly for all backends regardless of how each provider serializes the
system role:
Direct backend construction
For full control, construct the backend and pass it toMelleaSession directly:
start_session() accepts the same arguments as keyword parameters:
backend_name values: "ollama", "openai", "hf", "litellm", "watsonx".
See also: Configure Model Options | Integrations