pip install mellea, Ollama for local inference
or appropriate credentials for cloud backends.
A backend is the engine that runs the LLM. Mellea ships with backends for Ollama,
OpenAI-compatible APIs, LiteLLM, HuggingFace transformers, and IBM WatsonX. You
configure the backend when you create a session.
Default backend
start_session() defaults to Ollama with IBM Granite 4 Micro (granite4:micro).
No API keys needed — just have Ollama running:
Switching the model
Pass any model string your backend supports:model_ids constants for known models:
OpenAI backend
Backend note: This section requiresUse any OpenAI-compatible API — OpenAI itself, LM Studio, vLLM, or Ollama’s OpenAI-compatible endpoint:pip install mellea(no extras needed — the OpenAI client is included). Needs a validapi_keyfor the OpenAI API; local endpoints such as LM Studio and Ollama’s OpenAI endpoint do not require a real key.
LiteLLM backend
Backend note: RequiresLiteLLM provides unified access to 100+ providers — Anthropic, AWS Bedrock, Azure, and more:pip install "mellea[litellm]". Provider-specific environment variables must be set (e.g.,AWS_BEARER_TOKEN_BEDROCKfor Bedrock). See the LiteLLM docs for your provider’s setup.
HuggingFace backend
Backend note: Requires pip install "mellea[hf]". Models are downloaded from
HuggingFace Hub on first use. GPU recommended for reasonable inference speed.
Required for Intrinsics.
Run models locally using HuggingFace transformers:
WatsonX backend
Deprecated: The native WatsonX backend is deprecated. Use the LiteLLM or OpenAI backend with a WatsonX-compatible endpoint instead. See IBM WatsonX integration for the recommended setup.
Model options
ModelOption provides backend-agnostic keys for common generation parameters.
Options set at session level apply to all calls; options passed to instruct() or
chat() apply to that call only and take precedence:
ModelOption constants:
| Constant | Description |
|---|---|
ModelOption.TEMPERATURE | Sampling temperature |
ModelOption.MAX_NEW_TOKENS | Maximum tokens to generate |
ModelOption.SEED | Random seed for reproducibility |
ModelOption.SYSTEM_PROMPT | System prompt override |
ModelOption.THINKING | Enable thinking / reasoning mode |
ModelOption.STREAM | Enable streaming output |
ModelOption.TOOLS | List of tools available to the model |
ModelOption.CONTEXT_WINDOW | Context window size |
ModelOption constants. If
the same parameter is specified both ways, ModelOption takes precedence.
System prompt
ModelOption.SYSTEM_PROMPT is the recommended way to set a system message. It is
translated correctly for all backends regardless of how each provider serializes the
system role:
Direct backend construction
For full control, construct the backend and pass it toMelleaSession directly:
start_session() accepts the same arguments as keyword parameters:
backend_name values: "ollama", "openai", "hf", "litellm", "watsonx".
See also: Configure Model Options | Integrations