Skip to main content
Prerequisites: pip install mellea, Ollama for local inference or appropriate credentials for cloud backends. A backend is the engine that runs the LLM. Mellea ships with backends for Ollama, OpenAI-compatible APIs, LiteLLM, HuggingFace transformers, and IBM WatsonX. You configure the backend when you create a session.

Default backend

start_session() defaults to Ollama with IBM Granite 4 Micro (granite4:micro). No API keys needed — just have Ollama running:
import mellea

m = mellea.start_session()

Backend comparison

The following table shows all available backends, their class names, import paths, required extras, and start_session() configuration:
BackendClassImportRequired extrasstart_session()
OllamaOllamaModelBackendmellea.backends.ollamabasebackend_name="ollama"
OpenAIOpenAIBackendmellea.backends.openaibasebackend_name="openai"
LiteLLMLiteLLMBackendmellea.backends.litellmmellea[litellm]backend_name="litellm"
HuggingFaceLocalHFBackendmellea.backends.huggingfacemellea[hf]backend_name="hf"
WatsonXWatsonxAIBackendmellea.backends.watsonxmellea[watsonx]backend_name="watsonx" (deprecated)
Note: Vertex AI uses the LiteLLM backend with appropriate model IDs. See the Vertex AI integration for details. For detailed setup instructions, click the backend name in the table above.

Switching the model

Pass any model string your backend supports:
import mellea

m = mellea.start_session(model_id="llama3.2:3b")
Use model_ids constants for known models:
from mellea import start_session
from mellea.backends import model_ids

m = start_session(model_id=model_ids.IBM_GRANITE_3_3_8B)

OpenAI backend

Backend note: This section requires pip install mellea (no extras needed — the OpenAI client is included). Needs a valid api_key for the OpenAI API; local endpoints such as LM Studio and Ollama’s OpenAI endpoint do not require a real key.
Use any OpenAI-compatible API — OpenAI itself, LM Studio, vLLM, or Ollama’s OpenAI-compatible endpoint:
from mellea import MelleaSession
from mellea.backends.openai import OpenAIBackend
from mellea.stdlib.context import ChatContext

# OpenAI API
m = MelleaSession(
    OpenAIBackend(model_id="gpt-4o", api_key="sk-..."),
    ctx=ChatContext(),
)
from mellea import MelleaSession
from mellea.backends.openai import OpenAIBackend

# LM Studio (local, no real key needed)
m = MelleaSession(
    OpenAIBackend(model_id="qwen2.5vl:7b", base_url="http://127.0.0.1:1234/v1"),
)

# Ollama via OpenAI-compatible endpoint
m = MelleaSession(
    OpenAIBackend(
        model_id="qwen2.5vl:7b",
        base_url="http://localhost:11434/v1",
        api_key="ollama",
    ),
)

LiteLLM backend

Backend note: Requires pip install "mellea[litellm]". Provider-specific environment variables must be set (e.g., AWS_BEARER_TOKEN_BEDROCK for Bedrock). See the LiteLLM docs for your provider’s setup.
LiteLLM provides unified access to 100+ providers — Anthropic, AWS Bedrock, Azure, and more:
import mellea

m = mellea.start_session(
    backend_name="litellm",
    model_id="bedrock/converse/us.amazon.nova-pro-v1:0",
)
result = m.chat("Give me three facts about the Amazon rainforest.")
print(str(result))
# Output will vary — LLM responses depend on model and temperature.

HuggingFace backend

Backend note: Requires pip install "mellea[hf]". Models are downloaded from HuggingFace Hub on first use. GPU recommended for reasonable inference speed. Required for Intrinsics.
Run models locally using HuggingFace transformers:
from mellea import MelleaSession
from mellea.backends.huggingface import LocalHFBackend

backend = LocalHFBackend(model_id="ibm-granite/granite-4.0-micro")
m = MelleaSession(backend=backend)

WatsonX backend

Deprecated: The native WatsonX backend is deprecated. Use the LiteLLM or OpenAI backend with a WatsonX-compatible endpoint instead. See IBM WatsonX integration for the recommended setup.

Model options

ModelOption provides backend-agnostic keys for common generation parameters. Options set at session level apply to all calls; options passed to instruct() or chat() apply to that call only and take precedence:
from mellea import MelleaSession
from mellea.backends import ModelOption
from mellea.backends.ollama import OllamaModelBackend

# Set seed for all calls in this session
m = MelleaSession(
    backend=OllamaModelBackend(model_options={ModelOption.SEED: 42})
)

# Override temperature and token limit for a single call
answer = m.instruct(
    "What is 2 × 2?",
    model_options={
        ModelOption.TEMPERATURE: 0.5,
        ModelOption.MAX_NEW_TOKENS: 15,
    },
)
print(str(answer))
# Output will vary — LLM responses depend on model and temperature.
Available ModelOption constants:
ConstantDescription
ModelOption.TEMPERATURESampling temperature
ModelOption.MAX_NEW_TOKENSMaximum tokens to generate
ModelOption.SEEDRandom seed for reproducibility
ModelOption.SYSTEM_PROMPTSystem prompt override
ModelOption.THINKINGEnable thinking / reasoning mode
ModelOption.STREAMEnable streaming output
ModelOption.TOOLSList of tools available to the model
ModelOption.CONTEXT_WINDOWContext window size
You can also pass raw backend-native keys alongside ModelOption constants. If the same parameter is specified both ways, ModelOption takes precedence.

System prompt

ModelOption.SYSTEM_PROMPT is the recommended way to set a system message. It is translated correctly for all backends regardless of how each provider serializes the system role:
from mellea import start_session
from mellea.backends import ModelOption

m = start_session(model_options={ModelOption.SYSTEM_PROMPT: "You are a concise assistant."})
reply = m.chat("What is the capital of France?")
print(str(reply))
# Output will vary — LLM responses depend on model and temperature.

Direct backend construction

For full control, construct the backend and pass it to MelleaSession directly:
import mellea
from mellea.backends.ollama import OllamaModelBackend
from mellea.stdlib.context import ChatContext

backend = OllamaModelBackend(model_id="phi4-mini:latest")
m = mellea.MelleaSession(backend=backend, ctx=ChatContext())
start_session() accepts the same arguments as keyword parameters:
import mellea
from mellea.backends import ModelOption
from mellea.stdlib.context import ChatContext

m = mellea.start_session(
    backend_name="ollama",
    model_id="phi4-mini:latest",
    ctx=ChatContext(),
    model_options={ModelOption.TEMPERATURE: 0.1},
)
Valid backend_name values: "ollama", "openai", "hf", "litellm", "watsonx".
See also: Configure Model Options | Integrations