pip install mellea, Ollama running locally.
Async methods
Every sync method onMelleaSession has an a-prefixed async counterpart with the
same signature and return type:
| Sync | Async |
|---|---|
instruct() | ainstruct() |
chat() | achat() |
act() | aact() |
validate() | avalidate() |
query() | aquery() |
transform() | atransform() |
Parallel generation
ainstruct() returns a ModelOutputThunk immediately — generation starts in the
background but the value is not resolved until you call avalue(). This lets you
fire multiple generations and resolve them all at once:
wait_for_all_mots is a convenience wrapper:
Note: All thunks passed towait_for_all_motsmust belong to the same event loop, which is always the case when usingMelleaSession.
Streaming
Enable streaming by passingModelOption.STREAM: True in model_options. Consume
incremental output chunks with mot.astream():
astream() behaves:
- Each call returns only the new content since the previous call.
- When the thunk is fully computed (
is_computed()returnsTrue), the finalastream()call returns the complete value. - If the thunk is already computed,
astream()returns the full value immediately.
Warning: Do not call astream() from multiple coroutines simultaneously on
the same thunk. Each thunk should have a single reader.
Async and context
UseSimpleContext (the default) with concurrent async requests. Using ChatContext
with concurrent requests can cause stale context issues — Mellea logs a warning
when this is detected:
ChatContext with async, await each call before starting the next:
SimpleContext.
See also: Tutorial 02: Streaming and Async | act() and aact()