Skip to main content
Mellea command-line tool for LLM-powered workflows. Provides sub-commands for serving models (m serve), training and uploading adapters (m alora), decomposing tasks into subtasks (m decompose), running test-based evaluation pipelines (m eval), and applying automated code migrations (m fix).

m alora

Train or upload aLoRAs for requirement validation.

m alora add-readme

Generate and upload an INTRINSIC_README.md for a trained adapter. Uses an LLM to auto-generate documentation for a trained adapter based on the training data and model configuration, then uploads it to the Hugging Face Hub repository.
Prerequisites:
  • Hugging Face CLI authenticated (huggingface-cli login).
  • An LLM backend available for README generation.
m alora add-readme <DATAFILE> --basemodel <value> [--promptfile] --name <value> [--hints] [--io-yaml]
Arguments:
NameTypeRequiredDescription
DATAFILEtextyesJSONL file with item/label pairs
Options:
FlagTypeDefaultDescription
--basemodeltextrequiredBase model ID or path
--promptfiletextPath to load the prompt format file
--nametextrequiredDestination model name (e.g., acme/carbchecker-alora)
--hintstextFile containing any additional hints.
--io-yamltextLocation of the io.yaml file that configures input and output processing if the model is invoked as an intrinsic.
Output: Generates a README.md file, displays it for confirmation, and uploads it to the Hugging Face Hub repository specified by --name. Example:
m alora add-readme data.jsonl --basemodel ibm-granite/granite-3.3-2b-instruct --name acme/my-alora
See also: Lora and Alora Adapters

m alora train

Train an aLoRA or LoRA adapter on a labelled dataset. Fine-tunes a base causal language model using a JSONL dataset of item/label pairs. Supports both aLoRA (asymmetric LoRA) and standard LoRA adapters.
Prerequisites:
  • Mellea installed with adapter extras (uv add mellea[adapters]).
  • A CUDA, MPS, or CPU device available for training.
m alora train <DATAFILE> --basemodel <value> --outfile <value> [--promptfile] [--adapter] [--device] [--epochs] [--learning-rate] [--batch-size] [--max-length] [--grad-accum]
Arguments:
NameTypeRequiredDescription
DATAFILEtextyesJSONL file with item/label pairs
Options:
FlagTypeDefaultDescription
--basemodeltextrequiredBase model ID or path
--outfiletextrequiredPath to save adapter weights
--promptfiletextPath to load the prompt format file
--adaptertextaloraAdapter type: alora or lora
--devicetextautoDevice: auto, cpu, cuda, or mps
--epochsinteger6Number of training epochs
--learning-ratefloat6e-06Learning rate
--batch-sizeinteger2Per-device batch size
--max-lengthinteger1024Max sequence length
--grad-accuminteger4Gradient accumulation steps
Output: Saves adapter weights to the path specified by --outfile. The output directory contains an adapter_config.json and the trained weight files, ready for upload or local inference. Example:
m alora train data.jsonl --basemodel ibm-granite/granite-3.3-2b-instruct --outfile ./adapter
See also: Lora and Alora Adapters

m alora upload

Upload a trained adapter to a remote model registry. Pushes adapter weights to Hugging Face Hub, optionally packaging the adapter as an intrinsic with an io.yaml configuration file.
Prerequisites: Hugging Face CLI authenticated (huggingface-cli login).
m alora upload <WEIGHT_PATH> --name <value> [--intrinsic] [--io-yaml]
Arguments:
NameTypeRequiredDescription
WEIGHT_PATHtextyesPath to saved adapter weights
Options:
FlagTypeDefaultDescription
--nametextrequiredDestination model name (e.g., acme/carbchecker-alora)
--intrinsicbooleanfalseTrue if the uploaded adapter implements an intrinsic. If true, the caller must provide an io.yaml file.
--io-yamltextLocation of the io.yaml file that configures input and output processing if the model is invoked as an intrinsic.
Output: Creates or updates a Hugging Face Hub repository at the name specified by --name and uploads the adapter weight files. Example:
m alora upload ./adapter --name acme/my-alora
See also: Lora and Alora Adapters

m decompose

Utility pipeline for decomposing task prompts.

m decompose run

Break a complex task into ordered, executable subtasks. Reads user queries from a file or interactive input, runs the LLM-driven decomposition pipeline for each task job, and writes one JSON file, one rendered Python script, and any generated validation modules under a per-job output directory.
Prerequisites:
  • Mellea installed (uv add mellea).
  • An Ollama instance running locally, or an OpenAI-compatible endpoint configured via --backend-endpoint.
m decompose run --out-dir <value> [--out-name] [--input-file] [--model-id] [--backend] [--backend-req-timeout] [--backend-endpoint] [--backend-api-key] [--version] [--input-var] [--log-mode] [--enable-script-run]
Options:
FlagTypeDefaultDescription
--out-dirpathrequiredPath to an existing directory to save the output files.
--out-nametextm_decomp_resultName for the output files. Defaults to “m_decomp_result”.
--input-filetextPath to a text file containing user queries.
--model-idtextmistral-small3.2:latestModel name/id used to run the decomposition pipeline. Defaults to “mistral-small3.2:latest”, valid for the “ollama” backend.
--backendollama | openaiollamaBackend used for inference. Options: “ollama” and “openai”.
--backend-req-timeoutinteger300Timeout in seconds for backend requests. Defaults to “300”.
--backend-endpointtextBackend endpoint / base URL. Required for “openai”.
--backend-api-keytextBackend API key. Required for “openai”.
--versionlatest | v1 | v2 | v3latestVersion of the mellea program generator template to use.
--input-vartextOptional user input variable names. You may pass this option multiple times. Each value must be a valid Python identifier.
--log-modedemo | debugdemoReadable logging mode. Options: “demo” or “debug”.
--enable-script-runbooleanfalseWhen true, generated scripts expose argparse runtime options for backend, model, endpoint, and API key overrides.
Output: Creates a directory <out-dir>/<out-name>/ containing a JSON decomposition result file, a ready-to-run Python script, and any generated validation modules. One directory per task job. Example:
m decompose run --out-dir ./output --input-file tasks.txt
See also: M Decompose, Refactor Prompts with Cli

m eval

LLM-as-a-judge evaluation pipelines.

m eval run

Run LLM-as-a-judge evaluation on one or more test files. Loads test cases from JSON/JSONL files, generates candidate responses using the specified generation backend, scores them with a judge model, and writes aggregated results to a file.
Prerequisites:
  • Mellea installed (uv add mellea).
  • At least one inference backend available (Ollama by default).
  • A separate judge backend/model is recommended but optional (defaults to the generation backend).
m eval run <TEST_FILES> [--backend] [--model] [--max-gen-tokens] [--judge-backend] [--judge-model] [--max-judge-tokens] [--output-path] [--output-format] [--continue-on-error]
Arguments:
NameTypeRequiredDescription
TEST_FILEStextyesList of paths to json/jsonl files containing test cases
Options:
FlagTypeDefaultDescription
--backend, -btextollamaInference backend for generating candidate responses (e.g. ollama, openai)
--modeltextModel name/id for the generation backend; uses backend default if omitted
--max-gen-tokensinteger256Max tokens to generate for responses
--judge-backend, -jbtextInference backend for the judge model; reuses —backend if omitted
--judge-modeltextModel name/id for the judge; uses judge backend default if omitted
--max-judge-tokensinteger256Max tokens for the judge model’s judgement.
--output-path, -otexteval_resultsOutput path for results
--output-formattextjsonEither json or jsonl format for results
--continue-on-errorbooleantrueSkip failed test cases instead of aborting the entire run
Output: Writes evaluation results to <output-path>.<output-format> (default eval_results.json). The file contains per-test-case scores, judge verdicts, and aggregate statistics. Example:
m eval run tests.jsonl --backend ollama --model granite3.3:2b
See also: Evaluate with Llm as a Judge

m fix

Fix code for API changes.

m fix async

Fix async calls for the await_result default change. Scans Python source files for aact, ainstruct, and aquery calls and applies an automated migration to restore blocking behaviour after the await_result default changed from True to False.
Prerequisites: Mellea installed (uv add mellea).
m fix async <PATH> [--mode] [--dry-run]
Arguments:
NameTypeRequiredDescription
PATHtextyesFile or directory to scan
Options:
FlagTypeDefaultDescription
--mode, -madd-await-result | add-stream-loopadd-await-resultFix strategy to apply
--dry-runbooleanfalseReport locations without modifying files
Modes:
  • add-await-result — (default) Adds await_result=True to each call so it blocks until the result is ready. Use this if you don’t need to stream partial results.
  • add-stream-loop — Inserts a while not r.is_computed(): await r.astream() loop after each call. This only works if you passed a streaming model option (e.g. stream=True) to the call; otherwise the loop will finish immediately.
Best practices:
  • Run with —dry-run first to review what will be changed.
  • Only run a given mode once per file. The tool detects prior fixes and skips calls that already have await_result=True or a stream loop, but it is safest to treat it as a one-shot migration.
  • Do not run both modes on the same file. If a stream loop is already present, add-await-result will skip that call (and vice versa).
Detection notes:
  • Most import styles are detected: import mellea, from mellea import MelleaSession, from mellea.stdlib.functional import aact, module aliases, etc.
  • Calls that are already followed by await r.avalue(), await r.astream(), or a while not r.is_computed() loop are automatically skipped, even when nested inside if/try/for blocks.
Output: Modifies Python source files in place (unless --dry-run). Prints a summary of fixed call sites with file paths and line numbers. Example:
m fix async src/ --dry-run

m fix genslots

Rewrite genslot imports and class names to genstub equivalents. Scans Python source files and replaces deprecated GenerativeSlot imports and class references with their GenerativeStub replacements.
Prerequisites: Mellea installed (uv add mellea).
m fix genslots <PATH> [--dry-run]
Arguments:
NameTypeRequiredDescription
PATHtextyesFile or directory to scan
Options:
FlagTypeDefaultDescription
--dry-runbooleanfalseReport locations without modifying files
Rewrites:
  • mellea.stdlib.components.genslot → mellea.stdlib.components.genstub
  • GenerativeSlot → GenerativeStub
  • SyncGenerativeSlot → SyncGenerativeStub
  • AsyncGenerativeSlot → AsyncGenerativeStub
Best practices:
  • Run with —dry-run first to review what will be changed.
  • The tool is idempotent — running it twice on the same file is safe.
Output: Modifies Python source files in place (unless --dry-run). Prints a summary of rewritten references with file paths and line numbers. Example:
m fix genslots src/ --dry-run

m serve

Serve a Mellea program as an OpenAI-compatible HTTP endpoint. Loads a Python file containing a serve function and exposes it via a FastAPI server implementing the OpenAI chat completions API. The server accepts POST /v1/chat/completions requests.
Prerequisites:
  • Mellea installed with server dependency group (uv add 'mellea[server]').
  • The python file being loaded must have a serve function.
m serve [SCRIPT_PATH] [--host] [--port]
Arguments:
NameTypeRequiredDescription
SCRIPT_PATHtextnoPath to the Python script to import and serve
Options:
FlagTypeDefaultDescription
--hosttext0.0.0.0Host to bind to
--portinteger8080Port to bind to
Output: Starts a long-running HTTP server on the specified host and port. The /v1/chat/completions endpoint accepts OpenAI-format chat completion requests and returns ChatCompletion JSON responses. Example:
m serve my_app.py --port 9000
See also: M Serve