mellea.backends.vllm

The purpose of the VLLM backend is to provide a locally running fast inference engine.

Classes

CLASS `LocalVLLMBackend`

The LocalVLLMBackend uses vLLM’s python interface for inference, and uses a Formatter to convert Components into prompts. The support for Activated LoRAs (ALoras)](https://arxiv.org/pdf/2504.12397) is planned. This backend is designed for running an HF model for small-scale inference locally on your machine. Its throughput is generally higher than that of LocalHFBackend. However, it takes longer to load the weights during the instantiation. Also, if you submit a request one by one, it can be slower.

Methods:

FUNC `generate_from_context`

generate_from_context(self, action: Component[C] | CBlock, ctx: Context) -> tuple[ModelOutputThunk[C], Context]

Generate using the huggingface model.

FUNC `processing`

processing(self, mot: ModelOutputThunk, chunk: vllm.RequestOutput)

Process the returned chunks or the complete response.

FUNC `post_processing`

post_processing(self, mot: ModelOutputThunk, conversation: list[dict], _format: type[BaseModelSubclass] | None, tool_calls: bool, tools: dict[str, Callable], seed)

Called when generation is done.

FUNC `generate_from_raw`

generate_from_raw(self, actions: list[Component[C]], ctx: Context) -> list[ModelOutputThunk[C]]

FUNC `generate_from_raw`

generate_from_raw(self, actions: list[Component[C] | CBlock], ctx: Context) -> list[ModelOutputThunk[C | str]]

FUNC `generate_from_raw`

generate_from_raw(self, actions: Sequence[Component[C] | CBlock], ctx: Context) -> list[ModelOutputThunk]

Generate using the completions api. Gives the input provided to the model without templating.

mellea

cli

​Classes

​CLASS LocalVLLMBackend

​FUNC generate_from_context

​FUNC processing

​FUNC post_processing

​FUNC generate_from_raw

​FUNC generate_from_raw

​FUNC generate_from_raw

Classes

CLASS `LocalVLLMBackend`

FUNC `generate_from_context`

FUNC `processing`

FUNC `post_processing`

FUNC `generate_from_raw`

FUNC `generate_from_raw`

FUNC `generate_from_raw`