Importing Llama Stack as a Python Library

Llama Stack is typically utilized in a client-server configuration. To get started quickly, you can import Llama Stack as a library and call the APIs directly without needing to set up a server. For example:

from llama_stack_client.lib.direct.direct import LlamaStackDirectClient

client = await LlamaStackDirectClient.from_template('ollama')
await client.initialize()

This will parse your config and set up any inline implementations and remote clients needed for your implementation.

Then, you can access the APIs like models and inference on the client and call their methods directly:

response = await client.models.list()
print(response)

response = await client.inference.chat_completion(
    messages=[UserMessage(content="What is the capital of France?", role="user")],
    model="Llama3.1-8B-Instruct",
    stream=False,
)
print("\nChat completion response:")
print(response)

If you've created a custom distribution, you can also import it with the from_config constructor:

import yaml

with open(config_path, "r") as f:
    config_dict = yaml.safe_load(f)

run_config = parse_and_maybe_upgrade_config(config_dict)

client = await LlamaStackDirectClient.from_config(run_config)

1.5 KiB Raw Blame History

Importing Llama Stack as a Python Library

1.5 KiB

Raw Blame History