mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 20:14:13 +00:00
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR removes `init()` from `LlamaStackAsLibrary` Currently client.initialize() had to be invoked by user. To improve dev experience and to avoid runtime errors, this PR init LlamaStackAsLibrary implicitly upon using the client. It prevents also multiple init of the same client, while maintaining backward ccompatibility. This PR does the following - Automatic Initialization: Constructor calls initialize_impl() automatically. - Client is fully initialized after __init__ completes. - Prevents consecutive initialization after the client has been successfully initialized. - initialize() method still exists but is now a no-op. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> fixes https://github.com/meta-llama/llama-stack/issues/2946 --------- Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
1.2 KiB
1.2 KiB
Using Llama Stack as a Library
Setup Llama Stack without a Server
If you are planning to use an external service for Inference (even Ollama or TGI counts as external), it is often easier to use Llama Stack as a library. This avoids the overhead of setting up a server.
# setup
uv pip install llama-stack
llama stack build --distro starter --image-type venv
from llama_stack.core.library_client import LlamaStackAsLibraryClient
client = LlamaStackAsLibraryClient(
"starter",
# provider_data is optional, but if you need to pass in any provider specific data, you can do so here.
provider_data={"tavily_search_api_key": os.environ["TAVILY_SEARCH_API_KEY"]},
)
This will parse your config and set up any inline implementations and remote clients needed for your implementation.
Then, you can access the APIs like models
and inference
on the client and call their methods directly:
response = client.models.list()
If you've created a custom distribution, you can also use the run.yaml configuration file directly:
client = LlamaStackAsLibraryClient(config_path)