mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-04 10:10:36 +00:00
# What does this PR do? relates to #2878 We introduce a Containerfile which is used to replaced the `llama stack build` command (removal in a separate PR). ``` llama stack build --distro starter --image-type venv --run ``` is replaced by ``` llama stack list-deps starter | xargs -L1 uv pip install llama stack run starter ``` - See the updated workflow files for e2e workflow. ## Test Plan CI ``` ❯ docker build . -f docker/Dockerfile --build-arg DISTRO_NAME=starter --build-arg INSTALL_MODE=editable --tag test_starter ❯ docker run -p 8321:8321 test_starter ❯ curl http://localhost:8321/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o-mini", "messages": [ { "role": "user", "content": "Hello!" } ] }' ``` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/llamastack/llama-stack/pull/3839). * #3855 * __->__ #3839
43 lines
2 KiB
Text
43 lines
2 KiB
Text
---
|
|
title: Starting a Llama Stack Server
|
|
description: Different ways to run Llama Stack servers - as library, container, or Kubernetes deployment
|
|
sidebar_label: Starting Llama Stack Server
|
|
sidebar_position: 7
|
|
---
|
|
|
|
# Starting a Llama Stack Server
|
|
|
|
You can run a Llama Stack server in one of the following ways:
|
|
|
|
## As a Library:
|
|
|
|
This is the simplest way to get started. Using Llama Stack as a library means you do not need to start a server. This is especially useful when you are not running inference locally and relying on an external inference service (eg. fireworks, together, groq, etc.) See [Using Llama Stack as a Library](importing_as_library)
|
|
|
|
|
|
## Container:
|
|
|
|
Another simple way to start interacting with Llama Stack is to just spin up a container (via Docker or Podman) which is pre-built with all the providers you need. We provide a number of pre-built images so you can start a Llama Stack server instantly. You can also build your own custom container. Which distribution to choose depends on the hardware you have. See [Selection of a Distribution](./list_of_distributions) for more details.
|
|
|
|
## Kubernetes:
|
|
|
|
If you have built a container image and want to deploy it in a Kubernetes cluster instead of starting the Llama Stack server locally. See [Kubernetes Deployment Guide](../deploying/kubernetes_deployment) for more details.
|
|
|
|
|
|
## Configure logging
|
|
|
|
Control log output via environment variables before starting the server.
|
|
|
|
- `LLAMA_STACK_LOGGING` sets per-component levels, e.g. `LLAMA_STACK_LOGGING=server=debug;core=info`.
|
|
- Supported categories: `all`, `core`, `server`, `router`, `inference`, `agents`, `safety`, `eval`, `tools`, `client`.
|
|
- Levels: `debug`, `info`, `warning`, `error`, `critical` (default is `info`). Use `all=<level>` to apply globally.
|
|
- `LLAMA_STACK_LOG_FILE=/path/to/log` mirrors logs to a file while still printing to stdout.
|
|
|
|
Export these variables prior to running `llama stack run`, launching a container, or starting the server through any other pathway.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
:hidden:
|
|
|
|
importing_as_library
|
|
configuration
|
|
```
|