mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-23 00:59:41 +00:00
Merge remote-tracking branch 'upstream/main' into patch-1
This commit is contained in:
commit
688f42ba3a
586 changed files with 65959 additions and 3793 deletions
|
|
@ -1,3 +1,7 @@
|
|||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
# inline::meta-reference
|
||||
|
||||
## Description
|
||||
|
|
|
|||
|
|
@ -1,3 +1,7 @@
|
|||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
# remote::nvidia
|
||||
|
||||
## Description
|
||||
|
|
|
|||
|
|
@ -43,7 +43,7 @@ We have built-in functionality to run the supported open-benckmarks using llama-
|
|||
|
||||
Spin up llama stack server with 'open-benchmark' template
|
||||
```
|
||||
llama stack run llama_stack/templates/open-benchmark/run.yaml
|
||||
llama stack run llama_stack/distributions/open-benchmark/run.yaml
|
||||
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -23,7 +23,7 @@ To use the HF SFTTrainer in your Llama Stack project, follow these steps:
|
|||
You can access the HuggingFace trainer via the `ollama` distribution:
|
||||
|
||||
```bash
|
||||
llama stack build --template starter --image-type venv
|
||||
llama stack build --distro starter --image-type venv
|
||||
llama stack run --image-type venv ~/.llama/distributions/ollama/ollama-run.yaml
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -1,3 +1,7 @@
|
|||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
# inline::huggingface
|
||||
|
||||
## Description
|
||||
|
|
|
|||
|
|
@ -1,3 +1,7 @@
|
|||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
# inline::torchtune
|
||||
|
||||
## Description
|
||||
|
|
|
|||
|
|
@ -1,3 +1,7 @@
|
|||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
# remote::nvidia
|
||||
|
||||
## Description
|
||||
|
|
|
|||
|
|
@ -1,3 +1,7 @@
|
|||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
# inline::basic
|
||||
|
||||
## Description
|
||||
|
|
|
|||
|
|
@ -1,3 +1,7 @@
|
|||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
# inline::braintrust
|
||||
|
||||
## Description
|
||||
|
|
|
|||
|
|
@ -1,3 +1,7 @@
|
|||
---
|
||||
orphan: true
|
||||
---
|
||||
|
||||
# inline::llm-as-judge
|
||||
|
||||
## Description
|
||||
|
|
|
|||
|
|
@ -355,7 +355,7 @@ server:
|
|||
8. Run the server:
|
||||
|
||||
```bash
|
||||
python -m llama_stack.distribution.server.server --yaml-config ~/.llama/run-byoa.yaml
|
||||
python -m llama_stack.core.server.server --yaml-config ~/.llama/run-byoa.yaml
|
||||
```
|
||||
|
||||
9. Test the API:
|
||||
|
|
|
|||
|
|
@ -97,11 +97,11 @@ To start the Llama Stack Playground, run the following commands:
|
|||
1. Start up the Llama Stack API server
|
||||
|
||||
```bash
|
||||
llama stack build --template together --image-type conda
|
||||
llama stack build --distro together --image-type venv
|
||||
llama stack run together
|
||||
```
|
||||
|
||||
2. Start Streamlit UI
|
||||
```bash
|
||||
uv run --with ".[ui]" streamlit run llama_stack/distribution/ui/app.py
|
||||
uv run --with ".[ui]" streamlit run llama_stack.core/ui/app.py
|
||||
```
|
||||
|
|
|
|||
|
|
@ -11,4 +11,5 @@ See the [Adding a New API Provider](new_api_provider.md) which describes how to
|
|||
:hidden:
|
||||
|
||||
new_api_provider
|
||||
testing
|
||||
```
|
||||
|
|
|
|||
|
|
@ -6,7 +6,7 @@ This guide will walk you through the process of adding a new API provider to Lla
|
|||
- Begin by reviewing the [core concepts](../concepts/index.md) of Llama Stack and choose the API your provider belongs to (Inference, Safety, VectorIO, etc.)
|
||||
- Determine the provider type ({repopath}`Remote::llama_stack/providers/remote` or {repopath}`Inline::llama_stack/providers/inline`). Remote providers make requests to external services, while inline providers execute implementation locally.
|
||||
- Add your provider to the appropriate {repopath}`Registry::llama_stack/providers/registry/`. Specify pip dependencies necessary.
|
||||
- Update any distribution {repopath}`Templates::llama_stack/templates/` `build.yaml` and `run.yaml` files if they should include your provider by default. Run {repopath}`./scripts/distro_codegen.py` if necessary. Note that `distro_codegen.py` will fail if the new provider causes any distribution template to attempt to import provider-specific dependencies. This usually means the distribution's `get_distribution_template()` code path should only import any necessary Config or model alias definitions from each provider and not the provider's actual implementation.
|
||||
- Update any distribution {repopath}`Templates::llama_stack/distributions/` `build.yaml` and `run.yaml` files if they should include your provider by default. Run {repopath}`./scripts/distro_codegen.py` if necessary. Note that `distro_codegen.py` will fail if the new provider causes any distribution template to attempt to import provider-specific dependencies. This usually means the distribution's `get_distribution_template()` code path should only import any necessary Config or model alias definitions from each provider and not the provider's actual implementation.
|
||||
|
||||
|
||||
Here are some example PRs to help you get started:
|
||||
|
|
@ -52,7 +52,7 @@ def get_base_url(self) -> str:
|
|||
|
||||
## Testing the Provider
|
||||
|
||||
Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the `together` distribution, you should install dependencies via `llama stack build --template together`.
|
||||
Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the `together` distribution, you should install dependencies via `llama stack build --distro together`.
|
||||
|
||||
### 1. Integration Testing
|
||||
|
||||
|
|
|
|||
|
|
@ -174,7 +174,7 @@ spec:
|
|||
- name: llama-stack
|
||||
image: localhost/llama-stack-run-k8s:latest
|
||||
imagePullPolicy: IfNotPresent
|
||||
command: ["python", "-m", "llama_stack.distribution.server.server", "--config", "/app/config.yaml"]
|
||||
command: ["python", "-m", "llama_stack.core.server.server", "--config", "/app/config.yaml"]
|
||||
ports:
|
||||
- containerPort: 5000
|
||||
volumeMounts:
|
||||
|
|
|
|||
|
|
@ -47,30 +47,37 @@ pip install -e .
|
|||
```
|
||||
Use the CLI to build your distribution.
|
||||
The main points to consider are:
|
||||
1. **Image Type** - Do you want a Conda / venv environment or a Container (eg. Docker)
|
||||
1. **Image Type** - Do you want a venv environment or a Container (eg. Docker)
|
||||
2. **Template** - Do you want to use a template to build your distribution? or start from scratch ?
|
||||
3. **Config** - Do you want to use a pre-existing config file to build your distribution?
|
||||
|
||||
```
|
||||
llama stack build -h
|
||||
usage: llama stack build [-h] [--config CONFIG] [--template TEMPLATE] [--list-templates] [--image-type {conda,container,venv}] [--image-name IMAGE_NAME] [--print-deps-only] [--run]
|
||||
usage: llama stack build [-h] [--config CONFIG] [--template TEMPLATE] [--distro DISTRIBUTION] [--list-distros] [--image-type {container,venv}] [--image-name IMAGE_NAME] [--print-deps-only]
|
||||
[--run] [--providers PROVIDERS]
|
||||
|
||||
Build a Llama stack container
|
||||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
--config CONFIG Path to a config file to use for the build. You can find example configs in llama_stack/distributions/**/build.yaml. If this argument is not provided, you will
|
||||
be prompted to enter information interactively (default: None)
|
||||
--template TEMPLATE Name of the example template config to use for build. You may use `llama stack build --list-templates` to check out the available templates (default: None)
|
||||
--list-templates Show the available templates for building a Llama Stack distribution (default: False)
|
||||
--image-type {conda,container,venv}
|
||||
--config CONFIG Path to a config file to use for the build. You can find example configs in llama_stack.cores/**/build.yaml. If this argument is not provided, you will be prompted to
|
||||
enter information interactively (default: None)
|
||||
--template TEMPLATE (deprecated) Name of the example template config to use for build. You may use `llama stack build --list-distros` to check out the available distributions (default:
|
||||
None)
|
||||
--distro DISTRIBUTION, --distribution DISTRIBUTION
|
||||
Name of the distribution to use for build. You may use `llama stack build --list-distros` to check out the available distributions (default: None)
|
||||
--list-distros, --list-distributions
|
||||
Show the available distributions for building a Llama Stack distribution (default: False)
|
||||
--image-type {container,venv}
|
||||
Image Type to use for the build. If not specified, will use the image type from the template config. (default: None)
|
||||
--image-name IMAGE_NAME
|
||||
[for image-type=conda|container|venv] Name of the conda or virtual environment to use for the build. If not specified, currently active environment will be used if
|
||||
found. (default: None)
|
||||
[for image-type=container|venv] Name of the virtual environment to use for the build. If not specified, currently active environment will be used if found. (default:
|
||||
None)
|
||||
--print-deps-only Print the dependencies for the stack only, without building the stack (default: False)
|
||||
--run Run the stack after building using the same image type, name, and other applicable arguments (default: False)
|
||||
|
||||
--providers PROVIDERS
|
||||
Build a config for a list of providers and only those providers. This list is formatted like: api1=provider1,api2=provider2. Where there can be multiple providers per
|
||||
API. (default: None)
|
||||
```
|
||||
|
||||
After this step is complete, a file named `<name>-build.yaml` and template file `<name>-run.yaml` will be generated and saved at the output file path specified at the end of the command.
|
||||
|
|
@ -141,7 +148,7 @@ You may then pick a template to build your distribution with providers fitted to
|
|||
|
||||
For example, to build a distribution with TGI as the inference provider, you can run:
|
||||
```
|
||||
$ llama stack build --template starter
|
||||
$ llama stack build --distro starter
|
||||
...
|
||||
You can now edit ~/.llama/distributions/llamastack-starter/starter-run.yaml and run `llama stack run ~/.llama/distributions/llamastack-starter/starter-run.yaml`
|
||||
```
|
||||
|
|
@ -159,7 +166,7 @@ It would be best to start with a template and understand the structure of the co
|
|||
llama stack build
|
||||
|
||||
> Enter a name for your Llama Stack (e.g. my-local-stack): my-stack
|
||||
> Enter the image type you want your Llama Stack to be built as (container or conda or venv): conda
|
||||
> Enter the image type you want your Llama Stack to be built as (container or venv): venv
|
||||
|
||||
Llama Stack is composed of several APIs working together. Let's select
|
||||
the provider types (implementations) you want to use for these APIs.
|
||||
|
|
@ -184,10 +191,10 @@ You can now edit ~/.llama/distributions/llamastack-my-local-stack/my-local-stack
|
|||
:::{tab-item} Building from a pre-existing build config file
|
||||
- In addition to templates, you may customize the build to your liking through editing config files and build from config files with the following command.
|
||||
|
||||
- The config file will be of contents like the ones in `llama_stack/templates/*build.yaml`.
|
||||
- The config file will be of contents like the ones in `llama_stack/distributions/*build.yaml`.
|
||||
|
||||
```
|
||||
llama stack build --config llama_stack/templates/starter/build.yaml
|
||||
llama stack build --config llama_stack/distributions/starter/build.yaml
|
||||
```
|
||||
:::
|
||||
|
||||
|
|
@ -253,11 +260,11 @@ Podman is supported as an alternative to Docker. Set `CONTAINER_BINARY` to `podm
|
|||
To build a container image, you may start off from a template and use the `--image-type container` flag to specify `container` as the build image type.
|
||||
|
||||
```
|
||||
llama stack build --template starter --image-type container
|
||||
llama stack build --distro starter --image-type container
|
||||
```
|
||||
|
||||
```
|
||||
$ llama stack build --template starter --image-type container
|
||||
$ llama stack build --distro starter --image-type container
|
||||
...
|
||||
Containerfile created successfully in /tmp/tmp.viA3a3Rdsg/ContainerfileFROM python:3.10-slim
|
||||
...
|
||||
|
|
@ -312,7 +319,7 @@ Now, let's start the Llama Stack Distribution Server. You will need the YAML con
|
|||
```
|
||||
llama stack run -h
|
||||
usage: llama stack run [-h] [--port PORT] [--image-name IMAGE_NAME] [--env KEY=VALUE]
|
||||
[--image-type {conda,venv}] [--enable-ui]
|
||||
[--image-type {venv}] [--enable-ui]
|
||||
[config | template]
|
||||
|
||||
Start the server for a Llama Stack Distribution. You should have already built (or downloaded) and configured the distribution.
|
||||
|
|
@ -326,8 +333,8 @@ options:
|
|||
--image-name IMAGE_NAME
|
||||
Name of the image to run. Defaults to the current environment (default: None)
|
||||
--env KEY=VALUE Environment variables to pass to the server in KEY=VALUE format. Can be specified multiple times. (default: None)
|
||||
--image-type {conda,venv}
|
||||
Image Type used during the build. This can be either conda or venv. (default: None)
|
||||
--image-type {venv}
|
||||
Image Type used during the build. This should be venv. (default: None)
|
||||
--enable-ui Start the UI server (default: False)
|
||||
```
|
||||
|
||||
|
|
@ -342,9 +349,6 @@ llama stack run ~/.llama/distributions/llamastack-my-local-stack/my-local-stack-
|
|||
|
||||
# Start using a venv
|
||||
llama stack run --image-type venv ~/.llama/distributions/llamastack-my-local-stack/my-local-stack-run.yaml
|
||||
|
||||
# Start using a conda environment
|
||||
llama stack run --image-type conda ~/.llama/distributions/llamastack-my-local-stack/my-local-stack-run.yaml
|
||||
```
|
||||
|
||||
```
|
||||
|
|
|
|||
|
|
@ -10,7 +10,6 @@ The default `run.yaml` files generated by templates are starting points for your
|
|||
|
||||
```yaml
|
||||
version: 2
|
||||
conda_env: ollama
|
||||
apis:
|
||||
- agents
|
||||
- inference
|
||||
|
|
|
|||
|
|
@ -6,11 +6,11 @@ This avoids the overhead of setting up a server.
|
|||
```bash
|
||||
# setup
|
||||
uv pip install llama-stack
|
||||
llama stack build --template starter --image-type venv
|
||||
llama stack build --distro starter --image-type venv
|
||||
```
|
||||
|
||||
```python
|
||||
from llama_stack.distribution.library_client import LlamaStackAsLibraryClient
|
||||
from llama_stack.core.library_client import LlamaStackAsLibraryClient
|
||||
|
||||
client = LlamaStackAsLibraryClient(
|
||||
"starter",
|
||||
|
|
|
|||
|
|
@ -9,6 +9,7 @@ This section provides an overview of the distributions available in Llama Stack.
|
|||
list_of_distributions
|
||||
building_distro
|
||||
customizing_run_yaml
|
||||
starting_llama_stack_server
|
||||
importing_as_library
|
||||
configuration
|
||||
```
|
||||
|
|
|
|||
|
|
@ -34,6 +34,13 @@ data:
|
|||
provider_type: remote::chromadb
|
||||
config:
|
||||
url: ${env.CHROMADB_URL:=}
|
||||
kvstore:
|
||||
type: postgres
|
||||
host: ${env.POSTGRES_HOST:=localhost}
|
||||
port: ${env.POSTGRES_PORT:=5432}
|
||||
db: ${env.POSTGRES_DB:=llamastack}
|
||||
user: ${env.POSTGRES_USER:=llamastack}
|
||||
password: ${env.POSTGRES_PASSWORD:=llamastack}
|
||||
safety:
|
||||
- provider_id: llama-guard
|
||||
provider_type: inline::llama-guard
|
||||
|
|
|
|||
|
|
@ -52,7 +52,7 @@ spec:
|
|||
value: "${SAFETY_MODEL}"
|
||||
- name: TAVILY_SEARCH_API_KEY
|
||||
value: "${TAVILY_SEARCH_API_KEY}"
|
||||
command: ["python", "-m", "llama_stack.distribution.server.server", "--config", "/etc/config/stack_run_config.yaml", "--port", "8321"]
|
||||
command: ["python", "-m", "llama_stack.core.server.server", "--config", "/etc/config/stack_run_config.yaml", "--port", "8321"]
|
||||
ports:
|
||||
- containerPort: 8321
|
||||
volumeMounts:
|
||||
|
|
|
|||
|
|
@ -31,6 +31,13 @@ providers:
|
|||
provider_type: remote::chromadb
|
||||
config:
|
||||
url: ${env.CHROMADB_URL:=}
|
||||
kvstore:
|
||||
type: postgres
|
||||
host: ${env.POSTGRES_HOST:=localhost}
|
||||
port: ${env.POSTGRES_PORT:=5432}
|
||||
db: ${env.POSTGRES_DB:=llamastack}
|
||||
user: ${env.POSTGRES_USER:=llamastack}
|
||||
password: ${env.POSTGRES_PASSWORD:=llamastack}
|
||||
safety:
|
||||
- provider_id: llama-guard
|
||||
provider_type: inline::llama-guard
|
||||
|
|
|
|||
|
|
@ -56,12 +56,12 @@ Breaking down the demo app, this section will show the core pieces that are used
|
|||
### Setup Remote Inferencing
|
||||
Start a Llama Stack server on localhost. Here is an example of how you can do this using the firework.ai distribution:
|
||||
```
|
||||
conda create -n stack-fireworks python=3.10
|
||||
conda activate stack-fireworks
|
||||
uv venv starter --python 3.12
|
||||
source starter/bin/activate # On Windows: starter\Scripts\activate
|
||||
pip install --no-cache llama-stack==0.2.2
|
||||
llama stack build --template fireworks --image-type conda
|
||||
llama stack build --distro starter --image-type venv
|
||||
export FIREWORKS_API_KEY=<SOME_KEY>
|
||||
llama stack run fireworks --port 5050
|
||||
llama stack run starter --port 5050
|
||||
```
|
||||
|
||||
Ensure the Llama Stack server version is the same as the Kotlin SDK Library for maximum compatibility.
|
||||
|
|
|
|||
|
|
@ -57,7 +57,7 @@ Make sure you have access to a watsonx API Key. You can get one by referring [wa
|
|||
|
||||
## Running Llama Stack with watsonx
|
||||
|
||||
You can do this via Conda (build code), venv or Docker which has a pre-built image.
|
||||
You can do this via venv or Docker which has a pre-built image.
|
||||
|
||||
### Via Docker
|
||||
|
||||
|
|
@ -76,13 +76,3 @@ docker run \
|
|||
--env WATSONX_PROJECT_ID=$WATSONX_PROJECT_ID \
|
||||
--env WATSONX_BASE_URL=$WATSONX_BASE_URL
|
||||
```
|
||||
|
||||
### Via Conda
|
||||
|
||||
```bash
|
||||
llama stack build --template watsonx --image-type conda
|
||||
llama stack run ./run.yaml \
|
||||
--port $LLAMA_STACK_PORT \
|
||||
--env WATSONX_API_KEY=$WATSONX_API_KEY \
|
||||
--env WATSONX_PROJECT_ID=$WATSONX_PROJECT_ID
|
||||
```
|
||||
|
|
|
|||
|
|
@ -114,7 +114,7 @@ podman run --rm -it \
|
|||
|
||||
## Running Llama Stack
|
||||
|
||||
Now you are ready to run Llama Stack with TGI as the inference provider. You can do this via Conda (build code) or Docker which has a pre-built image.
|
||||
Now you are ready to run Llama Stack with TGI as the inference provider. You can do this via venv or Docker which has a pre-built image.
|
||||
|
||||
### Via Docker
|
||||
|
||||
|
|
@ -153,7 +153,7 @@ docker run \
|
|||
--pull always \
|
||||
-p $LLAMA_STACK_PORT:$LLAMA_STACK_PORT \
|
||||
-v $HOME/.llama:/root/.llama \
|
||||
-v ./llama_stack/templates/tgi/run-with-safety.yaml:/root/my-run.yaml \
|
||||
-v ./llama_stack/distributions/tgi/run-with-safety.yaml:/root/my-run.yaml \
|
||||
llamastack/distribution-dell \
|
||||
--config /root/my-run.yaml \
|
||||
--port $LLAMA_STACK_PORT \
|
||||
|
|
@ -164,12 +164,12 @@ docker run \
|
|||
--env CHROMA_URL=$CHROMA_URL
|
||||
```
|
||||
|
||||
### Via Conda
|
||||
### Via venv
|
||||
|
||||
Make sure you have done `pip install llama-stack` and have the Llama Stack CLI available.
|
||||
|
||||
```bash
|
||||
llama stack build --template dell --image-type conda
|
||||
llama stack build --distro dell --image-type venv
|
||||
llama stack run dell
|
||||
--port $LLAMA_STACK_PORT \
|
||||
--env INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||
|
|
|
|||
|
|
@ -70,7 +70,7 @@ $ llama model list --downloaded
|
|||
|
||||
## Running the Distribution
|
||||
|
||||
You can do this via Conda (build code) or Docker which has a pre-built image.
|
||||
You can do this via venv or Docker which has a pre-built image.
|
||||
|
||||
### Via Docker
|
||||
|
||||
|
|
@ -104,12 +104,12 @@ docker run \
|
|||
--env SAFETY_MODEL=meta-llama/Llama-Guard-3-1B
|
||||
```
|
||||
|
||||
### Via Conda
|
||||
### Via venv
|
||||
|
||||
Make sure you have done `uv pip install llama-stack` and have the Llama Stack CLI available.
|
||||
|
||||
```bash
|
||||
llama stack build --template meta-reference-gpu --image-type conda
|
||||
llama stack build --distro meta-reference-gpu --image-type venv
|
||||
llama stack run distributions/meta-reference-gpu/run.yaml \
|
||||
--port 8321 \
|
||||
--env INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct
|
||||
|
|
|
|||
|
|
@ -133,7 +133,7 @@ curl -X DELETE "$NEMO_URL/v1/deployment/model-deployments/meta/llama-3.1-8b-inst
|
|||
|
||||
## Running Llama Stack with NVIDIA
|
||||
|
||||
You can do this via Conda or venv (build code), or Docker which has a pre-built image.
|
||||
You can do this via venv (build code), or Docker which has a pre-built image.
|
||||
|
||||
### Via Docker
|
||||
|
||||
|
|
@ -152,24 +152,13 @@ docker run \
|
|||
--env NVIDIA_API_KEY=$NVIDIA_API_KEY
|
||||
```
|
||||
|
||||
### Via Conda
|
||||
|
||||
```bash
|
||||
INFERENCE_MODEL=meta-llama/Llama-3.1-8b-Instruct
|
||||
llama stack build --template nvidia --image-type conda
|
||||
llama stack run ./run.yaml \
|
||||
--port 8321 \
|
||||
--env NVIDIA_API_KEY=$NVIDIA_API_KEY \
|
||||
--env INFERENCE_MODEL=$INFERENCE_MODEL
|
||||
```
|
||||
|
||||
### Via venv
|
||||
|
||||
If you've set up your local development environment, you can also build the image using your local virtual environment.
|
||||
|
||||
```bash
|
||||
INFERENCE_MODEL=meta-llama/Llama-3.1-8b-Instruct
|
||||
llama stack build --template nvidia --image-type venv
|
||||
llama stack build --distro nvidia --image-type venv
|
||||
llama stack run ./run.yaml \
|
||||
--port 8321 \
|
||||
--env NVIDIA_API_KEY=$NVIDIA_API_KEY \
|
||||
|
|
|
|||
|
|
@ -100,10 +100,6 @@ The following environment variables can be configured:
|
|||
### Model Configuration
|
||||
- `INFERENCE_MODEL`: HuggingFace model for serverless inference
|
||||
- `INFERENCE_ENDPOINT_NAME`: HuggingFace endpoint name
|
||||
- `OLLAMA_INFERENCE_MODEL`: Ollama model name
|
||||
- `OLLAMA_EMBEDDING_MODEL`: Ollama embedding model name
|
||||
- `OLLAMA_EMBEDDING_DIMENSION`: Ollama embedding dimension (default: `384`)
|
||||
- `VLLM_INFERENCE_MODEL`: vLLM model name
|
||||
|
||||
### Vector Database Configuration
|
||||
- `SQLITE_STORE_DIR`: SQLite store directory (default: `~/.llama/distributions/starter`)
|
||||
|
|
@ -127,47 +123,29 @@ The following environment variables can be configured:
|
|||
|
||||
## Enabling Providers
|
||||
|
||||
You can enable specific providers by setting their provider ID to a valid value using environment variables. This is useful when you want to use certain providers or don't have the required API keys.
|
||||
You can enable specific providers by setting appropriate environment variables. For example,
|
||||
|
||||
### Examples of Enabling Providers
|
||||
|
||||
#### Enable FAISS Vector Provider
|
||||
```bash
|
||||
export ENABLE_FAISS=faiss
|
||||
# self-hosted
|
||||
export OLLAMA_URL=http://localhost:11434 # enables the Ollama inference provider
|
||||
export VLLM_URL=http://localhost:8000/v1 # enables the vLLM inference provider
|
||||
export TGI_URL=http://localhost:8000/v1 # enables the TGI inference provider
|
||||
|
||||
# cloud-hosted requiring API key configuration on the server
|
||||
export CEREBRAS_API_KEY=your_cerebras_api_key # enables the Cerebras inference provider
|
||||
export NVIDIA_API_KEY=your_nvidia_api_key # enables the NVIDIA inference provider
|
||||
|
||||
# vector providers
|
||||
export MILVUS_URL=http://localhost:19530 # enables the Milvus vector provider
|
||||
export CHROMADB_URL=http://localhost:8000/v1 # enables the ChromaDB vector provider
|
||||
export PGVECTOR_DB=llama_stack_db # enables the PGVector vector provider
|
||||
```
|
||||
|
||||
#### Enable Ollama Models
|
||||
```bash
|
||||
export ENABLE_OLLAMA=ollama
|
||||
```
|
||||
|
||||
#### Disable vLLM Models
|
||||
```bash
|
||||
export VLLM_INFERENCE_MODEL=__disabled__
|
||||
```
|
||||
|
||||
#### Disable Optional Vector Providers
|
||||
```bash
|
||||
export ENABLE_SQLITE_VEC=__disabled__
|
||||
export ENABLE_CHROMADB=__disabled__
|
||||
export ENABLE_PGVECTOR=__disabled__
|
||||
```
|
||||
|
||||
### Provider ID Patterns
|
||||
|
||||
The starter distribution uses several patterns for provider IDs:
|
||||
|
||||
1. **Direct provider IDs**: `faiss`, `ollama`, `vllm`
|
||||
2. **Environment-based provider IDs**: `${env.ENABLE_SQLITE_VEC:+sqlite-vec}`
|
||||
3. **Model-based provider IDs**: `${env.OLLAMA_INFERENCE_MODEL:__disabled__}`
|
||||
|
||||
When using the `+` pattern (like `${env.ENABLE_SQLITE_VEC+sqlite-vec}`), the provider is enabled by default and can be disabled by setting the environment variable to `__disabled__`.
|
||||
|
||||
When using the `:` pattern (like `${env.OLLAMA_INFERENCE_MODEL:__disabled__}`), the provider is disabled by default and can be enabled by setting the environment variable to a valid value.
|
||||
This distribution comes with a default "llama-guard" shield that can be enabled by setting the `SAFETY_MODEL` environment variable to point to an appropriate Llama Guard model id. Use `llama-stack-client models list` to see the list of available models.
|
||||
|
||||
## Running the Distribution
|
||||
|
||||
You can run the starter distribution via Docker, Conda, or venv.
|
||||
You can run the starter distribution via Docker or venv.
|
||||
|
||||
### Via Docker
|
||||
|
||||
|
|
@ -186,12 +164,12 @@ docker run \
|
|||
--port $LLAMA_STACK_PORT
|
||||
```
|
||||
|
||||
### Via Conda or venv
|
||||
### Via venv
|
||||
|
||||
Ensure you have configured the starter distribution using the environment variables explained above.
|
||||
|
||||
```bash
|
||||
uv run --with llama-stack llama stack build --template starter --image-type <conda|venv> --run
|
||||
uv run --with llama-stack llama stack build --distro starter --image-type venv --run
|
||||
```
|
||||
|
||||
## Example Usage
|
||||
|
|
|
|||
|
|
@ -11,12 +11,6 @@ This is the simplest way to get started. Using Llama Stack as a library means yo
|
|||
|
||||
Another simple way to start interacting with Llama Stack is to just spin up a container (via Docker or Podman) which is pre-built with all the providers you need. We provide a number of pre-built images so you can start a Llama Stack server instantly. You can also build your own custom container. Which distribution to choose depends on the hardware you have. See [Selection of a Distribution](selection) for more details.
|
||||
|
||||
|
||||
## Conda:
|
||||
|
||||
If you have a custom or an advanced setup or you are developing on Llama Stack you can also build a custom Llama Stack server. Using `llama stack build` and `llama stack run` you can build/run a custom Llama Stack server containing the exact combination of providers you wish. We have also provided various templates to make getting started easier. See [Building a Custom Distribution](building_distro) for more details.
|
||||
|
||||
|
||||
## Kubernetes:
|
||||
|
||||
If you have built a container image and want to deploy it in a Kubernetes cluster instead of starting the Llama Stack server locally. See [Kubernetes Deployment Guide](kubernetes_deployment) for more details.
|
||||
|
|
|
|||
|
|
@ -59,10 +59,10 @@ Now let's build and run the Llama Stack config for Ollama.
|
|||
We use `starter` as template. By default all providers are disabled, this requires enable ollama by passing environment variables.
|
||||
|
||||
```bash
|
||||
llama stack build --template starter --image-type venv --run
|
||||
llama stack build --distro starter --image-type venv --run
|
||||
```
|
||||
:::
|
||||
:::{tab-item} Using `conda`
|
||||
:::{tab-item} Using `venv`
|
||||
You can use Python to build and run the Llama Stack server, which is useful for testing and development.
|
||||
|
||||
Llama Stack uses a [YAML configuration file](../distributions/configuration.md) to specify the stack setup,
|
||||
|
|
@ -70,7 +70,7 @@ which defines the providers and their settings.
|
|||
Now let's build and run the Llama Stack config for Ollama.
|
||||
|
||||
```bash
|
||||
llama stack build --template starter --image-type conda --run
|
||||
llama stack build --distro starter --image-type venv --run
|
||||
```
|
||||
:::
|
||||
:::{tab-item} Using a Container
|
||||
|
|
@ -150,13 +150,7 @@ pip install llama-stack-client
|
|||
```
|
||||
:::
|
||||
|
||||
:::{tab-item} Install with `conda`
|
||||
```bash
|
||||
yes | conda create -n stack-client python=3.12
|
||||
conda activate stack-client
|
||||
pip install llama-stack-client
|
||||
```
|
||||
:::
|
||||
|
||||
::::
|
||||
|
||||
Now let's use the `llama-stack-client` [CLI](../references/llama_stack_client_cli_reference.md) to check the
|
||||
|
|
|
|||
|
|
@ -16,10 +16,13 @@ as the inference [provider](../providers/inference/index) for a Llama Model.
|
|||
```bash
|
||||
ollama run llama3.2:3b --keepalive 60m
|
||||
```
|
||||
|
||||
#### Step 2: Run the Llama Stack server
|
||||
|
||||
We will use `uv` to run the Llama Stack server.
|
||||
```bash
|
||||
uv run --with llama-stack llama stack build --template starter --image-type venv --run
|
||||
OLLAMA_URL=http://localhost:11434 \
|
||||
uv run --with llama-stack llama stack build --distro starter --image-type venv --run
|
||||
```
|
||||
#### Step 3: Run the demo
|
||||
Now open up a new terminal and copy the following script into a file named `demo_script.py`.
|
||||
|
|
|
|||
|
|
@ -1,5 +1,13 @@
|
|||
# Agents Providers
|
||||
# Agents
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **agents** API.
|
||||
|
||||
- [inline::meta-reference](inline_meta-reference.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_meta-reference
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,7 +1,15 @@
|
|||
# Datasetio Providers
|
||||
# Datasetio
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **datasetio** API.
|
||||
|
||||
- [inline::localfs](inline_localfs.md)
|
||||
- [remote::huggingface](remote_huggingface.md)
|
||||
- [remote::nvidia](remote_nvidia.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_localfs
|
||||
remote_huggingface
|
||||
remote_nvidia
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,6 +1,14 @@
|
|||
# Eval Providers
|
||||
# Eval
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **eval** API.
|
||||
|
||||
- [inline::meta-reference](inline_meta-reference.md)
|
||||
- [remote::nvidia](remote_nvidia.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_meta-reference
|
||||
remote_nvidia
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,9 +1,4 @@
|
|||
# External Providers Guide
|
||||
|
||||
Llama Stack supports external providers that live outside of the main codebase. This allows you to:
|
||||
- Create and maintain your own providers independently
|
||||
- Share providers with others without contributing to the main codebase
|
||||
- Keep provider-specific code separate from the core Llama Stack code
|
||||
# Creating External Providers
|
||||
|
||||
## Configuration
|
||||
|
||||
|
|
@ -55,17 +50,6 @@ Llama Stack supports two types of external providers:
|
|||
1. **Remote Providers**: Providers that communicate with external services (e.g., cloud APIs)
|
||||
2. **Inline Providers**: Providers that run locally within the Llama Stack process
|
||||
|
||||
## Known External Providers
|
||||
|
||||
Here's a list of known external providers that you can use with Llama Stack:
|
||||
|
||||
| Name | Description | API | Type | Repository |
|
||||
|------|-------------|-----|------|------------|
|
||||
| KubeFlow Training | Train models with KubeFlow | Post Training | Remote | [llama-stack-provider-kft](https://github.com/opendatahub-io/llama-stack-provider-kft) |
|
||||
| KubeFlow Pipelines | Train models with KubeFlow Pipelines | Post Training | Inline **and** Remote | [llama-stack-provider-kfp-trainer](https://github.com/opendatahub-io/llama-stack-provider-kfp-trainer) |
|
||||
| RamaLama | Inference models with RamaLama | Inference | Remote | [ramalama-stack](https://github.com/containers/ramalama-stack) |
|
||||
| TrustyAI LM-Eval | Evaluate models with TrustyAI LM-Eval | Eval | Remote | [llama-stack-provider-lmeval](https://github.com/trustyai-explainability/llama-stack-provider-lmeval) |
|
||||
|
||||
### Remote Provider Specification
|
||||
|
||||
Remote providers are used when you need to communicate with external services. Here's an example for a custom Ollama provider:
|
||||
|
|
@ -119,9 +103,9 @@ container_image: custom-vector-store:latest # optional
|
|||
- `provider_data_validator`: Optional validator for provider data
|
||||
- `container_image`: Optional container image to use instead of pip packages
|
||||
|
||||
## Required Implementation
|
||||
## Required Fields
|
||||
|
||||
## All Providers
|
||||
### All Providers
|
||||
|
||||
All providers must contain a `get_provider_spec` function in their `provider` module. This is a standardized structure that Llama Stack expects and is necessary for getting things such as the config class. The `get_provider_spec` method returns a structure identical to the `adapter`. An example function may look like:
|
||||
|
||||
|
|
@ -146,7 +130,7 @@ def get_provider_spec() -> ProviderSpec:
|
|||
)
|
||||
```
|
||||
|
||||
### Remote Providers
|
||||
#### Remote Providers
|
||||
|
||||
Remote providers must expose a `get_adapter_impl()` function in their module that takes two arguments:
|
||||
1. `config`: An instance of the provider's config class
|
||||
|
|
@ -162,7 +146,7 @@ async def get_adapter_impl(
|
|||
return OllamaInferenceAdapter(config)
|
||||
```
|
||||
|
||||
### Inline Providers
|
||||
#### Inline Providers
|
||||
|
||||
Inline providers must expose a `get_provider_impl()` function in their module that takes two arguments:
|
||||
1. `config`: An instance of the provider's config class
|
||||
|
|
@ -189,7 +173,40 @@ Version: 0.1.0
|
|||
Location: /path/to/venv/lib/python3.10/site-packages
|
||||
```
|
||||
|
||||
## Example using `external_providers_dir`: Custom Ollama Provider
|
||||
## Best Practices
|
||||
|
||||
1. **Package Naming**: Use the prefix `llama-stack-provider-` for your provider packages to make them easily identifiable.
|
||||
|
||||
2. **Version Management**: Keep your provider package versioned and compatible with the Llama Stack version you're using.
|
||||
|
||||
3. **Dependencies**: Only include the minimum required dependencies in your provider package.
|
||||
|
||||
4. **Documentation**: Include clear documentation in your provider package about:
|
||||
- Installation requirements
|
||||
- Configuration options
|
||||
- Usage examples
|
||||
- Any limitations or known issues
|
||||
|
||||
5. **Testing**: Include tests in your provider package to ensure it works correctly with Llama Stack.
|
||||
You can refer to the [integration tests
|
||||
guide](https://github.com/meta-llama/llama-stack/blob/main/tests/integration/README.md) for more
|
||||
information. Execute the test for the Provider type you are developing.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
If your external provider isn't being loaded:
|
||||
|
||||
1. Check that `module` points to a published pip package with a top level `provider` module including `get_provider_spec`.
|
||||
1. Check that the `external_providers_dir` path is correct and accessible.
|
||||
2. Verify that the YAML files are properly formatted.
|
||||
3. Ensure all required Python packages are installed.
|
||||
4. Check the Llama Stack server logs for any error messages - turn on debug logging to get more
|
||||
information using `LLAMA_STACK_LOGGING=all=debug`.
|
||||
5. Verify that the provider package is installed in your Python environment if using `external_providers_dir`.
|
||||
|
||||
## Examples
|
||||
|
||||
### Example using `external_providers_dir`: Custom Ollama Provider
|
||||
|
||||
Here's a complete example of creating and using a custom Ollama provider:
|
||||
|
||||
|
|
@ -241,7 +258,7 @@ external_providers_dir: ~/.llama/providers.d/
|
|||
The provider will now be available in Llama Stack with the type `remote::custom_ollama`.
|
||||
|
||||
|
||||
## Example using `module`: ramalama-stack
|
||||
### Example using `module`: ramalama-stack
|
||||
|
||||
[ramalama-stack](https://github.com/containers/ramalama-stack) is a recognized external provider that supports installation via module.
|
||||
|
||||
|
|
@ -266,35 +283,4 @@ additional_pip_packages:
|
|||
|
||||
No other steps are required other than `llama stack build` and `llama stack run`. The build process will use `module` to install all of the provider dependencies, retrieve the spec, etc.
|
||||
|
||||
The provider will now be available in Llama Stack with the type `remote::ramalama`.
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Package Naming**: Use the prefix `llama-stack-provider-` for your provider packages to make them easily identifiable.
|
||||
|
||||
2. **Version Management**: Keep your provider package versioned and compatible with the Llama Stack version you're using.
|
||||
|
||||
3. **Dependencies**: Only include the minimum required dependencies in your provider package.
|
||||
|
||||
4. **Documentation**: Include clear documentation in your provider package about:
|
||||
- Installation requirements
|
||||
- Configuration options
|
||||
- Usage examples
|
||||
- Any limitations or known issues
|
||||
|
||||
5. **Testing**: Include tests in your provider package to ensure it works correctly with Llama Stack.
|
||||
You can refer to the [integration tests
|
||||
guide](https://github.com/meta-llama/llama-stack/blob/main/tests/integration/README.md) for more
|
||||
information. Execute the test for the Provider type you are developing.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
If your external provider isn't being loaded:
|
||||
|
||||
1. Check that `module` points to a published pip package with a top level `provider` module including `get_provider_spec`.
|
||||
1. Check that the `external_providers_dir` path is correct and accessible.
|
||||
2. Verify that the YAML files are properly formatted.
|
||||
3. Ensure all required Python packages are installed.
|
||||
4. Check the Llama Stack server logs for any error messages - turn on debug logging to get more
|
||||
information using `LLAMA_STACK_LOGGING=all=debug`.
|
||||
5. Verify that the provider package is installed in your Python environment if using `external_providers_dir`.
|
||||
The provider will now be available in Llama Stack with the type `remote::ramalama`.
|
||||
10
docs/source/providers/external/external-providers-list.md
vendored
Normal file
10
docs/source/providers/external/external-providers-list.md
vendored
Normal file
|
|
@ -0,0 +1,10 @@
|
|||
# Known External Providers
|
||||
|
||||
Here's a list of known external providers that you can use with Llama Stack:
|
||||
|
||||
| Name | Description | API | Type | Repository |
|
||||
|------|-------------|-----|------|------------|
|
||||
| KubeFlow Training | Train models with KubeFlow | Post Training | Remote | [llama-stack-provider-kft](https://github.com/opendatahub-io/llama-stack-provider-kft) |
|
||||
| KubeFlow Pipelines | Train models with KubeFlow Pipelines | Post Training | Inline **and** Remote | [llama-stack-provider-kfp-trainer](https://github.com/opendatahub-io/llama-stack-provider-kfp-trainer) |
|
||||
| RamaLama | Inference models with RamaLama | Inference | Remote | [ramalama-stack](https://github.com/containers/ramalama-stack) |
|
||||
| TrustyAI LM-Eval | Evaluate models with TrustyAI LM-Eval | Eval | Remote | [llama-stack-provider-lmeval](https://github.com/trustyai-explainability/llama-stack-provider-lmeval) |
|
||||
13
docs/source/providers/external/index.md
vendored
Normal file
13
docs/source/providers/external/index.md
vendored
Normal file
|
|
@ -0,0 +1,13 @@
|
|||
# External Providers
|
||||
|
||||
Llama Stack supports external providers that live outside of the main codebase. This allows you to:
|
||||
- Create and maintain your own providers independently
|
||||
- Share providers with others without contributing to the main codebase
|
||||
- Keep provider-specific code separate from the core Llama Stack code
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
external-providers-list
|
||||
external-providers-guide
|
||||
```
|
||||
|
|
@ -1,5 +1,13 @@
|
|||
# Files Providers
|
||||
# Files
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **files** API.
|
||||
|
||||
- [inline::localfs](inline_localfs.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_localfs
|
||||
```
|
||||
|
|
|
|||
|
|
@ -8,7 +8,7 @@ Local filesystem-based file storage provider for managing files and documents lo
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `storage_dir` | `<class 'str'>` | No | PydanticUndefined | Directory to store uploaded files |
|
||||
| `storage_dir` | `<class 'str'>` | No | | Directory to store uploaded files |
|
||||
| `metadata_store` | `utils.sqlstore.sqlstore.SqliteSqlStoreConfig \| utils.sqlstore.sqlstore.PostgresSqlStoreConfig` | No | sqlite | SQL store configuration for file metadata |
|
||||
| `ttl_secs` | `<class 'int'>` | No | 31536000 | |
|
||||
|
||||
|
|
|
|||
|
|
@ -1,4 +1,4 @@
|
|||
# API Providers Overview
|
||||
# API Providers
|
||||
|
||||
The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include:
|
||||
- LLM inference providers (e.g., Meta Reference, Ollama, Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, vLLM, OpenAI, Anthropic, Gemini, WatsonX, etc.),
|
||||
|
|
@ -12,81 +12,17 @@ Providers come in two flavors:
|
|||
|
||||
Importantly, Llama Stack always strives to provide at least one fully inline provider for each API so you can iterate on a fully featured environment locally.
|
||||
|
||||
## External Providers
|
||||
Llama Stack supports external providers that live outside of the main codebase. This allows you to create and maintain your own providers independently.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
external.md
|
||||
```
|
||||
|
||||
```{include} openai.md
|
||||
:start-after: ## OpenAI API Compatibility
|
||||
```
|
||||
|
||||
## Inference
|
||||
Runs inference with an LLM.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
external/index
|
||||
openai
|
||||
inference/index
|
||||
```
|
||||
|
||||
## Agents
|
||||
Run multi-step agentic workflows with LLMs with tool usage, memory (RAG), etc.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
agents/index
|
||||
```
|
||||
|
||||
## DatasetIO
|
||||
Interfaces with datasets and data loaders.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
datasetio/index
|
||||
```
|
||||
|
||||
## Safety
|
||||
Applies safety policies to the output at a Systems (not only model) level.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
safety/index
|
||||
```
|
||||
|
||||
## Telemetry
|
||||
Collects telemetry data from the system.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
telemetry/index
|
||||
```
|
||||
|
||||
## Vector IO
|
||||
|
||||
Vector IO refers to operations on vector databases, such as adding documents, searching, and deleting documents.
|
||||
Vector IO plays a crucial role in [Retreival Augmented Generation (RAG)](../..//building_applications/rag), where the vector
|
||||
io and database are used to store and retrieve documents for retrieval.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
vector_io/index
|
||||
```
|
||||
|
||||
## Tool Runtime
|
||||
Is associated with the ToolGroup resources.
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
tool_runtime/index
|
||||
```
|
||||
files/index
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,26 +1,34 @@
|
|||
# Inference Providers
|
||||
# Inference
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **inference** API.
|
||||
|
||||
- [inline::meta-reference](inline_meta-reference.md)
|
||||
- [inline::sentence-transformers](inline_sentence-transformers.md)
|
||||
- [remote::anthropic](remote_anthropic.md)
|
||||
- [remote::bedrock](remote_bedrock.md)
|
||||
- [remote::cerebras](remote_cerebras.md)
|
||||
- [remote::databricks](remote_databricks.md)
|
||||
- [remote::fireworks](remote_fireworks.md)
|
||||
- [remote::gemini](remote_gemini.md)
|
||||
- [remote::groq](remote_groq.md)
|
||||
- [remote::hf::endpoint](remote_hf_endpoint.md)
|
||||
- [remote::hf::serverless](remote_hf_serverless.md)
|
||||
- [remote::llama-openai-compat](remote_llama-openai-compat.md)
|
||||
- [remote::nvidia](remote_nvidia.md)
|
||||
- [remote::ollama](remote_ollama.md)
|
||||
- [remote::openai](remote_openai.md)
|
||||
- [remote::passthrough](remote_passthrough.md)
|
||||
- [remote::runpod](remote_runpod.md)
|
||||
- [remote::sambanova](remote_sambanova.md)
|
||||
- [remote::tgi](remote_tgi.md)
|
||||
- [remote::together](remote_together.md)
|
||||
- [remote::vllm](remote_vllm.md)
|
||||
- [remote::watsonx](remote_watsonx.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_meta-reference
|
||||
inline_sentence-transformers
|
||||
remote_anthropic
|
||||
remote_bedrock
|
||||
remote_cerebras
|
||||
remote_databricks
|
||||
remote_fireworks
|
||||
remote_gemini
|
||||
remote_groq
|
||||
remote_hf_endpoint
|
||||
remote_hf_serverless
|
||||
remote_llama-openai-compat
|
||||
remote_nvidia
|
||||
remote_ollama
|
||||
remote_openai
|
||||
remote_passthrough
|
||||
remote_runpod
|
||||
remote_sambanova
|
||||
remote_tgi
|
||||
remote_together
|
||||
remote_vllm
|
||||
remote_watsonx
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,21 +0,0 @@
|
|||
# remote::cerebras-openai-compat
|
||||
|
||||
## Description
|
||||
|
||||
Cerebras OpenAI-compatible provider for using Cerebras models with OpenAI API format.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `str \| None` | No | | The Cerebras API key |
|
||||
| `openai_compat_api_base` | `<class 'str'>` | No | https://api.cerebras.ai/v1 | The URL for the Cerebras API server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
openai_compat_api_base: https://api.cerebras.ai/v1
|
||||
api_key: ${env.CEREBRAS_API_KEY}
|
||||
|
||||
```
|
||||
|
||||
|
|
@ -1,21 +0,0 @@
|
|||
# remote::fireworks-openai-compat
|
||||
|
||||
## Description
|
||||
|
||||
Fireworks AI OpenAI-compatible provider for using Fireworks models with OpenAI API format.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `str \| None` | No | | The Fireworks API key |
|
||||
| `openai_compat_api_base` | `<class 'str'>` | No | https://api.fireworks.ai/inference/v1 | The URL for the Fireworks API server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
openai_compat_api_base: https://api.fireworks.ai/inference/v1
|
||||
api_key: ${env.FIREWORKS_API_KEY}
|
||||
|
||||
```
|
||||
|
||||
|
|
@ -1,21 +0,0 @@
|
|||
# remote::groq-openai-compat
|
||||
|
||||
## Description
|
||||
|
||||
Groq OpenAI-compatible provider for using Groq models with OpenAI API format.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `str \| None` | No | | The Groq API key |
|
||||
| `openai_compat_api_base` | `<class 'str'>` | No | https://api.groq.com/openai/v1 | The URL for the Groq API server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
openai_compat_api_base: https://api.groq.com/openai/v1
|
||||
api_key: ${env.GROQ_API_KEY}
|
||||
|
||||
```
|
||||
|
||||
|
|
@ -8,7 +8,7 @@ HuggingFace Inference Endpoints provider for dedicated model serving.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `endpoint_name` | `<class 'str'>` | No | PydanticUndefined | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
|
||||
| `endpoint_name` | `<class 'str'>` | No | | The name of the Hugging Face Inference Endpoint in the format of '{namespace}/{endpoint_name}' (e.g. 'my-cool-org/meta-llama-3-1-8b-instruct-rce'). Namespace is optional and will default to the user account if not provided. |
|
||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -8,7 +8,7 @@ HuggingFace Inference API serverless provider for on-demand model inference.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `huggingface_repo` | `<class 'str'>` | No | PydanticUndefined | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
|
||||
| `huggingface_repo` | `<class 'str'>` | No | | The model ID of the model on the Hugging Face Hub (e.g. 'meta-llama/Meta-Llama-3.1-70B-Instruct') |
|
||||
| `api_token` | `pydantic.types.SecretStr \| None` | No | | Your Hugging Face user access token (will default to locally saved token if not provided) |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -8,7 +8,7 @@ Text Generation Inference (TGI) provider for HuggingFace model serving.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `url` | `<class 'str'>` | No | PydanticUndefined | The URL for the TGI serving endpoint |
|
||||
| `url` | `<class 'str'>` | No | | The URL for the TGI serving endpoint |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
|
|||
|
|
@ -1,21 +0,0 @@
|
|||
# remote::together-openai-compat
|
||||
|
||||
## Description
|
||||
|
||||
Together AI OpenAI-compatible provider for using Together models with OpenAI API format.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `str \| None` | No | | The Together API key |
|
||||
| `openai_compat_api_base` | `<class 'str'>` | No | https://api.together.xyz/v1 | The URL for the Together API server |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
openai_compat_api_base: https://api.together.xyz/v1
|
||||
api_key: ${env.TOGETHER_API_KEY}
|
||||
|
||||
```
|
||||
|
||||
|
|
@ -1,7 +1,15 @@
|
|||
# Post_Training Providers
|
||||
# Post_Training
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **post_training** API.
|
||||
|
||||
- [inline::huggingface](inline_huggingface.md)
|
||||
- [inline::torchtune](inline_torchtune.md)
|
||||
- [remote::nvidia](remote_nvidia.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_huggingface
|
||||
inline_torchtune
|
||||
remote_nvidia
|
||||
```
|
||||
|
|
|
|||
|
|
@ -24,6 +24,10 @@ HuggingFace-based post-training provider for fine-tuning models using the Huggin
|
|||
| `weight_decay` | `<class 'float'>` | No | 0.01 | |
|
||||
| `dataloader_num_workers` | `<class 'int'>` | No | 4 | |
|
||||
| `dataloader_pin_memory` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_beta` | `<class 'float'>` | No | 0.1 | |
|
||||
| `use_reference_model` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid | |
|
||||
| `dpo_output_dir` | `<class 'str'>` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
|
|
@ -31,6 +35,7 @@ HuggingFace-based post-training provider for fine-tuning models using the Huggin
|
|||
checkpoint_format: huggingface
|
||||
distributed_backend: null
|
||||
device: cpu
|
||||
dpo_output_dir: ~/.llama/dummy/dpo_output
|
||||
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -1,10 +1,18 @@
|
|||
# Safety Providers
|
||||
# Safety
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **safety** API.
|
||||
|
||||
- [inline::code-scanner](inline_code-scanner.md)
|
||||
- [inline::llama-guard](inline_llama-guard.md)
|
||||
- [inline::prompt-guard](inline_prompt-guard.md)
|
||||
- [remote::bedrock](remote_bedrock.md)
|
||||
- [remote::nvidia](remote_nvidia.md)
|
||||
- [remote::sambanova](remote_sambanova.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_code-scanner
|
||||
inline_llama-guard
|
||||
inline_prompt-guard
|
||||
remote_bedrock
|
||||
remote_nvidia
|
||||
remote_sambanova
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,7 +1,15 @@
|
|||
# Scoring Providers
|
||||
# Scoring
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **scoring** API.
|
||||
|
||||
- [inline::basic](inline_basic.md)
|
||||
- [inline::braintrust](inline_braintrust.md)
|
||||
- [inline::llm-as-judge](inline_llm-as-judge.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_basic
|
||||
inline_braintrust
|
||||
inline_llm-as-judge
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,5 +1,13 @@
|
|||
# Telemetry Providers
|
||||
# Telemetry
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **telemetry** API.
|
||||
|
||||
- [inline::meta-reference](inline_meta-reference.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_meta-reference
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,10 +1,18 @@
|
|||
# Tool_Runtime Providers
|
||||
# Tool_Runtime
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **tool_runtime** API.
|
||||
|
||||
- [inline::rag-runtime](inline_rag-runtime.md)
|
||||
- [remote::bing-search](remote_bing-search.md)
|
||||
- [remote::brave-search](remote_brave-search.md)
|
||||
- [remote::model-context-protocol](remote_model-context-protocol.md)
|
||||
- [remote::tavily-search](remote_tavily-search.md)
|
||||
- [remote::wolfram-alpha](remote_wolfram-alpha.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_rag-runtime
|
||||
remote_bing-search
|
||||
remote_brave-search
|
||||
remote_model-context-protocol
|
||||
remote_tavily-search
|
||||
remote_wolfram-alpha
|
||||
```
|
||||
|
|
|
|||
|
|
@ -1,16 +1,24 @@
|
|||
# Vector_Io Providers
|
||||
# Vector_Io
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **vector_io** API.
|
||||
|
||||
- [inline::chromadb](inline_chromadb.md)
|
||||
- [inline::faiss](inline_faiss.md)
|
||||
- [inline::meta-reference](inline_meta-reference.md)
|
||||
- [inline::milvus](inline_milvus.md)
|
||||
- [inline::qdrant](inline_qdrant.md)
|
||||
- [inline::sqlite-vec](inline_sqlite-vec.md)
|
||||
- [inline::sqlite_vec](inline_sqlite_vec.md)
|
||||
- [remote::chromadb](remote_chromadb.md)
|
||||
- [remote::milvus](remote_milvus.md)
|
||||
- [remote::pgvector](remote_pgvector.md)
|
||||
- [remote::qdrant](remote_qdrant.md)
|
||||
- [remote::weaviate](remote_weaviate.md)
|
||||
## Providers
|
||||
|
||||
```{toctree}
|
||||
:maxdepth: 1
|
||||
|
||||
inline_chromadb
|
||||
inline_faiss
|
||||
inline_meta-reference
|
||||
inline_milvus
|
||||
inline_qdrant
|
||||
inline_sqlite-vec
|
||||
inline_sqlite_vec
|
||||
remote_chromadb
|
||||
remote_milvus
|
||||
remote_pgvector
|
||||
remote_qdrant
|
||||
remote_weaviate
|
||||
```
|
||||
|
|
|
|||
|
|
@ -41,7 +41,7 @@ See [Chroma's documentation](https://docs.trychroma.com/docs/overview/introducti
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `db_path` | `<class 'str'>` | No | PydanticUndefined | |
|
||||
| `db_path` | `<class 'str'>` | No | | |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | Config for KV store backend |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ Please refer to the remote provider documentation.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `db_path` | `<class 'str'>` | No | PydanticUndefined | |
|
||||
| `db_path` | `<class 'str'>` | No | | |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | Config for KV store backend (SQLite only for now) |
|
||||
| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
|
||||
|
||||
|
|
|
|||
|
|
@ -50,12 +50,16 @@ See the [Qdrant documentation](https://qdrant.tech/documentation/) for more deta
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `path` | `<class 'str'>` | No | PydanticUndefined | |
|
||||
| `path` | `<class 'str'>` | No | | |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
path: ${env.QDRANT_PATH:=~/.llama/~/.llama/dummy}/qdrant.db
|
||||
kvstore:
|
||||
type: sqlite
|
||||
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/qdrant_registry.db
|
||||
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -205,7 +205,7 @@ See [sqlite-vec's GitHub repo](https://github.com/asg017/sqlite-vec/tree/main) f
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `db_path` | `<class 'str'>` | No | PydanticUndefined | Path to the SQLite database file |
|
||||
| `db_path` | `<class 'str'>` | No | | Path to the SQLite database file |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | Config for KV store backend (SQLite only for now) |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ Please refer to the sqlite-vec provider documentation.
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `db_path` | `<class 'str'>` | No | PydanticUndefined | Path to the SQLite database file |
|
||||
| `db_path` | `<class 'str'>` | No | | Path to the SQLite database file |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | Config for KV store backend (SQLite only for now) |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -40,7 +40,7 @@ See [Chroma's documentation](https://docs.trychroma.com/docs/overview/introducti
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `url` | `str \| None` | No | PydanticUndefined | |
|
||||
| `url` | `str \| None` | No | | |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | Config for KV store backend |
|
||||
|
||||
## Sample Configuration
|
||||
|
|
|
|||
|
|
@ -111,8 +111,8 @@ For more details on TLS configuration, refer to the [TLS setup guide](https://mi
|
|||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `uri` | `<class 'str'>` | No | PydanticUndefined | The URI of the Milvus server |
|
||||
| `token` | `str \| None` | No | PydanticUndefined | The token of the Milvus server |
|
||||
| `uri` | `<class 'str'>` | No | | The URI of the Milvus server |
|
||||
| `token` | `str \| None` | No | | The token of the Milvus server |
|
||||
| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | Config for KV store backend |
|
||||
| `config` | `dict` | No | {} | This configuration allows additional fields to be passed through to the underlying Milvus client. See the [Milvus](https://milvus.io/docs/install-overview.md) documentation for more details about Milvus in general. |
|
||||
|
|
|
|||
|
|
@ -20,11 +20,15 @@ Please refer to the inline provider documentation.
|
|||
| `prefix` | `str \| None` | No | | |
|
||||
| `timeout` | `int \| None` | No | | |
|
||||
| `host` | `str \| None` | No | | |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
api_key: ${env.QDRANT_API_KEY}
|
||||
api_key: ${env.QDRANT_API_KEY:=}
|
||||
kvstore:
|
||||
type: sqlite
|
||||
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/qdrant_registry.db
|
||||
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -33,9 +33,19 @@ To install Weaviate see the [Weaviate quickstart documentation](https://weaviate
|
|||
See [Weaviate's documentation](https://weaviate.io/developers/weaviate) for more details about Weaviate in general.
|
||||
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `weaviate_api_key` | `str \| None` | No | | The API key for the Weaviate instance |
|
||||
| `weaviate_cluster_url` | `str \| None` | No | localhost:8080 | The URL of the Weaviate cluster |
|
||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig, annotation=NoneType, required=False, default='sqlite', discriminator='type'` | No | | Config for KV store backend (SQLite only for now) |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
weaviate_api_key: null
|
||||
weaviate_cluster_url: ${env.WEAVIATE_CLUSTER_URL:=localhost:8080}
|
||||
kvstore:
|
||||
type: sqlite
|
||||
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/dummy}/weaviate_registry.db
|
||||
|
|
|
|||
|
|
@ -366,7 +366,7 @@ The purpose of scoring function is to calculate the score for each example based
|
|||
Firstly, you can see if the existing [llama stack scoring functions](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/inline/scoring) can fulfill your need. If not, you need to write a new scoring function based on what benchmark author / other open source repo describe.
|
||||
|
||||
### Add new benchmark into template
|
||||
Firstly, you need to add the evaluation dataset associated with your benchmark under `datasets` resource in the [open-benchmark](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/templates/open-benchmark/run.yaml)
|
||||
Firstly, you need to add the evaluation dataset associated with your benchmark under `datasets` resource in the [open-benchmark](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/distributions/open-benchmark/run.yaml)
|
||||
|
||||
Secondly, you need to add the new benchmark you just created under the `benchmarks` resource in the same template. To add the new benchmark, you need to have
|
||||
- `benchmark_id`: identifier of the benchmark
|
||||
|
|
@ -378,7 +378,7 @@ Secondly, you need to add the new benchmark you just created under the `benchmar
|
|||
|
||||
Spin up llama stack server with 'open-benchmark' templates
|
||||
```
|
||||
llama stack run llama_stack/templates/open-benchmark/run.yaml
|
||||
llama stack run llama_stack/distributions/open-benchmark/run.yaml
|
||||
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -19,11 +19,11 @@ You have two ways to install Llama Stack:
|
|||
cd ~/local
|
||||
git clone git@github.com:meta-llama/llama-stack.git
|
||||
|
||||
conda create -n myenv python=3.10
|
||||
conda activate myenv
|
||||
uv venv myenv --python 3.12
|
||||
source myenv/bin/activate # On Windows: myenv\Scripts\activate
|
||||
|
||||
cd llama-stack
|
||||
$CONDA_PREFIX/bin/pip install -e .
|
||||
pip install -e .
|
||||
|
||||
## Downloading models via CLI
|
||||
|
||||
|
|
|
|||
|
|
@ -19,11 +19,11 @@ You have two ways to install Llama Stack:
|
|||
cd ~/local
|
||||
git clone git@github.com:meta-llama/llama-stack.git
|
||||
|
||||
conda create -n myenv python=3.10
|
||||
conda activate myenv
|
||||
uv venv myenv --python 3.12
|
||||
source myenv/bin/activate # On Windows: myenv\Scripts\activate
|
||||
|
||||
cd llama-stack
|
||||
$CONDA_PREFIX/bin/pip install -e .
|
||||
pip install -e .
|
||||
|
||||
|
||||
## `llama` subcommands
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue