mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-14 12:22:37 +00:00
readme
This commit is contained in:
parent
5ea36b0274
commit
29c8edb4f6
2 changed files with 34 additions and 1 deletions
33
distributions/meta-reference-gpu/README.md
Normal file
33
distributions/meta-reference-gpu/README.md
Normal file
|
|
@ -0,0 +1,33 @@
|
||||||
|
# Meta Reference Distribution
|
||||||
|
|
||||||
|
The `llamastack/distribution-meta-reference-gpu` distribution consists of the following provider configurations.
|
||||||
|
|
||||||
|
|
||||||
|
| **API** | **Inference** | **Agents** | **Memory** | **Safety** | **Telemetry** |
|
||||||
|
|----------------- |--------------- |---------------- |-------------------------------------------------- |---------------- |---------------- |
|
||||||
|
| **Provider(s)** | meta-reference | meta-reference | meta-reference, remote::pgvector, remote::chroma | meta-reference | meta-reference |
|
||||||
|
|
||||||
|
|
||||||
|
### Start the Distribution (Single Node GPU)
|
||||||
|
|
||||||
|
> [!NOTE]
|
||||||
|
> This assumes you have access to GPU to start a TGI server with access to your GPU.
|
||||||
|
|
||||||
|
> [!NOTE]
|
||||||
|
> For GPU inference, you need to set these environment variables for specifying local directory containing your model checkpoints, and enable GPU inference to start running docker container.
|
||||||
|
```
|
||||||
|
export LLAMA_CHECKPOINT_DIR=~/.llama
|
||||||
|
```
|
||||||
|
|
||||||
|
> [!NOTE]
|
||||||
|
> `~/.llama` should be the path containing downloaded weights of Llama models.
|
||||||
|
|
||||||
|
|
||||||
|
To download and start running a pre-built docker container, you may use the following commands:
|
||||||
|
|
||||||
|
```
|
||||||
|
docker run -it -p 5000:5000 -v ~/.llama:/root/.llama --gpus=all llamastack/llamastack-local-gpu
|
||||||
|
```
|
||||||
|
|
||||||
|
### Alternative (Build and start distribution locally via conda)
|
||||||
|
- You may checkout the [Getting Started](../../docs/getting_started.md) for more details on starting up a meta-reference distribution.
|
||||||
|
|
@ -49,7 +49,7 @@ docker run -it -p 5000:5000 -v ~/.llama:/root/.llama --gpus=all llamastack/llama
|
||||||
```
|
```
|
||||||
|
|
||||||
> [!TIP]
|
> [!TIP]
|
||||||
> Pro Tip: We may use `docker compose up` for starting up a distribution with remote providers (e.g. TGI) using [llamastack-local-cpu](https://hub.docker.com/repository/docker/llamastack/llamastack-local-cpu/general). You can checkout [these scripts](../llama_stack/distribution/docker/README.md) to help you get started.
|
> Pro Tip: We may use `docker compose up` for starting up a distribution with remote providers (e.g. TGI) using [llamastack-local-cpu](https://hub.docker.com/repository/docker/llamastack/llamastack-local-cpu/general). You can checkout [these scripts](../distributions/) to help you get started.
|
||||||
|
|
||||||
#### Build->Configure->Run Llama Stack server via conda
|
#### Build->Configure->Run Llama Stack server via conda
|
||||||
You may also build a LlamaStack distribution from scratch, configure it, and start running the distribution. This is useful for developing on LlamaStack.
|
You may also build a LlamaStack distribution from scratch, configure it, and start running the distribution. This is useful for developing on LlamaStack.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue