mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-29 15:23:51 +00:00
Change name of build for less confusion
This commit is contained in:
parent
fb3c4566ce
commit
0d619b9f8e
1 changed files with 10 additions and 10 deletions
|
@ -309,16 +309,16 @@ To install a distribution, we run a simple command providing 2 inputs:
|
||||||
- **Distribution Id** of the distribution that we want to install ( as obtained from the list-distributions command )
|
- **Distribution Id** of the distribution that we want to install ( as obtained from the list-distributions command )
|
||||||
- A **Name** for the specific build and configuration of this distribution.
|
- A **Name** for the specific build and configuration of this distribution.
|
||||||
|
|
||||||
Let's imagine you are working with a 8B-Instruct model. The following command will build a package (in the form of a Conda environment) _and_ configure it. As part of the configuration, you will be asked for some inputs (model_id, max_seq_len, etc.)
|
Let's imagine you are working with a 8B-Instruct model. The following command will build a package (in the form of a Conda environment) _and_ configure it. As part of the configuration, you will be asked for some inputs (model_id, max_seq_len, etc.) Since we are working with a 8B model, we will name our build `8b-instruct` to help us remember the config.
|
||||||
|
|
||||||
```
|
```
|
||||||
llama stack build local --name llama-8b
|
llama stack build local --name 8b-instruct
|
||||||
```
|
```
|
||||||
|
|
||||||
Once it runs successfully , you should see some outputs in the form:
|
Once it runs successfully , you should see some outputs in the form:
|
||||||
|
|
||||||
```
|
```
|
||||||
$ llama stack build local --name llama-8b
|
$ llama stack build local --name 8b-instruct
|
||||||
....
|
....
|
||||||
....
|
....
|
||||||
Successfully installed cfgv-3.4.0 distlib-0.3.8 identify-2.6.0 libcst-1.4.0 llama_toolchain-0.0.2 moreorless-0.4.0 nodeenv-1.9.1 pre-commit-3.8.0 stdlibs-2024.5.15 toml-0.10.2 tomlkit-0.13.0 trailrunner-1.4.0 ufmt-2.7.0 usort-1.0.8 virtualenv-20.26.3
|
Successfully installed cfgv-3.4.0 distlib-0.3.8 identify-2.6.0 libcst-1.4.0 llama_toolchain-0.0.2 moreorless-0.4.0 nodeenv-1.9.1 pre-commit-3.8.0 stdlibs-2024.5.15 toml-0.10.2 tomlkit-0.13.0 trailrunner-1.4.0 ufmt-2.7.0 usort-1.0.8 virtualenv-20.26.3
|
||||||
|
@ -328,17 +328,17 @@ Successfully setup conda environment. Configuring build...
|
||||||
...
|
...
|
||||||
...
|
...
|
||||||
|
|
||||||
YAML configuration has been written to ~/.llama/builds/local/conda/llama-8b.yaml
|
YAML configuration has been written to ~/.llama/builds/local/conda/8b-instruct.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
You can re-configure this distribution by running:
|
You can re-configure this distribution by running:
|
||||||
```
|
```
|
||||||
llama stack configure local --name llama-8b
|
llama stack configure local --name 8b-instruct
|
||||||
```
|
```
|
||||||
|
|
||||||
Here is an example run of how the CLI will guide you to fill the configuration
|
Here is an example run of how the CLI will guide you to fill the configuration
|
||||||
```
|
```
|
||||||
$ llama stack configure local --name llama-8b
|
$ llama stack configure local --name 8b-instruct
|
||||||
|
|
||||||
Configuring API: inference (meta-reference)
|
Configuring API: inference (meta-reference)
|
||||||
Enter value for model (required): Meta-Llama3.1-8B-Instruct
|
Enter value for model (required): Meta-Llama3.1-8B-Instruct
|
||||||
|
@ -358,7 +358,7 @@ Entering sub-configuration for prompt_guard_shield:
|
||||||
Enter value for model (required): Prompt-Guard-86M
|
Enter value for model (required): Prompt-Guard-86M
|
||||||
...
|
...
|
||||||
...
|
...
|
||||||
YAML configuration has been written to ~/.llama/builds/local/conda/llama-8b.yaml
|
YAML configuration has been written to ~/.llama/builds/local/conda/8b-instruct.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
As you can see, we did basic configuration above and configured:
|
As you can see, we did basic configuration above and configured:
|
||||||
|
@ -377,12 +377,12 @@ Now let’s start Llama Stack server.
|
||||||
You need the YAML configuration file which was written out at the end by the `llama stack build` step.
|
You need the YAML configuration file which was written out at the end by the `llama stack build` step.
|
||||||
|
|
||||||
```
|
```
|
||||||
llama stack run local --name llama-8b --port 5000
|
llama stack run local --name 8b-instruct --port 5000
|
||||||
```
|
```
|
||||||
You should see the Stack server start and print the APIs that it is supporting,
|
You should see the Stack server start and print the APIs that it is supporting,
|
||||||
|
|
||||||
```
|
```
|
||||||
$ llama stack run local --name llama-8b --port 5000
|
$ llama stack run local --name 8b-instruct --port 5000
|
||||||
|
|
||||||
> initializing model parallel with size 1
|
> initializing model parallel with size 1
|
||||||
> initializing ddp with size 1
|
> initializing ddp with size 1
|
||||||
|
@ -414,7 +414,7 @@ INFO: Uvicorn running on http://[::]:5000 (Press CTRL+C to quit)
|
||||||
|
|
||||||
|
|
||||||
> [!NOTE]
|
> [!NOTE]
|
||||||
> Configuration is in `~/.llama/builds/local/conda/llama-8b.yaml`. Feel free to increase `max_seq_len`.
|
> Configuration is in `~/.llama/builds/local/conda/8b-instruct.yaml`. Feel free to increase `max_seq_len`.
|
||||||
|
|
||||||
> [!IMPORTANT]
|
> [!IMPORTANT]
|
||||||
> The "local" distribution inference server currently only supports CUDA. It will not work on Apple Silicon machines.
|
> The "local" distribution inference server currently only supports CUDA. It will not work on Apple Silicon machines.
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue