Change name of build for less confusion

This commit is contained in:
Ashwin Bharambe 2024-09-03 18:48:29 -07:00
parent fb3c4566ce
commit 0d619b9f8e

View file

@ -309,16 +309,16 @@ To install a distribution, we run a simple command providing 2 inputs:
- **Distribution Id** of the distribution that we want to install ( as obtained from the list-distributions command ) - **Distribution Id** of the distribution that we want to install ( as obtained from the list-distributions command )
- A **Name** for the specific build and configuration of this distribution. - A **Name** for the specific build and configuration of this distribution.
Let's imagine you are working with a 8B-Instruct model. The following command will build a package (in the form of a Conda environment) _and_ configure it. As part of the configuration, you will be asked for some inputs (model_id, max_seq_len, etc.) Let's imagine you are working with a 8B-Instruct model. The following command will build a package (in the form of a Conda environment) _and_ configure it. As part of the configuration, you will be asked for some inputs (model_id, max_seq_len, etc.) Since we are working with a 8B model, we will name our build `8b-instruct` to help us remember the config.
``` ```
llama stack build local --name llama-8b llama stack build local --name 8b-instruct
``` ```
Once it runs successfully , you should see some outputs in the form: Once it runs successfully , you should see some outputs in the form:
``` ```
$ llama stack build local --name llama-8b $ llama stack build local --name 8b-instruct
.... ....
.... ....
Successfully installed cfgv-3.4.0 distlib-0.3.8 identify-2.6.0 libcst-1.4.0 llama_toolchain-0.0.2 moreorless-0.4.0 nodeenv-1.9.1 pre-commit-3.8.0 stdlibs-2024.5.15 toml-0.10.2 tomlkit-0.13.0 trailrunner-1.4.0 ufmt-2.7.0 usort-1.0.8 virtualenv-20.26.3 Successfully installed cfgv-3.4.0 distlib-0.3.8 identify-2.6.0 libcst-1.4.0 llama_toolchain-0.0.2 moreorless-0.4.0 nodeenv-1.9.1 pre-commit-3.8.0 stdlibs-2024.5.15 toml-0.10.2 tomlkit-0.13.0 trailrunner-1.4.0 ufmt-2.7.0 usort-1.0.8 virtualenv-20.26.3
@ -328,17 +328,17 @@ Successfully setup conda environment. Configuring build...
... ...
... ...
YAML configuration has been written to ~/.llama/builds/local/conda/llama-8b.yaml YAML configuration has been written to ~/.llama/builds/local/conda/8b-instruct.yaml
``` ```
You can re-configure this distribution by running: You can re-configure this distribution by running:
``` ```
llama stack configure local --name llama-8b llama stack configure local --name 8b-instruct
``` ```
Here is an example run of how the CLI will guide you to fill the configuration Here is an example run of how the CLI will guide you to fill the configuration
``` ```
$ llama stack configure local --name llama-8b $ llama stack configure local --name 8b-instruct
Configuring API: inference (meta-reference) Configuring API: inference (meta-reference)
Enter value for model (required): Meta-Llama3.1-8B-Instruct Enter value for model (required): Meta-Llama3.1-8B-Instruct
@ -358,7 +358,7 @@ Entering sub-configuration for prompt_guard_shield:
Enter value for model (required): Prompt-Guard-86M Enter value for model (required): Prompt-Guard-86M
... ...
... ...
YAML configuration has been written to ~/.llama/builds/local/conda/llama-8b.yaml YAML configuration has been written to ~/.llama/builds/local/conda/8b-instruct.yaml
``` ```
As you can see, we did basic configuration above and configured: As you can see, we did basic configuration above and configured:
@ -377,12 +377,12 @@ Now lets start Llama Stack server.
You need the YAML configuration file which was written out at the end by the `llama stack build` step. You need the YAML configuration file which was written out at the end by the `llama stack build` step.
``` ```
llama stack run local --name llama-8b --port 5000 llama stack run local --name 8b-instruct --port 5000
``` ```
You should see the Stack server start and print the APIs that it is supporting, You should see the Stack server start and print the APIs that it is supporting,
``` ```
$ llama stack run local --name llama-8b --port 5000 $ llama stack run local --name 8b-instruct --port 5000
> initializing model parallel with size 1 > initializing model parallel with size 1
> initializing ddp with size 1 > initializing ddp with size 1
@ -414,7 +414,7 @@ INFO: Uvicorn running on http://[::]:5000 (Press CTRL+C to quit)
> [!NOTE] > [!NOTE]
> Configuration is in `~/.llama/builds/local/conda/llama-8b.yaml`. Feel free to increase `max_seq_len`. > Configuration is in `~/.llama/builds/local/conda/8b-instruct.yaml`. Feel free to increase `max_seq_len`.
> [!IMPORTANT] > [!IMPORTANT]
> The "local" distribution inference server currently only supports CUDA. It will not work on Apple Silicon machines. > The "local" distribution inference server currently only supports CUDA. It will not work on Apple Silicon machines.