mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 04:04:14 +00:00
MDX leftover fixes
This commit is contained in:
parent
aebd728c81
commit
cfc8357930
11 changed files with 96 additions and 110 deletions
|
@ -75,7 +75,7 @@ outlined on that page and do not file a public issue.
|
||||||
In order to accept your pull request, we need you to submit a CLA. You only need
|
In order to accept your pull request, we need you to submit a CLA. You only need
|
||||||
to do this once to work on any of Meta's open source projects.
|
to do this once to work on any of Meta's open source projects.
|
||||||
|
|
||||||
Complete your CLA here: <https://code.facebook.com/cla>
|
Complete your CLA here: [https://code.facebook.com/cla](https://code.facebook.com/cla)
|
||||||
|
|
||||||
**I'd like to contribute!**
|
**I'd like to contribute!**
|
||||||
|
|
||||||
|
|
|
@ -12,9 +12,9 @@ This guide will walk you through the process of adding a new API provider to Lla
|
||||||
|
|
||||||
|
|
||||||
- Begin by reviewing the [core concepts](../concepts/index.md) of Llama Stack and choose the API your provider belongs to (Inference, Safety, VectorIO, etc.)
|
- Begin by reviewing the [core concepts](../concepts/index.md) of Llama Stack and choose the API your provider belongs to (Inference, Safety, VectorIO, etc.)
|
||||||
- Determine the provider type ({repopath}`Remote::llama_stack/providers/remote` or {repopath}`Inline::llama_stack/providers/inline`). Remote providers make requests to external services, while inline providers execute implementation locally.
|
- Determine the provider type ([Remote](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/remote) or [Inline](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/inline)). Remote providers make requests to external services, while inline providers execute implementation locally.
|
||||||
- Add your provider to the appropriate {repopath}`Registry::llama_stack/providers/registry/`. Specify pip dependencies necessary.
|
- Add your provider to the appropriate [Registry](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/registry/). Specify pip dependencies necessary.
|
||||||
- Update any distribution {repopath}`Templates::llama_stack/distributions/` `build.yaml` and `run.yaml` files if they should include your provider by default. Run {repopath}`./scripts/distro_codegen.py` if necessary. Note that `distro_codegen.py` will fail if the new provider causes any distribution template to attempt to import provider-specific dependencies. This usually means the distribution's `get_distribution_template()` code path should only import any necessary Config or model alias definitions from each provider and not the provider's actual implementation.
|
- Update any distribution [Templates](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/distributions/) `build.yaml` and `run.yaml` files if they should include your provider by default. Run [./scripts/distro_codegen.py](https://github.com/meta-llama/llama-stack/blob/main/scripts/distro_codegen.py) if necessary. Note that `distro_codegen.py` will fail if the new provider causes any distribution template to attempt to import provider-specific dependencies. This usually means the distribution's `get_distribution_template()` code path should only import any necessary Config or model alias definitions from each provider and not the provider's actual implementation.
|
||||||
|
|
||||||
|
|
||||||
Here are some example PRs to help you get started:
|
Here are some example PRs to help you get started:
|
||||||
|
@ -71,9 +71,9 @@ Before running tests, you must have required dependencies installed. This depend
|
||||||
|
|
||||||
### 1. Integration Testing
|
### 1. Integration Testing
|
||||||
|
|
||||||
Integration tests are located in {repopath}`tests/integration`. These tests use the python client-SDK APIs (from the `llama_stack_client` package) to test functionality. Since these tests use client APIs, they can be run either by pointing to an instance of the Llama Stack server or "inline" by using `LlamaStackAsLibraryClient`.
|
Integration tests are located in [tests/integration](https://github.com/meta-llama/llama-stack/tree/main/tests/integration). These tests use the python client-SDK APIs (from the `llama_stack_client` package) to test functionality. Since these tests use client APIs, they can be run either by pointing to an instance of the Llama Stack server or "inline" by using `LlamaStackAsLibraryClient`.
|
||||||
|
|
||||||
Consult {repopath}`tests/integration/README.md` for more details on how to run the tests.
|
Consult [tests/integration/README.md](https://github.com/meta-llama/llama-stack/blob/main/tests/integration/README.md) for more details on how to run the tests.
|
||||||
|
|
||||||
Note that each provider's `sample_run_config()` method (in the configuration class for that provider)
|
Note that each provider's `sample_run_config()` method (in the configuration class for that provider)
|
||||||
typically references some environment variables for specifying API keys and the like. You can set these in the environment or pass these via the `--env` flag to the test command.
|
typically references some environment variables for specifying API keys and the like. You can set these in the environment or pass these via the `--env` flag to the test command.
|
||||||
|
@ -81,9 +81,9 @@ Note that each provider's `sample_run_config()` method (in the configuration cla
|
||||||
|
|
||||||
### 2. Unit Testing
|
### 2. Unit Testing
|
||||||
|
|
||||||
Unit tests are located in {repopath}`tests/unit`. Provider-specific unit tests are located in {repopath}`tests/unit/providers`. These tests are all run automatically as part of the CI process.
|
Unit tests are located in [tests/unit](https://github.com/meta-llama/llama-stack/tree/main/tests/unit). Provider-specific unit tests are located in [tests/unit/providers](https://github.com/meta-llama/llama-stack/tree/main/tests/unit/providers). These tests are all run automatically as part of the CI process.
|
||||||
|
|
||||||
Consult {repopath}`tests/unit/README.md` for more details on how to run the tests manually.
|
Consult [tests/unit/README.md](https://github.com/meta-llama/llama-stack/blob/main/tests/unit/README.md) for more details on how to run the tests manually.
|
||||||
|
|
||||||
### 3. Additional end-to-end testing
|
### 3. Additional end-to-end testing
|
||||||
|
|
||||||
|
|
|
@ -39,7 +39,7 @@ filtering, sorting, and aggregating vectors.
|
||||||
- `YourVectorIOAdapter.query_chunks()`
|
- `YourVectorIOAdapter.query_chunks()`
|
||||||
- `YourVectorIOAdapter.delete_chunks()`
|
- `YourVectorIOAdapter.delete_chunks()`
|
||||||
3. **Add to Registry**: Register your provider in the appropriate registry file.
|
3. **Add to Registry**: Register your provider in the appropriate registry file.
|
||||||
- Update {repopath}`llama_stack/providers/registry/vector_io.py` to include your new provider.
|
- Update [llama_stack/providers/registry/vector_io.py](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/registry/vector_io.py) to include your new provider.
|
||||||
```python
|
```python
|
||||||
from llama_stack.providers.registry.specs import InlineProviderSpec
|
from llama_stack.providers.registry.specs import InlineProviderSpec
|
||||||
from llama_stack.providers.registry.api import Api
|
from llama_stack.providers.registry.api import Api
|
||||||
|
@ -65,7 +65,7 @@ InlineProviderSpec(
|
||||||
5. Add your provider to the `vector_io_providers` fixture dictionary.
|
5. Add your provider to the `vector_io_providers` fixture dictionary.
|
||||||
- Please follow the naming convention of `your_vectorprovider_index` and `your_vectorprovider_adapter` as the tests require this to execute properly.
|
- Please follow the naming convention of `your_vectorprovider_index` and `your_vectorprovider_adapter` as the tests require this to execute properly.
|
||||||
- Integration Tests
|
- Integration Tests
|
||||||
- Integration tests are located in {repopath}`tests/integration`. These tests use the python client-SDK APIs (from the `llama_stack_client` package) to test functionality.
|
- Integration tests are located in [tests/integration](https://github.com/meta-llama/llama-stack/tree/main/tests/integration). These tests use the python client-SDK APIs (from the `llama_stack_client` package) to test functionality.
|
||||||
- The two set of integration tests are:
|
- The two set of integration tests are:
|
||||||
- `tests/integration/vector_io/test_vector_io.py`: This file tests registration, insertion, and retrieval.
|
- `tests/integration/vector_io/test_vector_io.py`: This file tests registration, insertion, and retrieval.
|
||||||
- `tests/integration/vector_io/test_openai_vector_stores.py`: These tests are for OpenAI-compatible vector stores and test the OpenAI API compatibility.
|
- `tests/integration/vector_io/test_openai_vector_stores.py`: These tests are for OpenAI-compatible vector stores and test the OpenAI API compatibility.
|
||||||
|
@ -79,5 +79,5 @@ InlineProviderSpec(
|
||||||
- If you are adding tests for the `remote` provider you will have to update the `test` group, which is used in the GitHub CI for integration tests.
|
- If you are adding tests for the `remote` provider you will have to update the `test` group, which is used in the GitHub CI for integration tests.
|
||||||
- `uv add new_pip_package --group test`
|
- `uv add new_pip_package --group test`
|
||||||
5. **Update Documentation**: Please update the documentation for end users
|
5. **Update Documentation**: Please update the documentation for end users
|
||||||
- Generate the provider documentation by running {repopath}`./scripts/provider_codegen.py`.
|
- Generate the provider documentation by running [./scripts/provider_codegen.py](https://github.com/meta-llama/llama-stack/blob/main/scripts/provider_codegen.py).
|
||||||
- Update the autogenerated content in the registry/vector_io.py file with information about your provider. Please see other providers for examples.
|
- Update the autogenerated content in the registry/vector_io.py file with information about your provider. Please see other providers for examples.
|
||||||
|
|
|
@ -86,8 +86,11 @@ options:
|
||||||
|
|
||||||
After this step is complete, a file named `<name>-build.yaml` and template file `<name>-run.yaml` will be generated and saved at the output file path specified at the end of the command.
|
After this step is complete, a file named `<name>-build.yaml` and template file `<name>-run.yaml` will be generated and saved at the output file path specified at the end of the command.
|
||||||
|
|
||||||
::::{tab-set}
|
import Tabs from '@theme/Tabs';
|
||||||
:::{tab-item} Building from a template
|
import TabItem from '@theme/TabItem';
|
||||||
|
|
||||||
|
<Tabs>
|
||||||
|
<TabItem value="template" label="Building from a template">
|
||||||
To build from alternative API providers, we provide distribution templates for users to get started building a distribution backed by different providers.
|
To build from alternative API providers, we provide distribution templates for users to get started building a distribution backed by different providers.
|
||||||
|
|
||||||
The following command will allow you to see the available templates and their corresponding providers.
|
The following command will allow you to see the available templates and their corresponding providers.
|
||||||
|
@ -160,8 +163,8 @@ You can now edit ~/.llama/distributions/llamastack-starter/starter-run.yaml and
|
||||||
```{tip}
|
```{tip}
|
||||||
The generated `run.yaml` file is a starting point for your configuration. For comprehensive guidance on customizing it for your specific needs, infrastructure, and deployment scenarios, see [Customizing Your run.yaml Configuration](customizing_run_yaml.md).
|
The generated `run.yaml` file is a starting point for your configuration. For comprehensive guidance on customizing it for your specific needs, infrastructure, and deployment scenarios, see [Customizing Your run.yaml Configuration](customizing_run_yaml.md).
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
:::{tab-item} Building from Scratch
|
<TabItem value="scratch" label="Building from Scratch">
|
||||||
|
|
||||||
If the provided templates do not fit your use case, you could start off with running `llama stack build` which will allow you to a interactively enter wizard where you will be prompted to enter build configurations.
|
If the provided templates do not fit your use case, you could start off with running `llama stack build` which will allow you to a interactively enter wizard where you will be prompted to enter build configurations.
|
||||||
|
|
||||||
|
@ -190,9 +193,8 @@ Tip: use <TAB> to see options for the providers.
|
||||||
|
|
||||||
You can now edit ~/.llama/distributions/llamastack-my-local-stack/my-local-stack-run.yaml and run `llama stack run ~/.llama/distributions/llamastack-my-local-stack/my-local-stack-run.yaml`
|
You can now edit ~/.llama/distributions/llamastack-my-local-stack/my-local-stack-run.yaml and run `llama stack run ~/.llama/distributions/llamastack-my-local-stack/my-local-stack-run.yaml`
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
|
<TabItem value="config" label="Building from a pre-existing build config file">
|
||||||
:::{tab-item} Building from a pre-existing build config file
|
|
||||||
- In addition to templates, you may customize the build to your liking through editing config files and build from config files with the following command.
|
- In addition to templates, you may customize the build to your liking through editing config files and build from config files with the following command.
|
||||||
|
|
||||||
- The config file will be of contents like the ones in `llama_stack/distributions/*build.yaml`.
|
- The config file will be of contents like the ones in `llama_stack/distributions/*build.yaml`.
|
||||||
|
@ -200,9 +202,8 @@ You can now edit ~/.llama/distributions/llamastack-my-local-stack/my-local-stack
|
||||||
```
|
```
|
||||||
llama stack build --config llama_stack/distributions/starter/build.yaml
|
llama stack build --config llama_stack/distributions/starter/build.yaml
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
|
<TabItem value="external" label="Building with External Providers">
|
||||||
:::{tab-item} Building with External Providers
|
|
||||||
|
|
||||||
Llama Stack supports external providers that live outside of the main codebase. This allows you to create and maintain your own providers independently or use community-provided providers.
|
Llama Stack supports external providers that live outside of the main codebase. This allows you to create and maintain your own providers independently or use community-provided providers.
|
||||||
|
|
||||||
|
@ -251,15 +252,12 @@ llama stack build --config my-external-stack.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
For more information on external providers, including directory structure, provider types, and implementation requirements, see the [External Providers documentation](../providers/external.md).
|
For more information on external providers, including directory structure, provider types, and implementation requirements, see the [External Providers documentation](../providers/external.md).
|
||||||
:::
|
</TabItem>
|
||||||
|
<TabItem value="container" label="Building Container">
|
||||||
:::{tab-item} Building Container
|
|
||||||
|
|
||||||
```{admonition} Podman Alternative
|
|
||||||
:class: tip
|
|
||||||
|
|
||||||
|
:::tip Podman Alternative
|
||||||
Podman is supported as an alternative to Docker. Set `CONTAINER_BINARY` to `podman` in your environment to use Podman.
|
Podman is supported as an alternative to Docker. Set `CONTAINER_BINARY` to `podman` in your environment to use Podman.
|
||||||
```
|
:::
|
||||||
|
|
||||||
To build a container image, you may start off from a template and use the `--image-type container` flag to specify `container` as the build image type.
|
To build a container image, you may start off from a template and use the `--image-type container` flag to specify `container` as the build image type.
|
||||||
|
|
||||||
|
@ -278,7 +276,8 @@ You can now edit ~/meta-llama/llama-stack/tmp/configs/ollama-run.yaml and run `l
|
||||||
```
|
```
|
||||||
|
|
||||||
Now set some environment variables for the inference model ID and Llama Stack Port and create a local directory to mount into the container's file system.
|
Now set some environment variables for the inference model ID and Llama Stack Port and create a local directory to mount into the container's file system.
|
||||||
```
|
|
||||||
|
```bash
|
||||||
export INFERENCE_MODEL="llama3.2:3b"
|
export INFERENCE_MODEL="llama3.2:3b"
|
||||||
export LLAMA_STACK_PORT=8321
|
export LLAMA_STACK_PORT=8321
|
||||||
mkdir -p ~/.llama
|
mkdir -p ~/.llama
|
||||||
|
@ -312,9 +311,8 @@ Here are the docker flags and their uses:
|
||||||
|
|
||||||
* `--env OLLAMA_URL=http://host.docker.internal:11434`: Configures the URL for the Ollama service
|
* `--env OLLAMA_URL=http://host.docker.internal:11434`: Configures the URL for the Ollama service
|
||||||
|
|
||||||
:::
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
::::
|
|
||||||
|
|
||||||
|
|
||||||
### Running your Stack server
|
### Running your Stack server
|
||||||
|
|
|
@ -478,12 +478,12 @@ A rule may also specify a condition, either a 'when' or an 'unless',
|
||||||
with additional constraints as to where the rule applies. The
|
with additional constraints as to where the rule applies. The
|
||||||
constraints supported at present are:
|
constraints supported at present are:
|
||||||
|
|
||||||
- 'user with <attr-value> in <attr-name>'
|
- 'user with `<attr-value>` in `<attr-name>`'
|
||||||
- 'user with <attr-value> not in <attr-name>'
|
- 'user with `<attr-value>` not in `<attr-name>`'
|
||||||
- 'user is owner'
|
- 'user is owner'
|
||||||
- 'user is not owner'
|
- 'user is not owner'
|
||||||
- 'user in owners <attr-name>'
|
- 'user in owners `<attr-name>`'
|
||||||
- 'user not in owners <attr-name>'
|
- 'user not in owners `<attr-name>`'
|
||||||
|
|
||||||
The attributes defined for a user will depend on how the auth
|
The attributes defined for a user will depend on how the auth
|
||||||
configuration is defined.
|
configuration is defined.
|
||||||
|
|
|
@ -31,23 +31,21 @@ ollama run llama3.2:3b --keepalive 60m
|
||||||
|
|
||||||
Install [uv](https://docs.astral.sh/uv/) to setup your virtual environment
|
Install [uv](https://docs.astral.sh/uv/) to setup your virtual environment
|
||||||
|
|
||||||
::::{tab-set}
|
<Tabs>
|
||||||
|
<TabItem value="unix" label="macOS and Linux">
|
||||||
:::{tab-item} macOS and Linux
|
|
||||||
Use `curl` to download the script and execute it with `sh`:
|
Use `curl` to download the script and execute it with `sh`:
|
||||||
```console
|
```console
|
||||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
|
<TabItem value="windows" label="Windows">
|
||||||
:::{tab-item} Windows
|
|
||||||
Use `irm` to download the script and execute it with `iex`:
|
Use `irm` to download the script and execute it with `iex`:
|
||||||
|
|
||||||
```console
|
```console
|
||||||
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
|
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
::::
|
</Tabs>
|
||||||
|
|
||||||
Setup your virtual environment.
|
Setup your virtual environment.
|
||||||
|
|
||||||
|
@ -58,9 +56,8 @@ source .venv/bin/activate
|
||||||
### Step 2: Run Llama Stack
|
### Step 2: Run Llama Stack
|
||||||
Llama Stack is a server that exposes multiple APIs, you connect with it using the Llama Stack client SDK.
|
Llama Stack is a server that exposes multiple APIs, you connect with it using the Llama Stack client SDK.
|
||||||
|
|
||||||
::::{tab-set}
|
<Tabs>
|
||||||
|
<TabItem value="venv" label="Using venv">
|
||||||
:::{tab-item} Using `venv`
|
|
||||||
You can use Python to build and run the Llama Stack server, which is useful for testing and development.
|
You can use Python to build and run the Llama Stack server, which is useful for testing and development.
|
||||||
|
|
||||||
Llama Stack uses a [YAML configuration file](../distributions/configuration.md) to specify the stack setup,
|
Llama Stack uses a [YAML configuration file](../distributions/configuration.md) to specify the stack setup,
|
||||||
|
@ -71,19 +68,8 @@ We use `starter` as template. By default all providers are disabled, this requir
|
||||||
```bash
|
```bash
|
||||||
llama stack build --distro starter --image-type venv --run
|
llama stack build --distro starter --image-type venv --run
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
:::{tab-item} Using `venv`
|
<TabItem value="container" label="Using a Container">
|
||||||
You can use Python to build and run the Llama Stack server, which is useful for testing and development.
|
|
||||||
|
|
||||||
Llama Stack uses a [YAML configuration file](../distributions/configuration.md) to specify the stack setup,
|
|
||||||
which defines the providers and their settings.
|
|
||||||
Now let's build and run the Llama Stack config for Ollama.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
llama stack build --distro starter --image-type venv --run
|
|
||||||
```
|
|
||||||
:::
|
|
||||||
:::{tab-item} Using a Container
|
|
||||||
You can use a container image to run the Llama Stack server. We provide several container images for the server
|
You can use a container image to run the Llama Stack server. We provide several container images for the server
|
||||||
component that works with different inference providers out of the box. For this guide, we will use
|
component that works with different inference providers out of the box. For this guide, we will use
|
||||||
`llamastack/distribution-starter` as the container image. If you'd like to build your own image or customize the
|
`llamastack/distribution-starter` as the container image. If you'd like to build your own image or customize the
|
||||||
|
@ -110,9 +96,8 @@ with `host.containers.internal`.
|
||||||
|
|
||||||
The configuration YAML for the Ollama distribution is available at `distributions/ollama/run.yaml`.
|
The configuration YAML for the Ollama distribution is available at `distributions/ollama/run.yaml`.
|
||||||
|
|
||||||
```{tip}
|
:::tip
|
||||||
|
Docker containers run in their own isolated network namespaces on Linux. To allow the container to communicate with services running on the host via `localhost`, you need `--network=host`. This makes the container use the host's network directly so it can connect to Ollama running on `localhost:11434`.
|
||||||
Docker containers run in their own isolated network namespaces on Linux. To allow the container to communicate with services running on the host via `localhost`, you need `--network=host`. This makes the container use the host’s network directly so it can connect to Ollama running on `localhost:11434`.
|
|
||||||
|
|
||||||
Linux users having issues running the above command should instead try the following:
|
Linux users having issues running the above command should instead try the following:
|
||||||
```bash
|
```bash
|
||||||
|
@ -126,7 +111,6 @@ docker run -it \
|
||||||
--env OLLAMA_URL=http://localhost:11434
|
--env OLLAMA_URL=http://localhost:11434
|
||||||
```
|
```
|
||||||
:::
|
:::
|
||||||
::::
|
|
||||||
You will see output like below:
|
You will see output like below:
|
||||||
```
|
```
|
||||||
INFO: Application startup complete.
|
INFO: Application startup complete.
|
||||||
|
@ -137,31 +121,29 @@ Now you can use the Llama Stack client to run inference and build agents!
|
||||||
|
|
||||||
You can reuse the server setup or use the [Llama Stack Client](https://github.com/meta-llama/llama-stack-client-python/).
|
You can reuse the server setup or use the [Llama Stack Client](https://github.com/meta-llama/llama-stack-client-python/).
|
||||||
Note that the client package is already included in the `llama-stack` package.
|
Note that the client package is already included in the `llama-stack` package.
|
||||||
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
|
|
||||||
### Step 3: Run Client CLI
|
### Step 3: Run Client CLI
|
||||||
|
|
||||||
Open a new terminal and navigate to the same directory you started the server from. Then set up a new or activate your
|
Open a new terminal and navigate to the same directory you started the server from. Then set up a new or activate your
|
||||||
existing server virtual environment.
|
existing server virtual environment.
|
||||||
|
|
||||||
::::{tab-set}
|
<Tabs>
|
||||||
|
<TabItem value="reuse" label="Reuse Server venv">
|
||||||
:::{tab-item} Reuse Server `venv`
|
|
||||||
```bash
|
```bash
|
||||||
# The client is included in the llama-stack package so we just activate the server venv
|
# The client is included in the llama-stack package so we just activate the server venv
|
||||||
source .venv/bin/activate
|
source .venv/bin/activate
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
|
<TabItem value="install" label="Install with venv">
|
||||||
:::{tab-item} Install with `venv`
|
|
||||||
```bash
|
```bash
|
||||||
uv venv client --python 3.12
|
uv venv client --python 3.12
|
||||||
source client/bin/activate
|
source client/bin/activate
|
||||||
pip install llama-stack-client
|
pip install llama-stack-client
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
|
|
||||||
::::
|
|
||||||
|
|
||||||
Now let's use the `llama-stack-client` [CLI](../references/llama_stack_client_cli_reference.md) to check the
|
Now let's use the `llama-stack-client` [CLI](../references/llama_stack_client_cli_reference.md) to check the
|
||||||
connectivity to the server.
|
connectivity to the server.
|
||||||
|
@ -237,9 +219,8 @@ OpenAIChatCompletion(
|
||||||
Note that these demos show the [Python Client SDK](../references/python_sdk_reference/index.md).
|
Note that these demos show the [Python Client SDK](../references/python_sdk_reference/index.md).
|
||||||
Other SDKs are also available, please refer to the [Client SDK](../index.md#client-sdks) list for the complete options.
|
Other SDKs are also available, please refer to the [Client SDK](../index.md#client-sdks) list for the complete options.
|
||||||
|
|
||||||
::::{tab-set}
|
<Tabs>
|
||||||
|
<TabItem value="inference" label="Basic Inference">
|
||||||
:::{tab-item} Basic Inference
|
|
||||||
Now you can run inference using the Llama Stack client SDK.
|
Now you can run inference using the Llama Stack client SDK.
|
||||||
|
|
||||||
#### i. Create the Script
|
#### i. Create the Script
|
||||||
|
@ -279,9 +260,8 @@ Which will output:
|
||||||
Model: ollama/llama3.2:3b
|
Model: ollama/llama3.2:3b
|
||||||
OpenAIChatCompletion(id='chatcmpl-30cd0f28-a2ad-4b6d-934b-13707fc60ebf', choices=[OpenAIChatCompletionChoice(finish_reason='stop', index=0, message=OpenAIChatCompletionChoiceMessageOpenAIAssistantMessageParam(role='assistant', content="Lines of code unfold\nAlgorithms dance with ease\nLogic's gentle kiss", name=None, tool_calls=None, refusal=None, annotations=None, audio=None, function_call=None), logprobs=None)], created=1751732480, model='llama3.2:3b', object='chat.completion', service_tier=None, system_fingerprint='fp_ollama', usage={'completion_tokens': 16, 'prompt_tokens': 37, 'total_tokens': 53, 'completion_tokens_details': None, 'prompt_tokens_details': None})
|
OpenAIChatCompletion(id='chatcmpl-30cd0f28-a2ad-4b6d-934b-13707fc60ebf', choices=[OpenAIChatCompletionChoice(finish_reason='stop', index=0, message=OpenAIChatCompletionChoiceMessageOpenAIAssistantMessageParam(role='assistant', content="Lines of code unfold\nAlgorithms dance with ease\nLogic's gentle kiss", name=None, tool_calls=None, refusal=None, annotations=None, audio=None, function_call=None), logprobs=None)], created=1751732480, model='llama3.2:3b', object='chat.completion', service_tier=None, system_fingerprint='fp_ollama', usage={'completion_tokens': 16, 'prompt_tokens': 37, 'total_tokens': 53, 'completion_tokens_details': None, 'prompt_tokens_details': None})
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
|
<TabItem value="agent" label="Build a Simple Agent">
|
||||||
:::{tab-item} Build a Simple Agent
|
|
||||||
Next we can move beyond simple inference and build an agent that can perform tasks using the Llama Stack server.
|
Next we can move beyond simple inference and build an agent that can perform tasks using the Llama Stack server.
|
||||||
#### i. Create the Script
|
#### i. Create the Script
|
||||||
Create a file `agent.py` and add the following code:
|
Create a file `agent.py` and add the following code:
|
||||||
|
@ -449,9 +429,8 @@ uv run python agent.py
|
||||||
|
|
||||||
So, that's me in a nutshell!
|
So, that's me in a nutshell!
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
|
<TabItem value="rag" label="Build a RAG Agent">
|
||||||
:::{tab-item} Build a RAG Agent
|
|
||||||
|
|
||||||
For our last demo, we can build a RAG agent that can answer questions about the Torchtune project using the documents
|
For our last demo, we can build a RAG agent that can answer questions about the Torchtune project using the documents
|
||||||
in a vector database.
|
in a vector database.
|
||||||
|
@ -554,9 +533,8 @@ uv run python rag_agent.py
|
||||||
...
|
...
|
||||||
Overall, DORA is a powerful reinforcement learning algorithm that can learn complex tasks from human demonstrations. However, it requires careful consideration of the challenges and limitations to achieve optimal results.
|
Overall, DORA is a powerful reinforcement learning algorithm that can learn complex tasks from human demonstrations. However, it requires careful consideration of the challenges and limitations to achieve optimal results.
|
||||||
```
|
```
|
||||||
:::
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
::::
|
|
||||||
|
|
||||||
**You're Ready to Build Your Own Apps!**
|
**You're Ready to Build Your Own Apps!**
|
||||||
|
|
||||||
|
|
|
@ -11,11 +11,8 @@ HuggingFace-based post-training provider for fine-tuning models using the Huggin
|
||||||
| `device` | `<class 'str'>` | No | cuda | |
|
| `device` | `<class 'str'>` | No | cuda | |
|
||||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||||
| `chat_template` | `<class 'str'>` | No | <|user|>
|
| `chat_template` | `<class 'str'>` | No | `<|user|>`<br/>`{input}`<br/>`<|assistant|>`<br/>`{output}` | |
|
||||||
{input}
|
| `model_specific_config` | `<class 'dict'>` | No | `{'trust_remote_code': True, 'attn_implementation': 'sdpa'}` | |
|
||||||
<|assistant|>
|
|
||||||
{output} | |
|
|
||||||
| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} | |
|
|
||||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||||
|
|
|
@ -17,8 +17,8 @@ HuggingFace-based post-training provider for fine-tuning models using the Huggin
|
||||||
| `device` | `<class 'str'>` | No | cuda | |
|
| `device` | `<class 'str'>` | No | cuda | |
|
||||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||||
| `chat_template` | `<class 'str'>` | No | <|user|><br/>{input}<br/><|assistant|><br/>{output} | |
|
| `chat_template` | `<class 'str'>` | No | `<|user|>`<br/>`{input}`<br/>`<|assistant|>`<br/>`{output}` | |
|
||||||
| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} | |
|
| `model_specific_config` | `<class 'dict'>` | No | `{'trust_remote_code': True, 'attn_implementation': 'sdpa'}` | |
|
||||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||||
|
|
|
@ -11,11 +11,8 @@ HuggingFace-based post-training provider for fine-tuning models using the Huggin
|
||||||
| `device` | `<class 'str'>` | No | cuda | |
|
| `device` | `<class 'str'>` | No | cuda | |
|
||||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||||
| `chat_template` | `<class 'str'>` | No | <|user|>
|
| `chat_template` | `<class 'str'>` | No | `<|user|>`<br/>`{input}`<br/>`<|assistant|>`<br/>`{output}` | |
|
||||||
{input}
|
| `model_specific_config` | `<class 'dict'>` | No | `{'trust_remote_code': True, 'attn_implementation': 'sdpa'}` | |
|
||||||
<|assistant|>
|
|
||||||
{output} | |
|
|
||||||
| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} | |
|
|
||||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||||
|
|
|
@ -409,7 +409,7 @@ For more details on TLS configuration, refer to the [TLS setup guide](https://mi
|
||||||
| `token` | `str \| None` | No | | The token of the Milvus server |
|
| `token` | `str \| None` | No | | The token of the Milvus server |
|
||||||
| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
|
| `consistency_level` | `<class 'str'>` | No | Strong | The consistency level of the Milvus server |
|
||||||
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | Config for KV store backend |
|
| `kvstore` | `utils.kvstore.config.RedisKVStoreConfig \| utils.kvstore.config.SqliteKVStoreConfig \| utils.kvstore.config.PostgresKVStoreConfig \| utils.kvstore.config.MongoDBKVStoreConfig` | No | sqlite | Config for KV store backend |
|
||||||
| `config` | `dict` | No | {} | This configuration allows additional fields to be passed through to the underlying Milvus client. See the [Milvus](https://milvus.io/docs/install-overview.md) documentation for more details about Milvus in general. |
|
| `config` | `dict` | No | `{}` | This configuration allows additional fields to be passed through to the underlying Milvus client. See the [Milvus](https://milvus.io/docs/install-overview.md) documentation for more details about Milvus in general. |
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
This configuration class accepts additional fields beyond those listed above. You can pass any additional configuration options that will be forwarded to the underlying provider.
|
This configuration class accepts additional fields beyond those listed above. You can pass any additional configuration options that will be forwarded to the underlying provider.
|
||||||
|
|
|
@ -229,17 +229,33 @@ def generate_provider_docs(progress, provider_spec: Any, api_name: str) -> str:
|
||||||
|
|
||||||
# Handle multiline default values and escape problematic characters for MDX
|
# Handle multiline default values and escape problematic characters for MDX
|
||||||
if "\n" in default:
|
if "\n" in default:
|
||||||
default = (
|
# For multiline defaults, escape angle brackets and use <br/> for line breaks
|
||||||
default.replace("\n", "<br/>")
|
lines = default.split("\n")
|
||||||
.replace("<", "<")
|
escaped_lines = []
|
||||||
.replace(">", ">")
|
for line in lines:
|
||||||
.replace("{", "{")
|
if line.strip():
|
||||||
.replace("}", "}")
|
# Escape angle brackets and wrap template tokens in backticks
|
||||||
)
|
escaped_line = line.strip().replace("<", "<").replace(">", ">")
|
||||||
|
if ("{" in escaped_line and "}" in escaped_line) or (
|
||||||
|
"<|" in escaped_line and "|>" in escaped_line
|
||||||
|
):
|
||||||
|
escaped_lines.append(f"`{escaped_line}`")
|
||||||
|
else:
|
||||||
|
escaped_lines.append(escaped_line)
|
||||||
|
else:
|
||||||
|
escaped_lines.append("")
|
||||||
|
default = "<br/>".join(escaped_lines)
|
||||||
else:
|
else:
|
||||||
default = (
|
# For single line defaults, escape angle brackets first
|
||||||
default.replace("<", "<").replace(">", ">").replace("{", "{").replace("}", "}")
|
escaped_default = default.replace("<", "<").replace(">", ">")
|
||||||
)
|
# Then wrap template tokens in backticks
|
||||||
|
if ("{" in escaped_default and "}" in escaped_default) or (
|
||||||
|
"<|" in escaped_default and "|>" in escaped_default
|
||||||
|
):
|
||||||
|
default = f"`{escaped_default}`"
|
||||||
|
else:
|
||||||
|
# Apply additional escaping for curly braces
|
||||||
|
default = escaped_default.replace("{", "{").replace("}", "}")
|
||||||
|
|
||||||
description_text = field_info["description"] or ""
|
description_text = field_info["description"] or ""
|
||||||
# Escape curly braces in description text for MDX compatibility
|
# Escape curly braces in description text for MDX compatibility
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue