llama-stack-mirror/llama_stack/distribution
Ashwin Bharambe 96e7ef646f
add support for ${env.FOO_BAR} placeholders in run.yaml files (#439)
# What does this PR do?

We'd like our docker steps to require _ZERO EDITS_ to a YAML file in
order to get going. This is often not possible because depending on the
provider, we do need some configuration input from the user. Environment
variables are the best way to obtain this information.

This PR allows our run.yaml to contain `${env.FOO_BAR}` placeholders
which can be replaced using `docker run -e FOO_BAR=baz` (and similar
`docker compose` equivalent).

## Test Plan

For remote-vllm, example `run.yaml` snippet looks like this:
```yaml
providers:
  inference:
  # serves main inference model
  - provider_id: vllm-0
    provider_type: remote::vllm
    config:
      # NOTE: replace with "localhost" if you are running in "host" network mode
      url: ${env.LLAMA_INFERENCE_VLLM_URL:http://host.docker.internal:5100/v1}
      max_tokens: ${env.MAX_TOKENS:4096}
      api_token: fake
  # serves safety llama_guard model
  - provider_id: vllm-1
    provider_type: remote::vllm
    config:
      # NOTE: replace with "localhost" if you are running in "host" network mode
      url: ${env.LLAMA_SAFETY_VLLM_URL:http://host.docker.internal:5101/v1}
      max_tokens: ${env.MAX_TOKENS:4096}
      api_token: fake
```

`compose.yaml` snippet looks like this:
```yaml
llamastack:
    depends_on:
    - vllm-0
    - vllm-1
      # image: llamastack/distribution-remote-vllm
    image: llamastack/distribution-remote-vllm:test-0.0.52rc3
    volumes:
      - ~/.llama:/root/.llama
      - ~/local/llama-stack/distributions/remote-vllm/run.yaml:/root/llamastack-run-remote-vllm.yaml
    # network_mode: "host"
    environment:
      - LLAMA_INFERENCE_VLLM_URL=${LLAMA_INFERENCE_VLLM_URL:-http://host.docker.internal:5100/v1}
      - LLAMA_INFERENCE_MODEL=${LLAMA_INFERENCE_MODEL:-Llama3.1-8B-Instruct}
      - MAX_TOKENS=${MAX_TOKENS:-4096}
      - SQLITE_STORE_DIR=${SQLITE_STORE_DIR:-$HOME/.llama/distributions/remote-vllm}
      - LLAMA_SAFETY_VLLM_URL=${LLAMA_SAFETY_VLLM_URL:-http://host.docker.internal:5101/v1}
      - LLAMA_SAFETY_MODEL=${LLAMA_SAFETY_MODEL:-Llama-Guard-3-1B}
```
2024-11-13 11:25:58 -08:00
..
routers change schema -> dataset_schema for register_dataset api (#443) 2024-11-13 11:17:46 -05:00
server add support for ${env.FOO_BAR} placeholders in run.yaml files (#439) 2024-11-13 11:25:58 -08:00
store Remove the "ShieldType" concept (#430) 2024-11-12 12:37:24 -08:00
utils Replace colon in path so it doesn't cause issue on Windows 2024-11-11 17:33:53 -08:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
build.py Fix bug in llama stack build; SERVER_DEPENDENCIES were dropped 2024-11-11 20:12:13 -08:00
build_conda_env.sh fix prompt guard (#177) 2024-10-03 11:07:53 -07:00
build_container.sh Change order of building the Docker 2024-11-12 13:09:04 -08:00
client.py Kill "remote" providers and fix testing with a remote stack properly (#435) 2024-11-12 21:51:29 -08:00
common.sh API Updates (#73) 2024-09-17 19:51:35 -07:00
configure.py Kill --name from llama stack build (#340) 2024-10-28 23:07:32 -07:00
configure_container.sh docker: Check for selinux before using --security-opt (#167) 2024-10-02 10:37:41 -07:00
datatypes.py Enable sane naming of registered objects with defaults (#429) 2024-11-12 11:18:05 -08:00
distribution.py Kill "remote" providers and fix testing with a remote stack properly (#435) 2024-11-12 21:51:29 -08:00
inspect.py Remove "routing_table" and "routing_key" concepts for the user (#201) 2024-10-10 10:24:13 -07:00
request_headers.py provider_id => provider_type, adapter_id => adapter_type 2024-10-02 14:05:59 -07:00
resolver.py Kill "remote" providers and fix testing with a remote stack properly (#435) 2024-11-12 21:51:29 -08:00
stack.py Kill "remote" providers and fix testing with a remote stack properly (#435) 2024-11-12 21:51:29 -08:00
start_conda_env.sh API Updates (#73) 2024-09-17 19:51:35 -07:00
start_container.sh Use tags for docker images instead of changing image name 2024-11-12 12:42:30 -08:00