# What does this PR do?
probably related to 3.11 upgrade
^^^^
File
"/opt/homebrew/Caskroom/miniconda/base/envs/myenv/lib/python3.11/site-packages/termcolor/termcolor.py",
line 147, in colored
text = fmt_str % (COLORS[color], text)
~~~~~~^^^^^^^
KeyError: 'light_blue'
## Test Plan
# What does this PR do?
dropped python3.10, updated pyproject and dependencies, and also removed
some blocks of code with special handling for enum.StrEnum
Closes#2458
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR fixes a bug where running a known template by name using:
`llama stack run ollama`
would fail with the following error:
`ValueError: Config file ollama does not exist`
<!-- If resolving an issue, uncomment and update the line below -->
Closes#2291
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
`llama stack run ollama` should work
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
Removes the ability to run llama stack container images through the
llama stack CLI
Closes#2110
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
Run:
```
llama stack run /path/to/run.yaml --image-type container
```
Expected outcome:
```
llama stack run: error: argument --image-type: invalid choice: 'container' (choose from 'conda', 'venv')
```
[//]: # (## Documentation)
# What does this PR do?
Fixes an issue where running `llama stack build --template ollama
--image-type venv --run` fails with a TypeError when validating external
providers directory paths.
The error occurs because `os.path.exists()` is called with `Path(None)`
instead of converting it to a string first. This change ensures
consistent handling of `None` values for `external_providers_dir` across
both build and
[run](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/cli/stack/run.py#L134)
commands by using `str()` conversion before path validation.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```bash
INFERENCE_MODEL=llama3.2:3b uv run --with llama-stack llama stack build --template ollama --image-type venv --run
```
Command completes successfully without TypeError
[//]: # (## Documentation)
# What does this PR do?
TSIA
`--enable-ui` to enable
## Test Plan
`llama stack run dev --image-type conda --enable-ui`
`localhost:8322` shows UI
llama stack run dev --image-type conda
`localhost:8322` does not work
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
```
llama stack rm llamastack-test
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
#225
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
# What does this PR do?
The `external_config_dir` configuration parameter is not being passed to
the `BuildConfig` for `LlamaStackAsLibraryClient`.
This prevents _plugin_ providers from being loaded when `llama-stack` is
uses as a library.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
I ran `LlamaStackAsLibraryClient` with a configuration file that
contained `external_config_dir` and related configuration.
It does not work without this change: _external_ providers are not
resolved.
It does work with this change 👍
[//]: # (## Documentation)
closes#2162
# test plan
run `llama stack build --image-name ollama --image-type
<venv/conda/container> --config llama_stack/templates/ollama/build.yaml`
and verify venv | conda | container are built.
# What does this PR do?
fixes#2188
## Test Plan
`INFERENCE_MODEL=meta-llama/Llama-3.3-70B-Instruct llama stack build
--image-name ollama --image-type conda --template ollama --run` without
error
# What does this PR do?
currently the "default" dir for external providers is
`/etc/llama-stack/providers.d`
This dir is not used anywhere nor created.
Switch to a more friendly `~/.llama/providers.d/`
This allows external providers to actually create this dir and/or
populate it upon installation, `pip` cannot create directories in `etc`.
If a user does not specify a dir, default to this one
see https://github.com/containers/ramalama-stack/issues/36
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
We are dropping configuration via CLI flag almost entirely. If any
server configuration has to be tweak it must be done through the server
section in the run.yaml.
This is unfortunately a breaking change for whover was using:
* `--tls-*`
* `--disable_ipv6`
`--port` stays around and get a special treatment since we believe, it's
common for user dev to change port for quick experimentations.
Closes: https://github.com/meta-llama/llama-stack/issues/1076
## Test Plan
Simply do `llama stack run <config>` nothing should break :)
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
Mainly tried to cover the entire llama_stack/apis directory, we only
have one left. Some excludes were just noop.
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
The goal of this PR is code base modernization.
Schema reflection code needed a minor adjustment to handle UnionTypes
and collections.abc.AsyncIterator. (Both are preferred for latest Python
releases.)
Note to reviewers: almost all changes here are automatically generated
by pyupgrade. Some additional unused imports were cleaned up. The only
change worth of note can be found under `docs/openapi_generator` and
`llama_stack/strong_typing/schema.py` where reflection code was updated
to deal with "newer" types.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
# What does this PR do?
Partial revert of fa68ded07c
this commit ensures users know where their new templates are generated
and how to run the newly built distro locally
discussion on Discord:
1351652390
## Test Plan
Did a local run - let me know if we want any unit testing covering this

## Documentation
Updated "Zero to Hero" guide with new output
---------
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
# What does this PR do?
- Added new Ruff lint rules to detect ambiguous or non-ASCII characters:
- Added per-file ignores where Unicode usage is still required.
- Fixed whatever had to be fixed
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
Enhances the user experience in the `llama stack build` command by
adding interactive TAB completion for image type selection. This ensures
the UX consistency with other parts of the CLI that already support tab
completion, such as provider selection, providing a more intuitive and
discoverable interface for users.
<img width="1531" alt="image"
src="https://github.com/user-attachments/assets/12161d45-451d-4820-b34d-7ea4decf810f"
/>
As part of the build process, we now include the generated run.yaml
(based of the provided build configuration file) into the container. We
updated the entrypoint to use this run configuration as well.
Given this simple distribution configuration:
```
# build.yaml
version: '2'
distribution_spec:
description: Use (an external) Ollama server for running LLM inference
providers:
inference:
- remote::ollama
vector_io:
- inline::faiss
safety:
- inline::llama-guard
agents:
- inline::meta-reference
telemetry:
- inline::meta-reference
eval:
- inline::meta-reference
datasetio:
- remote::huggingface
- inline::localfs
scoring:
- inline::basic
- inline::llm-as-judge
- inline::braintrust
tool_runtime:
- remote::brave-search
- remote::tavily-search
- inline::code-interpreter
- inline::rag-runtime
- remote::model-context-protocol
- remote::wolfram-alpha
container_image: "registry.access.redhat.com/ubi9"
image_type: container
image_name: test
```
Build it:
```
llama stack build --config build.yaml
```
Run it:
```
podman run --rm \
-p 8321:8321 \
-e OLLAMA_URL=http://host.containers.internal:11434 \
--name llama-stack-server \
localhost/leseb-test:0.2.2
```
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
Fixes a crash that occurred when building a stack as a container image
via the interactive wizard without supplying --template or --config.
- Root cause: template_or_config was None; only the container path
relies on that parameter, which later reaches subprocess.run() and
triggers
`TypeError: expected str, bytes or os.PathLike object, not NoneType.`
- Change: in `_run_stack_build_command_from_build_config` we now fall
back to the freshly‑written build‑spec file whenever both optional
sources are missing. Also adds a spy‑based unit test that asserts a
valid string path is passed to build_image() for container builds.
### Closes#1976
## Test Plan
- New unit test: test_build_path.py. Monkey‑patches build_image,
captures the fourth argument, and verifies it is a real path
- Manual smoke test:
```
llama stack build --image-type container
# answer wizard prompts
```
Build proceeds into Docker without raising the previous TypeError.
## Future Work
Harmonise `build_image` arguments so every image type receives the same
inputs, eliminating this asymmetric special‑case.
# What does this PR do?
Build failures are hard to read, sometimes we get errors like:
```
Error building stack: 'key'
```
Which are difficult to debug without a proper trace.
## Test Plan
If `llama stack build` fails you get a traceback now.
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
allow users to specify only the providers they want in the llama stack
build command. If a user wants a non-interactive build, but doesn't want
to use a template, `--providers` allows someone to specify something
like `--providers inference=remote::ollama` for a distro with JUST
ollama
## Test Plan
`llama stack build --providers inference=remote::ollama --image-type
venv`
<img width="1084" alt="Screenshot 2025-03-20 at 9 34 14 AM"
src="https://github.com/user-attachments/assets/502b5fa2-edab-4267-a595-4f987204a6a9"
/>
`llama stack run --image-type venv
/Users/charliedoern/projects/Documents/llama-stack/venv-run.yaml`
<img width="1149" alt="Screenshot 2025-03-20 at 9 35 19 AM"
src="https://github.com/user-attachments/assets/433765f3-6b7f-4383-9241-dad085b69228"
/>
---------
Signed-off-by: Charlie Doern <cdoern@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
current text for 'llama stack build' and 'llama stack run' says that if
no argument is passed to '--image-name' that the active Conda
environment will be used
in reality, the active enviroment is used whether it is from conda,
virtualenv, etc.
## Test Plan
N/A
## Documentation
N/A
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
# What does this PR do?
Move around bits. This makes the copies from llama-models _much_ easier
to maintain and ensures we don't entangle meta-reference specific
tidbits into llama-models code even by accident.
Also, kills the meta-reference-quantized-gpu distro and rolls
quantization deps into meta-reference-gpu.
## Test Plan
```
LLAMA_MODELS_DEBUG=1 \
with-proxy llama stack run meta-reference-gpu \
--env INFERENCE_MODEL=meta-llama/Llama-4-Scout-17B-16E-Instruct \
--env INFERENCE_CHECKPOINT_DIR=<DIR> \
--env MODEL_PARALLEL_SIZE=4 \
--env QUANTIZATION_TYPE=fp8_mixed
```
Start a server with and without quantization. Point integration tests to
it using:
```
pytest -s -v tests/integration/inference/test_text_inference.py \
--stack-config http://localhost:8321 --text-model meta-llama/Llama-4-Scout-17B-16E-Instruct
```
# What does this PR do?
This is the second attempt to switch to system packages by default. Now
with a hack to detect conda environment - in which case conda image-type
is used.
Note: Conda will only be used when --image-name is unset *and*
CONDA_DEFAULT_ENV is set. This means that users without conda will
correctly fall back to using system packages when no --image-* arguments
are passed at all.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
Uses virtualenv:
```
$ llama stack build --template ollama --image-type venv
$ llama stack run --image-type venv ~/.llama/distributions/ollama/ollama-run.yaml
[...]
Using virtual environment: /home/ec2-user/src/llama-stack/schedule/.local
[...]
```
Uses system packages (virtualenv already initialized):
```
$ llama stack run ~/.llama/distributions/ollama/ollama-run.yaml
[...]
INFO 2025-03-27 20:46:22,882 llama_stack.cli.stack.run:142 server: No image type or image name provided. Assuming environment packages.
[...]
```
Attempt to run from environment packages without necessary packages
installed:
```
$ python -m venv barebones
$ . ./barebones/bin/activate
$ pip install -e . # to install llama command
$ llama stack run ~/.llama/distributions/ollama/ollama-run.yaml
[...]
ModuleNotFoundError: No module named 'fastapi'
```
^ failed as expected because the environment doesn't have necessary
packages installed.
Now install some packages in the new environment:
```
$ pip install fastapi opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp aiosqlite ollama openai datasets faiss-cpu mcp autoevals
$ llama stack run ~/.llama/distributions/ollama/ollama-run.yaml
[...]
Uvicorn running on http://['::', '0.0.0.0']:8321 (Press CTRL+C to quit)
```
Now see if setting CONDA_DEFAULT_ENV will change what happens by
default:
```
$ export CONDA_DEFAULT_ENV=base
$ llama stack run ~/.llama/distributions/ollama/ollama-run.yaml
[...]
Using conda environment: base
Conda environment base does not exist.
[...]
```
---------
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
Fixes multiple issues
1. llama stack build of dependencies was breaking with incompatible
numpy / pandas when importing datasets
Moved the notebook to start a local server instead of using library as a
client. This way the setup is cleaner since its all contained and by
using `uv run --with` we can test both the server setup process too in
CI and release time.
2. The change to [1] surfaced some other issues
- running `llama stack run` was defaulting to conda env name
- provider data was not being managed properly
- Some notebook cells (telemetry for evals) were not updated with latest
changes
Fixed all the issues and update the notebook.
### Test
1. Manually run it all in local env
2. `pytest -v -s --nbval-lax docs/getting_started.ipynb`
# What does this PR do?
A PTY is unnecessary for interactive mode since `subprocess.run()`
already inherits the calling terminal’s stdin, stdout, and stderr,
allowing natural interaction. Using a PTY can introduce unwanted side
effects like buffering issues and inconsistent signal handling. Standard
input/output is sufficient for most interactive programs.
This commit simplifies the command execution by:
1. Removing PTY-based execution in favor of direct subprocess handling
2. Consolidating command execution into a single run_command function
3. Improving error handling with specific subprocess error types
4. Adding proper type hints and documentation
5. Maintaining Ctrl+C handling for graceful interruption
## Test Plan
```
llama stack run
```
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
Updates the help text for the `llama model prompt-format` command to
clarify that users should provide a specific model name (e.g.,
Llama3.1-8B, Llama3.2-11B-Vision), not a model family. Removes the
default value and field for `--model-name` to prevent users from
mistakenly thinking a model family name is acceptable. Adds guidance to
run `llama model list` to view valid model names.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
Output of `llama model prompt-format -h` Before:
```
(venv) alina@fedora:~/dev/llama/llama-stack$ llama model prompt-format -h
usage: llama model prompt-format [-h] [-m MODEL_NAME]
Show llama model message formats
options:
-h, --help show this help message and exit
-m MODEL_NAME, --model-name MODEL_NAME
Model Family (llama3_1, llama3_X, etc.)
Example:
llama model prompt-format <options>
(venv) alina@fedora:~/dev/llama/llama-stack$ llama model prompt-format --model-name llama3_1
usage: llama model prompt-format [-h] [-m MODEL_NAME]
llama model prompt-format: error: llama3_1 is not a valid Model. Choose one from --
Llama3.1-8B
Llama3.1-70B
Llama3.1-405B
Llama3.1-8B-Instruct
Llama3.1-70B-Instruct
Llama3.1-405B-Instruct
Llama3.2-1B
Llama3.2-3B
Llama3.2-1B-Instruct
Llama3.2-3B-Instruct
Llama3.2-11B-Vision
Llama3.2-90B-Vision
Llama3.2-11B-Vision-Instruct
Llama3.2-90B-Vision-Instruct
```
Output of `llama model prompt-format -h` After:
```
(venv) alina@fedora:~/dev/llama/llama-stack$ llama model prompt-format -h
usage: llama model prompt-format [-h] [-m MODEL_NAME]
Show llama model message formats
options:
-h, --help show this help message and exit
-m MODEL_NAME, --model-name MODEL_NAME
Example: Llama3.1-8B or Llama3.2-11B-Vision, etc
(Run `llama model list` to see a list of valid model names)
Example:
llama model prompt-format <options>
```
Signed-off-by: Alina Ryan <aliryan@redhat.com>
# What does this PR do?
Updated all instances of datetime.now() to use timezone.utc for
consistency in handling time across different systems. This ensures that
timestamps are always in Coordinated Universal Time (UTC), avoiding
issues with time zone discrepancies and promoting uniformity in
time-related data.
Signed-off-by: Sébastien Han <seb@redhat.com>
Reverts meta-llama/llama-stack#1252
The above PR breaks the following invocation:
```bash
llama stack run ~/.llama/distributions/together/together-run.yaml
```
# What does this PR do?
Users prefer to rely on the main CLI rather than invoking the server
through a Python module. Users interact with a high-level CLI rather
than needing to know internal module structures.
Now, when running llama stack run <path-to-config>, the server will
attempt to use the system package or a virtual environment if one is
active.
This also eliminates the current process dependency chain when running
from a virtual environment:
-> llama stack run
-> start_env.sh
-> python -m server...
Signed-off-by: Sébastien Han <seb@redhat.com>
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
Run:
```
ollama run llama3.2:3b-instruct-fp16 --keepalive=2m &
llama stack run ./llama_stack/templates/ollama/run.yaml --disable-ipv6
```
Notice that the server starts and shutdowns normally.
[//]: # (## Documentation)
---------
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
This disambiguates "Image" term from "container image" alternative usage
and allows for:
```python
if image_type == LlamaStackImagetype.venv:
...
```
accesses rather than `ImageType.venv.value`
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
Changes enum use to comply with semantic python styling and naming
conventions.
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
Refactor was automated and small so simple run-through of creating
images was done.
Signed-off-by: James Kunstle <jkunstle@redhat.com>
Summary:
+ llama model prompt-format -m Llama3.2-11B-Vision-Instruct
Traceback (most recent call last):
File "/tmp/tmp.gCwyyCcjoA/.venv/bin/llama", line 10, in <module>
sys.exit(main())
File
"/tmp/tmp.gCwyyCcjoA/.venv/lib/python3.10/site-packages/llama_stack/cli/llama.py",
line 50, in main
parser.run(args)
File
"/tmp/tmp.gCwyyCcjoA/.venv/lib/python3.10/site-packages/llama_stack/cli/llama.py",
line 44, in run
args.func(args)
File
"/tmp/tmp.gCwyyCcjoA/.venv/lib/python3.10/site-packages/llama_stack/cli/model/prompt_format.py",
line 59, in _run_model_template_cmd
if args.list:
AttributeError: 'Namespace' object has no attribute 'list'
Test Plan:
llama model prompt-format -m Llama3.2-11B-Vision-Instruct
# What does this PR do?
This commit introduces a new logging system that allows loggers to be
assigned
a category while retaining the logger name based on the file name. The
log
format includes both the logger name and the category, producing output
like:
```
INFO 2025-03-03 21:44:11,323 llama_stack.distribution.stack:103 [core]: Tool_groups: builtin::websearch served by
tavily-search
```
Key features include:
- Category-based logging: Loggers can be assigned a category (e.g.,
"core", "server") when programming. The logger can be loaded like
this: `logger = get_logger(name=__name__, category="server")`
- Environment variable control: Log levels can be configured
per-category using the
`LLAMA_STACK_LOGGING` environment variable. For example:
`LLAMA_STACK_LOGGING="server=DEBUG;core=debug"` enables DEBUG level for
the "server"
and "core" categories.
- `LLAMA_STACK_LOGGING="all=debug"` sets DEBUG level globally for all
categories and
third-party libraries.
This provides fine-grained control over logging levels while maintaining
a clean and
informative log format.
The formatter uses the rich library which provides nice colors better
stack traces like so:
```
ERROR 2025-03-03 21:49:37,124 asyncio:1758 [uncategorized]: unhandled exception during asyncio.run() shutdown
task: <Task finished name='Task-16' coro=<handle_signal.<locals>.shutdown() done, defined at
/Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/server/server.py:146>
exception=UnboundLocalError("local variable 'loop' referenced before assignment")>
╭────────────────────────────────────── Traceback (most recent call last) ───────────────────────────────────────╮
│ /Users/leseb/Documents/AI/llama-stack/llama_stack/distribution/server/server.py:178 in shutdown │
│ │
│ 175 │ │ except asyncio.CancelledError: │
│ 176 │ │ │ pass │
│ 177 │ │ finally: │
│ ❱ 178 │ │ │ loop.stop() │
│ 179 │ │
│ 180 │ loop = asyncio.get_running_loop() │
│ 181 │ loop.create_task(shutdown()) │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
UnboundLocalError: local variable 'loop' referenced before assignment
```
Co-authored-by: Ashwin Bharambe <@ashwinb>
Signed-off-by: Sébastien Han <seb@redhat.com>
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
python -m llama_stack.distribution.server.server --yaml-config ./llama_stack/templates/ollama/run.yaml
INFO 2025-03-03 21:55:35,918 __main__:365 [server]: Using config file: llama_stack/templates/ollama/run.yaml
INFO 2025-03-03 21:55:35,925 __main__:378 [server]: Run configuration:
INFO 2025-03-03 21:55:35,928 __main__:380 [server]: apis:
- agents
```
[//]: # (## Documentation)
---------
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
# What does this PR do?
using `formatter_class=argparse.ArgumentDefaultsHelpFormatter` displays
(default: DEFAULT_VALUE) for each flag. add this formatter class to
build and run to show users some default values like `conda`, `8321`,
etc
## Test Plan
ran locally with following output:
before:
```
llama stack run --help
usage: llama stack run [-h] [--port PORT] [--image-name IMAGE_NAME] [--disable-ipv6] [--env KEY=VALUE] [--tls-keyfile TLS_KEYFILE] [--tls-certfile TLS_CERTFILE]
[--image-type {conda,container,venv}]
config
Start the server for a Llama Stack Distribution. You should have already built (or downloaded) and configured the distribution.
positional arguments:
config Path to config file to use for the run
options:
-h, --help show this help message and exit
--port PORT Port to run the server on. It can also be passed via the env var LLAMA_STACK_PORT. Defaults to 8321
--image-name IMAGE_NAME
Name of the image to run. Defaults to the current conda environment
--disable-ipv6 Disable IPv6 support
--env KEY=VALUE Environment variables to pass to the server in KEY=VALUE format. Can be specified multiple times.
--tls-keyfile TLS_KEYFILE
Path to TLS key file for HTTPS
--tls-certfile TLS_CERTFILE
Path to TLS certificate file for HTTPS
--image-type {conda,container,venv}
Image Type used during the build. This can be either conda or container or venv.
```
after:
```
llama stack run --help
usage: llama stack run [-h] [--port PORT] [--image-name IMAGE_NAME] [--disable-ipv6] [--env KEY=VALUE] [--tls-keyfile TLS_KEYFILE] [--tls-certfile TLS_CERTFILE]
[--image-type {conda,container,venv}]
config
Start the server for a Llama Stack Distribution. You should have already built (or downloaded) and configured the distribution.
positional arguments:
config Path to config file to use for the run
options:
-h, --help show this help message and exit
--port PORT Port to run the server on. It can also be passed via the env var LLAMA_STACK_PORT. (default: 8321)
--image-name IMAGE_NAME
Name of the image to run. Defaults to the current conda environment (default: None)
--disable-ipv6 Disable IPv6 support (default: False)
--env KEY=VALUE Environment variables to pass to the server in KEY=VALUE format. Can be specified multiple times. (default: [])
--tls-keyfile TLS_KEYFILE
Path to TLS key file for HTTPS (default: None)
--tls-certfile TLS_CERTFILE
Path to TLS certificate file for HTTPS (default: None)
--image-type {conda,container,venv}
Image Type used during the build. This can be either conda or container or venv. (default: conda)
```
[//]: # (## Documentation)
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
The method "dict" in class "BaseModel" is deprecated we should use
model_dump instead.
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
- From old PR, it use `BUILDS_BASE_DIR` in
`llama_stack/cli/stack/configure.py`(removed).
https://github.com/meta-llama/llama-stack/pull/371/files
- Based on the current `build` code, it should only use
`DISTRIBS_BASE_DIR` to save it.
46b0a404e8/llama_stack/cli/stack/_build.py (L298)46b0a404e8/llama_stack/cli/stack/_build.py (L301)
Pls correct me if I am understand incorrectly.
So it should no need to use in `run` now.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
It would be better to tell user env var usage in help text.
```
before:
$ llama stack run --help
--port PORT Port to run the server on. Defaults to 8321
after
$ llama stack run --help
--port PORT Port to run the server on. It can also be passed via the env var LLAMA_STACK_PORT. Defaults to 8321
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
21ec67356c/distributions
It should missed the `s`.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
19ae4b35d9/llama_stack/cli/model/prompt_format.py (L47)
Based on the comment: `Only Llama 3.1 and 3.2 are supported`, even 3.1,
3.2 are not all models can show it with `prompt-format`, so cannot refer
to `llama model list`,
only refer to list when enter a invalid model, so it would be nice to
help to check the valid models:
```
llama model prompt-format -m Llama3.1-405B-Instruct:bf16-mp8
usage: llama model prompt-format [-h] [-m MODEL_NAME] [-l]
llama model prompt-format: error: Llama3.1-405B-Instruct:bf16-mp8 is not a valid Model <<<<---. Choose one from --
Llama3.1-8B
Llama3.1-70B
Llama3.1-405B
Llama3.1-8B-Instruct
Llama3.1-70B-Instruct
Llama3.1-405B-Instruct
Llama3.2-1B
Llama3.2-3B
Llama3.2-1B-Instruct
Llama3.2-3B-Instruct
Llama3.2-11B-Vision
Llama3.2-90B-Vision
Llama3.2-11B-Vision-Instruct
Llama3.2-90B-Vision-Instruct
before:
$ llama model prompt-format --help
usage: llama model prompt-format [-h] [-m MODEL_NAME]
Show llama model message formats
options:
-h, --help show this help message and exit
-m MODEL_NAME, --model-name MODEL_NAME
Model Family (llama3_1, llama3_X, etc.)
Example:
llama model prompt-format <options>
after:
$ llama model prompt-format --help
usage: llama model prompt-format [-h] [-m MODEL_NAME] [-l]
Show llama model message formats
options:
-h, --help show this help message and exit
-m MODEL_NAME, --model-name MODEL_NAME
Model Family (llama3_1, llama3_X, etc.)
-l, --list List the valid supported models
Example:
llama model prompt-format <options>
$ llama model prompt-format -l
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Model ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Llama3.1-8B │
├──────────────────────────────┤
│ Llama3.1-70B │
├──────────────────────────────┤
│ Llama3.1-405B │
├──────────────────────────────┤
│ Llama3.1-8B-Instruct │
├──────────────────────────────┤
│ Llama3.1-70B-Instruct │
├──────────────────────────────┤
│ Llama3.1-405B-Instruct │
├──────────────────────────────┤
│ Llama3.2-1B │
├──────────────────────────────┤
│ Llama3.2-3B │
├──────────────────────────────┤
│ Llama3.2-1B-Instruct │
├──────────────────────────────┤
│ Llama3.2-3B-Instruct │
├──────────────────────────────┤
│ Llama3.2-11B-Vision │
├──────────────────────────────┤
│ Llama3.2-90B-Vision │
├──────────────────────────────┤
│ Llama3.2-11B-Vision-Instruct │
├──────────────────────────────┤
│ Llama3.2-90B-Vision-Instruct │
└──────────────────────────────┘
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
---------
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>