# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
---------
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
Fix SQL syntax errors caused by hyphens in Vector DB IDs by sanitizing
table
# (Closes#1332 )
## Test Plan
Test confirms table names with hyphens are properly converted to
underscores
# What does this PR do?
Fix end of files hook for pre-commit.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
Run pre-commit without any errors:
```
uv run pre-commit run --all-files
```
Signed-off-by: Sébastien Han <seb@redhat.com>
Summary:
1. The `tools` parameter we construct to pass the inference API is
non-deterministic. As a result, our recordable mocks is flaky as the
ordering change sometimes. This PR makes it so that `tools` ordering is
deterministic and aligned with the order user specified.
2. In recordable mock key generation, client tool's parameter type was
'str' and now is 'string' for some reason. I didn't dig into exactly
why, but just regenerated the fixtures.
Test Plan:
Regenerate mocks:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --record-responses
```
Rerun tests without --record-responses:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B
```
Move unittests to tests/unittests. Gradually nuking tests from
providers/tests/ and unifying them into tests/api (which are e2e tests
using SDK types)
## Test Plan
`pytest -s -v tests/unittests/`
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
It would be better to tell user env var usage in help text.
```
before:
$ llama stack run --help
--port PORT Port to run the server on. Defaults to 8321
after
$ llama stack run --help
--port PORT Port to run the server on. It can also be passed via the env var LLAMA_STACK_PORT. Defaults to 8321
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
Summary:
Agent tests shouldn't need to run inference and tools calls repeatedly.
This PR introduces a way to record inference/tool calls and reuse them
in subsequent test runs, which makes the tests more reliable and saves
costs.
Test Plan:
Run when there's no recorded calls created (fails):
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B
```
Run when `--record-responses` to record calls:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --record-responses
```
Run without `--record-responses` again (succeeds):
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B
```
# What does this PR do?
- Modularized `resolve_impls` by extracting helper functions for
validation, sorting, and instantiation.
- Improved readability by introducing `validate_and_prepare_providers`,
`sort_providers_by_dependency`, and `instantiate_providers`.
- Enhanced type safety with explicit type hints (`Tuple`, `Dict`, `Set`,
etc.).
- Fixed potential issues with provider module imports and added error
handling.
- Updated `pyproject.toml` to enforce type checking on `resolver.py`
using `mypy`.
Signed-off-by: Sébastien Han <seb@redhat.com>
- [//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
Run the server.
[//]: # (## Documentation)
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
We currently use `max_infer_iters` in 2 different ways
1/ Server: track number of times
2/ Client side: track number of times we send `resume_turn` request
This PR gets rid of the need of (2) and makes server track total number
of times we perform inference within a Turn
**NOTE**
The PR will assume StopReason is set to
- end_of_message: turn is not finished, we could be waiting for client
tool call responses
- end_of_turn: if the entire turn is finished and there's no more things
to be done.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v tests/client-sdk/agents/test_agents.py::test_custom_tool_infinite_loop --inference-model "meta-llama/Llama-3.3-70B-Instruct"
```
[//]: # (## Documentation)
A self-respecting server needs good observability which starts with
configurable logging. Llama Stack had little until now. This PR adds a
`logcat` facility towards that. Callsites look like:
```python
logcat.debug("inference", f"params to ollama: {params}")
```
- the first parameter is a category. there is a static list of
categories in `llama_stack/logcat.py`
- each category can be associated with a log-level which can be
configured via the `LLAMA_STACK_LOGGING` env var.
- a value `LLAMA_STACK_LOGGING=inference=debug;server=info"` does the
obvious thing. there is a special key called `all` which is an alias for
all categories
## Test Plan
Ran with `LLAMA_STACK_LOGGING="all=debug" llama stack run fireworks` and
saw the following:

Hit it with a client-sdk test case and saw this:

# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
- add missing `import`
- add client define
- update `attachments` to `documents`,
40da0d0e76
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
Sorry for the https://github.com/meta-llama/llama-stack/pull/1340 logic,
it will cause issue if in `non-container` env.
```
Using conda <<<<<<<------ environment: stack
+ is_command_available docker
+ command -v docker
+ printf '\033[0;31mError: docker command not found. Is docker installed and in your PATH?\033[0m'
Error: docker command not found. Is docker installed and in your PATH?+ exit 1
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
For
3805604220
```
Fixing docs/source/building_applications/tools.md
check for added large files..............................................Passed
fix end of files.........................................................Passed
Insert license in comments...............................................Passed
ruff.....................................................................Passed
ruff-format..............................................................Passed
blacken-docs.............................................................Passed
uv-lock..................................................................Passed
uv-export................................................................Passed
mypy.....................................................................Passed
Distribution Template Codegen............................................Passed
pre-commit hook(s) made changes.
If you are seeing this message in CI, reproduce locally with: `pre-commit run --all-files`.
To run `pre-commit` as part of git workflow, use `pre-commit install`.
All changes made by hooks:
diff --git a/docs/source/building_applications/tools.md b/docs/source/building_applications/tools.md
index afffbc8..5a569ff 100644
--- a/docs/source/building_applications/tools.md
+++ b/docs/source/building_applications/tools.md
@@ -127,7 +127,7 @@ MCP tools require:
## Adding Custom Tools
-When you want to use tools other than the built-in tools, you can implement a python function and decorate it with `@client_tool`.
+When you want to use tools other than the built-in tools, you can implement a python function and decorate it with `@client_tool`.
To define a custom tool, you need to use the `@client_tool` decorator.
```python
Error: Process completed with exit code 1.
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
Summary:
- [new] Agent concepts (session, turn)
- [new] how to write custom tools
- [new] non-streaming API and how to get outputs
- [update] remaining `memory` -> `rag` rename
- [new] note importance of `instructions`
Test Plan:
read
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
`start_venv.sh` lifecycle should be:
025f615868
>>
34e3faa4e8
>>
4684fd3f8d
Finally replaced by `start_stack.sh`
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
---------
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
We want to bundle a bunch of (typically remote) providers in a distro
template and be able to configure them "on the fly" via environment
variables. So far, we have been able to do this with simple env var
replacements. However, sometimes you want to only conditionally enable
providers (because the relevant remote services may not be alive, or
relevant.) This was not possible until now.
To aid this, we add a simple (bash-like) env var replacement
enhancement: `${env.FOO+bar}` evaluates to `bar` if the variable is SET
and evaluates to empty string if it is not. On top of that, we update
our main resolver to ignore any provider whose ID is null.
This allows using the distro like this:
```bash
llama stack run dev --env CHROMADB_URL=http://localhost:6001 --env ENABLE_CHROMADB=1
```
when only Chroma is UP. This disables the other `pgvector` provider in
the run configuration.
## Test Plan
Hard code `chromadb` as the vector io provider inside
`test_vector_io.py` and run:
```bash
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -s -v tests/client-sdk/vector_io/ --embedding-model all-MiniLM-L6-v2
```
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# Summary:
This led to extremely hard to debug messages.
Before:
llama_stack/distribution/library_client.py:275: in request
response = await self._call_non_streaming(
llama_stack/distribution/library_client.py:322: in _call_non_streaming
result = await matched_func(**body)
llama_stack/providers/utils/telemetry/trace_protocol.py:102: in
async_wrapper
result = await method(self, *args, **kwargs)
llama_stack/providers/inline/agents/meta_reference/agents.py:80: in
create_agent
value=agent_config.model_dump_json(),
E AttributeError: 'dict' object has no attribute 'model_dump_json'
After:
E ValueError: Failed to convert parameter {'model':
'meta-llama/Llama-3.1-8B-Instruct', 'instructions': 'You are a helpful
assistant', 'sampling_params': {'strategy': {'type': 'top_p',
'temperature': 0.0001, 'top_p': 0.9}}, 'toolgroups': [{'name':
'builtin::rag'}], 'input_shields': ['meta-llama/Llama-Guard-3-8B'],
'output_shields': ['meta-llama/Llama-Guard-3-8B'],
'enable_session_persistence': False} into <class
'llama_stack.apis.agents.agents.AgentConfig'>: 2 validation errors for
AgentConfig
E toolgroups.0.str
E Input should be a valid string [type=string_type, input_value={'name':
'builtin::rag'}, input_type=dict]
E For further information visit
https://errors.pydantic.dev/2.10/v/string_type
E toolgroups.0.AgentToolGroupWithArgs.args
E Field required [type=missing, input_value={'name': 'builtin::rag'},
input_type=dict]
E For further information visit
https://errors.pydantic.dev/2.10/v/missing
# Test Plan:
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/
--safety-shield meta-llama/Llama-Guard-3-8B
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
21ec67356c/distributions
It should missed the `s`.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
Summary:
This test is no longer relevant. We updated the default system prompt in
https://github.com/meta-llama/llama-stack/pull/1310, and system override
behavior is already unit-tested in test_prompt_adapter.py
Test Plan:
read
# What does this PR do?
This PR updates the version in the
[README.md](https://github.com/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/README.md)
to reflect the latest changes in Llama Stack setup.
Previously, using **llama-stack==0.1.0** caused an error when running:
```bash
llama stack build --template ollama --image-type conda
```
Upgrading to llama-stack==0.1.3 resolves this issue.
## Test Plan
- Verified that `llama stack build --template ollama --image-type conda`
works correctly.
---------
Signed-off-by: Surya Prakash Pathak <supathak@redhat.com>
# What does this PR do?
Use @client_tool decorator instead of ClientTool
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
LLAMA_STACK_CONFIG=fireworks pytest -v tests/client-sdk/agents/test_agents.py --inference-model "meta-llama/Llama-3.3-70B-Instruct"
```
<img width="1053" alt="image"
src="https://github.com/user-attachments/assets/d3ade884-ef42-494e-8028-3b09d9ef1978"
/>
[//]: # (## Documentation)
# What does this PR do?
- using `eval` is a security risk
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
- see https://github.com/meta-llama/llama-stack/pull/1327
cc @SLR722 we will need to update the corresponding dataset via
```python
def update_to_json_str():
dataset = datasets.load_dataset(...)
processed_dataset = dataset[split].map(
lambda x: {
"column": json.dumps(eval(x["column"]))
}
)
processed_dataset.push_to_hub(...)
```
[//]: # (## Documentation)
# What does this PR do?
An API spec must talk about Error handling. This was a pretty glaring
omission so far. This PR begins to address it by adding a set of
standard error responses we can attach to all our API calls.
At a future point, we can add specific error types where necessary
(although we should not hurry to do that; it is best done very late.)
## Test Plan
Checked that Stainless SDK generation succeeds.
# What does this PR do?
- Using `eval` on server is a security risk
- Replace `eval` with `json.loads`
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
pytest -v -s --nbval-lax ./llama-stack/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb
```
<img width="747" alt="image"
src="https://github.com/user-attachments/assets/7aff3d95-0b12-4394-b9d0-aeff791eee38"
/>
[//]: # (## Documentation)
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
Since released the `--downloaded` option, so update the related
documents.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
55eb257459/llama_stack/cli/stack/run.py (L22)55eb257459/llama_stack/cli/stack/_build.py (L103)
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>