All of the tests from `llama_stack/providers/tests/` are now moved to
`tests/integration`.
I converted the `tools`, `scoring` and `datasetio` tests to use API.
However, `eval` and `post_training` proved to be a bit challenging to
leaving those. I think `post_training` should be relatively
straightforward also.
As part of this, I noticed that `wolfram_alpha` tool wasn't added to
some of our commonly used distros so I added it. I am going to remove a
lot of code duplication from distros next so while this looks like a
one-off right now, it will go away and be there uniformly for all
distros.
Summary:
Test Plan:
added new test
LLAMA_STACK_CONFIG=fireworks pytest -s -v
tests/api/agents/test_agents.py --safety-shield
meta-llama/Llama-Guard-3-8B
# What does this PR do?
- This was missed from previous deprecation:
https://github.com/meta-llama/llama-stack/pull/1186
- Part of https://github.com/meta-llama/llama-stack/issues/1396
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
pytest -v -s --nbval-lax ./llama-stack/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb
```
[//]: # (## Documentation)
# What does this PR do?
- Deprecate allow_turn_resume flag as this is used for staying backward
compat.
- Closes https://github.com/meta-llama/llama-stack/issues/1363
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
LLAMA_STACK_CONFIG=fireworks pytest -v tests/api/agents/test_agents.py --inference-model "meta-llama/Llama-3.3-70B-Instruct" --record-responses
```
<img width="1054" alt="image"
src="https://github.com/user-attachments/assets/d31de2d4-0953-41e1-a71a-7e1579fa351a"
/>
[//]: # (## Documentation)
Continues the refactor of tests.
Tests from `providers/tests` should be considered deprecated. For this
PR, I deleted most of the tests in
- inference
- safety
- agents
since much more comprehensive tests exist in
`tests/integration/{inference,safety,agents}` already.
I moved `test_persistence.py` from agents, but disabled all the tests
since that test needs to be properly migrated.
## Test Plan
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v agents --vision-inference-model=''
/Users/ashwin/homebrew/Caskroom/miniconda/base/envs/toolchain/lib/python3.10/site-packages/pytest_asyncio/plugin.py:208: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset.
The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"
warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
======================================================================================================= test session starts ========================================================================================================
platform darwin -- Python 3.10.16, pytest-8.3.3, pluggy-1.5.0 -- /Users/ashwin/homebrew/Caskroom/miniconda/base/envs/toolchain/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.10.16', 'Platform': 'macOS-15.3.1-arm64-arm-64bit', 'Packages': {'pytest': '8.3.3', 'pluggy': '1.5.0'}, 'Plugins': {'asyncio': '0.24.0', 'html': '4.1.1', 'metadata': '3.1.1', 'anyio': '4.8.0', 'nbval': '0.11.0'}}
rootdir: /Users/ashwin/local/llama-stack
configfile: pyproject.toml
plugins: asyncio-0.24.0, html-4.1.1, metadata-3.1.1, anyio-4.8.0, nbval-0.11.0
asyncio: mode=strict, default_loop_scope=None
collected 15 items
agents/test_agents.py::test_agent_simple[txt=8B] PASSED
agents/test_agents.py::test_tool_config[txt=8B] PASSED
agents/test_agents.py::test_builtin_tool_web_search[txt=8B] PASSED
agents/test_agents.py::test_builtin_tool_code_execution[txt=8B] PASSED
agents/test_agents.py::test_code_interpreter_for_attachments[txt=8B] PASSED
agents/test_agents.py::test_custom_tool[txt=8B] PASSED
agents/test_agents.py::test_custom_tool_infinite_loop[txt=8B] PASSED
agents/test_agents.py::test_tool_choice[txt=8B] PASSED
agents/test_agents.py::test_rag_agent[txt=8B-builtin::rag/knowledge_search] PASSED
agents/test_agents.py::test_rag_agent[txt=8B-builtin::rag] PASSED
agents/test_agents.py::test_rag_agent_with_attachments[txt=8B] PASSED
agents/test_agents.py::test_rag_and_code_agent[txt=8B] PASSED
agents/test_agents.py::test_create_turn_response[txt=8B] PASSED
agents/test_persistence.py::test_delete_agents_and_sessions SKIPPED (This test needs to be migrated to api / client-sdk world)
agents/test_persistence.py::test_get_agent_turns_and_steps SKIPPED (This test needs to be migrated to api / client-sdk world)
```
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
---------
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
Fix SQL syntax errors caused by hyphens in Vector DB IDs by sanitizing
table
# (Closes#1332 )
## Test Plan
Test confirms table names with hyphens are properly converted to
underscores
# What does this PR do?
Fix end of files hook for pre-commit.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
Run pre-commit without any errors:
```
uv run pre-commit run --all-files
```
Signed-off-by: Sébastien Han <seb@redhat.com>
Summary:
1. The `tools` parameter we construct to pass the inference API is
non-deterministic. As a result, our recordable mocks is flaky as the
ordering change sometimes. This PR makes it so that `tools` ordering is
deterministic and aligned with the order user specified.
2. In recordable mock key generation, client tool's parameter type was
'str' and now is 'string' for some reason. I didn't dig into exactly
why, but just regenerated the fixtures.
Test Plan:
Regenerate mocks:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --record-responses
```
Rerun tests without --record-responses:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B
```
Move unittests to tests/unittests. Gradually nuking tests from
providers/tests/ and unifying them into tests/api (which are e2e tests
using SDK types)
## Test Plan
`pytest -s -v tests/unittests/`
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
It would be better to tell user env var usage in help text.
```
before:
$ llama stack run --help
--port PORT Port to run the server on. Defaults to 8321
after
$ llama stack run --help
--port PORT Port to run the server on. It can also be passed via the env var LLAMA_STACK_PORT. Defaults to 8321
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
Summary:
Agent tests shouldn't need to run inference and tools calls repeatedly.
This PR introduces a way to record inference/tool calls and reuse them
in subsequent test runs, which makes the tests more reliable and saves
costs.
Test Plan:
Run when there's no recorded calls created (fails):
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B
```
Run when `--record-responses` to record calls:
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B --record-responses
```
Run without `--record-responses` again (succeeds):
```
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/agents/test_agents.py --safety-shield meta-llama/Llama-Guard-3-8B
```
# What does this PR do?
- Modularized `resolve_impls` by extracting helper functions for
validation, sorting, and instantiation.
- Improved readability by introducing `validate_and_prepare_providers`,
`sort_providers_by_dependency`, and `instantiate_providers`.
- Enhanced type safety with explicit type hints (`Tuple`, `Dict`, `Set`,
etc.).
- Fixed potential issues with provider module imports and added error
handling.
- Updated `pyproject.toml` to enforce type checking on `resolver.py`
using `mypy`.
Signed-off-by: Sébastien Han <seb@redhat.com>
- [//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
Run the server.
[//]: # (## Documentation)
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
We currently use `max_infer_iters` in 2 different ways
1/ Server: track number of times
2/ Client side: track number of times we send `resume_turn` request
This PR gets rid of the need of (2) and makes server track total number
of times we perform inference within a Turn
**NOTE**
The PR will assume StopReason is set to
- end_of_message: turn is not finished, we could be waiting for client
tool call responses
- end_of_turn: if the entire turn is finished and there's no more things
to be done.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v tests/client-sdk/agents/test_agents.py::test_custom_tool_infinite_loop --inference-model "meta-llama/Llama-3.3-70B-Instruct"
```
[//]: # (## Documentation)
A self-respecting server needs good observability which starts with
configurable logging. Llama Stack had little until now. This PR adds a
`logcat` facility towards that. Callsites look like:
```python
logcat.debug("inference", f"params to ollama: {params}")
```
- the first parameter is a category. there is a static list of
categories in `llama_stack/logcat.py`
- each category can be associated with a log-level which can be
configured via the `LLAMA_STACK_LOGGING` env var.
- a value `LLAMA_STACK_LOGGING=inference=debug;server=info"` does the
obvious thing. there is a special key called `all` which is an alias for
all categories
## Test Plan
Ran with `LLAMA_STACK_LOGGING="all=debug" llama stack run fireworks` and
saw the following:

Hit it with a client-sdk test case and saw this:

# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
- add missing `import`
- add client define
- update `attachments` to `documents`,
40da0d0e76
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
Sorry for the https://github.com/meta-llama/llama-stack/pull/1340 logic,
it will cause issue if in `non-container` env.
```
Using conda <<<<<<<------ environment: stack
+ is_command_available docker
+ command -v docker
+ printf '\033[0;31mError: docker command not found. Is docker installed and in your PATH?\033[0m'
Error: docker command not found. Is docker installed and in your PATH?+ exit 1
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
For
3805604220
```
Fixing docs/source/building_applications/tools.md
check for added large files..............................................Passed
fix end of files.........................................................Passed
Insert license in comments...............................................Passed
ruff.....................................................................Passed
ruff-format..............................................................Passed
blacken-docs.............................................................Passed
uv-lock..................................................................Passed
uv-export................................................................Passed
mypy.....................................................................Passed
Distribution Template Codegen............................................Passed
pre-commit hook(s) made changes.
If you are seeing this message in CI, reproduce locally with: `pre-commit run --all-files`.
To run `pre-commit` as part of git workflow, use `pre-commit install`.
All changes made by hooks:
diff --git a/docs/source/building_applications/tools.md b/docs/source/building_applications/tools.md
index afffbc8..5a569ff 100644
--- a/docs/source/building_applications/tools.md
+++ b/docs/source/building_applications/tools.md
@@ -127,7 +127,7 @@ MCP tools require:
## Adding Custom Tools
-When you want to use tools other than the built-in tools, you can implement a python function and decorate it with `@client_tool`.
+When you want to use tools other than the built-in tools, you can implement a python function and decorate it with `@client_tool`.
To define a custom tool, you need to use the `@client_tool` decorator.
```python
Error: Process completed with exit code 1.
```
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
Summary:
- [new] Agent concepts (session, turn)
- [new] how to write custom tools
- [new] non-streaming API and how to get outputs
- [update] remaining `memory` -> `rag` rename
- [new] note importance of `instructions`
Test Plan:
read
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
`start_venv.sh` lifecycle should be:
025f615868
>>
34e3faa4e8
>>
4684fd3f8d
Finally replaced by `start_stack.sh`
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
---------
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# What does this PR do?
We want to bundle a bunch of (typically remote) providers in a distro
template and be able to configure them "on the fly" via environment
variables. So far, we have been able to do this with simple env var
replacements. However, sometimes you want to only conditionally enable
providers (because the relevant remote services may not be alive, or
relevant.) This was not possible until now.
To aid this, we add a simple (bash-like) env var replacement
enhancement: `${env.FOO+bar}` evaluates to `bar` if the variable is SET
and evaluates to empty string if it is not. On top of that, we update
our main resolver to ignore any provider whose ID is null.
This allows using the distro like this:
```bash
llama stack run dev --env CHROMADB_URL=http://localhost:6001 --env ENABLE_CHROMADB=1
```
when only Chroma is UP. This disables the other `pgvector` provider in
the run configuration.
## Test Plan
Hard code `chromadb` as the vector io provider inside
`test_vector_io.py` and run:
```bash
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -s -v tests/client-sdk/vector_io/ --embedding-model all-MiniLM-L6-v2
```
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
# Summary:
This led to extremely hard to debug messages.
Before:
llama_stack/distribution/library_client.py:275: in request
response = await self._call_non_streaming(
llama_stack/distribution/library_client.py:322: in _call_non_streaming
result = await matched_func(**body)
llama_stack/providers/utils/telemetry/trace_protocol.py:102: in
async_wrapper
result = await method(self, *args, **kwargs)
llama_stack/providers/inline/agents/meta_reference/agents.py:80: in
create_agent
value=agent_config.model_dump_json(),
E AttributeError: 'dict' object has no attribute 'model_dump_json'
After:
E ValueError: Failed to convert parameter {'model':
'meta-llama/Llama-3.1-8B-Instruct', 'instructions': 'You are a helpful
assistant', 'sampling_params': {'strategy': {'type': 'top_p',
'temperature': 0.0001, 'top_p': 0.9}}, 'toolgroups': [{'name':
'builtin::rag'}], 'input_shields': ['meta-llama/Llama-Guard-3-8B'],
'output_shields': ['meta-llama/Llama-Guard-3-8B'],
'enable_session_persistence': False} into <class
'llama_stack.apis.agents.agents.AgentConfig'>: 2 validation errors for
AgentConfig
E toolgroups.0.str
E Input should be a valid string [type=string_type, input_value={'name':
'builtin::rag'}, input_type=dict]
E For further information visit
https://errors.pydantic.dev/2.10/v/string_type
E toolgroups.0.AgentToolGroupWithArgs.args
E Field required [type=missing, input_value={'name': 'builtin::rag'},
input_type=dict]
E For further information visit
https://errors.pydantic.dev/2.10/v/missing
# Test Plan:
LLAMA_STACK_CONFIG=fireworks pytest -s -v tests/client-sdk/
--safety-shield meta-llama/Llama-Guard-3-8B
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
21ec67356c/distributions
It should missed the `s`.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
Signed-off-by: reidliu <reid201711@gmail.com>
Co-authored-by: reidliu <reid201711@gmail.com>
Summary:
This test is no longer relevant. We updated the default system prompt in
https://github.com/meta-llama/llama-stack/pull/1310, and system override
behavior is already unit-tested in test_prompt_adapter.py
Test Plan:
read
# What does this PR do?
This PR updates the version in the
[README.md](https://github.com/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/README.md)
to reflect the latest changes in Llama Stack setup.
Previously, using **llama-stack==0.1.0** caused an error when running:
```bash
llama stack build --template ollama --image-type conda
```
Upgrading to llama-stack==0.1.3 resolves this issue.
## Test Plan
- Verified that `llama stack build --template ollama --image-type conda`
works correctly.
---------
Signed-off-by: Surya Prakash Pathak <supathak@redhat.com>
# What does this PR do?
Use @client_tool decorator instead of ClientTool
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
LLAMA_STACK_CONFIG=fireworks pytest -v tests/client-sdk/agents/test_agents.py --inference-model "meta-llama/Llama-3.3-70B-Instruct"
```
<img width="1053" alt="image"
src="https://github.com/user-attachments/assets/d3ade884-ef42-494e-8028-3b09d9ef1978"
/>
[//]: # (## Documentation)