mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-08 04:54:38 +00:00

History

Xi Yan 0fe071764f feat(1/n): api: unify agents for handling server & client tools (#1178 ) # Problem Our current Agent framework has discrepancies in definition on how we handle server side and client side tools. 1. Server Tools: a single Turn is returned including `ToolExecutionStep` in agenst 2. Client Tools: `create_agent_turn` is called in loop with client agent lib yielding the agent chunk `ad6ffc63df/src/llama_stack_client/lib/agents/agent.py (L186-L211)` This makes it inconsistent to work with server & client tools. It also complicates the logs to telemetry to get information about agents turn / history for observability. #### Principle The same `turn_id` should be used to represent the steps required to complete a user message including client tools. ## Solution 1. `AgentTurnResponseEventType.turn_awaiting_input` status to indicate that the current turn is not completed, and awaiting tool input 2. `continue_agent_turn` endpoint to update agent turn with client's tool response. # What does this PR do? - Skeleton API as example ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] - Just API update, no functionality change ``` llama stack run + client-sdk test ``` <img width="842" alt="image" src="https://github.com/user-attachments/assets/7ac56b5f-f424-4632-9476-7e0f57555bc3" /> [//]: # (## Documentation)		2025-02-21 11:48:27 -08:00
..
agents	feat(1/n): api: unify agents for handling server & client tools (#1178 )	2025-02-21 11:48:27 -08:00
inference	fix: remove list of list tests, no longer relevant after #1161 (#1205 )	2025-02-21 08:07:35 -08:00
safety	Fix precommit check after moving to ruff (#927 )	2025-02-02 06:46:45 -08:00
tool_runtime	build: format codebase imports using ruff linter (#1028 )	2025-02-13 10:06:21 -08:00
vector_io	fix: disable sqlite-vec test (#1090 )	2025-02-13 18:40:16 -08:00
__init__.py	[tests] add client-sdk pytests & delete client.py (#638 )	2024-12-16 12:04:56 -08:00
conftest.py	feat(providers): add NVIDIA Inference embedding provider and tests (#935 )	2025-02-20 16:59:48 -08:00
metadata.py	Report generation minor fixes (#884 )	2025-01-28 04:58:12 -08:00
README.md	Update README.md	2025-02-14 15:45:08 -08:00
report.py	script for running client sdk tests (#895 )	2025-02-19 22:38:06 -08:00

README.md

Llama Stack Integration Tests

You can run llama stack integration tests on either a Llama Stack Library or a Llama Stack endpoint.

To test on a Llama Stack library with certain configuration, run

LLAMA_STACK_CONFIG=./llama_stack/templates/cerebras/run.yaml pytest -s -v tests/client-sdk/inference/

or just the template name

LLAMA_STACK_CONFIG=together pytest -s -v tests/client-sdk/inference/

To test on a Llama Stack endpoint, run

LLAMA_STACK_BASE_URL=http://localhost:8089 pytest -s -v tests/client-sdk/inference

Report Generation

To generate a report, run with --report option

LLAMA_STACK_CONFIG=together pytest -s -v report.md tests/client-sdk/ --report

Common options

Depending on the API, there are custom options enabled

For tests in inference/ and agents/, we support --inference-model(to be used in text inference tests) and--vision-inference-model` (only used in image inference tests) overrides
For tests in vector_io/, we support --embedding-model override
For tests in safety/, we support --safety-shield override
The param can be --report or --report <path> If path is not provided, we do a best effort to infer based on the config / template name. For url endpoints, path is required.