llama-stack-mirror/tests/client-sdk
Xi Yan 0fe071764f
feat(1/n): api: unify agents for handling server & client tools (#1178)
# Problem

Our current Agent framework has discrepancies in definition on how we
handle server side and client side tools.

1. Server Tools: a single Turn is returned including `ToolExecutionStep`
in agenst
2. Client Tools: `create_agent_turn` is called in loop with client agent
lib yielding the agent chunk

ad6ffc63df/src/llama_stack_client/lib/agents/agent.py (L186-L211)

This makes it inconsistent to work with server & client tools. It also
complicates the logs to telemetry to get information about agents turn /
history for observability.

#### Principle
The same `turn_id` should be used to represent the steps required to
complete a user message including client tools.

## Solution

1. `AgentTurnResponseEventType.turn_awaiting_input` status to indicate
that the current turn is not completed, and awaiting tool input
2. `continue_agent_turn` endpoint to update agent turn with client's
tool response.


# What does this PR do?
- Skeleton API as example

## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]

- Just API update, no functionality change
```
llama stack run + client-sdk test
```

<img width="842" alt="image"
src="https://github.com/user-attachments/assets/7ac56b5f-f424-4632-9476-7e0f57555bc3"
/>


[//]: # (## Documentation)
2025-02-21 11:48:27 -08:00
..
agents feat(1/n): api: unify agents for handling server & client tools (#1178) 2025-02-21 11:48:27 -08:00
inference fix: remove list of list tests, no longer relevant after #1161 (#1205) 2025-02-21 08:07:35 -08:00
safety Fix precommit check after moving to ruff (#927) 2025-02-02 06:46:45 -08:00
tool_runtime build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
vector_io fix: disable sqlite-vec test (#1090) 2025-02-13 18:40:16 -08:00
__init__.py [tests] add client-sdk pytests & delete client.py (#638) 2024-12-16 12:04:56 -08:00
conftest.py feat(providers): add NVIDIA Inference embedding provider and tests (#935) 2025-02-20 16:59:48 -08:00
metadata.py Report generation minor fixes (#884) 2025-01-28 04:58:12 -08:00
README.md Update README.md 2025-02-14 15:45:08 -08:00
report.py script for running client sdk tests (#895) 2025-02-19 22:38:06 -08:00

Llama Stack Integration Tests

You can run llama stack integration tests on either a Llama Stack Library or a Llama Stack endpoint.

To test on a Llama Stack library with certain configuration, run

LLAMA_STACK_CONFIG=./llama_stack/templates/cerebras/run.yaml pytest -s -v tests/client-sdk/inference/

or just the template name

LLAMA_STACK_CONFIG=together pytest -s -v tests/client-sdk/inference/

To test on a Llama Stack endpoint, run

LLAMA_STACK_BASE_URL=http://localhost:8089 pytest -s -v tests/client-sdk/inference

Report Generation

To generate a report, run with --report option

LLAMA_STACK_CONFIG=together pytest -s -v report.md tests/client-sdk/ --report

Common options

Depending on the API, there are custom options enabled

  • For tests in inference/ and agents/, we support --inference-model(to be used in text inference tests) and--vision-inference-model` (only used in image inference tests) overrides
  • For tests in vector_io/, we support --embedding-model override
  • For tests in safety/, we support --safety-shield override
  • The param can be --report or --report <path> If path is not provided, we do a best effort to infer based on the config / template name. For url endpoints, path is required.