llama-stack-mirror/llama_stack/providers/inline
Xi Yan 7d111c7510
feat: unify max_infer_iters in client/server agent loop (#1309)
# What does this PR do?

We currently use `max_infer_iters` in 2 different ways
1/ Server: track number of times 
2/ Client side: track number of times we send `resume_turn` request

This PR gets rid of the need of (2) and makes server track total number
of times we perform inference within a Turn

**NOTE**
The PR will assume StopReason is set to
- end_of_message: turn is not finished, we could be waiting for client
tool call responses
- end_of_turn: if the entire turn is finished and there's no more things
to be done.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
```
LLAMA_STACK_BASE_URL=http://localhost:8321 pytest -v tests/client-sdk/agents/test_agents.py::test_custom_tool_infinite_loop --inference-model "meta-llama/Llama-3.3-70B-Instruct"
```

[//]: # (## Documentation)
2025-03-03 10:08:36 -08:00
..
agents feat: unify max_infer_iters in client/server agent loop (#1309) 2025-03-03 10:08:36 -08:00
datasetio build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
eval fix: replace eval with json decoding (#1327) 2025-02-28 11:10:45 -08:00
inference chore: remove straggler references to llama-models (#1345) 2025-03-01 14:26:03 -08:00
ios/inference chore: removed executorch submodule (#1265) 2025-02-25 21:57:21 -08:00
post_training fix: replace eval with json decoding for format_adapter (#1328) 2025-02-28 11:25:23 -08:00
safety chore: move all Llama Stack types from llama-models to llama-stack (#1098) 2025-02-14 09:10:59 -08:00
scoring chore(lint): update Ruff ignores for project conventions and maintainability (#1184) 2025-02-28 09:36:49 -08:00
telemetry chore: better raise (#1335) 2025-02-28 16:41:20 -08:00
tool_runtime chore: remove dependency on llama_models completely (#1344) 2025-03-01 12:48:08 -08:00
vector_io feat: allow conditionally enabling providers in run.yaml (#1321) 2025-03-01 11:19:14 -08:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00