llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-07 04:45:44 +00:00

History

Dinesh Yeduguru ead9397e22 fix: tracing fixes for trace context propogation across coroutines (#1522 ) # What does this PR do? This PR has two fixes needed for correct trace context propagation across asycnio boundary Fix 1: Start using context vars to store the global trace context. This is needed since we cannot use the same trace context across coroutines since the state is shared. each coroutine should have its own trace context so that each of it can start storing its state correctly. Fix 2: Start a new span for each new coroutines started for running shields to keep the span tree clean ## Test Plan ### Integration tests with server LLAMA_STACK_DISABLE_VERSION_CHECK=true llama stack run ~/.llama/distributions/together/together-run.yaml LLAMA_STACK_CONFIG=http://localhost:8321 pytest -s --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct server logs: https://gist.github.com/dineshyv/51ac5d9864ed031d0d89ce77352821fe test logs: https://gist.github.com/dineshyv/e66acc1c4648a42f1854600609c467f3 ### Integration tests with library client LLAMA_STACK_CONFIG=fireworks pytest -s --safety-shield meta-llama/Llama-Guard-3-8B --text-model meta-llama/Llama-3.1-8B-Instruct logs: https://gist.github.com/dineshyv/ca160696a0b167223378673fb1dcefb8 ### Apps test with server: ``` LLAMA_STACK_DISABLE_VERSION_CHECK=true llama stack run ~/.llama/distributions/together/together-run.yaml python -m examples.agents.e2e_loop_with_client_tools localhost 8321 ``` server logs: https://gist.github.com/dineshyv/1717a572d8f7c14279c36123b79c5797 app logs: https://gist.github.com/dineshyv/44167e9f57806a0ba3b710c32aec02f8		2025-03-11 07:12:48 -07:00
..
agents	fix: tracing fixes for trace context propogation across coroutines (#1522 )	2025-03-11 07:12:48 -07:00
datasetio	build: format codebase imports using ruff linter (#1028 )	2025-02-13 10:06:21 -08:00
eval	chore: rename task_config to benchmark_config (#1397 )	2025-03-04 12:44:04 -08:00
inference	feat: updated inline vllm inference provider (#880 )	2025-03-07 13:38:23 -08:00
ios/inference	chore: removed executorch submodule (#1265 )	2025-02-25 21:57:21 -08:00
post_training	fix: replace eval with json decoding for format_adapter (#1328 )	2025-02-28 11:25:23 -08:00
safety	chore: move all Llama Stack types from llama-models to llama-stack (#1098 )	2025-02-14 09:10:59 -08:00
scoring	feat: [new open benchmark] Math 500 (#1538 )	2025-03-10 20:38:28 -07:00
telemetry	fix: Revert "feat: record token usage for inference API (#1300 )" (#1476 )	2025-03-07 10:16:47 -08:00
tool_runtime	chore: remove dependency on llama_models completely (#1344 )	2025-03-01 12:48:08 -08:00
vector_io	chore: made inbuilt tools blocking calls into async non blocking calls (#1509 )	2025-03-09 16:59:24 -07:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00