llama-stack/llama_stack/distribution
Xi Yan 194d12b304
[bugfix] fix streaming GeneratorExit exception with LlamaStackAsLibraryClient (#760)
# What does this PR do?

#### Issue
- Using Jupyter notebook with LlamaStackAsLibraryClient + streaming
gives exception
```
Exception ignored in: <async_generator object HTTP11ConnectionByteStream.__aiter__ at 0x32a95a740>
Traceback (most recent call last):
  File "/opt/anaconda3/envs/fresh/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 404, in _aiter_
    yield part
RuntimeError: async generator ignored GeneratorExit
```

- Reproduce w/
https://github.com/meta-llama/llama-stack/blob/notebook-streaming-debug/inline.ipynb

#### Fix
- Issue likely comes from stream_across_asyncio_run_boundary closing
connection too soon when interacting in jupyter environment
- This uses an alternative way to convert AsyncStream to SyncStream
return type by sync version of LlamaStackAsLibraryClient, which calls
AsyncLlamaStackAsLibraryClient calling async impls under the hood

#### Additional changes
- Moved tracing logic into AsyncLlamaStackAsLibraryClient.request s.t.
streaming / non-streaming request for LlamaStackAsLibraryClient shares
same code

## Test Plan

- Test w/ together & fireworks & ollama with streaming and non-streaming
using notebook in:
https://github.com/meta-llama/llama-stack/blob/notebook-streaming-debug/inline.ipynb
- Note: need to restart kernel and run pip install -e . in jupyter
interpreter for local code change to take effect

<img width="826" alt="image"
src="https://github.com/user-attachments/assets/5f90985d-1aee-452c-a599-2157f5654fea"
/>


## Sources

Please link relevant resources if necessary.


## Before submitting

- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-14 10:58:46 -08:00
..
routers remove conflicting default for tool prompt format in chat completion (#742) 2025-01-10 10:41:53 -08:00
server rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars (#744) 2025-01-10 11:09:49 -08:00
store Fixes; make inference tests pass with newer tool call types 2025-01-13 23:16:53 -08:00
tests Fix bedrock inference impl 2024-12-16 14:22:34 -08:00
ui Fix failing flake8 E226 check (#701) 2025-01-02 09:04:07 -08:00
utils Ensure model_local_dir does not mangle "C:\" on Windows 2024-11-24 14:18:59 -08:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
build.py Switch to use importlib instead of deprecated pkg_resources (#678) 2025-01-13 20:20:02 -08:00
build_conda_env.sh Fix to conda env build script 2024-12-17 12:19:34 -08:00
build_container.sh Fix incorrect Python binary path for UBI9 image (#757) 2025-01-13 20:17:21 -08:00
build_venv.sh Miscellaneous fixes around telemetry, library client and run yaml autogen 2024-12-08 20:40:22 -08:00
client.py use API version in "remote" stack client 2024-11-19 15:59:47 -08:00
common.sh API Updates (#73) 2024-09-17 19:51:35 -07:00
configure.py [remove import *] clean up import *'s (#689) 2024-12-27 15:45:44 -08:00
configure_container.sh docker: Check for selinux before using --security-opt (#167) 2024-10-02 10:37:41 -07:00
datatypes.py agents to use tools api (#673) 2025-01-08 19:01:00 -08:00
distribution.py Tools API with brave and MCP providers (#639) 2024-12-19 21:25:17 -08:00
inspect.py add --version to llama stack CLI & /version endpoint (#732) 2025-01-08 16:30:06 -08:00
library_client.py [bugfix] fix streaming GeneratorExit exception with LlamaStackAsLibraryClient (#760) 2025-01-14 10:58:46 -08:00
request_headers.py Add X-LlamaStack-Client-Version, rename ProviderData -> Provider-Data (#735) 2025-01-09 11:51:36 -08:00
resolver.py agents to use tools api (#673) 2025-01-08 19:01:00 -08:00
stack.py Switch to use importlib instead of deprecated pkg_resources (#678) 2025-01-13 20:20:02 -08:00
start_conda_env.sh Move to use argparse, fix issues with multiple --env cmdline options 2024-11-18 16:31:59 -08:00
start_container.sh rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars (#744) 2025-01-10 11:09:49 -08:00