mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-07 18:57:21 +00:00
PR #201 had made several changes while trying to fix issues with getting the stream=False branches of inference and agents API working. As part of this, it made a change which was slightly gratuitous. Namely, making chat_completion() and brethren "def" instead of "async def". The rationale was that this allowed the user (within llama-stack) of this to use it as: ``` async for chunk in api.chat_completion(params) ``` However, it causes unnecessary confusion for several folks. Given that clients (e.g., llama-stack-apps) anyway use the SDK methods (which are completely isolated) this choice was not ideal. Let's revert back so the call now looks like: ``` async for chunk in await api.chat_completion(params) ``` Bonus: Added a completion() implementation for the meta-reference provider. Technically should have been another PR :) |
||
|---|---|---|
| .. | ||
| docker | ||
| routers | ||
| server | ||
| templates | ||
| utils | ||
| __init__.py | ||
| build.py | ||
| build_conda_env.sh | ||
| build_container.sh | ||
| common.sh | ||
| configure.py | ||
| configure_container.sh | ||
| datatypes.py | ||
| distribution.py | ||
| inspect.py | ||
| request_headers.py | ||
| resolver.py | ||
| start_conda_env.sh | ||
| start_container.sh | ||