llama-stack-mirror/llama_stack
Ashwin Bharambe 0c9eb3341c Separate chat_completion stream and non-stream implementations
This is a pretty important requirement. The streaming response type is
an AsyncGenerator while the non-stream one is a single object. So far
this has worked _sometimes_ due to various pre-existing hacks (and in
some cases, just failed.)
2024-10-08 17:23:40 -07:00
..
apis Separate chat_completion stream and non-stream implementations 2024-10-08 17:23:40 -07:00
cli A few bug fixes for covering corner cases 2024-10-08 17:23:02 -07:00
distribution Separate chat_completion stream and non-stream implementations 2024-10-08 17:23:40 -07:00
providers Separate chat_completion stream and non-stream implementations 2024-10-08 17:23:40 -07:00
scripts Add a test for CLI, but not fully done so disabled 2024-09-19 13:27:07 -07:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00