llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-08 19:10:56 +00:00

History

Ashwin Bharambe 0c9eb3341c Separate chat_completion stream and non-stream implementations This is a pretty important requirement. The streaming response type is an AsyncGenerator while the non-stream one is a single object. So far this has worked _sometimes_ due to various pre-existing hacks (and in some cases, just failed.)		2024-10-08 17:23:40 -07:00
..
agents	Push registration methods onto the backing providers	2024-10-08 17:23:02 -07:00
batch_inference	API Updates (#73 )	2024-09-17 19:51:35 -07:00
common	API Updates (#73 )	2024-09-17 19:51:35 -07:00
dataset	API Updates (#73 )	2024-09-17 19:51:35 -07:00
evals	API Updates (#73 )	2024-09-17 19:51:35 -07:00
inference	Separate chat_completion stream and non-stream implementations	2024-10-08 17:23:40 -07:00
inspect	memory bank registration fixes	2024-10-08 17:23:02 -07:00
memory	more memory related fixes; memory.client now works	2024-10-08 17:23:02 -07:00
memory_banks	more memory related fixes; memory.client now works	2024-10-08 17:23:02 -07:00
models	Redo the { models, shields, memory_banks } typeset	2024-10-08 17:23:02 -07:00
post_training	API Updates (#73 )	2024-09-17 19:51:35 -07:00
reward_scoring	API Updates (#73 )	2024-09-17 19:51:35 -07:00
safety	Introduce model_store, shield_store, memory_bank_store	2024-10-08 17:23:02 -07:00
shields	Redo the { models, shields, memory_banks } typeset	2024-10-08 17:23:02 -07:00
synthetic_data_generation	API Updates (#73 )	2024-09-17 19:51:35 -07:00
telemetry	API Updates (#73 )	2024-09-17 19:51:35 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00