llama-stack

forked from phoenix-oss/llama-stack-mirror

History

ehhuang 047303e339 feat: introduce APIs for retrieving chat completion requests (#2145 ) # What does this PR do? This PR introduces APIs to retrieve past chat completion requests, which will be used in the LS UI. Our current `Telemetry` is ill-suited for this purpose as it's untyped so we'd need to filter by obscure attribute names, making it brittle. Since these APIs are 'provided by stack' and don't need to be implemented by inference providers, we introduce a new InferenceProvider class, containing the existing inference protocol, which is implemented by inference providers. The APIs are OpenAI-compliant, with an additional `input_messages` field. ## Test Plan This PR just adds the API and marks them provided_by_stack. S tart stack server -> doesn't crash		2025-05-18 21:43:19 -07:00
..
agents	fix: Responses API: handle type=None in streaming tool calls (#2166 )	2025-05-14 14:16:33 -07:00
datasetio	chore(refact): move paginate_records fn outside of datasetio (#2137 )	2025-05-12 10:56:14 -07:00
eval	feat: implementation for agent/session list and describe (#1606 )	2025-05-07 14:49:23 +02:00
inference	feat: introduce APIs for retrieving chat completion requests (#2145 )	2025-05-18 21:43:19 -07:00
ios/inference	chore: removed executorch submodule (#1265 )	2025-02-25 21:57:21 -08:00
post_training	feat: add huggingface post_training impl (#2132 )	2025-05-16 14:41:28 -07:00
safety	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
scoring	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
telemetry	feat: add metrics query API (#1394 )	2025-05-07 10:11:26 -07:00
tool_runtime	feat: Adding support for customizing chunk context in RAG insertion and querying (#2134 )	2025-05-14 21:56:20 -04:00
vector_io	feat: implementation for agent/session list and describe (#1606 )	2025-05-07 14:49:23 +02:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00