llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 12:07:34 +00:00

History

ehhuang 14a94e9894 Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Pre-commit / pre-commit (push) Successful in 1m27s Details fix: responses <> chat completion input conversion (#3645 ) # What does this PR do? closes #3268 closes #3498 When resuming from previous response ID, currently we attempt to convert from the stored responses input to chat completion messages, which is not always possible, e.g. for tool calls where some data is lost once converted from chat completion message to repsonses input format. This PR stores the chat completion messages that correspond to the _last_ call to chat completion, which is sufficient to be resumed from in the next responses API call, where we load these saved messages and skip conversion entirely. Separate issue to optimize storage: https://github.com/llamastack/llama-stack/issues/3646 ## Test Plan existing CI tests		2025-10-02 16:01:08 -07:00
..
agents	fix: responses <> chat completion input conversion (#3645 )	2025-10-02 16:01:08 -07:00
batches	feat(batches, completions): add /v1/completions support to /v1/batches (#3309 )	2025-09-05 11:59:57 -07:00
datasetio	chore(misc): make tests and starter faster (#3042 )	2025-08-05 14:55:05 -07:00
eval	feat: update eval runner to use openai endpoints (#3588 )	2025-09-29 13:13:53 -07:00
files/localfs	fix(expires_after): make sure multipart/form-data is properly parsed (#3612 )	2025-09-30 16:14:03 -04:00
inference	chore: remove /v1/inference/completion and implementations (#3622 )	2025-10-01 11:36:53 -04:00
ios/inference	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 )	2025-10-02 15:12:03 -07:00
post_training	chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061 )	2025-08-20 07:15:35 -04:00
safety	feat: use /v1/chat/completions for safety model inference (#3591 )	2025-09-30 11:01:44 -07:00
scoring	chore: use openai_chat_completion for llm as a judge scoring (#3635 )	2025-10-01 09:44:31 -04:00
telemetry	chore: Remove debug logging from telemetry adapter (#3643 )	2025-10-01 15:16:23 -07:00
tool_runtime	feat(tools)!: substantial clean up of "Tool" related datatypes (#3627 )	2025-10-02 15:12:03 -07:00
vector_io	refactor: use generic WeightedInMemoryAggregator for hybrid search in SQLiteVecIndex (#3303 )	2025-09-02 10:38:35 -07:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00