llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-17 23:37:15 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	2c43285e22	feat(stores)!: use backend storage references instead of configs (#3697 ) This PR changes configurations in a backward incompatible way. Run configs today repeat full SQLite/Postgres snippets everywhere a store is needed, which means duplicated credentials, extra connection pools, and lots of drift between files. This PR introduces named storage backends so the stack and providers can share a single catalog and reference those backends by name. ## Key Changes - Add `storage.backends` to `StackRunConfig`, register each KV/SQL backend once at startup, and validate that references point to the right family. - Move server stores under `storage.stores` with lightweight references (backend + namespace/table) instead of full configs. - Update every provider/config/doc to use the new reference style; docs/codegen now surface the simplified YAML. ## Migration Before: ```yaml metadata_store: type: sqlite db_path: ~/.llama/distributions/foo/registry.db inference_store: type: postgres host: ${env.POSTGRES_HOST} port: ${env.POSTGRES_PORT} db: ${env.POSTGRES_DB} user: ${env.POSTGRES_USER} password: ${env.POSTGRES_PASSWORD} conversations_store: type: postgres host: ${env.POSTGRES_HOST} port: ${env.POSTGRES_PORT} db: ${env.POSTGRES_DB} user: ${env.POSTGRES_USER} password: ${env.POSTGRES_PASSWORD} ``` After: ```yaml storage: backends: kv_default: type: kv_sqlite db_path: ~/.llama/distributions/foo/kvstore.db sql_default: type: sql_postgres host: ${env.POSTGRES_HOST} port: ${env.POSTGRES_PORT} db: ${env.POSTGRES_DB} user: ${env.POSTGRES_USER} password: ${env.POSTGRES_PASSWORD} stores: metadata: backend: kv_default namespace: registry inference: backend: sql_default table_name: inference_store max_write_queue_size: 10000 num_writers: 4 conversations: backend: sql_default table_name: openai_conversations ``` Provider configs follow the same pattern—for example, a Chroma vector adapter switches from: ```yaml providers: vector_io: - provider_id: chromadb provider_type: remote::chromadb config: url: ${env.CHROMADB_URL} kvstore: type: sqlite db_path: ~/.llama/distributions/foo/chroma.db ``` to: ```yaml providers: vector_io: - provider_id: chromadb provider_type: remote::chromadb config: url: ${env.CHROMADB_URL} persistence: backend: kv_default namespace: vector_io::chroma_remote ``` Once the backends are declared, everything else just points at them, so rotating credentials or swapping to Postgres happens in one place and the stack reuses a single connection pool.	2025-10-20 13:20:09 -07:00
ehhuang	14a94e9894	fix: responses <> chat completion input conversion (#3645 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details API Conformance Tests / check-schema-compatibility (push) Successful in 10s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s Details Test External API and Providers / test-external (venv) (push) Failing after 6s Details Unit Tests / unit-tests (3.12) (push) Failing after 5s Details Unit Tests / unit-tests (3.13) (push) Failing after 6s Details UI Tests / ui-tests (22) (push) Successful in 33s Details Pre-commit / pre-commit (push) Successful in 1m27s Details # What does this PR do? closes #3268 closes #3498 When resuming from previous response ID, currently we attempt to convert from the stored responses input to chat completion messages, which is not always possible, e.g. for tool calls where some data is lost once converted from chat completion message to repsonses input format. This PR stores the chat completion messages that correspond to the _last_ call to chat completion, which is sufficient to be resumed from in the next responses API call, where we load these saved messages and skip conversion entirely. Separate issue to optimize storage: https://github.com/llamastack/llama-stack/issues/3646 ## Test Plan existing CI tests	2025-10-02 16:01:08 -07:00
ehhuang	8ab6684a94	chore: introduce write queue for response_store (#3497 ) # What does this PR do? Mirroring the same changes that was used for inference_store: https://github.com/llamastack/llama-stack/pull/3383 Will follow up with a shared internal API for managing these write queues. ## Test Plan existing tests	2025-09-29 10:36:16 -07:00
Matthew Farrellee	30b2e6a495	chore: default to pytest asyncio-mode=auto (#2730 ) # What does this PR do? previously, developers who ran `./scripts/unit-tests.sh` would get `asyncio-mode=auto`, which meant `@pytest.mark.asyncio` and `@pytest_asyncio.fixture` were redundent. developers who ran `pytest` directly would get pytest's default (strict mode), would run into errors leading them to add `@pytest.mark.asyncio` / `@pytest_asyncio.fixture` to their code. with this change - - `asyncio_mode=auto` is included in `pyproject.toml` making behavior consistent for all invocations of pytest - removes all redundant `@pytest_asyncio.fixture` and `@pytest.mark.asyncio` - for good measure, requires `pytest>=8.4` and `pytest-asyncio>=1.0` ## Test Plan - `./scripts/unit-tests.sh` - `uv run pytest tests/unit`	2025-07-11 13:00:24 -07:00
ehhuang	d3b60507d7	feat: support auth attributes in inference/responses stores (#2389 ) # What does this PR do? Inference/Response stores now store user attributes when inserting, and respects them when fetching. ## Test Plan pytest tests/unit/utils/test_sqlstore.py	2025-06-20 10:24:45 -07:00
ehhuang	15f630e5da	feat: support pagination in inference/responses stores (#2397 ) Some checks failed Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 23s Details Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 5s Details Integration Tests / test-matrix (library, 3.10, vector_io) (push) Failing after 7s Details Integration Tests / test-matrix (http, 3.10, vector_io) (push) Failing after 27s Details Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 19s Details Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 9s Details Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 11s Details Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s Details Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s Details Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s Details Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 44s Details Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 46s Details Test External Providers / test-external-providers (venv) (push) Failing after 41s Details Unit Tests / unit-tests (3.10) (push) Failing after 52s Details Unit Tests / unit-tests (3.12) (push) Failing after 18s Details Unit Tests / unit-tests (3.11) (push) Failing after 20s Details Unit Tests / unit-tests (3.13) (push) Failing after 16s Details Pre-commit / pre-commit (push) Successful in 2m0s Details # What does this PR do? ## Test Plan added unit tests	2025-06-16 22:43:35 -07:00

6 commits