Commit graph

5 commits

Author SHA1 Message Date
ehhuang
14a94e9894
fix: responses <> chat completion input conversion (#3645)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Python Package Build Test / build (3.12) (push) Failing after 2s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 5s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Vector IO Integration Tests / test-matrix (push) Failing after 5s
Python Package Build Test / build (3.13) (push) Failing after 3s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 9s
Test External API and Providers / test-external (venv) (push) Failing after 6s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
UI Tests / ui-tests (22) (push) Successful in 33s
Pre-commit / pre-commit (push) Successful in 1m27s
# What does this PR do?

closes #3268
closes #3498

When resuming from previous response ID, currently we attempt to convert
from the stored responses input to chat completion messages, which is
not always possible, e.g. for tool calls where some data is lost once
converted from chat completion message to repsonses input format.

This PR stores the chat completion messages that correspond to the
_last_ call to chat completion, which is sufficient to be resumed from
in the next responses API call, where we load these saved messages and
skip conversion entirely.

Separate issue to optimize storage:
https://github.com/llamastack/llama-stack/issues/3646

## Test Plan
existing CI tests
2025-10-02 16:01:08 -07:00
ehhuang
8ab6684a94
chore: introduce write queue for response_store (#3497)
# What does this PR do?
Mirroring the same changes that was used for inference_store:
https://github.com/llamastack/llama-stack/pull/3383

Will follow up with a shared internal API for managing these write
queues.

## Test Plan
existing tests
2025-09-29 10:36:16 -07:00
Matthew Farrellee
30b2e6a495
chore: default to pytest asyncio-mode=auto (#2730)
# What does this PR do?

previously, developers who ran `./scripts/unit-tests.sh` would get
`asyncio-mode=auto`, which meant `@pytest.mark.asyncio` and
`@pytest_asyncio.fixture` were redundent. developers who ran `pytest`
directly would get pytest's default (strict mode), would run into errors
leading them to add `@pytest.mark.asyncio` / `@pytest_asyncio.fixture`
to their code.

with this change -
- `asyncio_mode=auto` is included in `pyproject.toml` making behavior
consistent for all invocations of pytest
- removes all redundant `@pytest_asyncio.fixture` and
`@pytest.mark.asyncio`
 - for good measure, requires `pytest>=8.4` and `pytest-asyncio>=1.0`

## Test Plan

- `./scripts/unit-tests.sh`
- `uv run pytest tests/unit`
2025-07-11 13:00:24 -07:00
ehhuang
d3b60507d7
feat: support auth attributes in inference/responses stores (#2389)
# What does this PR do?
Inference/Response stores now store user attributes when inserting, and
respects them when fetching.

## Test Plan
pytest tests/unit/utils/test_sqlstore.py
2025-06-20 10:24:45 -07:00
ehhuang
15f630e5da
feat: support pagination in inference/responses stores (#2397)
Some checks failed
Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 23s
Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 5s
Integration Tests / test-matrix (library, 3.10, vector_io) (push) Failing after 7s
Integration Tests / test-matrix (http, 3.10, vector_io) (push) Failing after 27s
Integration Tests / test-matrix (http, 3.12, vector_io) (push) Failing after 19s
Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.11, vector_io) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 44s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 46s
Test External Providers / test-external-providers (venv) (push) Failing after 41s
Unit Tests / unit-tests (3.10) (push) Failing after 52s
Unit Tests / unit-tests (3.12) (push) Failing after 18s
Unit Tests / unit-tests (3.11) (push) Failing after 20s
Unit Tests / unit-tests (3.13) (push) Failing after 16s
Pre-commit / pre-commit (push) Successful in 2m0s
# What does this PR do?


## Test Plan
added unit tests
2025-06-16 22:43:35 -07:00