llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 04:04:14 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	c3d3a0b833	feat(tests): auto-merge all model list responses and unify recordings (#3320 ) Some checks failed Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Details Update ReadTheDocs / update-readthedocs (push) Failing after 3s Details Test External API and Providers / test-external (venv) (push) Failing after 5s Details Vector IO Integration Tests / test-matrix (push) Failing after 7s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Python Package Build Test / build (3.12) (push) Failing after 8s Details Unit Tests / unit-tests (3.13) (push) Failing after 14s Details Unit Tests / unit-tests (3.12) (push) Failing after 14s Details UI Tests / ui-tests (22) (push) Successful in 1m7s Details Pre-commit / pre-commit (push) Successful in 2m34s Details One needed to specify record-replay related environment variables for running integration tests. We could not use defaults because integration tests could be run against Ollama instances which could be running different models. For example, text vs vision tests needed separate instances of Ollama because a single instance typically cannot serve both of these models if you assume the standard CI worker configuration on Github. As a result, `client.list()` as returned by the Ollama client would be different between these runs and we'd end up overwriting responses. This PR "solves" it by adding a small amount of complexity -- we store model list responses specially, keyed by the hashes of the models they return. At replay time, we merge all of them and pretend that we have the union of all models available. ## Test Plan Re-recorded all the tests using `scripts/integration-tests.sh --inference-mode record`, including the vision tests.	2025-09-03 11:33:03 -07:00
ehhuang	bb6b6041d6	chore: fix: integration tests failures marked as successful (#3039 )	2025-08-04 17:06:28 -07:00
ehhuang	6ac710f3b0	fix(recording): endpoint resolution (#3013 ) Some checks failed Integration Tests (Replay) / discover-tests (push) Successful in 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s Details Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 15s Details Integration Tests (Replay) / run-replay-mode-tests (push) Failing after 10s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details Python Package Build Test / build (3.12) (push) Failing after 12s Details Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 15s Details Test External API and Providers / test-external (venv) (push) Failing after 13s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 18s Details Python Package Build Test / build (3.13) (push) Failing after 14s Details Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 18s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 23s Details Unit Tests / unit-tests (3.12) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 18s Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 19s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 17s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 56s Details Unit Tests / unit-tests (3.13) (push) Failing after 52s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 55s Details Pre-commit / pre-commit (push) Successful in 1m49s Details # What does this PR do? ## Test Plan	2025-08-01 16:23:54 -07:00
Ashwin Bharambe	27d866795c	feat(ci): add support for running vision inference tests (#2972 ) This PR significantly refactors the Integration Tests workflow. The main goal behind the PR was to enable recording of vision tests which were never run as part of our CI ever before. During debugging, I ended up making several other changes refactoring and hopefully increasing the robustness of the workflow. After doing the experiments, I have updated the trigger event to be `pull_request_target` so this workflow can get write permissions by default but it will run with source code from the base (main) branch in the source repository only. If you do change the workflow, you'd need to experiment using the `workflow_dispatch` triggers. This should not be news to anyone using Github Actions (except me!) It is likely to be a little rocky though while I learn more about GitHub Actions, etc. Please be patient :) --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2025-07-31 11:50:42 -07:00
Ashwin Bharambe	2e5ca3f15c	chore: move recordings one directory upwards	2025-07-29 12:46:19 -07:00

5 commits