llama-stack-mirror/llama_stack
Ben Browning 51d9fd4808
Some checks failed
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 43s
Unit Tests / unit-tests (3.12) (push) Failing after 45s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 4s
Integration Tests / discover-tests (push) Successful in 6s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s
Pre-commit / pre-commit (push) Successful in 2m8s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 5s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 11s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 12s
Test Llama Stack Build / build-single-provider (push) Failing after 7s
Python Package Build Test / build (3.13) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 7s
Unit Tests / unit-tests (3.13) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 13s
Test External Providers / test-external-providers (venv) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 11s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 12s
Update ReadTheDocs / update-readthedocs (push) Failing after 6s
Integration Tests / test-matrix (push) Failing after 6s
Test Llama Stack Build / build (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 16s
fix: Don't cache clients for passthrough auth providers (#2728)
# What does this PR do?

Some of our inference providers support passthrough authentication via
`x-llamastack-provider-data` header values. This fixes the providers
that support passthrough auth to not cache their clients to the backend
providers (mostly OpenAI client instances) so that the client connecting
to Llama Stack has to provide those auth values on each and every
request.

## Test Plan

I added some unit tests to ensure we're not caching clients across
requests for all the fixed providers in this PR.

```
uv run pytest -sv tests/unit/providers/inference/test_inference_client_caching.py
```


I also ran some of our OpenAI compatible API integration tests for each
of the changed providers, just to ensure they still work. Note that
these providers don't actually pass all these tests (for unrelated
reasons due to quirks of the Groq and Together SaaS services), but
enough of the tests passed to confirm the clients are still working as
intended.

### Together

```
ENABLE_TOGETHER="together" \
uv run llama stack run llama_stack/templates/starter/run.yaml

LLAMA_STACK_CONFIG=http://localhost:8321 \
uv run pytest -sv \
  tests/integration/inference/test_openai_completion.py \
  --text-model "together/meta-llama/Llama-3.1-8B-Instruct"
```

### OpenAI

```
ENABLE_OPENAI="openai" \
uv run llama stack run llama_stack/templates/starter/run.yaml

LLAMA_STACK_CONFIG=http://localhost:8321 \
uv run pytest -sv \
  tests/integration/inference/test_openai_completion.py \
  --text-model "openai/gpt-4o-mini"
```

### Groq

```
ENABLE_GROQ="groq" \
uv run llama stack run llama_stack/templates/starter/run.yaml

LLAMA_STACK_CONFIG=http://localhost:8321 \
uv run pytest -sv \
  tests/integration/inference/test_openai_completion.py \
  --text-model "groq/meta-llama/Llama-3.1-8B-Instruct"
```

---------

Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-07-11 13:38:27 -07:00
..
apis chore(api): add mypy coverage to apis (#2648) 2025-07-09 12:55:16 +02:00
cli chore(api): add mypy coverage to cli/stack (#2650) 2025-07-10 16:53:38 +02:00
distribution fix: container build on podman (#2723) 2025-07-11 16:25:33 +02:00
models chore(api): add mypy coverage to prompts (#2657) 2025-07-09 10:07:00 +02:00
providers fix: Don't cache clients for passthrough auth providers (#2728) 2025-07-11 13:38:27 -07:00
strong_typing chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
templates chore: Adding unit tests for OpenAI vector stores and migrating SQLite-vec registry to kvstore (#2665) 2025-07-10 14:22:13 -04:00
ui feat(auth,ui): support github sign-in in the UI (#2545) 2025-07-08 11:02:57 -07:00
__init__.py export LibraryClient 2024-12-13 12:08:00 -08:00
env.py refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401) 2025-03-04 14:53:47 -08:00
log.py chore: remove nested imports (#2515) 2025-06-26 08:01:05 +05:30
schema_utils.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00