Commit graph

3216 commits

Author SHA1 Message Date
Omar Abdelwahab
e6c6c36b70 Merge remote-tracking branch 'upstream/main' into add-mcp-authentication-param 2025-11-13 12:04:44 -08:00
Omar Abdelwahab
d913756844 updated test_tools_with_schemas 2025-11-13 11:54:09 -08:00
Charlie Doern
840ad75fe9
feat: split API and provider specs into separate llama-stack-api pkg (#3895)
# What does this PR do?

Extract API definitions and provider specifications into a standalone
llama-stack-api package that can be published to PyPI independently of
the main llama-stack server.


see: https://github.com/llamastack/llama-stack/pull/2978 and
https://github.com/llamastack/llama-stack/pull/2978#issuecomment-3145115942

Motivation

External providers currently import from llama-stack, which overrides
the installed version and causes dependency conflicts. This separation
allows external providers to:

- Install only the type definitions they need without server
dependencies
- Avoid version conflicts with the installed llama-stack package
- Be versioned and released independently

This enables us to re-enable external provider module tests that were
previously blocked by these import conflicts.

Changes

- Created llama-stack-api package with minimal dependencies (pydantic,
jsonschema)
- Moved APIs, providers datatypes, strong_typing, and schema_utils
- Updated all imports from llama_stack.* to llama_stack_api.*
- Configured local editable install for development workflow
- Updated linting and type-checking configuration for both packages

Next Steps

- Publish llama-stack-api to PyPI
- Update external provider dependencies
- Re-enable external provider module tests


Pre-cursor PRs to this one:

- #4093 
- #3954 
- #4064 

These PRs moved key pieces _out_ of the Api pkg, limiting the scope of
change here.


relates to #3237 

## Test Plan

Package builds successfully and can be imported independently. All
pre-commit hooks pass with expected exclusions maintained.

---------

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-11-13 11:51:17 -08:00
Omar Abdelwahab
4b6bfbac8c Added comments and updated model_context_protocol.py 2025-11-13 11:49:24 -08:00
Omar Abdelwahab
9c484d12ae Updated some unit tests 2025-11-13 10:58:40 -08:00
Sébastien Han
ceb716b9a0
chore: set minimum pre-commit version (#4148)
# What does this PR do?

- force a min precommit version
- pin to >= 4.3.0 when installing

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-11-13 10:52:38 -08:00
Omar Abdelwahab
c1b63202be Updated the test cases to support the headers for now 2025-11-13 10:35:51 -08:00
Omar Abdelwahab
8783255bc3 feat(tool-runtime): Add authorization parameter with backward compatibility
Implement Phase 1 of MCP auth migration:
- Add authorization parameter to list_runtime_tools() and invoke_tool()
- Maintain backward compatibility with X-LlamaStack-Provider-Data header
- Tests use old header-based auth to avoid client SDK dependency
- New parameter takes precedence when both methods provided

Phase 2 will migrate tests to new parameter after Stainless SDK release.

Related: PR #4052
2025-11-13 10:26:39 -08:00
Ashwin Bharambe
fa2b361f46
Merge branch 'main' into add-mcp-authentication-param 2025-11-13 09:42:35 -08:00
Francisco Arceo
4442b24de7
chore: Fix docs so can be deployed (#4149)
# What does this PR do?
Building/Deploying docs is failing here:
5530320962 (step):8:49

Needs the playground file. Updated it to reflect current admin status.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-11-13 09:15:32 -08:00
Derek Higgins
aeaf4eb3dd
fix: remove_disabled_providers filtering models with None fields (#4132)
Fixed bug where models with No provider_model_id were incorrectly
filtered from the startup config display. The function was checking
multiple fields when it should only filter items with explicitly
disabled provider_id.

Changes:
o Modified remove_disabled_providers to only check provider_id field o
Changed condition from checking multiple fields with None to only
  checking provider_id for "__disabled__", None or empty string
o Added comprehensive unit tests

Closes: #4131

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-11-13 07:24:05 -08:00
Ashwin Bharambe
1e81056a22
feat(tests): enable MCP tests in server mode (#4146)
We would like to run all OpenAI compatibility tests using only the
openai-client library. This is most friendly for contributors since they
can run tests without needing to update the client-sdks (which is
getting easier but still a long pole.)

This is the first step in enabling that -- no using "library client" for
any of the Responses tests. This seems like a reasonable trade-off since
the usage of an embeddeble library client for Responses (or any
OpenAI-compatible) behavior seems to be not very common. To do this, we
needed to enable MCP tests (which only worked in library client mode)
for server mode.
2025-11-13 07:23:23 -08:00
Akram Ben Aissi
9eb81439d2
docs: Add comprehensive Files API and Vector Store integration doc (#3279)
docs: Add comprehensive Files API and Vector Store integration
documentation

- Add Files API documentation with OpenAI-compatible endpoints
- Create comprehensive guide for OpenAI-compatible file operations
- Reorganize documentation structure: move file operations to files/
directory
- Add vector store provider documentation for Milvus, SQLite-vec, FAISS
- Clean up redundant files and improve navigation
- Update cross-references and eliminate documentation duplication
- Support for release 0.2.14 FileResponse and Vector Store API features

# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
2025-11-13 08:50:06 -05:00
Omar Abdelwahab
1a6cb7041d precommit 2025-11-12 19:02:54 -08:00
Omar Abdelwahab
66ca51ac0d feat(tool-runtime): Add authorization parameter to list_runtime_tools
Add authorization parameter to list_runtime_tools() method to support
MCP servers that require authentication for listing tools.

Changes:
- Updated ToolRuntime protocol to include authorization parameter on list_runtime_tools()
- Updated all provider implementations (MCP, Tavily, Brave, Bing, Wolfram Alpha)
- Updated router and routing table to pass authorization through
- Updated API recorder patched methods to include authorization parameter

This enables authenticated tool listing for enterprise MCP deployments
where IT administrators pre-configure connectors requiring authentication.

Note: Client SDK will need to be regenerated from updated OpenAPI spec
to support passing this parameter from client code. Tests will pass once
client SDK is updated.
2025-11-12 17:27:03 -08:00
Omar Abdelwahab
e6ebbd8a7b fix(tool-runtime): Remove authorization from list_runtime_tools in all providers
Updated all tool runtime provider implementations to remove the authorization
parameter from list_runtime_tools():

- tavily_search.py
- brave_search.py
- wolfram_alpha.py
- bing_search.py

These providers were missing in the previous commit. Tool listing typically
doesn't require authentication - only invoke_tool() needs the authorization
parameter for authenticated tool execution.

This ensures all tool runtime providers have consistent signatures matching
the updated protocol definition.
2025-11-12 16:20:53 -08:00
Omar Abdelwahab
18f197763b fix(tool-runtime): Remove authorization from list_runtime_tools()
The authorization parameter should only be on invoke_tool(), not on
list_runtime_tools(). Tool listing typically doesn't require authentication,
and the client SDK doesn't have this parameter yet.

Changes:
1. Removed authorization parameter from ToolRuntime.list_runtime_tools() protocol method
2. Updated all implementations to remove the authorization parameter:
   - MCPProviderImpl.list_runtime_tools()
   - ToolRuntimeRouter.list_runtime_tools()
   - ToolGroupsRoutingTable.list_tools() and _index_tools()
3. Updated test to remove authorization from list_tools() call

This ensures compatibility with the llama-stack-client SDK which doesn't
support authorization on list_tools() yet. Only invoke_tool() requires
and accepts the authorization parameter for authenticated tool execution.
2025-11-12 16:17:53 -08:00
Omar Abdelwahab
c0295a2495 revert(debug): Remove temporary debug logging from resolver
Removing the debug logging that was added to diagnose signature mismatch errors.
The logging served its purpose - it helped us identify that the error was coming
from api_recorder.py patched methods, not the actual provider implementations.

With the root cause now fixed in api_recorder.py, this debug logging is no longer
needed and can be safely removed to keep the code clean.
2025-11-12 16:12:14 -08:00
Omar Abdelwahab
4a1fa139f1 revert(ci): Remove unnecessary CI workarounds from action.yml
Now that we've fixed the actual root cause (api_recorder.py missing the
authorization parameter), we can revert all the CI workarounds that were
added during troubleshooting:

Removed changes:
- Cache clearing (venv, pycache, UV cache)
- PYTHONDONTWRITEBYTECODE environment variable
- --no-install-project flag
- Force reinstalling llama-stack
- Installing ci-tests distribution dependencies via llama CLI
- Final bytecode cache cleanup

These were all based on incorrect diagnosis (missing dependencies or module
caching) and are no longer needed. The real fix was updating api_recorder.py
to include the authorization parameter in patched tool runtime methods.

Restoring the simpler, original CI setup that just runs 'uv sync --all-groups'.
2025-11-12 16:11:16 -08:00
Omar Abdelwahab
d156451890 fix(ci): Add authorization parameter to api_recorder tool runtime patches
The ACTUAL root cause of the signature mismatch errors was found!

The api_recorder.py module patches tool runtime invoke_tool methods for test
recording/replay, but the patched methods were missing the new 'authorization'
parameter. The debug logging revealed:

  Object method: patched_tavily_invoke_tool (from api_recorder module)
  Object method's module: llama_stack.testing.api_recorder

Changes made:
1. Updated _patched_tool_invoke_method() to accept authorization parameter
2. Updated patched_tavily_invoke_tool() signature to include authorization
3. Added debug logging to resolver to help identify similar issues in the future

This fix ensures that when tests run in record/replay mode, the patched methods
preserve the full signature including the authorization parameter, allowing the
protocol compliance checks to pass.
2025-11-12 16:06:29 -08:00
Omar Abdelwahab
bae5b14adf debug: Add detailed logging for signature mismatch errors
Adding comprehensive debug logging to understand what's causing the persistent
signature mismatch errors in CI. The logging will show:
- Provider class name and module
- Both protocol and object signatures
- The actual method object
- The method's source module

This will help us identify if the issue is:
1. A cached module being loaded
2. A parent class overriding the method
3. Some other source of the wrong signature

Once we see the debug output, we can pinpoint the exact root cause.
2025-11-12 16:01:13 -08:00
Omar Abdelwahab
166c37bbbe fix(ci): Prevent Python from caching old code during uv sync
The signature mismatch error persists because 'uv sync' installs and potentially imports
the llama-stack package, caching provider modules in memory BEFORE we do the editable
install with fresh source code.

This fix adds the --no-install-project flag to 'uv sync', which:
1. Installs all dependencies but skips installing the project itself
2. Prevents Python from importing and caching provider modules
3. Ensures the subsequent 'uv pip install -e .' loads fresh source code

This should finally resolve the persistent signature mismatch errors in CI where
the protocol has 'authorization' parameter but provider implementations appear not to.
2025-11-12 15:56:26 -08:00
Omar Abdelwahab
761a2a0ce3 fix(ci): Use 'uv run' to execute llama command in virtual environment
The previous commit tried to run 'llama stack list-deps' directly, but the 'llama' command
wasn't in PATH yet since the virtual environment hadn't been activated.

This fix uses 'uv run llama' instead, which executes the command within the uv virtual
environment context, ensuring the llama CLI is accessible.
2025-11-12 15:51:55 -08:00
Omar Abdelwahab
844a159219 fix(ci): Install ci-tests distribution dependencies to fix test failures
The CI integration tests were failing with a signature mismatch error, but the root cause was missing dependencies (specifically the 'together' package). The signature mismatch was a misleading error that occurred because the provider modules failed to load properly due to missing dependencies.

This fix adds a step to install all ci-tests distribution dependencies using:
  llama stack list-deps ci-tests | xargs -L1 uv pip install

This ensures all required provider dependencies are installed before running tests.
2025-11-12 15:49:57 -08:00
Omar Abdelwahab
0754d59999 fix(ci): Add final bytecode cache clear after installations
The issue was timing - we were clearing cache before installations,
but uv sync/pip install were creating new .pyc files. This commit:

1. Adds PYTHONDONTWRITEBYTECODE=1 to prevent .pyc generation
2. Clears bytecode cache AFTER all installations complete
3. Ensures no stale .pyc files exist before tests run

For editable installs (-e .), Python loads from source directory,
so clearing cache after installation ensures the resolver sees the
latest method signatures with the authorization parameter.
2025-11-12 15:28:49 -08:00
Omar Abdelwahab
6dc2d92232 fix(ci): Clear cached .venv directory to ensure fresh install
The GitHub Actions cache was restoring a cached virtual environment
(.venv) with old code. This commit clears all caching layers:

1. Removes cached .venv directory (the main culprit)
2. Clears Python bytecode cache (.pyc files)
3. Clears UV cache directory

This forces uv sync to create a completely fresh virtual environment
with the latest source code changes, ensuring the authorization
parameter is picked up across all tool runtime providers.
2025-11-12 15:25:51 -08:00
Omar Abdelwahab
8b6588dc1e fix(ci): Clear UV cache directory instead of lock file
The previous approach of removing uv.lock caused dependency resolution
failures. The real issue is the UV_CACHE_DIR that contains pre-built
wheels with old code. This commit:

1. Keeps uv.lock (it's part of the project)
2. Clears UV_CACHE_DIR (where compiled wheels are cached)
3. Forces uv to rebuild wheels from source

This ensures the latest source code changes are picked up without
breaking dependency resolution.
2025-11-12 15:23:06 -08:00
Omar Abdelwahab
6aaf4ad080 fix(ci): Remove uv.lock before sync to ensure fresh dependency resolution
The uv.lock file contains cached dependency resolutions that prevent
source code changes from being picked up. By removing it before uv sync,
we force a fresh resolution and rebuild of dependencies.

This should fix the 73 CI test failures where the resolver was loading
stale method signatures without the authorization parameter.
2025-11-12 15:20:48 -08:00
Omar Abdelwahab
1ea57b0a17 Fix CI: Clear Python bytecode cache before reinstall
The real issue was stale .pyc bytecode files in __pycache__ directories.
These cached files contained the old method signatures without the
authorization parameter, causing signature mismatch errors even though
the source .py files were correct.

Now clearing all __pycache__ directories and .pyc files before the
force-reinstall to ensure Python loads fresh bytecode from the updated
source files.
2025-11-12 15:16:34 -08:00
Omar Abdelwahab
025c301a9a Fix CI: Force reinstall llama-stack from source
The CI was using a cached/stale version of the package that didn't
include our authorization parameter changes. Add explicit force
reinstall step to ensure the latest source code is used.
2025-11-12 15:12:42 -08:00
Omar Abdelwahab
778b7de9cb fix: add authorization parameter to ToolRuntimeRouter and routing table
The auto-routing layer was missing the authorization parameter:
- ToolRuntimeRouter.invoke_tool() now accepts and passes authorization
- ToolRuntimeRouter.list_runtime_tools() now accepts and passes authorization
- ToolGroupsRoutingTable.list_tools() now accepts and forwards authorization
- ToolGroupsRoutingTable._index_tools() now accepts and uses authorization

This fixes the '__autorouted__' provider signature mismatch error in CI.
2025-11-12 15:08:00 -08:00
Omar Abdelwahab
bf28c215d1 chore: trigger CI - all provider signatures fixed
All ToolRuntime provider implementations now have 'authorization' parameter.
Verified locally that signatures are correct after fresh pip install.

CI note: Ensure pip install -e . runs to pick up latest code changes.
2025-11-12 15:02:13 -08:00
Omar Abdelwahab
607e3cc05c
Merge branch 'main' into add-mcp-authentication-param 2025-11-12 14:55:23 -08:00
Omar Abdelwahab
7a823bc280 fix: remove syntax errors from test files caused by sed
Fixed syntax errors in test files that were introduced by batch sed replacement:
- test_tools_with_schemas.py: Removed leftover broken comments and closing brace
- test_mcp_json_schema.py: Removed all instances of broken comment blocks

The sed command left remnants that broke Python syntax.
2025-11-12 14:54:38 -08:00
Omar Abdelwahab
d804e37e01 chore: trigger CI rebuild with fresh Python cache 2025-11-12 14:51:38 -08:00
Omar Abdelwahab
d0ec3b07b5 fix: add authorization parameter to all ToolRuntime provider implementations
Updated all ToolRuntime provider implementations to match the protocol signature:
- BraveSearchToolRuntimeImpl
- TavilySearchToolRuntimeImpl
- BingSearchToolRuntimeImpl
- WolframAlphaToolRuntimeImpl
- MemoryToolRuntimeImpl

This fixes the signature mismatch error in CI where protocol had 'authorization' parameter but implementations didn't.
2025-11-12 14:47:22 -08:00
Omar Abdelwahab
84baa5c406 feat: unify MCP authentication across Responses and Tool Runtime APIs
- Add authorization parameter to Tool Runtime API signatures (list_runtime_tools, invoke_tool)
- Update MCP provider implementation to use authorization from request body instead of provider-data
- Deprecate mcp_authorization and mcp_headers from provider-data (MCPProviderDataValidator now empty)
- Update all Tool Runtime tests to pass authorization as request body parameter
- Responses API already uses request body authorization (no changes needed)

This provides a single, consistent way to pass MCP authentication tokens across both APIs, addressing reviewer feedback about avoiding multiple configuration paths.
2025-11-12 14:41:00 -08:00
Ashwin Bharambe
fcf649b97a
feat(storage): share sql/kv instances and add upsert support (#4140)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test Llama Stack Build / generate-matrix (push) Successful in 2s
Python Package Build Test / build (3.12) (push) Failing after 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Python Package Build Test / build (3.13) (push) Failing after 17s
Test Llama Stack Build / build-single-provider (push) Successful in 31s
Test External API and Providers / test-external (venv) (push) Failing after 32s
Vector IO Integration Tests / test-matrix (push) Failing after 45s
Test Llama Stack Build / build (push) Successful in 47s
UI Tests / ui-tests (22) (push) Successful in 1m42s
Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 2m8s
Unit Tests / unit-tests (3.13) (push) Failing after 2m7s
Unit Tests / unit-tests (3.12) (push) Failing after 2m28s
Test Llama Stack Build / build-custom-container-distribution (push) Successful in 2m32s
Pre-commit / pre-commit (push) Successful in 3m20s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m33s
A few changes to the storage layer to ensure we reduce unnecessary
contention arising out of our design choices (and letting the database
layer do its correct thing):

- SQL stores now share a single `SqlAlchemySqlStoreImpl` per backend,
and `kvstore_impl` caches instances per `(backend, namespace)`. This
avoids spawning multiple SQLite connections for the same file, reducing
lock contention and aligning the cache story for all backends.

- Added an async upsert API (with SQLite/Postgres dialect inserts) and
routed it through `AuthorizedSqlStore`, then switched conversations and
responses to call it. Using native `ON CONFLICT DO UPDATE` eliminates
the insert-then-update retry window that previously caused long WAL lock
retries.

### Test Plan

Existing tests, added a unit test for `upsert()`
2025-11-12 12:14:26 -08:00
Ashwin Bharambe
492f79ca9b
fix: harden storage semantics (#4118)
Fixes issues in the storage system by guaranteeing immediate durability
for responses and ensuring background writers stay alive. Three related
fixes:

* Responses to the OpenAI-compatible API now write directly to
Postgres/SQLite inside the request instead of detouring through an async
queue that might never drain; this restores the expected
read-after-write behavior and removes the "response not found" races
reported by users.

* The access-control shim was stamping owner_principal/access_attributes
as SQL NULL, which Postgres interprets as non-public rows; fixing it to
use the empty-string/JSON-null pattern means conversations and responses
stored without an authenticated user stay queryable (matching SQLite).

* The inference-store queue remains for batching, but its worker tasks
now start lazily on the live event loop so server startup doesn't cancel
them—writes keep flowing even when the stack is launched via llama stack
run.

Closes #4115 

### Test Plan

Added a matrix entry to test our "base" suite against Postgres as the
store.
2025-11-12 10:35:39 -08:00
Derek Higgins
356f37b1ba
docs: clarify model identification uses provider_model_id not model_id (#4128)
Updated documentation to accurately reflect current behavior where
models are identified as provider_id/provider_model_id in the system.

Changes:
o Clarify that model_id is for configuration purposes only o Explain
models are accessed as provider_id/provider_model_id o Remove outdated
aliasing example that suggested model_id could be used
  as a custom identifier

This corrects the documentation which previously suggested model_id
could be used to create friendly aliases, which is not how the code
actually works.

Signed-off-by: Derek Higgins <derekh@redhat.com>
2025-11-12 10:13:26 -08:00
Ken Dreyer
94e977c257
fix(docs): link to test replay-record docs for discoverability (#4134)
Help users find the comprehensive integration testing docs by linking to
the record-replay documentation. This clarifies that the technical
README complements the main docs.
2025-11-12 10:04:56 -08:00
Francisco Arceo
eb3f9ac278
feat: allow returning embeddings and metadata from /vector_stores/ methods; disallow changing Provider ID (#4046)
# What does this PR do?

- Updates `/vector_stores/{vector_store_id}/files/{file_id}/content` to
allow returning `embeddings` and `metadata` using the `extra_query`
    -  Updates the UI accordingly to display them.

- Update UI to support CRUD operations in the Vector Stores section and
adds a new modal exposing the functionality.

- Updates Vector Store update to fail if a user tries to update Provider
ID (which doesn't make sense to allow)

```python
In  [1]: client.vector_stores.files.content(
    vector_store_id=vector_store.id, 
    file_id=file.id, 
    extra_query={"include_embeddings": True, "include_metadata": True}
)
Out [1]: FileContentResponse(attributes={}, content=[Content(text='This is a test document to check if embeddings are generated properly.\n', type='text', embedding=[0.33760684728622437, ...,], chunk_metadata={'chunk_id': '62a63ae0-c202-f060-1b86-0a688995b8d3', 'document_id': 'file-27291dbc679642ac94ffac6d2810c339', 'source': None, 'created_timestamp': 1762053437, 'updated_timestamp': 1762053437, 'chunk_window': '0-13', 'chunk_tokenizer': 'DEFAULT_TIKTOKEN_TOKENIZER', 'chunk_embedding_model': 'sentence-transformers/nomic
-ai/nomic-embed-text-v1.5', 'chunk_embedding_dimension': 768, 'content_token_count': 13, 'metadata_token_count': 9}, metadata={'filename': 'test-embedding.txt', 'chunk_id': '62a63ae0-c202-f060-1b86-0a688995b8d3', 'document_id': 'file-27291dbc679642ac94ffac6d2810c339', 'token_count': 13, 'metadata_token_count': 9})], file_id='file-27291dbc679642ac94ffac6d2810c339', filename='test-embedding.txt')
```

Screenshots of UI are displayed below:

### List Vector Store with Added "Create New Vector Store"
<img width="1912" height="491" alt="Screenshot 2025-11-06 at 10 47
25 PM"
src="https://github.com/user-attachments/assets/a3a3ddd9-758d-4005-ac9c-5047f03916f3"
/>

### Create New Vector Store
<img width="1918" height="1048" alt="Screenshot 2025-11-06 at 10 47
49 PM"
src="https://github.com/user-attachments/assets/b4dc0d31-696f-4e68-b109-27915090f158"
/>

### Edit Vector Store
<img width="1916" height="1355" alt="Screenshot 2025-11-06 at 10 48
32 PM"
src="https://github.com/user-attachments/assets/ec879c63-4cf7-489f-bb1e-57ccc7931414"
/>


### Vector Store Files Contents page (with Embeddings)
<img width="1914" height="849" alt="Screenshot 2025-11-06 at 11 54
32 PM"
src="https://github.com/user-attachments/assets/3095520d-0e90-41f7-83bd-652f6c3fbf27"
/>

### Vector Store Files Contents Details page (with Embeddings)
<img width="1916" height="1221" alt="Screenshot 2025-11-06 at 11 55
00 PM"
src="https://github.com/user-attachments/assets/e71dbdc5-5b49-472b-a43a-5785f58d196c"
/>

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
Tests added for Middleware extension and Provider failures.

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-11-12 09:59:48 -08:00
Charlie Doern
37853ca558
fix(tests): add OpenAI client connection cleanup to prevent CI hangs (#4119)
# What does this PR do?

Add explicit connection cleanup and shorter timeouts to OpenAI client
fixtures. Fixes CI deadlock after 25+ tests due to connection pool
exhaustion. Also adds 60s timeout to test_conversation_context_loading
as safety net.

## Test Plan

tests pass

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-11-12 12:17:13 -05:00
Sam El-Borai
63137f9af1
chore(stainless): add config for file header (#4126)
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

This PR adds Stainless config to specify the Meta copyright file header
for generated files.

Doing it via config instead of custom code will reduce the probability
of git conflict.

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

- review preview builds
2025-11-12 11:39:21 -05:00
Akshay Ghodake
539b9c08f3
chore(deps): update pypdf to fix DoS vulnerabilities (#4121)
Some checks failed
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Integration Tests (Replay) / generate-matrix (push) Successful in 5s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 6s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test llama stack list-deps / generate-matrix (push) Successful in 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 13s
Python Package Build Test / build (3.12) (push) Failing after 17s
Python Package Build Test / build (3.13) (push) Failing after 17s
Test llama stack list-deps / show-single-provider (push) Successful in 50s
Test Llama Stack Build / build-single-provider (push) Successful in 53s
UI Tests / ui-tests (22) (push) Successful in 53s
Test Llama Stack Build / build (push) Successful in 52s
Test llama stack list-deps / list-deps-from-config (push) Successful in 1m18s
Test External API and Providers / test-external (venv) (push) Failing after 1m19s
Test llama stack list-deps / list-deps (push) Failing after 1m1s
Vector IO Integration Tests / test-matrix (push) Failing after 1m44s
Unit Tests / unit-tests (3.13) (push) Failing after 1m53s
Unit Tests / unit-tests (3.12) (push) Failing after 2m6s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3m7s
Test Llama Stack Build / build-custom-container-distribution (push) Successful in 3m8s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3m30s
Pre-commit / pre-commit (push) Successful in 4m1s
Update pypdf dependency to address vulnerabilities causing potential
denial of service through infinite loops or excessive memory usage when
handling malicious PDFs. The update remains fully backward compatible,
with no changes to the PdfReader API.


# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
Fixes #4120

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-11-12 10:24:19 +01:00
Charlie Doern
6ca2a67a9f
chore: remove dead code (#4125)
# What does this PR do?

build_image is not used because `llama stack build` is gone. Remove it.

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-11-12 10:09:14 +01:00
Omar Abdelwahab
893e186d5c
Merge branch 'main' into add-mcp-authentication-param 2025-11-11 11:21:09 -08:00
ehhuang
71b328fc4b
chore(ui): add npm package and dockerfile (#4100)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Python Package Build Test / build (3.12) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Pre-commit / pre-commit (push) Failing after 2s
Integration Tests (Replay) / generate-matrix (push) Successful in 2s
Python Package Build Test / build (3.13) (push) Failing after 1s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Test External API and Providers / test-external (venv) (push) Failing after 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 9s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s
UI Tests / ui-tests (22) (push) Successful in 53s
# What does this PR do?
- sets up package.json for npm `llama-stack-ui` package (will update
llama-stack-ops)
- adds dockerfile for UI docker image

## Test Plan
npx:
npm build && npm pack
LLAMA_STACK_UI_PORT=8322 npx
/Users/erichuang/projects/ui/src/llama_stack_ui/llama-stack-ui-0.4.0-alpha.2.tgz

docker:
cd src/llama_stack_ui
docker build . -f Dockerfile  --tag test_ui --no-cache

❯ docker run -p 8322:8322 \
      -e LLAMA_STACK_UI_PORT=8322 \
      test_ui:latest
2025-11-11 10:40:31 -08:00
Omar Abdelwahab
945a2883de
Merge branch 'main' into add-mcp-authentication-param 2025-11-11 09:18:50 -08:00
paulengineer
e5a55f3677
docs: use 'uv pip' to avoid pitfalls of using 'pip' in virtual environment (#4122)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s
Python Package Build Test / build (3.12) (push) Failing after 1s
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 2s
Pre-commit / pre-commit (push) Failing after 2s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 6s
API Conformance Tests / check-schema-compatibility (push) Successful in 9s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 3s
Unit Tests / unit-tests (3.13) (push) Failing after 5s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 25s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2s
UI Tests / ui-tests (22) (push) Successful in 53s
# What does this PR do?
In the **Detailed Tutorial**, at **Step 3**, the **Install with venv**
option creates a new virtual environment `client`, activates it then
attempts to install the llama-stack-client using pip.
```
uv venv client --python 3.12
source client/bin/activate
pip install llama-stack-client    <- this is the problematic line
```
However, the pip command will likely fail because the `uv venv` command
doesn't, by default, include adding the pip command to the virtual
environment that is created. The pip command will error either because
pip doesn't exist at all, or, if the pip command does exist outside of
the virtual environment, return a different error message. The latter
may be unclear to the user why it is failing.

This PR changes 'pip' to 'uv pip', allowing the install action to
function in the virtual environment as intended, and without the need
for pip to be installed.

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
1. Use linux or WSL (virtual environments on Windows use `Scripts`
folder instead of `bin` [virtualenv
#993ba13](993ba1316a)
which doesn't align with the tutorial)
2. Clone the `llama-stack` repo
3. Run the following and verify success:
```
uv venv client --python 3.12
source client/bin/activate
```
5. Run the updated command:
```
uv pip install llama-stack-client
```
6. Observe the console output confirms that the virtual environment
`client` was used:

> Using Python 3.12.3 environment at: **client**
2025-11-11 07:49:03 -05:00