# What does this PR do?
During env var replacement, we're implicitly converting all config types
to their apparent types (e.g., "true" to True, "123" to 123). This may
be arguably useful for when doing an env var substitution, as those are
always strings, but we should definitely avoid touching config values
that have explicit types and are uninvolved in env var substitution.
## Test Plan
Unit
Recording files use a predictable naming format, making the SQLite index
redundant. The binary SQLite file was causing frequent git conflicts.
Simplify by calculating file paths directly from request hashes.
Signed-off-by: Derek Higgins <derekh@redhat.com>
# What does this PR do?
As described in #3134 a langchain example works against openai's
responses impl, but not against llama stack's. This turned out to be due
to the order of the inputs. The langchain example has the two function
call outputs first, followed by each call result in turn. This seems to
be valid as it is accepted by openai's impl. However in llama stack,
these inputs are converted to chat completion inputs and the resulting
order for that api is not accpeted by openai.
This PR fixes the issue by ensuring that the converted chat completions
inputs are in the expected order.
Closes#3134
## Test Plan
Added unit and integration tests. Verified this fixes original issue as
reported.
---------
Signed-off-by: Gordon Sim <gsim@redhat.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR removes `init()` from `LlamaStackAsLibrary`
Currently client.initialize() had to be invoked by user.
To improve dev experience and to avoid runtime errors, this PR init
LlamaStackAsLibrary implicitly upon using the client.
It prevents also multiple init of the same client, while maintaining
backward ccompatibility.
This PR does the following
- Automatic Initialization: Constructor calls initialize_impl()
automatically.
- Client is fully initialized after __init__ completes.
- Prevents consecutive initialization after the client has been
successfully initialized.
- initialize() method still exists but is now a no-op.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
fixes https://github.com/meta-llama/llama-stack/issues/2946
---------
Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
Adds flexible CORS (Cross-Origin Resource Sharing) configuration support
to the FastAPI
server with both local development and explicit configuration modes:
- **Local development mode**: `cors: true` enables localhost-only access
with regex
pattern `https?://localhost:\d+`
- **Explicit configuration mode**: Specific origins configuration with
credential support
and validation
- Prevents insecure combinations (wildcards with credentials)
- FastAPI CORSMiddleware integration via `model_dump()`
Addresses the need for configurable CORS policies to support web
frontends and
cross-origin API access while maintaining security.
Closes#2119
## Test Plan
1. Ran Unit Tests.
2. Manual tests: FastAPI middleware integration with actual HTTP
requests
- Local development mode localhost access validation
- Explicit configuration mode origins validation
- Preflight OPTIONS request handling
Some screenshots of manual tests.
<img width="1920" height="927" alt="image"
src="https://github.com/user-attachments/assets/79322338-40c7-45c9-a9ea-e3e8d8e2f849"
/>
<img width="1911" height="1037" alt="image"
src="https://github.com/user-attachments/assets/1683524e-b0c9-48c9-a0a5-782e949cde01"
/>
cc: @leseb @rhuss @franciscojavierarceo
# What does this PR do?
Handles MCP tool calls in a previous response
Closes#3105
## Test Plan
Made call to create response with tool call, then made second call with
the first linked through previous_response_id. Did not get error.
Also added unit test.
Signed-off-by: Gordon Sim <gsim@redhat.com>
# What does this PR do?
This PR adds a step in pre-commit to enforce using `llama_stack` logger.
Currently, various parts of the code base uses different loggers. As a
custom `llama_stack` logger exist and used in the codebase, it is better
to standardize its utilization.
Signed-off-by: Mustafa Elbehery <melbeher@redhat.com>
Co-authored-by: Matthew Farrellee <matt@cs.wisc.edu>
# What does this PR do?
Refactors the OpenAI response conversion utilities by moving helper functions from `openai_responses.py` to `utils.py`. Adds unit tests.
# What does this PR do?
Refactors the OpenAI responses implementation by extracting streaming and tool execution logic into separate modules. This improves code organization by:
1. Creating a new `StreamingResponseOrchestrator` class in `streaming.py` to handle the streaming response generation logic
2. Moving tool execution functionality to a dedicated `ToolExecutor` class in `tool_executor.py`
## Test Plan
Existing tests
The OpenAI compatibility layer was incorrectly importing
ChatCompletionMessageToolCallParam instead of the
ChatCompletionMessageFunctionToolCall class. This caused "Cannot
instantiate typing.Union" errors when processing agent requests with
tool calls.
Closes: #3141
Signed-off-by: Derek Higgins <derekh@redhat.com>
# What does this PR do?
Adds content part streaming events to the OpenAI-compatible Responses API to support more granular streaming of response content. This introduces:
1. New schema types for content parts: `OpenAIResponseContentPart` with variants for text output and refusals
2. New streaming event types:
- `OpenAIResponseObjectStreamResponseContentPartAdded` for when content parts begin
- `OpenAIResponseObjectStreamResponseContentPartDone` for when content parts complete
3. Implementation in the reference provider to emit these events during streaming responses. Also emits MCP arguments just like function call ones.
## Test Plan
Updated existing streaming tests to verify content part events are properly emitted
# What does this PR do?
1. Updates `AgentPersistence.list_sessions()` to properly filter out
`Turn` keys from `Session` keys.
2. Adds a suite of unit tests to confirm the `list_sessions()` behavior
and tests the failed sample in
https://github.com/meta-llama/llama-stack/issues/3048
## Fixes https://github.com/meta-llama/llama-stack/issues/3048
## Test Plan
Unit tests added.
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
This PR implements hybrid search for Milvus DB based on the inbuilt
milvus support.
To test:
```
pytest tests/unit/providers/vector_io/remote/test_milvus.py -v -s
--tb=long --disable-warnings --asyncio-mode=auto
```
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
Extend the Shields Protocol and implement the capability to unregister
previously registered shields and CLI for shields management.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#2581
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
First of, test API for shields
1. Install and start Ollama:
`ollama serve`
2. Pull Llama Guard Model in Ollama:
`ollama pull llama-guard3:8b`
3. Configure env variables:
```
export ENABLE_OLLAMA=ollama
export OLLAMA_URL=http://localhost:11434
```
4. Build Llama Stack distro:
`llama stack build --template starter --image-type venv `
5. Start Llama Stack server:
`llama stack run starter --port 8321`
6. Check if Ollama model is available:
`curl -X GET http://localhost:8321/v1/models | jq '.data[] |
select(.provider_id=="ollama")'`
7. Register a new Shield using Ollama provider:
```
curl -X POST http://localhost:8321/v1/shields \
-H "Content-Type: application/json" \
-d '{
"shield_id": "test-shield",
"provider_id": "llama-guard",
"provider_shield_id": "ollama/llama-guard3:8b",
"params": {}
}'
```
`{"identifier":"test-shield","provider_resource_id":"ollama/llama-guard3:8b","provider_id":"llama-guard","type":"shield","owner":{"principal":"","attributes":{}},"params":{}}%
`
8. Check if shield was registered:
`curl -X GET http://localhost:8321/v1/shields/test-shield`
`{"identifier":"test-shield","provider_resource_id":"ollama/llama-guard3:8b","provider_id":"llama-guard","type":"shield","owner":{"principal":"","attributes":{}},"params":{}}%
`
9. Run shield:
```
curl -X POST http://localhost:8321/v1/safety/run-shield \
-H "Content-Type: application/json" \
-d '{
"shield_id": "test-shield",
"messages": [
{
"role": "user",
"content": "How can I hack into someone computer?"
}
],
"params": {}
}'
```
`{"violation":{"violation_level":"error","user_message":"I can't answer
that. Can I help with something
else?","metadata":{"violation_type":"S2"}}}% `
10. Unregister shield:
`curl -X DELETE http://localhost:8321/v1/shields/test-shield`
`null% `
11. Verify shield was deleted:
`curl -X GET http://localhost:8321/v1/shields/test-shield`
`{"detail":"Invalid value: Shield 'test-shield' not found"}%`
All tests passed ✅
```
========================================================================== 430 passed, 194 warnings in 19.54s ==========================================================================
/Users/iamiller/GitHub/llama-stack/.venv/lib/python3.12/site-packages/litellm/llms/custom_httpx/async_client_cleanup.py:78: RuntimeWarning: coroutine 'close_litellm_async_clients' was never awaited
loop.close()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
Wrote HTML report to htmlcov-3.12/index.html
```
As the title says. Distributions is in, Templates is out.
`llama stack build --template` --> `llama stack build --distro`. For
backward compatibility, the previous option is kept but results in a
warning.
Updated `server.py` to remove the "config_or_template" backward
compatibility since it has been a couple releases since that change.
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR is responsible for removal of Conda support in Llama Stack
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#2539
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
# What does this PR do?
Adds support to Vector store Open AI APIs in Qdrant.
<!-- If resolving an issue, uncomment and update the line below -->
Closes#2463
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
# What does this PR do?
This PR (1) enables the files API for Weaviate and (2) enables
integration tests for Weaviate, which adds a docker container to the
github action.
This PR also handles a couple of edge cases for in creating the
collection and ensuring the tests all pass.
## Test Plan
CI enabled
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
Improve user experience by providing specific guidance when no API key
is available, showing both provider data header and config options with
the correct field name for each provider.
Also adds comprehensive test coverage for API key resolution scenarios.
addresses #2990 for providers using litellm openai mixin
## Test Plan
`./scripts/unit-tests.sh
tests/unit/providers/inference/test_litellm_openai_mixin.py`
# What does this PR do?
- Initialize route_impls to None in constructor to prevent
AttributeError
- Consolidate initialization checks to single point in request() method
- Improve error message to be more helpful ("Please call initialize()
first")
- Add comprehensive test suite to prevent regressions
The library client now has better error handling when users forget to
call initialize(), showing a clear ValueError instead of confusing
AttributeError. All initialization validation is now centralized in the
request() method, with internal methods (_call_non_streaming,
_call_streaming, _convert_body) relying on this single check for
cleaner, more maintainable code.
closes#2943
## Test Plan
`./scripts/unit-tests.sh`
Implements a comprehensive recording and replay system for inference API
calls that eliminates dependency on online inference providers during
testing. The system treats inference as deterministic by recording real
API responses and replaying them in subsequent test runs. Applies to
OpenAI clients (which should cover many inference requests) as well as
Ollama AsyncClient.
For storing, we use a hybrid system: Sqlite for fast lookups and JSON
files for easy greppability / debuggability.
As expected, tests become much much faster (more than 3x in just
inference testing.)
```bash
LLAMA_STACK_TEST_INFERENCE_MODE=record LLAMA_STACK_TEST_RECORDING_DIR=<...> \
uv run pytest -s -v tests/integration/inference \
--stack-config=starter \
-k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \
--text-model="ollama/llama3.2:3b-instruct-fp16" \
--embedding-model=sentence-transformers/all-MiniLM-L6-v2
```
```bash
LLAMA_STACK_TEST_INFERENCE_MODE=replay LLAMA_STACK_TEST_RECORDING_DIR=<...> \
uv run pytest -s -v tests/integration/inference \
--stack-config=starter \
-k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \
--text-model="ollama/llama3.2:3b-instruct-fp16" \
--embedding-model=sentence-transformers/all-MiniLM-L6-v2
```
- `LLAMA_STACK_TEST_INFERENCE_MODE`: `live` (default), `record`, or
`replay`
- `LLAMA_STACK_TEST_RECORDING_DIR`: Storage location (must be specified
for record or replay modes)
**What:**
- Added OpenAIChatCompletionTextOnlyMessageContent type for text-only
content validation
- Modified OpenAISystemMessageParam, OpenAIAssistantMessageParam,
OpenAIDeveloperMessageParam, and OpenAIToolMessageParam to use text-only
content type instead of mixed content
- OpenAIUserMessageParam unchanged - still accepts both text and images
- Updated OpenAPI spec files to reflect text-only content restrictions
in schemas
closes#2894
**Why:**
- Enforces OpenAI API compatibility by restricting image content to user
messages only
- Prevents API misuse where images might be sent in message types that
don't support them
- Aligns with OpenAI's actual API behavior where only user messages can
contain multimodal content
- Improves type safety and validation at the API boundary
**Test plan:**
- Added comprehensive parametrized tests covering all 5 OpenAI message
types
- Tests verify text string acceptance for all message types
- Tests verify text list acceptance for all message types
- Tests verify image rejection for system/assistant/developer/tool
messages (ValidationError expected)
- Tests verify user messages still accept images (backward compatibility
maintained)
# What does this PR do?
- Add base_url field to OpenAIConfig with default
"https://api.openai.com/v1"
- Update sample_run_config to support OPENAI_BASE_URL environment
variable
- Modify get_base_url() to return configured base_url instead of
hardcoded value
- Add comprehensive test suite covering:
- Default base URL behavior
- Custom base URL from config
- Environment variable override
- Config precedence over environment variables
- Client initialization with configured URL
- Model availability checks using configured URL
This enables users to configure custom OpenAI-compatible API endpoints
via environment variables or configuration files.
Closes#2910
## Test Plan
run unit tests
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- Added `set -e` to the beginning of the unit test script to ensure the
script exits on failure and correctly fails the CI when tests do not
pass.
- Fixed all unit tests that were silently failing in the CI.
- Fixed Python 3.13 unit test CI failing silently.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#2877
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
- **Previously:** Unit tests passing in CI eventhough it failed 11 tests
->
[CI-run](4683681501 (step):4:2097)
- **Made the fix. Now, ensuring CI fails as expected on test failures:**
Unit tests failing in CI with 1 failed test ->
[CI-run](4684234247 (step):4:1506)
- This PR shows the CI passing and all unit tests passing.
# What does this PR do?
Enable Chroma inline unit tests and fix integration tests.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
Today, external providers are installed via the `external_providers_dir`
in the config. This necessitates users to understand the `ProviderSpec`
and set up their directories accordingly. This process splits up the
config for the stack across multiple files, directories, and formats.
Most (if not all) external providers today have a
[get_provider_spec](559cb18fbb/src/ramalama_stack/provider.py (L9))
method that sits unused. Utilizing this method rather than the
providers.d route allows for a much easier installation process for
external providers and limits the amount of extra configuration a
regular user has to do to get their stack off the ground.
To accomplish this and wire it throughout the build process, Introduce
the concept of a `module` for users to specify for an external provider
upon build time. In order to facilitate this, align the build and run
spec to use `Provider` class rather than the stringified provider_type
that build currently uses.
For example, say this is in your build config:
```
- provider_id: ramalama
provider_type: remote::ramalama
module: ramalama_stack
```
during build (in the various `build_...` scripts), additionally to
installing any pip dependencies we will also install this module and use
the `get_provider_spec` method to retrieve the ProviderSpec that is
currently specified using `providers.d`.
In production so far, providing instructions for installing external
providers for users has been difficult: they need to install the module
as a pre-req, create the providers.d directory, copy in the provider
spec, and also copy in the necessary build/run yaml files. Accessing an
external provider should be as easy as possible, and pointing to its
installable module aligns more with the rest of our build and dependency
management process.
For now, `external_providers_dir` still exists as an alternate more
declarative method of using external providers.
## Test Plan
added an integration test installing an external provider from module
and more unit test coverage for `get_provider_registry`
( the warning in yellow is expected, the module is installed inside of
the build env, not where we are running the command)
<img width="1119" height="400" alt="Screenshot 2025-07-24 at 11 30
48 AM"
src="https://github.com/user-attachments/assets/1efbaf45-b9e8-451a-bd63-264ed664706d"
/>
<img width="1154" height="618" alt="Screenshot 2025-07-24 at 11 31
14 AM"
src="https://github.com/user-attachments/assets/feb2b3ea-c5dd-418e-9662-9a3bd5dd6bdc"
/>
---------
Signed-off-by: Charlie Doern <cdoern@redhat.com>
# What does this PR do?
- Added ability to specify `required_scope` when declaring an API. This
is part of the `@webmethod` decorator.
- If auth is enabled, a user can access an API only if
`user.attributes['scope']` includes the `required_scope`
- We add `required_scope='telemetry.read'` to the telemetry read APIs.
## Test Plan
CI with added tests
1. Enable server.auth with github token
2. Observe `client.telemetry.query_traces()` returns 403
This flips #2823 and #2805 by making the Stack periodically query the
providers for models rather than the providers going behind the back and
calling "register" on to the registry themselves. This also adds support
for model listing for all other providers via `ModelRegistryHelper`.
Once this is done, we do not need to manually list or register models
via `run.yaml` and it will remove both noise and annoyance (setting
`INFERENCE_MODEL` environment variables, for example) from the new user
experience.
In addition, it adds a configuration variable `allowed_models` which can
be used to optionally restrict the set of models exposed from a
provider.
# What does this PR do?
This PR implements the openai compatible endpoints for chromadb
Closes#2462
## Test Plan
Ran ollama llama stack server and ran the command
`pytest -sv --stack-config=http://localhost:8321
tests/integration/vector_io/test_openai_vector_stores.py
--embedding-model all-MiniLM-L6-v2`
8 failed, 27 passed, 8 skipped, 1 xfailed
The failed ones are regarding files api
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: sarthakdeshpande <sarthak.deshpande@engati.com>
Co-authored-by: Francisco Javier Arceo <farceo@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
I fixed test_access_policy() function providing provider_model_id in
each register model endpoint to pass assertions.
Initially I faced this issue:
```
tests/unit/server/test_quota.py::test_authenticated_quota_allows_up_to_limit
tests/unit/server/test_quota.py::test_authenticated_quota_blocks_after_limit
tests/unit/server/test_quota.py::test_anonymous_quota_allows_up_to_limit
tests/unit/server/test_quota.py::test_anonymous_quota_blocks_after_limit
/Users/iamiller/GitHub/llama-stack/.venv/lib/python3.12/site-packages/aiosqlite/core.py:105: DeprecationWarning: The default datetime adapter is deprecated as of Python 3.12; see the sqlite3 documentation for suggested replacement recipes
result = function()
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================================================== short test summary info ===============================================================================
FAILED tests/unit/server/test_access_control.py::test_access_policy - AssertionError: assert 'test_provider/model-1' == 'model-1'
==================================================================== 1 failed, 436 passed, 194 warnings in 20.09s ====================================================================
```
After resolved, all works:
```
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
========================================================================= 437 passed, 194 warnings in 19.41s =========================================================================
```
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Run ` ./scripts/unit-tests.sh`
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
I noticed a few issues with my implementation of the search mode
validation for RagQuery.
This PR replaces the check for search mode in RagQuery with a Literal.
There were issues before with
```
TypeError: Object of type RAGSearchMode is not JSON serializable
```
When using
```
query_config = RAGQueryConfig(max_chunks=6, mode="vector").model_dump()
```
It also fixes the fact that despite user input "vector" was always the
used search mode.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Verify that a chosen search mode works when using Rag Query or use below
agent config:
```
agent = Agent(
client,
model=model_id,
instructions="You are a helpful assistant",
tools=[
{
"name": "builtin::rag/knowledge_search",
"args": {
"vector_db_ids": [vector_db_id],
"query_config": {
"mode": "keyword",
"max_chunks": 6
}
},
}
],
)
```
Running Unit Tests:
```
uv sync --extra dev
uv run pytest tests/unit/rag/test_rag_query.py -v
```
# What does this PR do?
add an `OpenAIMixin` for use by inference providers who remote endpoints
support an OpenAI compatible API.
use is demonstrated by refactoring
- OpenAIInferenceAdapter
- NVIDIAInferenceAdapter (adds embedding support)
- LlamaCompatInferenceAdapter
## Test Plan
existing unit and integration tests
This PR updates model registration and lookup behavior to be slightly
more general / flexible. See
https://github.com/meta-llama/llama-stack/issues/2843 for more details.
Note that this change is backwards compatible given the design of the
`lookup_model()` method.
## Test Plan
Added unit tests
# What does this PR do?
Refactors the vector store routing logic by moving OpenAI-compatible
vector store operations from the `VectorIORouter` to the
`VectorDBsRoutingTable`.
Closes https://github.com/meta-llama/llama-stack/issues/2761
## Test Plan
Added unit tests to cover new routing logic and ACL checks.
---------
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The pre-commit workflow was failing in the main branch and removing
`@pytest.mark.asyncio `from `test_get_raw_document_text.py` fixed that.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- Added coverage badge to README. - [See my
fork](https://github.com/ChristianZaccaria/llama-stack)
- Added a GitHub Actions workflow that runs the tests and updates the
coverage badge. - [See
run](4574811323)
- Documented steps in `testing.md` for running the tests locally, and
viewing the `html` report.
- Excluded non-essential files from coverage reporting to provide a more
accurate measurement.
Automatically created PR to update coverage badge:
https://github.com/ChristianZaccaria/llama-stack/pull/9
# Note for reviewers
1. Currently the coverage report shows a 45% coverage. Wondering if
there are other files or directories that should also be excluded from
the report to increase the percentage. The directories with the least
test coverage are `llama_stack/cli`, `llama_stack/models`, and
`llama_stack/ui`. - Should we exclude these?
2. **[Required]** The `GITHUB_TOKEN` should have write permissions to
open a PR to update the coverage badge.
# GitHub Issue
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes#2355
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
The `testing.md` file describes how to run the unit tests locally.