- streaming.py: Extract has_tool_calls boolean for readability
- streaming.py: Replace nested function checks with assertions
- streaming.py: Fix AllowedToolsFilter to use tool_names instead of allowed/disallowed
- streaming.py: Add comment explaining tool_context can be None
- streaming.py, utils.py: Clarify Pydantic/dict compatibility comments
- utils.py: Document list invariance vs Sequence covariance in type signature
- utils.py: Clarify list_shields runtime availability comment
Resolved all remaining mypy errors in streaming.py by adding proper None checks, union type narrowing, and type annotations:
- Fixed ToolContext None checks before accessing attributes
- Added isinstance checks for OpenAIAssistantMessageParam before accessing tool_calls
- Added None checks for response_tool_call and tool_call.function
- Fixed AllowedToolsFilter attribute access (allowed/disallowed)
- Added explicit type annotations to ensure str types (not str | None)
- Added type ignore comments for dict/TypedDict compatibility issues
All meta reference agent files now pass mypy with 0 errors (down from 280 errors).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Fixed openai_responses.py: proper type narrowing with match statements,
assertions for None checks, explicit list typing
- Fixed utils.py: added Sequence support, union type narrowing, None handling
- Fixed streaming.py signature: accept optional instructions parameter
- tool_executor.py and agent_instance.py: automatically fixed by API changes
Remaining: 30 errors in streaming.py and one other file
Co-Authored-By: Claude <noreply@anthropic.com>
Added support for collections.abc.Sequence types in the schema generator to fix OpenAPI spec generation after changing API types from list to Sequence.
Changes:
- Added is_generic_sequence() and unwrap_generic_sequence() helper functions in inspection.py
- Updated python_type_to_name() in name.py to handle Sequence types (treats them as List for schema purposes)
- Fixed force parameter propagation in recursive python_type_to_name() calls to ensure unions within generic types are handled correctly
- Updated docstring_to_schema() to call python_type_to_name() with force=True
- Regenerated OpenAPI specs with updated type handling
This enables using Sequence instead of list in API definitions while maintaining schema generation compatibility.
Fixed all 76 type errors in openai_responses.py through proper type-safe refactoring:
Changes made:
- Fixed return type signature for _process_input_with_previous_response to include ToolContext
- Replaced .copy() with list() for Sequence types (line 94)
- Added assertions for text and max_infer_iters to narrow types from None
- Properly typed input_items_data as list[OpenAIResponseInput] to avoid list invariance
- Properly typed conversation_items as list[ConversationItem] to avoid list invariance
- Properly typed output_items as list[ConversationItem] to avoid list invariance
- Fixed StreamingResponseOrchestrator signature to accept str | None for instructions
All fixes use proper type narrowing and union typing without any type: ignore comments.
- Change list to Sequence in OpenAI response API types to fix list invariance issues
- Use match statements for proper union type narrowing in stream chunk handling
- Reduces errors in openai_responses.py from 76 to 12 (84% reduction)
Changed comment from "message.content uses list[AliasType] but mypy expects Iterable[BaseType]" to "OpenAI SDK uses aliased types internally that mypy sees as incompatible with base types".
This is more accurate - the OpenAI SDK's message parameter types use aliased names (like OpenAIChatCompletionContentPartTextParam) internally in their type annotations, and mypy cannot match these with base type names (ChatCompletionContentPartTextParam) even though they're the same types at runtime.
Verified that importing and using base types directly doesn't resolve the issue because the SDK's internal type annotations still use the aliased names.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Simplify provider_resource_id assignment with assertion (review comment 1)
- Fix comment placement order (review comment 2)
- Refactor tool_calls list building to avoid union-attr suppression (review comment 3)
- Rename response_format to response_format_dict to avoid shadowing (review comment 4)
- Update type: ignore comments for message.content with accurate explanation of OpenAI SDK type alias resolution issue (review comment 5)
- Add assertions in litellm_openai_mixin to validate provider_resource_id is not None
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Resolves 111 mypy errors across OpenAI inference compatibility utilities:
litellm_openai_mixin.py (23 errors → 1 unavoidable):
- Add type annotation for input_dict parameter
- Fix JsonSchemaResponseFormat dict conversion and manipulation
- Add None checks for tool_config access with walrus operator
- Fix get_api_key() to properly handle None key_field
- Add model_store None checks in all three OpenAI methods
- Add type annotations for provider_resource_id variable
- Add type: ignore for litellm external library returns
openai_compat.py (88 errors → 0):
- Add None checks for TopPSamplingStrategy temperature and top_p
- Add type: ignore for no-any-return in text_from_choice
- Fix union-attr errors in logprobs iteration with None checks
- Add None checks for choice.text and finish_reason
- Fix OpenAICompatCompletionChoice.message attribute access
- Filter tool_calls to only valid ToolCall objects
- Add type annotations for converted_messages and choices lists
- Fix TypedDict ** expansion issues in message conversions
- Add type: ignore for function dict index operations
- Fix tool_choice and response_format type conversions
- Add type annotations for lls_tools variable
- Fix sampling strategy assignment with proper ordering
- Add None checks for buffer string concatenations
- Rename shadowed tool_call variable to parsed_tool_call
- Fix message content and tool_calls type conversions
- Add isinstance checks before attribute access on content
- Fix OpenAI finish_reason type literal conversions
Code remains clean and readable with strategic use of type: ignore
pragmas only where necessary for external library compatibility.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add None checks for optional shield and client_tools lists
- Convert StepType.X.value to StepType.X enum values
- Convert ISO timestamp strings to datetime objects
- Add type annotations (output_attachments, tool_name_to_def)
- Fix union type discrimination with isinstance checks
- Fix max_infer_iters optional comparison
- Filter tool_calls to exclude strings, keep only ToolCall objects
- Fix identifier handling for BuiltinTool enum conversion
- Fix Attachment API parameter (url → content)
- Add type: ignore for OpenAI response format compatibility
Fixes all 81 mypy errors in agent_instance.py.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixed 35 errors, 46 remaining:
- Add isinstance() checks for union type discrimination
- Fix list type annotations for Message types
- Convert strings to datetime/StepType where needed
- Use assert to narrow AgentTurnCreateRequest vs AgentTurnResumeRequest
- Add explicit type annotations to avoid inference issues
Still to fix:
- Remaining str to datetime/StepType conversions
- Optional list handling for shields
- Type annotations for tool maps
- List variance issues for input_messages
- Fix turn_id variable redefinition
- Add proper type annotations using Any where needed
- Fix union-attr errors with getattr and walrus operator
- Fix arg-type errors for datetime/enum conversions
- Add type: ignore for list invariance issues
- Remove event variable reuse to satisfy type checker
- Use proper type narrowing for tool execution paths
Patterns established:
- Use getattr() with walrus operator for optional attributes
- Use type: ignore for runtime-correct but mypy-incompatible cases
- Separate event variables by type to avoid union conflicts
This PR enables routing of fully qualified model IDs of the form
`provider_id/model_id` even when the models are not registered with the
Stack.
Here's the situation: assume a remote inference provider which works
only when users provide their own API keys via
`X-LlamaStack-Provider-Data` header. By definition, we cannot list
models and hence update our routing registry. But because we _require_ a
provider ID in the models now, we can identify which provider to route
to and let that provider decide.
Note that we still try to look up our registry since it may have a
pre-registered alias. Just that we don't outright fail when we are not
able to look it up.
Also, updated inference router so that the responses have the _exact_
model that the request had.
## Test Plan
Added an integration test
Closes#3929
---------
Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>
Adds type stubs and fixes mypy errors for better type coverage.
Changes:
- Added type_checking dependency group with type stubs (torchtune, trl,
etc.)
- Added lm-format-enforcer to pre-commit hook
- Created HFAutoModel Protocol for type-safe HuggingFace model handling
- Added mypy.overrides for untyped libraries (torchtune, fairscale,
etc.)
- Fixed type issues in post-training providers, databricks, and
api_recorder
Note: ~1,200 errors remain in excluded files (see pyproject.toml exclude
list).
---------
Co-authored-by: Claude <noreply@anthropic.com>
## Summary
- Fix OpenAI SDK NotGiven/Omit type mismatches in embeddings calls
- Fix incorrect OpenAIChatCompletionChunk import in vllm provider
- Refactor to avoid type:ignore comments by using conditional kwargs
## Changes
**openai_mixin.py (9 errors fixed):**
- Build kwargs conditionally for embeddings.create() to avoid
NotGiven/Omit mismatch
- Only include parameters when they have actual values (not None)
**gemini.py (9 errors fixed):**
- Apply same conditional kwargs pattern
- Add missing Any import
**vllm.py (2 errors fixed):**
- Use correct OpenAIChatCompletionChunk from llama_stack.apis.inference
- Remove incorrect alias from openai package
## Technical Notes
The OpenAI SDK has a type system quirk where `NOT_GIVEN` has type
`NotGiven` but parameter signatures expect `Omit`. By only passing
parameters with actual values, we avoid this mismatch entirely without
needing `# type: ignore` comments.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Fixes mypy type errors in provider utilities and testing infrastructure:
- `mcp.py`: Cast incompatible client types, wrap image data properly
- `batches.py`: Rename walrus variable to avoid shadowing
- `api_recorder.py`: Use cast for Pydantic field annotation
No functional changes.
---------
Co-authored-by: Claude <noreply@anthropic.com>
## Summary
Fixes mypy type errors across 4 model implementation files (Phase 2d of
mypy suppression removal plan):
- `src/llama_stack/models/llama/llama3/multimodal/image_transform.py`
(10 errors fixed)
- `src/llama_stack/models/llama/checkpoint.py` (2 errors fixed)
- `src/llama_stack/models/llama/hadamard_utils.py` (1 error fixed)
- `src/llama_stack/models/llama/llama3/multimodal/encoder_utils.py` (1
error fixed)
## Changes
### image_transform.py
- Fixed return type annotation for `find_supported_resolutions` from
`Tensor` to `list[tuple[int, int]]`
- Fixed parameter and return type annotations for
`resize_without_distortion` from `Tensor` to `Image.Image`
- Resolved variable shadowing by using separate names:
`possible_resolutions_list` for the list and
`possible_resolutions_tensor` for the tensor
### checkpoint.py
- Replaced deprecated `torch.BFloat16Tensor` and
`torch.cuda.BFloat16Tensor` with
`torch.set_default_dtype(torch.bfloat16)`
- Fixed variable shadowing by renaming numpy array to `ckpt_paths_array`
to distinguish from the parameter `ckpt_paths: list[Path]`
### hadamard_utils.py
- Added `isinstance` assertion to narrow type from `nn.Module` to
`nn.Linear` before accessing `in_features` attribute
### encoder_utils.py
- Fixed variable shadowing by using `masks_list` for list accumulation
and `masks` for the final Tensor result
## Test plan
- Verified all files pass mypy type checking (only optional dependency
import warnings remain)
- No functional changes - only type annotations and variable naming
improvements
Stacks on PR #3933
Co-authored-by: Claude <noreply@anthropic.com>
Fixes mypy type errors in OpenTelemetry integration:
- Add type aliases for AttributeValue and Attributes
- Add helper to filter None values from attributes (OpenTelemetry
doesn't accept None)
- Cast metric and tracer objects to proper types
- Update imports after refactoring
No functional changes.
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR is responsible for making changes to Responses API scheme to
introduce OpenAI compatible prompts there. Change to the API only,
therefore currently no implementation at all. However, the follow up PR
with actual implementation will be submitted after current PR lands.
The need of this functionality was initiated in #3514.
> Note, #3514 is divided on three separate PRs. Current PR is the second
of three.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
CI
## Summary
This PR adds mypy and essential type stub packages to dev dependencies
as Phase 1 of the mypy suppression removal plan.
**Changes:**
- Add `mypy` to dev dependencies
- Add type stubs: `types-jsonschema`, `pandas-stubs`, `types-psutil`,
`types-tqdm`, `boto3-stubs`
**Impact:**
- Enables static type checking across the codebase
- Eliminates ~30 type checking errors related to missing type
information for third-party packages
- Provides foundation for subsequent PRs to remove type suppressions
**Part of:** Mypy suppression removal plan (Phase 1/4)
**Testing:**
```bash
uv sync --group dev
uv run mypy
```
# What does this PR do?
To match https://github.com/llamastack/llama-stack/pull/3847 We must not
update the lock manually, but always reflect the update in the
pyproject.toml. The lock is a state at build time.
Signed-off-by: Sébastien Han <seb@redhat.com>
## Summary
- `preserve_contexts_async_generator` left `PROVIDER_DATA_VAR` (and
other context vars) populated after a streaming generator completed on
HEAD~1, so the asyncio context for request N+1 started with request N's
provider payload.
- FastAPI dependencies and middleware execute before
`request_provider_data_context` rebinds the header data, meaning
auth/logging hooks could observe a prior tenant's credentials or treat
them as authenticated. Traces and any background work that inspects the
context outside the `with` block leak as well—this is a real security
regression, not just a CLI artifact.
- The wrapper now restores each tracked `ContextVar` to the value it
held before the iteration (falling back to clearing when necessary)
after every yield and when the generator terminates, so provider data is
wiped while callers that set their own defaults keep them.
## Test Plan
- `uv run pytest tests/unit/core/test_provider_data_context.py -q`
- `uv run pytest tests/unit/distribution/test_context.py -q`
Both suites fail on HEAD~1 and pass with this change.
# What does this PR do?
add provider-data key passing support to Cerebras, Databricks, NVIDIA
and RunPod
also, added missing tests for Fireworks, Anthropic, Gemini, SambaNova,
and vLLM
addresses #3517
## Test Plan
ci w/ new tests
---------
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
Migrates package structure to src/ layout following Python packaging
best practices.
All code moved from `llama_stack/` to `src/llama_stack/`. Public API
unchanged - imports remain `import llama_stack.*`.
Updated build configs, pre-commit hooks, scripts, and GitHub workflows
accordingly. All hooks pass, package builds cleanly.
**Developer note**: Reinstall after pulling: `pip install -e .`
# What does this PR do?
Introduces two main fixes to enhance the stability of Responses API when
dealing with tool calling responses and structured outputs.
### Changes Made
1. It added OpenAIResponseOutputMessageMCPCall and ListTools to
OpenAIResponseInput but
https://github.com/llamastack/llama-stack/pull/3810 got merge that did
the same in a different way. Still this PR does it in a way that keep
the sync between OpenAIResponsesOutput and the allowed objects in
OpenAIResponseInput.
2. Add protection in case self.ctx.response_format does not have type
attribute
BREAKING CHANGE: OpenAIResponseInput now uses OpenAIResponseOutput union
type.
This is semantically equivalent - all previously accepted types are
still supported
via the OpenAIResponseOutput union. This improves type consistency and
maintainability.
This patch ensures if max tokens is not defined, then is set to None
instead of 0 when calling openai_chat_completion. This way some
providers (like gemini) that cannot handle the `max_tokens = 0` will not
fail
Issue: #3666
The vector_provider_wrapper was only limiting providers to
faiss/sqlite-vec for replay mode, but CI tests also run in record mode
with the same limited set of providers. This caused test failures when
trying to test against milvus, chromadb, pgvector, weaviate, and qdrant
which aren't configured in the record job.
Bumps
[@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node)
from 24.8.1 to 24.9.1.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps
[@types/react-dom](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react-dom)
from 19.2.1 to 19.2.2.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react-dom">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>