Add 'from __future__ import annotations' to enable postponed evaluation
of annotations. This allows S3Client type hints (imported only under
TYPE_CHECKING) to work correctly at runtime without NameError.
Fixes unit test collection error:NameError: name 'S3Client' is not defined
Add boto3-stubs[s3] dev dependency and use TYPE_CHECKING with cast() to
provide proper S3Client type hints while avoiding boto3-stubs' overly
strict overload definitions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Import OpenAIChatCompletionChunk from llama_stack.apis.inference
instead of aliasing from openai package to match parent class signature.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Refactor embeddings API calls to avoid NotGiven/Omit type incompatibility
by conditionally building kwargs dict with only non-None parameters.
- openai_mixin.py: Build kwargs conditionally for embeddings.create()
- gemini.py: Apply same pattern + add Any import
This approach avoids type:ignore comments by not passing NOT_GIVEN sentinel
values that conflict with Omit type annotations in OpenAI SDK.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Fixes mypy type errors in provider utilities and testing infrastructure:
- `mcp.py`: Cast incompatible client types, wrap image data properly
- `batches.py`: Rename walrus variable to avoid shadowing
- `api_recorder.py`: Use cast for Pydantic field annotation
No functional changes.
---------
Co-authored-by: Claude <noreply@anthropic.com>
Fixes mypy type errors in OpenTelemetry integration:
- Add type aliases for AttributeValue and Attributes
- Add helper to filter None values from attributes (OpenTelemetry
doesn't accept None)
- Cast metric and tracer objects to proper types
- Update imports after refactoring
No functional changes.
# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
This PR is responsible for making changes to Responses API scheme to
introduce OpenAI compatible prompts there. Change to the API only,
therefore currently no implementation at all. However, the follow up PR
with actual implementation will be submitted after current PR lands.
The need of this functionality was initiated in #3514.
> Note, #3514 is divided on three separate PRs. Current PR is the second
of three.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
CI
# What does this PR do?
add provider-data key passing support to Cerebras, Databricks, NVIDIA
and RunPod
also, added missing tests for Fireworks, Anthropic, Gemini, SambaNova,
and vLLM
addresses #3517
## Test Plan
ci w/ new tests
---------
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
Migrates package structure to src/ layout following Python packaging
best practices.
All code moved from `llama_stack/` to `src/llama_stack/`. Public API
unchanged - imports remain `import llama_stack.*`.
Updated build configs, pre-commit hooks, scripts, and GitHub workflows
accordingly. All hooks pass, package builds cleanly.
**Developer note**: Reinstall after pulling: `pip install -e .`