llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-11 19:56:03 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	ec1bae78e6	fix	2025-10-28 16:22:25 -07:00
Ashwin Bharambe	1f5adff5a7	Simplify type signature using only Sequence covariance	2025-10-28 15:53:32 -07:00
Ashwin Bharambe	53c6f846d4	Address PR feedback: improve code clarity and fix AllowedToolsFilter bug - streaming.py: Extract has_tool_calls boolean for readability - streaming.py: Replace nested function checks with assertions - streaming.py: Fix AllowedToolsFilter to use tool_names instead of allowed/disallowed - streaming.py: Add comment explaining tool_context can be None - streaming.py, utils.py: Clarify Pydantic/dict compatibility comments - utils.py: Document list invariance vs Sequence covariance in type signature - utils.py: Clarify list_shields runtime availability comment	2025-10-28 15:47:31 -07:00
Ashwin Bharambe	84d78ff48a	fix(mypy): complete streaming.py type fixes (24→0 errors) Resolved all remaining mypy errors in streaming.py by adding proper None checks, union type narrowing, and type annotations: - Fixed ToolContext None checks before accessing attributes - Added isinstance checks for OpenAIAssistantMessageParam before accessing tool_calls - Added None checks for response_tool_call and tool_call.function - Fixed AllowedToolsFilter attribute access (allowed/disallowed) - Added explicit type annotations to ensure str types (not str \| None) - Added type ignore comments for dict/TypedDict compatibility issues All meta reference agent files now pass mypy with 0 errors (down from 280 errors). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-28 15:24:41 -07:00
Ashwin Bharambe	693e99c4ba	fix(mypy): resolve OpenAI responses type issues (280→30 errors) - Fixed openai_responses.py: proper type narrowing with match statements, assertions for None checks, explicit list typing - Fixed utils.py: added Sequence support, union type narrowing, None handling - Fixed streaming.py signature: accept optional instructions parameter - tool_executor.py and agent_instance.py: automatically fixed by API changes Remaining: 30 errors in streaming.py and one other file Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-28 15:24:41 -07:00
Ashwin Bharambe	35e251090b	fix(schema): add Sequence type support to schema generator Added support for collections.abc.Sequence types in the schema generator to fix OpenAPI spec generation after changing API types from list to Sequence. Changes: - Added is_generic_sequence() and unwrap_generic_sequence() helper functions in inspection.py - Updated python_type_to_name() in name.py to handle Sequence types (treats them as List for schema purposes) - Fixed force parameter propagation in recursive python_type_to_name() calls to ensure unions within generic types are handled correctly - Updated docstring_to_schema() to call python_type_to_name() with force=True - Regenerated OpenAPI specs with updated type handling This enables using Sequence instead of list in API definitions while maintaining schema generation compatibility.	2025-10-28 15:24:41 -07:00
Ashwin Bharambe	aba98f49db	fix(mypy): complete openai_responses.py type fixes (76 errors resolved) Fixed all 76 type errors in openai_responses.py through proper type-safe refactoring: Changes made: - Fixed return type signature for _process_input_with_previous_response to include ToolContext - Replaced .copy() with list() for Sequence types (line 94) - Added assertions for text and max_infer_iters to narrow types from None - Properly typed input_items_data as list[OpenAIResponseInput] to avoid list invariance - Properly typed conversation_items as list[ConversationItem] to avoid list invariance - Properly typed output_items as list[ConversationItem] to avoid list invariance - Fixed StreamingResponseOrchestrator signature to accept str \| None for instructions All fixes use proper type narrowing and union typing without any type: ignore comments.	2025-10-28 15:24:40 -07:00
Ashwin Bharambe	d4d55bc0fe	Improve OpenAI responses type safety with Sequence and match statements - Change list to Sequence in OpenAI response API types to fix list invariance issues - Use match statements for proper union type narrowing in stream chunk handling - Reduces errors in openai_responses.py from 76 to 12 (84% reduction)	2025-10-28 15:24:40 -07:00
Ashwin Bharambe	c3f817f344	Update type: ignore comments with clearer explanation Changed comment from "message.content uses list[AliasType] but mypy expects Iterable[BaseType]" to "OpenAI SDK uses aliased types internally that mypy sees as incompatible with base types". This is more accurate - the OpenAI SDK's message parameter types use aliased names (like OpenAIChatCompletionContentPartTextParam) internally in their type annotations, and mypy cannot match these with base type names (ChatCompletionContentPartTextParam) even though they're the same types at runtime. Verified that importing and using base types directly doesn't resolve the issue because the SDK's internal type annotations still use the aliased names. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-28 13:53:02 -07:00
Ashwin Bharambe	dbd036e7b4	Address PR review feedback - Simplify provider_resource_id assignment with assertion (review comment 1) - Fix comment placement order (review comment 2) - Refactor tool_calls list building to avoid union-attr suppression (review comment 3) - Rename response_format to response_format_dict to avoid shadowing (review comment 4) - Update type: ignore comments for message.content with accurate explanation of OpenAI SDK type alias resolution issue (review comment 5) - Add assertions in litellm_openai_mixin to validate provider_resource_id is not None 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-28 13:49:34 -07:00
Ashwin Bharambe	9032ba9097	add comments explaining the suppressions	2025-10-28 13:32:21 -07:00
Ashwin Bharambe	a8d51a1a8b	fix(mypy): resolve OpenAI compatibility layer type issues Resolves 111 mypy errors across OpenAI inference compatibility utilities: litellm_openai_mixin.py (23 errors → 1 unavoidable): - Add type annotation for input_dict parameter - Fix JsonSchemaResponseFormat dict conversion and manipulation - Add None checks for tool_config access with walrus operator - Fix get_api_key() to properly handle None key_field - Add model_store None checks in all three OpenAI methods - Add type annotations for provider_resource_id variable - Add type: ignore for litellm external library returns openai_compat.py (88 errors → 0): - Add None checks for TopPSamplingStrategy temperature and top_p - Add type: ignore for no-any-return in text_from_choice - Fix union-attr errors in logprobs iteration with None checks - Add None checks for choice.text and finish_reason - Fix OpenAICompatCompletionChoice.message attribute access - Filter tool_calls to only valid ToolCall objects - Add type annotations for converted_messages and choices lists - Fix TypedDict ** expansion issues in message conversions - Add type: ignore for function dict index operations - Fix tool_choice and response_format type conversions - Add type annotations for lls_tools variable - Fix sampling strategy assignment with proper ordering - Add None checks for buffer string concatenations - Rename shadowed tool_call variable to parsed_tool_call - Fix message content and tool_calls type conversions - Add isinstance checks before attribute access on content - Fix OpenAI finish_reason type literal conversions Code remains clean and readable with strategic use of type: ignore pragmas only where necessary for external library compatibility. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-28 13:12:43 -07:00
Ashwin Bharambe	dd74b2033c	fix(mypy): resolve litellm_openai_mixin type issues (23 errors fixed)	2025-10-28 13:12:43 -07:00
Ashwin Bharambe	2882ae39b9	small stylistic fixes	2025-10-28 13:12:29 -07:00
Ashwin Bharambe	ce1392b3a8	fix(mypy): resolve agent_instance.py type issues (81 errors) - Add None checks for optional shield and client_tools lists - Convert StepType.X.value to StepType.X enum values - Convert ISO timestamp strings to datetime objects - Add type annotations (output_attachments, tool_name_to_def) - Fix union type discrimination with isinstance checks - Fix max_infer_iters optional comparison - Filter tool_calls to exclude strings, keep only ToolCall objects - Fix identifier handling for BuiltinTool enum conversion - Fix Attachment API parameter (url → content) - Add type: ignore for OpenAI response format compatibility Fixes all 81 mypy errors in agent_instance.py. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-10-28 13:12:29 -07:00
Ashwin Bharambe	3cf36e665b	fix(mypy): resolve agent_instance type issues (part 1 of 2) Fixed 35 errors, 46 remaining: - Add isinstance() checks for union type discrimination - Fix list type annotations for Message types - Convert strings to datetime/StepType where needed - Use assert to narrow AgentTurnCreateRequest vs AgentTurnResumeRequest - Add explicit type annotations to avoid inference issues Still to fix: - Remaining str to datetime/StepType conversions - Optional list handling for shields - Type annotations for tool maps - List variance issues for input_messages - Fix turn_id variable redefinition	2025-10-28 13:12:29 -07:00
Ashwin Bharambe	3a437d80af	fix(mypy): resolve tool_executor type issues (45 errors fixed) - Add proper type annotations using Any where needed - Fix union-attr errors with getattr and walrus operator - Fix arg-type errors for datetime/enum conversions - Add type: ignore for list invariance issues - Remove event variable reuse to satisfy type checker - Use proper type narrowing for tool execution paths Patterns established: - Use getattr() with walrus operator for optional attributes - Use type: ignore for runtime-correct but mypy-incompatible cases - Separate event variables by type to avoid union conflicts	2025-10-28 13:12:29 -07:00
Ashwin Bharambe	f88416ef87	fix(inference): enable routing of models with provider_data alone (#3928 ) This PR enables routing of fully qualified model IDs of the form `provider_id/model_id` even when the models are not registered with the Stack. Here's the situation: assume a remote inference provider which works only when users provide their own API keys via `X-LlamaStack-Provider-Data` header. By definition, we cannot list models and hence update our routing registry. But because we _require_ a provider ID in the models now, we can identify which provider to route to and let that provider decide. Note that we still try to look up our registry since it may have a pre-registered alias. Just that we don't outright fail when we are not able to look it up. Also, updated inference router so that the responses have the _exact_ model that the request had. ## Test Plan Added an integration test Closes #3929 --------- Co-authored-by: ehhuang <ehhuang@users.noreply.github.com>	2025-10-28 11:16:37 -07:00
Ashwin Bharambe	94b0592240	fix(mypy): add type stubs and fix typing issues (#3938 ) Adds type stubs and fixes mypy errors for better type coverage. Changes: - Added type_checking dependency group with type stubs (torchtune, trl, etc.) - Added lm-format-enforcer to pre-commit hook - Created HFAutoModel Protocol for type-safe HuggingFace model handling - Added mypy.overrides for untyped libraries (torchtune, fairscale, etc.) - Fixed type issues in post-training providers, databricks, and api_recorder Note: ~1,200 errors remain in excluded files (see pyproject.toml exclude list). --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 11:00:09 -07:00
Ashwin Bharambe	1d385b5b75	fix(mypy): resolve OpenAI SDK and provider type issues (#3936 ) ## Summary - Fix OpenAI SDK NotGiven/Omit type mismatches in embeddings calls - Fix incorrect OpenAIChatCompletionChunk import in vllm provider - Refactor to avoid type:ignore comments by using conditional kwargs ## Changes openai_mixin.py (9 errors fixed): - Build kwargs conditionally for embeddings.create() to avoid NotGiven/Omit mismatch - Only include parameters when they have actual values (not None) gemini.py (9 errors fixed): - Apply same conditional kwargs pattern - Add missing Any import vllm.py (2 errors fixed): - Use correct OpenAIChatCompletionChunk from llama_stack.apis.inference - Remove incorrect alias from openai package ## Technical Notes The OpenAI SDK has a type system quirk where `NOT_GIVEN` has type `NotGiven` but parameter signatures expect `Omit`. By only passing parameters with actual values, we avoid this mismatch entirely without needing `# type: ignore` comments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 10:54:29 -07:00
Ashwin Bharambe	d009dc29f7	fix(mypy): resolve provider utility and testing type issues (#3935 ) Some checks failed SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 2s Details Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 3s Details Test Llama Stack Build / generate-matrix (push) Successful in 3s Details Vector IO Integration Tests / test-matrix (push) Failing after 5s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Python Package Build Test / build (3.12) (push) Failing after 2s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 4s Details Test Llama Stack Build / build-single-provider (push) Failing after 4s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 4s Details Python Package Build Test / build (3.13) (push) Failing after 3s Details Test llama stack list-deps / generate-matrix (push) Successful in 4s Details Test llama stack list-deps / show-single-provider (push) Failing after 3s Details API Conformance Tests / check-schema-compatibility (push) Successful in 11s Details Test llama stack list-deps / list-deps-from-config (push) Failing after 4s Details Test External API and Providers / test-external (venv) (push) Failing after 3s Details Unit Tests / unit-tests (3.12) (push) Failing after 4s Details Unit Tests / unit-tests (3.13) (push) Failing after 4s Details Test llama stack list-deps / list-deps (push) Failing after 4s Details Test Llama Stack Build / build (push) Failing after 7s Details UI Tests / ui-tests (22) (push) Successful in 51s Details Pre-commit / pre-commit (push) Successful in 2m0s Details Fixes mypy type errors in provider utilities and testing infrastructure: - `mcp.py`: Cast incompatible client types, wrap image data properly - `batches.py`: Rename walrus variable to avoid shadowing - `api_recorder.py`: Use cast for Pydantic field annotation No functional changes. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 10:37:27 -07:00
Ashwin Bharambe	fcf07790c8	fix(mypy): resolve model implementation typing issues (#3934 ) ## Summary Fixes mypy type errors across 4 model implementation files (Phase 2d of mypy suppression removal plan): - `src/llama_stack/models/llama/llama3/multimodal/image_transform.py` (10 errors fixed) - `src/llama_stack/models/llama/checkpoint.py` (2 errors fixed) - `src/llama_stack/models/llama/hadamard_utils.py` (1 error fixed) - `src/llama_stack/models/llama/llama3/multimodal/encoder_utils.py` (1 error fixed) ## Changes ### image_transform.py - Fixed return type annotation for `find_supported_resolutions` from `Tensor` to `list[tuple[int, int]]` - Fixed parameter and return type annotations for `resize_without_distortion` from `Tensor` to `Image.Image` - Resolved variable shadowing by using separate names: `possible_resolutions_list` for the list and `possible_resolutions_tensor` for the tensor ### checkpoint.py - Replaced deprecated `torch.BFloat16Tensor` and `torch.cuda.BFloat16Tensor` with `torch.set_default_dtype(torch.bfloat16)` - Fixed variable shadowing by renaming numpy array to `ckpt_paths_array` to distinguish from the parameter `ckpt_paths: list[Path]` ### hadamard_utils.py - Added `isinstance` assertion to narrow type from `nn.Module` to `nn.Linear` before accessing `in_features` attribute ### encoder_utils.py - Fixed variable shadowing by using `masks_list` for list accumulation and `masks` for the final Tensor result ## Test plan - Verified all files pass mypy type checking (only optional dependency import warnings remain) - No functional changes - only type annotations and variable naming improvements Stacks on PR #3933 Co-authored-by: Claude <noreply@anthropic.com>	2025-10-28 10:28:29 -07:00
Ashwin Bharambe	6ce59b5df8	fix(mypy): resolve type issues in MongoDB, batches, and auth providers (#3933 ) Fixes mypy type errors in provider utilities: - MongoDB: Fix AsyncMongoClient parameters, use async iteration for cursor - Batches: Handle memoryview\|bytes union for file decoding - Auth: Add missing imports, validate JWKS URI, conditionally pass parameters Fixes 11 type errors. No functional changes.	2025-10-28 10:23:39 -07:00
Ashwin Bharambe	4a2ea278c5	fix(mypy): resolve OpenTelemetry typing issues in telemetry.py (#3943 ) Fixes mypy type errors in OpenTelemetry integration: - Add type aliases for AttributeValue and Attributes - Add helper to filter None values from attributes (OpenTelemetry doesn't accept None) - Cast metric and tracer objects to proper types - Update imports after refactoring No functional changes.	2025-10-28 10:10:18 -07:00
Ashwin Bharambe	85887d724f	Revert "fix(mypy): resolve OpenTelemetry typing issues in telemetry.py (#3931 )" This reverts commit `9afc52a36a`.	2025-10-28 09:48:46 -07:00
Ashwin Bharambe	9afc52a36a	fix(mypy): resolve OpenTelemetry typing issues in telemetry.py (#3931 ) ## Summary Fix all 11 mypy type checking errors in `telemetry.py` without using any type suppressions. Changes: - Add type aliases for OpenTelemetry attribute types (`AttributeValue`, `Attributes`) - Create `_clean_attributes()` helper to filter None values from attribute dicts - Use `cast()` for TracerProvider methods (`add_span_processor`, `force_flush`) - Use `cast()` for metric creation methods returning from global storage - Fix variable reuse by renaming `span` to `end_span` in SpanEndPayload branch - Add None check for `parent_span` before `set_span_in_context` Errors Fixed: - TracerProvider attribute access: 2 errors - Counter/UpDownCounter/ObservableGauge return types: 3 errors - Attribute dict type mismatches: 4 errors - Span assignment type conflicts: 2 errors Testing: ```bash uv run mypy src/llama_stack/core/telemetry/telemetry.py # Success: no issues found ``` Part of: Mypy suppression removal plan (Phase 2a/4) Stack: - [Phase 1] Add type stubs (#3930) - [Phase 2a] Fix OpenTelemetry types (this PR) - [Phase 2b+] Fix remaining errors (upcoming) - [Phase 3] Remove inline suppressions (upcoming) - [Phase 4] Un-exclude files from mypy (upcoming)	2025-10-28 09:47:20 -07:00
Ian Miller	5598f61e12	feat(responses)!: introduce OpenAI compatible prompts to Responses API (#3942 ) # What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is responsible for making changes to Responses API scheme to introduce OpenAI compatible prompts there. Change to the API only, therefore currently no implementation at all. However, the follow up PR with actual implementation will be submitted after current PR lands. The need of this functionality was initiated in #3514. > Note, #3514 is divided on three separate PRs. Current PR is the second of three. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> CI	2025-10-28 09:31:27 -07:00
Sébastien Han	d10bfb5121	chore: remove leftover llama_stack directory (#3940 ) # What does this PR do? Followup on https://github.com/llamastack/llama-stack/pull/3920 where the llama_stack directory was moved under src. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-10-28 05:09:08 -07:00
Ashwin Bharambe	4e6c769cc4	fix(context): prevent provider data leak between streaming requests (#3924 ) ## Summary - `preserve_contexts_async_generator` left `PROVIDER_DATA_VAR` (and other context vars) populated after a streaming generator completed on HEAD~1, so the asyncio context for request N+1 started with request N's provider payload. - FastAPI dependencies and middleware execute before `request_provider_data_context` rebinds the header data, meaning auth/logging hooks could observe a prior tenant's credentials or treat them as authenticated. Traces and any background work that inspects the context outside the `with` block leak as well—this is a real security regression, not just a CLI artifact. - The wrapper now restores each tracked `ContextVar` to the value it held before the iteration (falling back to clearing when necessary) after every yield and when the generator terminates, so provider data is wiped while callers that set their own defaults keep them. ## Test Plan - `uv run pytest tests/unit/core/test_provider_data_context.py -q` - `uv run pytest tests/unit/distribution/test_context.py -q` Both suites fail on HEAD~1 and pass with this change.	2025-10-27 23:01:12 -07:00
ehhuang	c077d01ddf	chore(telemetry): more cleanup: remove apis.telemetry (#3919 ) # What does this PR do? ## Test Plan CI	2025-10-27 22:20:15 -07:00
ehhuang	b7dd3f5c56	chore!: BREAKING CHANGE: vector_db_id -> vector_store_id (#3923 ) # What does this PR do? ## Test Plan CI vector_io tests will fail until next client sync passed with https://github.com/llamastack/llama-stack-client-python/pull/286 checked out locally	2025-10-27 14:26:06 -07:00
Nathan Weinberg	b6954c9882	fix: add missing shutdown methods to PromptServiceImpl and ConversationServiceImpl (#3925 ) Change is visible in server shutdown logs, changes `WARNING` loglines to `INFO` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-10-27 13:41:38 -07:00
Matthew Farrellee	a9b00db421	feat: add provider data keys for Cerebras, Databricks, NVIDIA, and RunPod (#3734 ) # What does this PR do? add provider-data key passing support to Cerebras, Databricks, NVIDIA and RunPod also, added missing tests for Fireworks, Anthropic, Gemini, SambaNova, and vLLM addresses #3517 ## Test Plan ci w/ new tests --------- Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-10-27 13:09:35 -07:00
Ashwin Bharambe	471b1b248b	chore(package): migrate to src/ layout (#3920 ) Migrates package structure to src/ layout following Python packaging best practices. All code moved from `llama_stack/` to `src/llama_stack/`. Public API unchanged - imports remain `import llama_stack.`. Updated build configs, pre-commit hooks, scripts, and GitHub workflows accordingly. All hooks pass, package builds cleanly. Developer note*: Reinstall after pulling: `pip install -e .`	2025-10-27 12:02:21 -07:00

34 commits