llama-stack-mirror/scripts/openapi_generator/schema_collection.py
Sébastien Han 97f535c4f1
Some checks failed
Pre-commit / pre-commit (push) Successful in 3m27s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Tests (Replay) / generate-matrix (push) Successful in 3s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Test llama stack list-deps / generate-matrix (push) Successful in 3s
Python Package Build Test / build (3.12) (push) Failing after 4s
API Conformance Tests / check-schema-compatibility (push) Successful in 11s
Test llama stack list-deps / show-single-provider (push) Successful in 25s
Test External API and Providers / test-external (venv) (push) Failing after 34s
Vector IO Integration Tests / test-matrix (push) Failing after 43s
Test Llama Stack Build / build (push) Successful in 37s
Test Llama Stack Build / build-single-provider (push) Successful in 48s
Test llama stack list-deps / list-deps-from-config (push) Successful in 52s
Test llama stack list-deps / list-deps (push) Failing after 52s
Python Package Build Test / build (3.13) (push) Failing after 1m2s
UI Tests / ui-tests (22) (push) Successful in 1m15s
Test Llama Stack Build / build-custom-container-distribution (push) Successful in 1m29s
Unit Tests / unit-tests (3.12) (push) Failing after 1m45s
Test Llama Stack Build / build-ubi9-container-distribution (push) Successful in 1m54s
Unit Tests / unit-tests (3.13) (push) Failing after 2m13s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 2m20s
feat(openapi): switch to fastapi-based generator (#3944)
# What does this PR do?
This replaces the legacy "pyopenapi + strong_typing" pipeline with a
FastAPI-backed generator that has an explicit schema registry inside
`llama_stack_api`. The key changes:

1. **New generator architecture.** FastAPI now builds the OpenAPI schema
directly from the real routes, while helper modules
(`schema_collection`, `endpoints`, `schema_transforms`, etc.)
post-process the result. The old pyopenapi stack and its strong_typing
helpers are removed entirely, so we no longer rely on fragile AST
analysis or top-level import side effects.

2. **Schema registry in `llama_stack_api`.** `schema_utils.py` keeps a
`SchemaInfo` record for every `@json_schema_type`, `register_schema`,
and dynamically created request model. The OpenAPI generator and other
tooling query this registry instead of scanning the package tree,
producing deterministic names (e.g., `{MethodName}Request`), capturing
all optional/nullable fields, and making schema discovery testable. A
new unit test covers the registry behavior.

3. **Regenerated specs + CI alignment.** All docs/Stainless specs are
regenerated from the new pipeline, so optional/nullable fields now match
reality (expect the API Conformance workflow to report breaking
changes—this PR establishes the new baseline). The workflow itself is
back to the stock oasdiff invocation so future regressions surface
normally.

*Conformance will be RED on this PR; we choose to accept the
deviations.*

## Test Plan
- `uv run pytest tests/unit/server/test_schema_registry.py`
- `uv run python -m scripts.openapi_generator.main docs/static`

---------

Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-11-14 15:53:53 -08:00

131 lines
5.3 KiB
Python

# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.
"""
Schema discovery and collection for OpenAPI generation.
"""
import importlib
from typing import Any
def _ensure_components_schemas(openapi_schema: dict[str, Any]) -> None:
"""Ensure components.schemas exists in the schema."""
if "components" not in openapi_schema:
openapi_schema["components"] = {}
if "schemas" not in openapi_schema["components"]:
openapi_schema["components"]["schemas"] = {}
def _load_extra_schema_modules() -> None:
"""
Import modules outside llama_stack_api that use schema_utils to register schemas.
The API package already imports its submodules via __init__, but server-side modules
like telemetry need to be imported explicitly so their decorator side effects run.
"""
extra_modules = [
"llama_stack.core.telemetry.telemetry",
]
for module_name in extra_modules:
try:
importlib.import_module(module_name)
except ImportError:
continue
def _extract_and_fix_defs(schema: dict[str, Any], openapi_schema: dict[str, Any]) -> None:
"""
Extract $defs from a schema, move them to components/schemas, and fix references.
This handles both TypeAdapter-generated schemas and model_json_schema() schemas.
"""
if "$defs" in schema:
defs = schema.pop("$defs")
for def_name, def_schema in defs.items():
if def_name not in openapi_schema["components"]["schemas"]:
openapi_schema["components"]["schemas"][def_name] = def_schema
# Recursively handle $defs in nested schemas
_extract_and_fix_defs(def_schema, openapi_schema)
# Fix any references in the main schema that point to $defs
def fix_refs_in_schema(obj: Any) -> None:
if isinstance(obj, dict):
if "$ref" in obj and obj["$ref"].startswith("#/$defs/"):
obj["$ref"] = obj["$ref"].replace("#/$defs/", "#/components/schemas/")
for value in obj.values():
fix_refs_in_schema(value)
elif isinstance(obj, list):
for item in obj:
fix_refs_in_schema(item)
fix_refs_in_schema(schema)
def _ensure_json_schema_types_included(openapi_schema: dict[str, Any]) -> dict[str, Any]:
"""
Ensure all registered schemas (decorated, explicit, and dynamic) are included in the OpenAPI schema.
Relies on llama_stack_api's registry instead of recursively importing every module.
"""
_ensure_components_schemas(openapi_schema)
from pydantic import TypeAdapter
from llama_stack_api.schema_utils import (
iter_dynamic_schema_types,
iter_json_schema_types,
iter_registered_schema_types,
)
# Import extra modules (e.g., telemetry) whose schema registrations live outside llama_stack_api
_load_extra_schema_modules()
# Handle explicitly registered schemas first (union types, Annotated structs, etc.)
for registration_info in iter_registered_schema_types():
schema_type = registration_info.type
schema_name = registration_info.name
if schema_name not in openapi_schema["components"]["schemas"]:
try:
adapter = TypeAdapter(schema_type)
schema = adapter.json_schema(ref_template="#/components/schemas/{model}")
_extract_and_fix_defs(schema, openapi_schema)
openapi_schema["components"]["schemas"][schema_name] = schema
except Exception as e:
print(f"Warning: Failed to generate schema for registered type {schema_name}: {e}")
import traceback
traceback.print_exc()
continue
# Add @json_schema_type decorated models
for model in iter_json_schema_types():
schema_name = getattr(model, "_llama_stack_schema_name", None) or getattr(model, "__name__", None)
if not schema_name:
continue
if schema_name not in openapi_schema["components"]["schemas"]:
try:
if hasattr(model, "model_json_schema"):
schema = model.model_json_schema(ref_template="#/components/schemas/{model}")
else:
adapter = TypeAdapter(model)
schema = adapter.json_schema(ref_template="#/components/schemas/{model}")
_extract_and_fix_defs(schema, openapi_schema)
openapi_schema["components"]["schemas"][schema_name] = schema
except Exception as e:
print(f"Warning: Failed to generate schema for {schema_name}: {e}")
continue
# Include any dynamic models generated while building endpoints
for model in iter_dynamic_schema_types():
try:
schema_name = model.__name__
if schema_name not in openapi_schema["components"]["schemas"]:
schema = model.model_json_schema(ref_template="#/components/schemas/{model}")
_extract_and_fix_defs(schema, openapi_schema)
openapi_schema["components"]["schemas"][schema_name] = schema
except Exception:
continue
return openapi_schema