feat: remove core.telemetry as a dependency of llama_stack.apis

Remove circular dependency by moving tracing from API protocol definitions
  to router implementation layer.

  Changes:
  - Create apis/common/tracing.py with marker decorator (zero core dependencies)
  - Add @trace_protocol marker decorator to 11 protocol classes
  - Apply actual tracing in core/routers/__init__.py based on protocol marker
  - Move MetricResponseMixin from core to apis (it's an API response type)
  - APIs package is now self-contained with zero core dependencies

  The tracing functionality remains identical - actual trace_protocol from core
  is applied to router implementations at runtime when both telemetry is enabled
  and the protocol has the __trace_protocol__ marker.

  ## Test Plan

  Manual integration test confirms identical behavior to main branch:

  ```bash
  llama stack list-deps --format uv starter | sh
  export OLLAMA_URL=http://localhost:11434
  llama stack run starter

  curl -X POST http://localhost:8321/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{"model": "ollama/gpt-oss:20b",
         "messages": [{"role": "user", "content": "Say hello"}],
         "max_tokens": 10}'

  Verified identical between main and this branch:
  - trace_id present in response
  - metrics array with prompt_tokens, completion_tokens, total_tokens
  - Server logs show trace_protocol applied to all routers

  Existing telemetry integration tests (tests/integration/telemetry/) validate
  trace context propagation and span attributes.

Signed-off-by: Charlie Doern <cdoern@redhat.com>
This commit is contained in:
Charlie Doern 2025-11-04 11:21:50 -05:00
parent a8a8aa56c0
commit d9f6815d62
14 changed files with 81 additions and 53 deletions

View file

@ -20,8 +20,8 @@ from llama_stack.apis.agents.openai_responses import (
OpenAIResponseOutputMessageMCPListTools,
OpenAIResponseOutputMessageWebSearchToolCall,
)
from llama_stack.apis.common.tracing import trace_protocol
from llama_stack.apis.version import LLAMA_STACK_API_V1
from llama_stack.core.telemetry.trace_protocol import trace_protocol
from llama_stack.schema_utils import json_schema_type, register_schema, webmethod
Metadata = dict[str, str]