feat: Add opt-in OpenTelemetry auto-instrumentation to Docker images (#4281)

# What does this PR do? This allows llama-stack users of the Docker image to use OpenTelemetry like previous versions. #4127 migrated to automatic instrumentation, but unless we add those libraries to the image, everyone needs to build a custom image to enable otel. Also, unless we establish a convention for enabling it, users who formerly just set config now need to override the entrypoint. This PR bootstraps OTEL packages, so they are available (only +10MB). It also prefixes `llama stack run` with `opentelemetry-instrument` when any `OTEL_*` environment variable is set. The result is implicit tracing like before, where you don't need a custom image to use traces or metrics. ## Test Plan ```bash # Build image docker build -f containers/Containerfile \ --build-arg DISTRO_NAME=starter \ --build-arg INSTALL_MODE=editable \ --tag llamastack/distribution-starter:otel-test . # Run with OTEL env to implicitly use `opentelemetry-instrument`. The # Settings below ensure inbound traces are honored, but no # "junk traces" like SQL connects are created. docker run -p 8321:8321 \ -e OTEL_EXPORTER_OTLP_ENDPOINT=http://host.docker.internal:4318 \ -e OTEL_SERVICE_NAME=llama-stack \ -e OTEL_TRACES_SAMPLER=parentbased_traceidratio \ -e OTEL_TRACES_SAMPLER_ARG=0.0 \ llamastack/distribution-starter:otel-test ``` Ran a sample flight search agent which is instrumented on the client side. This and llama-stack target [otel-tui](https://github.com/ymtdzzz/otel-tui) I verified no root database spans, yet database spans are attached to incoming traces. <img width="1608" height="742" alt="screenshot" src="https://github.com/user-attachments/assets/69f59b74-3054-42cd-947d-a6c0d9472a7c" /> Signed-off-by: Adrian Cole <adrian@tetrate.io>
2025-12-03 01:48:05 +00:00 · 2025-12-03 09:03:27 +08:00 · 2025-12-03 09:03:27 +08:00 · 4237eb4aaa
commit 4237eb4aaa
parent e243892ef0
1 changed files with 14 additions and 3 deletions
--- a/containers/Containerfile
+++ b/containers/Containerfile
@ -120,6 +120,11 @@ RUN set -eux; \
        printf '%s\n' "$deps" | xargs -L1 uv pip install --no-cache; \
    fi

+# Install OpenTelemetry auto-instrumentation support
+RUN set -eux; \
+    pip install --no-cache opentelemetry-distro opentelemetry-exporter-otlp; \
+    opentelemetry-bootstrap -a install
+
 # Cleanup
 RUN set -eux; \
    pip uninstall -y uv; \
@ -135,15 +140,21 @@ RUN cat <<'EOF' >/usr/local/bin/llama-stack-entrypoint.sh
 #!/bin/sh
 set -e

+# Enable OpenTelemetry auto-instrumentation if any OTEL_* variable is set
+CMD_PREFIX=""
+if env | grep -q '^OTEL_'; then
+  CMD_PREFIX="opentelemetry-instrument"
+fi
+
 if [ -n "$RUN_CONFIG_PATH" ] && [ -f "$RUN_CONFIG_PATH" ]; then
-  exec llama stack run "$RUN_CONFIG_PATH" "$@"
+  exec $CMD_PREFIX llama stack run "$RUN_CONFIG_PATH" "$@"
 fi

 if [ -n "$DISTRO_NAME" ]; then
-  exec llama stack run "$DISTRO_NAME" "$@"
+  exec $CMD_PREFIX llama stack run "$DISTRO_NAME" "$@"
 fi

-exec llama stack run "$@"
+exec $CMD_PREFIX llama stack run "$@"
 EOF
 RUN chmod +x /usr/local/bin/llama-stack-entrypoint.sh