From 2a1a8133087017f4d84f0ec22a4d4874ab9cf8da Mon Sep 17 00:00:00 2001
From: ehhuang <ehhuang@users.noreply.github.com>
Date: Fri, 24 Oct 2025 13:57:28 -0700
Subject: [PATCH] chore: update docs for telemetry api removal (#3900)

# What does this PR do?
Telemetry is no longer an API/provider.

## Test Plan
---
 README.md                                     | 58 +++++++++----------
 docs/docs/building_applications/safety.mdx    |  1 -
 docs/docs/concepts/apis/index.mdx             |  1 -
 docs/docs/distributions/configuration.mdx     | 21 +------
 .../remote_hosted_distro/index.mdx            |  6 +-
 .../remote_hosted_distro/watsonx.md           |  1 -
 .../self_hosted_distro/dell-tgi.md            |  6 +-
 .../distributions/self_hosted_distro/dell.md  |  1 -
 .../self_hosted_distro/passthrough.md         |  1 -
 .../self_hosted_distro/starter.md             |  3 +-
 docs/docs/index.mdx                           |  2 +-
 docs/docs/providers/index.mdx                 |  1 -
 docs/docs/providers/telemetry/index.mdx       | 10 ----
 .../telemetry/inline_meta-reference.mdx       | 27 ---------
 .../llama_stack_client_cli_reference.md       |  2 -
 15 files changed, 39 insertions(+), 102 deletions(-)
 delete mode 100644 docs/docs/providers/telemetry/index.mdx
 delete mode 100644 docs/docs/providers/telemetry/inline_meta-reference.mdx

diff --git a/README.md b/README.md
index bb8587855..639e7280d 100644
--- a/README.md
+++ b/README.md
@@ -99,7 +99,7 @@ curl -LsSf https://github.com/llamastack/llama-stack/raw/main/scripts/install.sh
 
 Llama Stack standardizes the core building blocks that simplify AI application development. It codifies best practices across the Llama ecosystem. More specifically, it provides
 
-- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals, and Telemetry.
+- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals.
 - **Plugin architecture** to support the rich ecosystem of different API implementations in various environments, including local development, on-premises, cloud, and mobile.
 - **Prepackaged verified distributions** which offer a one-stop solution for developers to get started quickly and reliably in any environment.
 - **Multiple developer interfaces** like CLI and SDKs for Python, Typescript, iOS, and Android.
@@ -125,34 +125,34 @@ By reducing friction and complexity, Llama Stack empowers developers to focus on
 Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack.
 Please checkout for [full list](https://llamastack.github.io/docs/providers)
 
-| API Provider Builder | Environments | Agents | Inference | VectorIO | Safety | Telemetry | Post Training | Eval | DatasetIO |
-|:--------------------:|:------------:|:------:|:---------:|:--------:|:------:|:---------:|:-------------:|:----:|:--------:|
-|    Meta Reference    | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-|      SambaNova       | Hosted | | ✅ | | ✅ | | | | |
-|       Cerebras       | Hosted | | ✅ | | | | | | |
-|      Fireworks       | Hosted | ✅ | ✅ | ✅ | | | | | |
-|     AWS Bedrock      | Hosted | | ✅ | | ✅ | | | | |
-|       Together       | Hosted | ✅ | ✅ | | ✅ | | | | |
-|         Groq         | Hosted | | ✅ | | | | | | |
-|        Ollama        | Single Node | | ✅ | | | | | | |
-|         TGI          | Hosted/Single Node | | ✅ | | | | | | |
-|      NVIDIA NIM      | Hosted/Single Node | | ✅ | | ✅ | | | | |
-|       ChromaDB       | Hosted/Single Node | | | ✅ | | | | | |
-|        Milvus        | Hosted/Single Node | | | ✅ | | | | | |
-|        Qdrant        | Hosted/Single Node | | | ✅ | | | | | |
-|       Weaviate       | Hosted/Single Node | | | ✅ | | | | | |
-|      SQLite-vec      | Single Node | | | ✅ | | | | | |
-|      PG Vector       | Single Node | | | ✅ | | | | | |
-|  PyTorch ExecuTorch  | On-device iOS | ✅ | ✅ | | | | | | |
-|         vLLM         | Single Node | | ✅ | | | | | | |
-|        OpenAI        | Hosted | | ✅ | | | | | | |
-|      Anthropic       | Hosted | | ✅ | | | | | | |
-|        Gemini        | Hosted | | ✅ | | | | | | |
-|       WatsonX        | Hosted | | ✅ | | | | | | |
-|     HuggingFace      | Single Node | | | | | | ✅ | | ✅ |
-|      TorchTune       | Single Node | | | | | | ✅ | | |
-|     NVIDIA NEMO      | Hosted | | ✅ | ✅ | | | ✅ | ✅ | ✅ |
-|        NVIDIA        | Hosted | | | | | | ✅ | ✅ | ✅ |
+| API Provider Builder | Environments | Agents | Inference | VectorIO | Safety | Post Training | Eval | DatasetIO |
+|:--------------------:|:------------:|:------:|:---------:|:--------:|:------:|:-------------:|:----:|:--------:|
+|    Meta Reference    | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+|      SambaNova       | Hosted | | ✅ | | ✅ | | | |
+|       Cerebras       | Hosted | | ✅ | | | | | |
+|      Fireworks       | Hosted | ✅ | ✅ | ✅ | | | | |
+|     AWS Bedrock      | Hosted | | ✅ | | ✅ | | | |
+|       Together       | Hosted | ✅ | ✅ | | ✅ | | | |
+|         Groq         | Hosted | | ✅ | | | | | |
+|        Ollama        | Single Node | | ✅ | | | | | |
+|         TGI          | Hosted/Single Node | | ✅ | | | | | |
+|      NVIDIA NIM      | Hosted/Single Node | | ✅ | | ✅ | | | |
+|       ChromaDB       | Hosted/Single Node | | | ✅ | | | | |
+|        Milvus        | Hosted/Single Node | | | ✅ | | | | |
+|        Qdrant        | Hosted/Single Node | | | ✅ | | | | |
+|       Weaviate       | Hosted/Single Node | | | ✅ | | | | |
+|      SQLite-vec      | Single Node | | | ✅ | | | | |
+|      PG Vector       | Single Node | | | ✅ | | | | |
+|  PyTorch ExecuTorch  | On-device iOS | ✅ | ✅ | | | | | |
+|         vLLM         | Single Node | | ✅ | | | | | |
+|        OpenAI        | Hosted | | ✅ | | | | | |
+|      Anthropic       | Hosted | | ✅ | | | | | |
+|        Gemini        | Hosted | | ✅ | | | | | |
+|       WatsonX        | Hosted | | ✅ | | | | | |
+|     HuggingFace      | Single Node | | | | | ✅ | | ✅ |
+|      TorchTune       | Single Node | | | | | ✅ | | |
+|     NVIDIA NEMO      | Hosted | | ✅ | ✅ | | ✅ | ✅ | ✅ |
+|        NVIDIA        | Hosted | | | | | ✅ | ✅ | ✅ |
 
 > **Note**: Additional providers are available through external packages. See [External Providers](https://llamastack.github.io/docs/providers/external) documentation.
 
diff --git a/docs/docs/building_applications/safety.mdx b/docs/docs/building_applications/safety.mdx
index 16fe5f6f8..998c02b20 100644
--- a/docs/docs/building_applications/safety.mdx
+++ b/docs/docs/building_applications/safety.mdx
@@ -391,5 +391,4 @@ client.shields.register(
 - **[Agents](./agent)** - Integrating safety shields with intelligent agents
 - **[Agent Execution Loop](./agent_execution_loop)** - Understanding safety in the execution flow
 - **[Evaluations](./evals)** - Evaluating safety shield effectiveness
-- **[Telemetry](./telemetry)** - Monitoring safety violations and metrics
 - **[Llama Guard Documentation](https://github.com/meta-llama/PurpleLlama/tree/main/Llama-Guard3)** - Advanced safety model details
diff --git a/docs/docs/concepts/apis/index.mdx b/docs/docs/concepts/apis/index.mdx
index 6e699d137..11b8b2e08 100644
--- a/docs/docs/concepts/apis/index.mdx
+++ b/docs/docs/concepts/apis/index.mdx
@@ -16,7 +16,6 @@ A Llama Stack API is described as a collection of REST endpoints. We currently s
 - **Scoring**: evaluate outputs of the system
 - **Eval**: generate outputs (via Inference or Agents) and perform scoring
 - **VectorIO**: perform operations on vector stores, such as adding documents, searching, and deleting documents
-- **Telemetry**: collect telemetry data from the system
 - **Post Training**: fine-tune a model
 - **Tool Runtime**: interact with various tools and protocols
 - **Responses**: generate responses from an LLM using this OpenAI compatible API.
diff --git a/docs/docs/distributions/configuration.mdx b/docs/docs/distributions/configuration.mdx
index bf3156865..910a0ed05 100644
--- a/docs/docs/distributions/configuration.mdx
+++ b/docs/docs/distributions/configuration.mdx
@@ -21,7 +21,6 @@ apis:
 - inference
 - vector_io
 - safety
-- telemetry
 providers:
   inference:
   - provider_id: ollama
@@ -51,10 +50,6 @@ providers:
         responses:
           backend: sql_default
           table_name: responses
-  telemetry:
-  - provider_id: meta-reference
-    provider_type: inline::meta-reference
-    config: {}
 storage:
   backends:
     kv_default:
@@ -92,7 +87,6 @@ apis:
 - inference
 - vector_io
 - safety
-- telemetry
 ```
 
 ## Providers
@@ -589,24 +583,13 @@ created by users sharing a team with them:
 
 In addition to resource-based access control, Llama Stack supports endpoint-level authorization using OAuth 2.0 style scopes. When authentication is enabled, specific API endpoints require users to have particular scopes in their authentication token.
 
-**Scope-Gated APIs:**
-The following APIs are currently gated by scopes:
-
-- **Telemetry API** (scope: `telemetry.read`):
-  - `POST /telemetry/traces` - Query traces
-  - `GET /telemetry/traces/{trace_id}` - Get trace by ID
-  - `GET /telemetry/traces/{trace_id}/spans/{span_id}` - Get span by ID
-  - `POST /telemetry/spans/{span_id}/tree` - Get span tree
-  - `POST /telemetry/spans` - Query spans
-  - `POST /telemetry/metrics/{metric_name}` - Query metrics
-
 **Authentication Configuration:**
 
 For **JWT/OAuth2 providers**, scopes should be included in the JWT's claims:
 ```json
 {
   "sub": "user123",
-  "scope": "telemetry.read",
+  "scope": "<scope>",
   "aud": "llama-stack"
 }
 ```
@@ -616,7 +599,7 @@ For **custom authentication providers**, the endpoint must return user attribute
 {
   "principal": "user123",
   "attributes": {
-    "scopes": ["telemetry.read"]
+    "scopes": ["<scope>"]
   }
 }
 ```
diff --git a/docs/docs/distributions/remote_hosted_distro/index.mdx b/docs/docs/distributions/remote_hosted_distro/index.mdx
index ef5a83d8a..7fa9d1bf6 100644
--- a/docs/docs/distributions/remote_hosted_distro/index.mdx
+++ b/docs/docs/distributions/remote_hosted_distro/index.mdx
@@ -2,10 +2,10 @@
 
 Remote-Hosted distributions are available endpoints serving Llama Stack API that you can directly connect to.
 
-| Distribution | Endpoint | Inference | Agents | Memory | Safety | Telemetry |
+| Distribution | Endpoint | Inference | Agents | Memory | Safety |
 |-------------|----------|-----------|---------|---------|---------|------------|
-| Together | [https://llama-stack.together.ai](https://llama-stack.together.ai) | remote::together | meta-reference | remote::weaviate | meta-reference | meta-reference |
-| Fireworks | [https://llamastack-preview.fireworks.ai](https://llamastack-preview.fireworks.ai) | remote::fireworks | meta-reference | remote::weaviate | meta-reference | meta-reference |
+| Together | [https://llama-stack.together.ai](https://llama-stack.together.ai) | remote::together | meta-reference | remote::weaviate | meta-reference |
+| Fireworks | [https://llamastack-preview.fireworks.ai](https://llamastack-preview.fireworks.ai) | remote::fireworks | meta-reference | remote::weaviate | meta-reference |
 
 ## Connecting to Remote-Hosted Distributions
 
diff --git a/docs/docs/distributions/remote_hosted_distro/watsonx.md b/docs/docs/distributions/remote_hosted_distro/watsonx.md
index 5add678f3..2ec7fe965 100644
--- a/docs/docs/distributions/remote_hosted_distro/watsonx.md
+++ b/docs/docs/distributions/remote_hosted_distro/watsonx.md
@@ -21,7 +21,6 @@ The `llamastack/distribution-watsonx` distribution consists of the following pro
 | inference | `remote::watsonx`, `inline::sentence-transformers` |
 | safety | `inline::llama-guard` |
 | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` |
-| telemetry | `inline::meta-reference` |
 | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::rag-runtime`, `remote::model-context-protocol` |
 | vector_io | `inline::faiss` |
 
diff --git a/docs/docs/distributions/self_hosted_distro/dell-tgi.md b/docs/docs/distributions/self_hosted_distro/dell-tgi.md
index 5fca297b0..a49bab4e6 100644
--- a/docs/docs/distributions/self_hosted_distro/dell-tgi.md
+++ b/docs/docs/distributions/self_hosted_distro/dell-tgi.md
@@ -13,9 +13,9 @@ self
 The `llamastack/distribution-tgi` distribution consists of the following provider configurations.
 
 
-| **API**         	| **Inference** 	| **Agents**     	| **Memory**                                       	| **Safety**     	| **Telemetry**  	|
-|-----------------	|---------------	|----------------	|--------------------------------------------------	|----------------	|----------------	|
-| **Provider(s)** 	| remote::tgi   	| meta-reference 	| meta-reference, remote::pgvector, remote::chroma 	| meta-reference 	| meta-reference 	|
+| **API**         	| **Inference** 	| **Agents**     	| **Memory**                                       	| **Safety**     	|
+|-----------------	|---------------	|----------------	|--------------------------------------------------	|----------------	|
+| **Provider(s)** 	| remote::tgi   	| meta-reference 	| meta-reference, remote::pgvector, remote::chroma 	| meta-reference 	|
 
 
 The only difference vs. the `tgi` distribution is that it runs the Dell-TGI server for inference.
diff --git a/docs/docs/distributions/self_hosted_distro/dell.md b/docs/docs/distributions/self_hosted_distro/dell.md
index 040eb4a12..e30df5164 100644
--- a/docs/docs/distributions/self_hosted_distro/dell.md
+++ b/docs/docs/distributions/self_hosted_distro/dell.md
@@ -22,7 +22,6 @@ The `llamastack/distribution-dell` distribution consists of the following provid
 | inference | `remote::tgi`, `inline::sentence-transformers` |
 | safety | `inline::llama-guard` |
 | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` |
-| telemetry | `inline::meta-reference` |
 | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::rag-runtime` |
 | vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` |
 
diff --git a/docs/docs/distributions/self_hosted_distro/passthrough.md b/docs/docs/distributions/self_hosted_distro/passthrough.md
index 39f076be4..13e78a1ee 100644
--- a/docs/docs/distributions/self_hosted_distro/passthrough.md
+++ b/docs/docs/distributions/self_hosted_distro/passthrough.md
@@ -21,7 +21,6 @@ The `llamastack/distribution-passthrough` distribution consists of the following
 | inference | `remote::passthrough`, `inline::sentence-transformers` |
 | safety | `inline::llama-guard` |
 | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` |
-| telemetry | `inline::meta-reference` |
 | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `remote::wolfram-alpha`, `inline::rag-runtime`, `remote::model-context-protocol` |
 | vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` |
 
diff --git a/docs/docs/distributions/self_hosted_distro/starter.md b/docs/docs/distributions/self_hosted_distro/starter.md
index e04c5874b..f6786a95c 100644
--- a/docs/docs/distributions/self_hosted_distro/starter.md
+++ b/docs/docs/distributions/self_hosted_distro/starter.md
@@ -26,7 +26,6 @@ The starter distribution consists of the following provider configurations:
 | inference | `remote::openai`, `remote::fireworks`, `remote::together`, `remote::ollama`, `remote::anthropic`, `remote::gemini`, `remote::groq`, `remote::sambanova`, `remote::vllm`, `remote::tgi`, `remote::cerebras`, `remote::llama-openai-compat`, `remote::nvidia`, `remote::hf::serverless`, `remote::hf::endpoint`, `inline::sentence-transformers` |
 | safety | `inline::llama-guard`                                                                                                                                                                                                                                                                                                                          |
 | scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust`                                                                                                                                                                                                                                                                                  |
-| telemetry | `inline::meta-reference`                                                                                                                                                                                                                                                                                                                       |
 | tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::rag-runtime`, `remote::model-context-protocol`                                                                                                                                                                                                                                       |
 | vector_io | `inline::faiss`, `inline::sqlite-vec`, `inline::milvus`, `remote::chromadb`, `remote::pgvector`                                                                                                                                                                                                                                                 |
 
@@ -119,7 +118,7 @@ The following environment variables can be configured:
 
 ### Telemetry Configuration
 - `OTEL_SERVICE_NAME`: OpenTelemetry service name
-- `TELEMETRY_SINKS`: Telemetry sinks (default: `[]`)
+- `OTEL_EXPORTER_OTLP_ENDPOINT`: OpenTelemetry collector endpoint URL
 
 ## Enabling Providers
 
diff --git a/docs/docs/index.mdx b/docs/docs/index.mdx
index 80b288872..8c17283f9 100644
--- a/docs/docs/index.mdx
+++ b/docs/docs/index.mdx
@@ -29,7 +29,7 @@ Llama Stack is now available! See the [release notes](https://github.com/llamast
 
 Llama Stack defines and standardizes the core building blocks needed to bring generative AI applications to market. It provides a unified set of APIs with implementations from leading service providers, enabling seamless transitions between development and production environments. More specifically, it provides:
 
-- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals, and Telemetry.
+- **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals.
 - **Plugin architecture** to support the rich ecosystem of implementations of the different APIs in different environments like local development, on-premises, cloud, and mobile.
 - **Prepackaged verified distributions** which offer a one-stop solution for developers to get started quickly and reliably in any environment
 - **Multiple developer interfaces** like CLI and SDKs for Python, Node, iOS, and Android
diff --git a/docs/docs/providers/index.mdx b/docs/docs/providers/index.mdx
index 2ca2b2697..bfc16b29a 100644
--- a/docs/docs/providers/index.mdx
+++ b/docs/docs/providers/index.mdx
@@ -26,7 +26,6 @@ Importantly, Llama Stack always strives to provide at least one fully inline pro
 - **[Agents](agents/index.mdx)** - Agentic system providers
 - **[DatasetIO](datasetio/index.mdx)** - Dataset and data loader providers
 - **[Safety](safety/index.mdx)** - Content moderation and safety providers
-- **[Telemetry](telemetry/index.mdx)** - Monitoring and observability providers
 - **[Vector IO](vector_io/index.mdx)** - Vector database providers
 - **[Tool Runtime](tool_runtime/index.mdx)** - Tool and protocol providers
 - **[Files](files/index.mdx)** - File system and storage providers
diff --git a/docs/docs/providers/telemetry/index.mdx b/docs/docs/providers/telemetry/index.mdx
deleted file mode 100644
index 07190d625..000000000
--- a/docs/docs/providers/telemetry/index.mdx
+++ /dev/null
@@ -1,10 +0,0 @@
----
-sidebar_label: Telemetry
-title: Telemetry
----
-
-# Telemetry
-
-## Overview
-
-This section contains documentation for all available providers for the **telemetry** API.
diff --git a/docs/docs/providers/telemetry/inline_meta-reference.mdx b/docs/docs/providers/telemetry/inline_meta-reference.mdx
deleted file mode 100644
index d8b3157d1..000000000
--- a/docs/docs/providers/telemetry/inline_meta-reference.mdx
+++ /dev/null
@@ -1,27 +0,0 @@
----
-description: "Meta's reference implementation of telemetry and observability using OpenTelemetry."
-sidebar_label: Meta-Reference
-title: inline::meta-reference
----
-
-# inline::meta-reference
-
-## Description
-
-Meta's reference implementation of telemetry and observability using OpenTelemetry.
-
-## Configuration
-
-| Field | Type | Required | Default | Description |
-|-------|------|----------|---------|-------------|
-| `otel_exporter_otlp_endpoint` | `str \| None` | No |  | The OpenTelemetry collector endpoint URL (base URL for traces, metrics, and logs). If not set, the SDK will use OTEL_EXPORTER_OTLP_ENDPOINT environment variable. |
-| `service_name` | `<class 'str'>` | No | ​ | The service name to use for telemetry |
-| `sinks` | `list[inline.telemetry.meta_reference.config.TelemetrySink` | No | [] | List of telemetry sinks to enable (possible values: otel_trace, otel_metric, console) |
-
-## Sample Configuration
-
-```yaml
-service_name: "${env.OTEL_SERVICE_NAME:=\u200B}"
-sinks: ${env.TELEMETRY_SINKS:=}
-otel_exporter_otlp_endpoint: ${env.OTEL_EXPORTER_OTLP_ENDPOINT:=}
-```
diff --git a/docs/docs/references/llama_stack_client_cli_reference.md b/docs/docs/references/llama_stack_client_cli_reference.md
index a4321938a..fd87e7dbd 100644
--- a/docs/docs/references/llama_stack_client_cli_reference.md
+++ b/docs/docs/references/llama_stack_client_cli_reference.md
@@ -78,8 +78,6 @@ llama-stack-client providers list
 +-----------+----------------+-----------------+
 | agents    | meta-reference | meta-reference  |
 +-----------+----------------+-----------------+
-| telemetry | meta-reference | meta-reference  |
-+-----------+----------------+-----------------+
 | safety    | meta-reference | meta-reference  |
 +-----------+----------------+-----------------+
 ```