Fix wrong import and use space_id instead of space_key for Arize integration

This commit is contained in:
Nate Mar 2025-03-17 20:37:28 -07:00
parent f8ac675321
commit 0dfc21e80a
11 changed files with 221 additions and 186 deletions

View file

@ -105,11 +105,16 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"ARIZE_SPACE_KEY\"] = getpass(\"Enter your Arize space key: \")\n",
"os.environ[\"ARIZE_SPACE_ID\"] = getpass(\"Enter your Arize space id: \")\n",
"os.environ[\"ARIZE_API_KEY\"] = getpass(\"Enter your Arize API key: \")\n",
"os.environ['OPENAI_API_KEY']= getpass(\"Enter your OpenAI API key: \")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},

View file

@ -11,12 +11,12 @@ https://github.com/BerriAI/litellm
:::
## Pre-Requisites
Make an account on [Arize AI](https://app.arize.com/auth/login)
## Quick Start
Use just 2 lines of code, to instantly log your responses **across all providers** with arize
You can also use the instrumentor option instead of the callback, which you can find [here](https://docs.arize.com/arize/llm-tracing/tracing-integrations-auto/litellm).
@ -24,11 +24,12 @@ You can also use the instrumentor option instead of the callback, which you can
```python
litellm.callbacks = ["arize"]
```
```python
import litellm
import os
os.environ["ARIZE_SPACE_KEY"] = ""
os.environ["ARIZE_SPACE_ID"] = ""
os.environ["ARIZE_API_KEY"] = ""
# LLM API Keys
@ -48,7 +49,6 @@ response = litellm.completion(
### Using with LiteLLM Proxy
```yaml
model_list:
- model_name: gpt-4
@ -61,7 +61,7 @@ litellm_settings:
callbacks: ["arize"]
environment_variables:
ARIZE_SPACE_KEY: "d0*****"
ARIZE_SPACE_ID: "d0*****"
ARIZE_API_KEY: "141a****"
ARIZE_ENDPOINT: "https://otlp.arize.com/v1" # OPTIONAL - your custom arize GRPC api endpoint
ARIZE_HTTP_ENDPOINT: "https://otlp.arize.com/v1" # OPTIONAL - your custom arize HTTP api endpoint. Set either this or ARIZE_ENDPOINT

View file

@ -302,7 +302,7 @@ router_settings:
| AISPEND_API_KEY | API Key for AI Spend
| ALLOWED_EMAIL_DOMAINS | List of email domains allowed for access
| ARIZE_API_KEY | API key for Arize platform integration
| ARIZE_SPACE_KEY | Space key for Arize platform
| ARIZE_SPACE_ID | Space key for Arize platform
| ARGILLA_BATCH_SIZE | Batch size for Argilla logging
| ARGILLA_API_KEY | API key for Argilla platform
| ARGILLA_SAMPLING_RATE | Sampling rate for Argilla logging

View file

@ -17,8 +17,6 @@ Log Proxy input, output, and exceptions using:
- DynamoDB
- etc.
## Getting the LiteLLM Call ID
LiteLLM generates a unique `call_id` for each request. This `call_id` can be
@ -52,18 +50,17 @@ A number of these headers could be useful for troubleshooting, but the
`x-litellm-call-id` is the one that is most useful for tracking a request across
components in your system, including in logging tools.
## Logging Features
### Conditional Logging by Virtual Keys, Teams
Use this to:
1. Conditionally enable logging for some virtual keys/teams
2. Set different logging providers for different virtual keys/teams
[👉 **Get Started** - Team/Key Based Logging](team_logging)
### Redacting UserAPIKeyInfo
Redact information about the user api key (hashed token, user_id, team id, etc.), from logs.
@ -76,7 +73,6 @@ litellm_settings:
redact_user_api_key_info: true
```
### Redact Messages, Response Content
Set `litellm.turn_off_message_logging=True` This will prevent the messages and responses from being logged to your logging provider, but request metadata - e.g. spend, will still be tracked.
@ -86,6 +82,7 @@ Set `litellm.turn_off_message_logging=True` This will prevent the messages and r
<TabItem value="global" label="Global">
**1. Setup config.yaml **
```yaml
model_list:
- model_name: gpt-3.5-turbo
@ -97,6 +94,7 @@ litellm_settings:
```
**2. Send request**
```shell
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
@ -111,8 +109,6 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
}'
```
</TabItem>
<TabItem value="dynamic" label="Per Request">
@ -170,13 +166,11 @@ curl -L -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
<Image img={require('../../img/message_redaction_spend_logs.png')} />
### Disable Message Redaction
If you have `litellm.turn_on_message_logging` turned on, you can override it for specific requests by
setting a request header `LiteLLM-Disable-Message-Redaction: true`.
```shell
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
@ -192,7 +186,6 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
}'
```
### Turn off all tracking/logging
For some use cases, you may want to turn off all tracking/logging. You can do this by passing `no-log=True` in the request body.
@ -205,6 +198,7 @@ Disable this by setting `global_disable_no_log_param:true` in your config.yaml f
litellm_settings:
global_disable_no_log_param: True
```
:::
<Tabs>
@ -268,7 +262,6 @@ print(response)
LiteLLM.Info: "no-log request, skipping logging"
```
## What gets logged?
Found under `kwargs["standard_logging_object"]`. This is a standard payload, logged for every response.
@ -431,10 +424,8 @@ print(response)
Set `tags` as part of your request body
<Tabs>
<TabItem value="openai" label="OpenAI Python v1.0.0+">
```python
@ -462,6 +453,7 @@ response = client.chat.completions.create(
print(response)
```
</TabItem>
<TabItem value="Curl" label="Curl Request">
@ -486,6 +478,7 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
}
}'
```
</TabItem>
<TabItem value="langchain" label="Langchain">
@ -528,14 +521,12 @@ print(response)
</TabItem>
</Tabs>
### LiteLLM Tags - `cache_hit`, `cache_key`
Use this if you want to control which LiteLLM-specific fields are logged as tags by the LiteLLM proxy. By default LiteLLM Proxy logs no LiteLLM-specific fields
| LiteLLM specific field | Description | Example Value |
|------------------------|-------------------------------------------------------|------------------------------------------------|
| ------------------------- | --------------------------------------------------------------------------------------- | --------------------------------------- |
| `cache_hit` | Indicates whether a cache hit occured (True) or not (False) | `true`, `false` |
| `cache_key` | The Cache key used for this request | `d2b758c****` |
| `proxy_base_url` | The base URL for the proxy server, the value of env var `PROXY_BASE_URL` on your server | `https://proxy.example.com` |
@ -544,12 +535,12 @@ Use this if you want to control which LiteLLM-specific fields are logged as tags
| `user_api_key_user_email` | The email associated with a user's API key. | `user@example.com`, `admin@example.com` |
| `user_api_key_team_alias` | An alias for a team associated with an API key. | `team_alpha`, `dev_team` |
**Usage**
Specify `langfuse_default_tags` to control what litellm fields get logged on Langfuse
Example config.yaml
```yaml
model_list:
- model_name: gpt-4
@ -562,7 +553,18 @@ litellm_settings:
success_callback: ["langfuse"]
# 👇 Key Change
langfuse_default_tags: ["cache_hit", "cache_key", "proxy_base_url", "user_api_key_alias", "user_api_key_user_id", "user_api_key_user_email", "user_api_key_team_alias", "semantic-similarity", "proxy_base_url"]
langfuse_default_tags:
[
"cache_hit",
"cache_key",
"proxy_base_url",
"user_api_key_alias",
"user_api_key_user_id",
"user_api_key_user_email",
"user_api_key_team_alias",
"semantic-similarity",
"proxy_base_url",
]
```
### View POST sent from LiteLLM to provider
@ -965,6 +967,7 @@ callback_settings:
```
### Traceparent Header
##### Context propagation across Services `Traceparent HTTP Header`
❓ Use this when you want to **pass information about the incoming request in a distributed tracing system**
@ -1042,18 +1045,16 @@ Log LLM Logs to [Google Cloud Storage Buckets](https://cloud.google.com/storage?
:::
| Property | Details |
|----------|---------|
| ---------------------------- | -------------------------------------------------------------- |
| Description | Log LLM Input/Output to cloud storage buckets |
| Load Test Benchmarks | [Benchmarks](https://docs.litellm.ai/docs/benchmarks) |
| Google Docs on Cloud Storage | [Google Cloud Storage](https://cloud.google.com/storage?hl=en) |
#### Usage
1. Add `gcs_bucket` to LiteLLM Config.yaml
```yaml
model_list:
- litellm_params:
@ -1096,7 +1097,6 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
'
```
#### Expected Logs on GCS Buckets
<Image img={require('../../img/gcs_bucket.png')} />
@ -1105,7 +1105,6 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
[**The standard logging object is logged on GCS Bucket**](../proxy/logging_spec)
#### Getting `service_account.json` from Google Cloud Console
1. Go to [Google Cloud Console](https://console.cloud.google.com/)
@ -1115,8 +1114,6 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
5. Click on 'Keys' -> Add Key -> Create New Key -> JSON
6. Save the JSON file and add the path to `GCS_PATH_SERVICE_ACCOUNT`
## Google Cloud Storage - PubSub Topic
Log LLM Logs/SpendLogs to [Google Cloud Storage PubSub Topic](https://cloud.google.com/pubsub/docs/reference/rest)
@ -1127,19 +1124,18 @@ Log LLM Logs/SpendLogs to [Google Cloud Storage PubSub Topic](https://cloud.goog
:::
| Property | Details |
|----------|---------|
| ----------- | ------------------------------------------------------------------ |
| Description | Log LiteLLM `SpendLogs Table` to Google Cloud Storage PubSub Topic |
When to use `gcs_pubsub`?
- If your LiteLLM Database has crossed 1M+ spend logs and you want to send `SpendLogs` to a PubSub Topic that can be consumed by GCS BigQuery
#### Usage
1. Add `gcs_pubsub` to LiteLLM Config.yaml
```yaml
model_list:
- litellm_params:
@ -1182,8 +1178,6 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
'
```
## s3 Buckets
We will use the `--config` to set
@ -1276,17 +1270,15 @@ Log LLM Logs to [Azure Data Lake Storage](https://learn.microsoft.com/en-us/azur
:::
| Property | Details |
|----------|---------|
| ------------------------------- | --------------------------------------------------------------------------------------------------------------- |
| Description | Log LLM Input/Output to Azure Blob Storag (Bucket) |
| Azure Docs on Data Lake Storage | [Azure Data Lake Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction) |
#### Usage
1. Add `azure_storage` to LiteLLM Config.yaml
```yaml
model_list:
- model_name: fake-openai-endpoint
@ -1339,7 +1331,6 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
'
```
#### Expected Logs on Azure Data Lake Storage
<Image img={require('../../img/azure_blob.png')} />
@ -1348,11 +1339,10 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
[**The standard logging object is logged on Azure Data Lake Storage**](../proxy/logging_spec)
## DataDog
LiteLLM Supports logging to the following Datdog Integrations:
- `datadog` [Datadog Logs](https://docs.datadoghq.com/logs/)
- `datadog_llm_observability` [Datadog LLM Observability](https://www.datadoghq.com/product/llm-observability/)
- `ddtrace-run` [Datadog Tracing](#datadog-tracing)
@ -1448,7 +1438,7 @@ docker run \
LiteLLM supports customizing the following Datadog environment variables
| Environment Variable | Description | Default Value | Required |
|---------------------|-------------|---------------|----------|
| -------------------- | ------------------------------------------------------------- | ---------------- | -------- |
| `DD_API_KEY` | Your Datadog API key for authentication | None | ✅ Yes |
| `DD_SITE` | Your Datadog site (e.g., "us5.datadoghq.com") | None | ✅ Yes |
| `DD_ENV` | Environment tag for your logs (e.g., "production", "staging") | "unknown" | ❌ No |
@ -1458,15 +1448,18 @@ LiteLLM supports customizing the following Datadog environment variables
| `HOSTNAME` | Hostname tag for your logs | "" | ❌ No |
| `POD_NAME` | Pod name tag (useful for Kubernetes deployments) | "unknown" | ❌ No |
## Lunary
#### Step1: Install dependencies and set your environment variables
Install the dependencies
```shell
pip install litellm lunary
```
Get you Lunary public key from from https://app.lunary.ai/settings
```shell
export LUNARY_PUBLIC_KEY="<your-public-key>"
```
@ -1484,6 +1477,7 @@ litellm_settings:
```
#### Step 3: Start the LiteLLM proxy
```shell
litellm --config config.yaml
```
@ -1510,8 +1504,8 @@ curl -X POST 'http://0.0.0.0:4000/chat/completions' \
## MLflow
#### Step1: Install dependencies
Install the dependencies.
```shell
@ -1531,6 +1525,7 @@ litellm_settings:
```
#### Step 3: Start the LiteLLM proxy
```shell
litellm --config config.yaml
```
@ -1559,8 +1554,6 @@ Run the following command to start MLflow UI and review recorded traces.
mlflow ui
```
## Custom Callback Class [Async]
Use this when you want to run custom callbacks in `python`
@ -1685,7 +1678,6 @@ model_list:
litellm_settings:
callbacks: custom_callbacks.proxy_handler_instance # sets litellm.callbacks = [proxy_handler_instance]
```
#### Step 3 - Start proxy + test request
@ -1783,7 +1775,7 @@ class MyCustomHandler(CustomLogger):
**Expected Output**
```json
{'mode': 'embedding', 'input_cost_per_token': 0.002}
{ "mode": "embedding", "input_cost_per_token": 0.002 }
```
##### Logging responses from proxy
@ -1966,7 +1958,6 @@ environment_variables:
LANGSMITH_BASE_URL: "https://api.smith.langchain.com" # (Optional - only needed if you have a custom Langsmith instance)
```
2. Start Proxy
```
@ -1989,10 +1980,10 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
}
'
```
Expect to see your log on Langfuse
<Image img={require('../../img/langsmith_new.png')} />
## Arize AI
1. Set `success_callback: ["arize"]` on litellm config.yaml
@ -2009,7 +2000,7 @@ litellm_settings:
callbacks: ["arize"]
environment_variables:
ARIZE_SPACE_KEY: "d0*****"
ARIZE_SPACE_ID: "d0*****"
ARIZE_API_KEY: "141a****"
ARIZE_ENDPOINT: "https://otlp.arize.com/v1" # OPTIONAL - your custom arize GRPC api endpoint
ARIZE_HTTP_ENDPOINT: "https://otlp.arize.com/v1" # OPTIONAL - your custom arize HTTP api endpoint. Set either this or ARIZE_ENDPOINT
@ -2037,10 +2028,10 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
}
'
```
Expect to see your log on Langfuse
<Image img={require('../../img/langsmith_new.png')} />
## Langtrace
1. Set `success_callback: ["langtrace"]` on litellm config.yaml
@ -2413,7 +2404,6 @@ curl --location 'http://0.0.0.0:4000/chat/completions' \
}'
```
<!-- ## (BETA) Moderation with Azure Content Safety
Note: This page is for logging callbacks and this is a moderation service. Commenting until we found a better location for this.

View file

@ -12,7 +12,7 @@ else:
def set_attributes(span: Span, kwargs, response_obj):
from openinference.semconv.trace import (
from litellm.integrations._types.open_inference import (
MessageAttributes,
OpenInferenceSpanKindValues,
SpanAttributes,

View file

@ -40,11 +40,11 @@ class ArizeLogger:
Raises:
ValueError: If required environment variables are not set.
"""
space_key = os.environ.get("ARIZE_SPACE_KEY")
space_id = os.environ.get("ARIZE_SPACE_ID")
api_key = os.environ.get("ARIZE_API_KEY")
if not space_key:
raise ValueError("ARIZE_SPACE_KEY not found in environment variables")
if not space_id:
raise ValueError("ARIZE_SPACE_ID not found in environment variables")
if not api_key:
raise ValueError("ARIZE_API_KEY not found in environment variables")
@ -65,7 +65,7 @@ class ArizeLogger:
endpoint = "https://otlp.arize.com/v1"
return ArizeConfig(
space_key=space_key,
space_id=space_id,
api_key=api_key,
protocol=protocol,
endpoint=endpoint,

View file

@ -2654,7 +2654,7 @@ def _init_custom_logger_compatible_class( # noqa: PLR0915
)
os.environ["OTEL_EXPORTER_OTLP_TRACES_HEADERS"] = (
f"space_key={arize_config.space_key},api_key={arize_config.api_key}"
f"space_id={arize_config.space_id},api_key={arize_config.api_key}"
)
for callback in _in_memory_loggers:
if (
@ -2899,8 +2899,8 @@ def get_custom_logger_compatible_class( # noqa: PLR0915
elif logging_integration == "arize":
from litellm.integrations.opentelemetry import OpenTelemetry
if "ARIZE_SPACE_KEY" not in os.environ:
raise ValueError("ARIZE_SPACE_KEY not found in environment variables")
if "ARIZE_SPACE_ID" not in os.environ:
raise ValueError("ARIZE_SPACE_ID not found in environment variables")
if "ARIZE_API_KEY" not in os.environ:
raise ValueError("ARIZE_API_KEY not found in environment variables")
for callback in _in_memory_loggers:

View file

@ -1,4 +1,4 @@
from typing import TYPE_CHECKING, Literal, Any
from typing import TYPE_CHECKING, Literal, Any, Optional
from pydantic import BaseModel
@ -8,7 +8,7 @@ else:
Protocol = Any
class ArizeConfig(BaseModel):
space_key: str
space_id: str
api_key: str
protocol: Protocol
endpoint: str

View file

@ -1,11 +1,13 @@
import asyncio
import logging
from litellm import Choices
import pytest
from dotenv import load_dotenv
import litellm
from litellm._logging import verbose_logger, verbose_proxy_logger
from litellm.integrations._types.open_inference import SpanAttributes
from litellm.integrations.arize.arize import ArizeConfig, ArizeLogger
load_dotenv()
@ -32,7 +34,7 @@ async def test_async_otel_callback():
@pytest.fixture
def mock_env_vars(monkeypatch):
monkeypatch.setenv("ARIZE_SPACE_KEY", "test_space_key")
monkeypatch.setenv("ARIZE_SPACE_ID", "test_space_id")
monkeypatch.setenv("ARIZE_API_KEY", "test_api_key")
@ -42,7 +44,7 @@ def test_get_arize_config(mock_env_vars):
"""
config = ArizeLogger.get_arize_config()
assert isinstance(config, ArizeConfig)
assert config.space_key == "test_space_key"
assert config.space_id == "test_space_id"
assert config.api_key == "test_api_key"
assert config.endpoint == "https://otlp.arize.com/v1"
assert config.protocol == "otlp_grpc"
@ -58,3 +60,41 @@ def test_get_arize_config_with_endpoints(mock_env_vars, monkeypatch):
config = ArizeLogger.get_arize_config()
assert config.endpoint == "grpc://test.endpoint"
assert config.protocol == "otlp_grpc"
def test_arize_set_attributes():
"""
Test setting attributes for Arize
"""
from unittest.mock import MagicMock
from litellm.types.utils import ModelResponse
span = MagicMock()
kwargs = {
"role": "user",
"content": "simple arize test",
"model": "gpt-4o",
"messages": [{"role": "user", "content": "basic arize test"}],
"litellm_params": {"metadata": {"key": "value"}},
"standard_logging_object": {"model_parameters": {"user": "test_user"}}
}
response_obj = ModelResponse(usage={"total_tokens": 100, "completion_tokens": 60, "prompt_tokens": 40},
choices=[Choices(message={"role": "assistant", "content": "response content"})])
ArizeLogger.set_arize_attributes(span, kwargs, response_obj)
assert span.set_attribute.call_count == 14
span.set_attribute.assert_any_call(SpanAttributes.METADATA, str({"key": "value"}))
span.set_attribute.assert_any_call(SpanAttributes.LLM_MODEL_NAME, "gpt-4o")
span.set_attribute.assert_any_call(SpanAttributes.OPENINFERENCE_SPAN_KIND, "LLM")
span.set_attribute.assert_any_call(SpanAttributes.INPUT_VALUE, "basic arize test")
span.set_attribute.assert_any_call("llm.input_messages.0.message.role", "user")
span.set_attribute.assert_any_call("llm.input_messages.0.message.content", "basic arize test")
span.set_attribute.assert_any_call(SpanAttributes.LLM_INVOCATION_PARAMETERS, '{"user": "test_user"}')
span.set_attribute.assert_any_call(SpanAttributes.USER_ID, "test_user")
span.set_attribute.assert_any_call(SpanAttributes.OUTPUT_VALUE, "response content")
span.set_attribute.assert_any_call("llm.output_messages.0.message.role", "assistant")
span.set_attribute.assert_any_call("llm.output_messages.0.message.content", "response content")
span.set_attribute.assert_any_call(SpanAttributes.LLM_TOKEN_COUNT_TOTAL, 100)
span.set_attribute.assert_any_call(SpanAttributes.LLM_TOKEN_COUNT_COMPLETION, 60)
span.set_attribute.assert_any_call(SpanAttributes.LLM_TOKEN_COUNT_PROMPT, 40)

View file

@ -89,7 +89,7 @@ expected_env_vars = {
"OPIK_API_KEY": "opik_api_key",
"LANGTRACE_API_KEY": "langtrace_api_key",
"LOGFIRE_TOKEN": "logfire_token",
"ARIZE_SPACE_KEY": "arize_space_key",
"ARIZE_SPACE_ID": "arize_space_id",
"ARIZE_API_KEY": "arize_api_key",
"PHOENIX_API_KEY": "phoenix_api_key",
"ARGILLA_API_KEY": "argilla_api_key",