feat(proxy_cli.py): add new 'log_config' cli param (#6352)

* feat(proxy_cli.py): add new 'log_config' cli param

Allows passing logging.conf to uvicorn on startup

* docs(cli.md): add logging conf to uvicorn cli docs

* fix(get_llm_provider_logic.py): fix default api base for litellm_proxy

Fixes https://github.com/BerriAI/litellm/issues/6332

* feat(openai_like/embedding): Add support for jina ai embeddings

Closes https://github.com/BerriAI/litellm/issues/6337

* docs(deploy.md): update entrypoint.sh filepath post-refactor

Fixes outdated docs

* feat(prometheus.py): emit time_to_first_token metric on prometheus

Closes https://github.com/BerriAI/litellm/issues/6334

* fix(prometheus.py): only emit time to first token metric if stream is True

enables more accurate ttft usage

* test: handle vertex api instability

* fix(get_llm_provider_logic.py): fix import

* fix(openai.py): fix deepinfra default api base

* fix(anthropic/transformation.py): remove anthropic beta header (#6361)
This commit is contained in:
Krish Dholakia 2024-10-21 21:25:58 -07:00 committed by GitHub
parent 7338b24a74
commit 2b9db05e08
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
23 changed files with 839 additions and 263 deletions

View file

@ -176,3 +176,11 @@ Cli arguments, --host, --port, --num_workers
```
## --log_config
- **Default:** `None`
- **Type:** `str`
- Specify a log configuration file for uvicorn.
- **Usage:**
```shell
litellm --log_config path/to/log_config.conf
```

View file

@ -125,7 +125,7 @@ WORKDIR /app
COPY config.yaml .
# Make sure your docker/entrypoint.sh is executable
RUN chmod +x entrypoint.sh
RUN chmod +x ./docker/entrypoint.sh
# Expose the necessary port
EXPOSE 4000/tcp
@ -632,7 +632,7 @@ RUN rm -rf /app/litellm/proxy/_experimental/out/* && \
WORKDIR /app
# Make sure your entrypoint.sh is executable
RUN chmod +x entrypoint.sh
RUN chmod +x ./docker/entrypoint.sh
# Expose the necessary port
EXPOSE 4000/tcp

View file

@ -134,8 +134,9 @@ Use this for LLM API Error monitoring and tracking remaining rate limits and tok
| Metric Name | Description |
|----------------------|--------------------------------------|
| `litellm_request_total_latency_metric` | Total latency (seconds) for a request to LiteLLM Proxy Server - tracked for labels `litellm_call_id`, `model`, `user_api_key`, `user_api_key_alias`, `user_api_team`, `user_api_team_alias` |
| `litellm_llm_api_latency_metric` | Latency (seconds) for just the LLM API call - tracked for labels `litellm_call_id`, `model`, `user_api_key`, `user_api_key_alias`, `user_api_team`, `user_api_team_alias` |
| `litellm_request_total_latency_metric` | Total latency (seconds) for a request to LiteLLM Proxy Server - tracked for labels `model`, `hashed_api_key`, `api_key_alias`, `team`, `team_alias` |
| `litellm_llm_api_latency_metric` | Latency (seconds) for just the LLM API call - tracked for labels `model`, `hashed_api_key`, `api_key_alias`, `team`, `team_alias` |
| `litellm_llm_api_time_to_first_token_metric` | Time to first token for LLM API call - tracked for labels `model`, `hashed_api_key`, `api_key_alias`, `team`, `team_alias` [Note: only emitted for streaming requests] |
## Virtual Key - Budget, Rate Limit Metrics