Commit graph

694 commits

Author SHA1 Message Date
Ishaan Jaff
91dd3e11c4 [Feat-Perf] Use Batching + Squashing (#5645)
* use folder for slack alerting

* clean up slack alerting

* fix test alerting
2024-09-12 18:37:53 -07:00
Ishaan Jaff
f7fc14ac34 Merge branch 'main' into litellm_otel_fixes 2024-09-11 18:06:29 -07:00
Ishaan Jaff
6f9d7a7df8 Merge pull request #5638 from BerriAI/litellm_langsmith_perf
[Langsmith Perf Improvement] Use /batch for Langsmith Logging
2024-09-11 17:43:26 -07:00
Ishaan Jaff
73d838e7c8 fix move logic to custom_batch_logger 2024-09-11 16:19:24 -07:00
Ishaan Jaff
3e0c4448cd use vars for batch size and flush interval seconds 2024-09-11 14:40:58 -07:00
Ishaan Jaff
c72c8c0383 fix otel use sensible defaults 2024-09-11 14:24:04 -07:00
Ishaan Jaff
53734dbcfc fix langsmith tenacity 2024-09-11 13:48:44 -07:00
Ishaan Jaff
d0ae85a7bb use lock to flush events to langsmith 2024-09-11 13:27:16 -07:00
Ishaan Jaff
c5f64ef99e add better debugging for flush interval 2024-09-11 13:02:34 -07:00
Ishaan Jaff
385286c089 use tenacity for langsmith 2024-09-11 12:41:22 -07:00
Ishaan Jaff
67b3ce8740 fix langsmith clear logged queue on success 2024-09-11 11:56:24 -07:00
Krish Dholakia
7f47c48b35 LiteLLM Minor Fixes and Improvements (09/10/2024) (#5618)
* fix(cost_calculator.py): move to debug for noisy warning message on cost calculation error

Fixes https://github.com/BerriAI/litellm/issues/5610

* fix(databricks/cost_calculator.py): Handles model name issues for databricks models

* fix(main.py): fix stream chunk builder for multiple tool calls

Fixes https://github.com/BerriAI/litellm/issues/5591

* fix: correctly set user_alias when passed in

Fixes https://github.com/BerriAI/litellm/issues/5612

* fix(types/utils.py): allow passing role for message object

https://github.com/BerriAI/litellm/issues/5621

* fix(litellm_logging.py): Fix langfuse logging across multiple projects

Fixes issue where langfuse logger was re-using the old logging object

* feat(proxy/_types.py): support adding key-based tags for tag-based routing

Enable tag based routing at key-level

* fix(proxy/_types.py): fix inheritance

* test(test_key_generate_prisma.py): fix test

* test: fix test

* fix(litellm_logging.py): return used callback object
2024-09-11 11:30:29 -07:00
Ishaan Jaff
a053464fc5 langsmith use batching for logging 2024-09-11 11:28:27 -07:00
Ishaan Jaff
530cc34866 stash - langsmith use batching for logging 2024-09-11 08:06:56 -07:00
Ishaan Jaff
4515f43976 Merge pull request #5623 from BerriAI/litellm_vertex_use_async_for_getting_token
[Feat-Vertex Perf] Use async func to get auth credentials
2024-09-10 18:53:48 -07:00
Ishaan Jaff
c8fe600dbf fix case when gemini is used 2024-09-10 17:06:45 -07:00
Ishaan Jaff
7891b3742c fix vertex use async func to set auth creds 2024-09-10 16:12:18 -07:00
Ishaan Jaff
f481961429 use get async httpx client 2024-09-10 13:08:49 -07:00
Ishaan Jaff
5e15cc546a use get_async_httpx_client for logging httpx 2024-09-10 13:03:55 -07:00
Ishaan Jaff
02325f33d7 Merge branch 'main' into litellm_allow_turning_off_message_logging_for_callbacks 2024-09-09 21:59:36 -07:00
Ishaan Jaff
4e51da6c16 Merge pull request #5576 from BerriAI/litellm_set_max_batch_size
[Fix - Otel logger] Set a max queue size of 100 logs for OTEL
2024-09-09 17:39:16 -07:00
Ishaan Jaff
2e021a6203 fix otel defaults 2024-09-09 16:18:55 -07:00
Ishaan Jaff
0f0a5c1758 fix init custom logger when init OTEL runs 2024-09-09 16:03:39 -07:00
Ishaan Jaff
af5b87a8de add message_logging on Custom Logger 2024-09-09 15:59:42 -07:00
Ishaan Jaff
8286804649 fix slack alerting allow setting custom spend report frequency 2024-09-07 11:42:16 -07:00
Krish Dholakia
8d5cad0c39 fix(langsmith.py): support sampling langsmith traces (#5577) 2024-09-06 22:14:44 -07:00
Ishaan Jaff
22fe7705e7 fix otel set max_queue_size, max_queue_size 2024-09-06 17:31:43 -07:00
Krish Dholakia
2cab33b061 LiteLLM Minor Fixes and Improvements (08/06/2024) (#5567)
* fix(utils.py): return citations for perplexity streaming

Fixes https://github.com/BerriAI/litellm/issues/5535

* fix(anthropic/chat.py): support fallbacks for anthropic streaming (#5542)

* fix(anthropic/chat.py): support fallbacks for anthropic streaming

Fixes https://github.com/BerriAI/litellm/issues/5512

* fix(anthropic/chat.py): use module level http client if none given (prevents early client closure)

* fix: fix linting errors

* fix(http_handler.py): fix raise_for_status error handling

* test: retry flaky test

* fix otel type

* fix(bedrock/embed): fix error raising

* test(test_openai_batches_and_files.py): skip azure batches test (for now) quota exceeded

* fix(test_router.py): skip azure batch route test (for now) - hit batch quota limits

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* All `model_group_alias` should show up in `/models`, `/model/info` , `/model_group/info` (#5539)

* fix(router.py): support returning model_alias model names in `/v1/models`

* fix(proxy_server.py): support returning model alias'es on `/model/info`

* feat(router.py): support returning model group alias for `/model_group/info`

* fix(proxy_server.py): fix linting errors

* fix(proxy_server.py): fix linting errors

* build(model_prices_and_context_window.json): add amazon titan text premier pricing information

Closes https://github.com/BerriAI/litellm/issues/5560

* feat(litellm_logging.py): log standard logging response object for pass through endpoints. Allows bedrock /invoke agent calls to be correctly logged to langfuse + s3

* fix(success_handler.py): fix linting error

* fix(success_handler.py): fix linting errors

* fix(team_endpoints.py): Allows admin to update team member budgets

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-06 17:16:24 -07:00
Ishaan Jaff
5c7fbe0be2 fix otel max batch size 2024-09-06 17:12:01 -07:00
Ishaan Jaff
99853de886 fix datadog log exceptions 2024-09-06 14:36:35 -07:00
Ishaan Jaff
4bc0d297f7 fix otel type 2024-09-05 19:04:56 -07:00
Krish Dholakia
a074f5801e Update lago.py to accomodate API change (#5495) (#5543)
* Update lago.py to accomodate API change (#5495)

external_customer_id is deprecated. 

external_subscription_id is the replacement.

* fix(lago.py): fixes

\

---------

Co-authored-by: Raymond Weitekamp <19483938+rawwerks@users.noreply.github.com>
2024-09-05 17:27:40 -07:00
Krish Dholakia
6fdee99632 LiteLLM Minor fixes + improvements (08/04/2024) (#5505)
* Minor IAM AWS OIDC Improvements (#5246)

* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.

* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.

* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.

* fix(router.py): log rejected requests

Fixes https://github.com/BerriAI/litellm/issues/5498

* refactor: don't use verbose_logger.exception, if exception is raised

User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.

* fix(datadog.py): support setting datadog source as an env var

Fixes https://github.com/BerriAI/litellm/issues/5508

* docs(logging.md): add dd_source to datadog docs

* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers

* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509)

* feat(anthropic.py): support 'cache_control' param for content when it is a string

* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519)

This reverts commit 3fac0349c2.

* refactor: ci/cd run again

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-09-04 22:16:55 -07:00
Krish Dholakia
8eb7cb5300 LiteLLM Minor fixes + improvements (08/03/2024) (#5488)
* fix(internal_user_endpoints.py): set budget_reset_at for /user/update

* fix(vertex_and_google_ai_studio_gemini.py): handle accumulated json

Fixes https://github.com/BerriAI/litellm/issues/5479

* fix(vertex_ai_and_gemini.py): fix assistant message function call when content is not None

Fixes https://github.com/BerriAI/litellm/issues/5490

* fix(proxy_server.py): generic state uuid for okta sso

* fix(lago.py): improve debug logs

Debugging for https://github.com/BerriAI/litellm/issues/5477

* docs(bedrock.md): add bedrock cross-region inferencing to docs

* fix(azure.py): return azure response headers on aembedding call

* feat(azure.py): return azure response headers for `/audio/transcription`

* fix(types/utils.py): standardize deepseek / anthropic prompt caching usage information

Closes https://github.com/BerriAI/litellm/issues/5285

* docs(usage.md): add docs on litellm usage object

* test(test_completion.py): mark flaky test
2024-09-03 21:21:34 -07:00
Ishaan Jaff
4dff632905 add sync_construct_request_headers 2024-09-03 10:36:10 -07:00
Krish Dholakia
11f85d883f LiteLLM Minor Fixes + Improvements (#5474)
* feat(proxy/_types.py): add lago billing to callbacks ui

Closes https://github.com/BerriAI/litellm/issues/5472

* fix(anthropic.py): return anthropic prompt caching information

Fixes https://github.com/BerriAI/litellm/issues/5364

* feat(bedrock/chat.py): support 'json_schema' for bedrock models

Closes https://github.com/BerriAI/litellm/issues/5434

* fix(bedrock/embed/embeddings.py): support async embeddings for amazon titan models

* fix: linting fixes

* fix: handle key errors

* fix(bedrock/chat.py): fix bedrock ai21 streaming object

* feat(bedrock/embed): support bedrock embedding optional params

* fix(databricks.py): fix usage chunk

* fix(internal_user_endpoints.py): apply internal user defaults, if user role updated

Fixes issue where user update wouldn't apply defaults

* feat(slack_alerting.py): provide multiple slack channels for a given alert type

multiple channels might be interested in receiving an alert for a given type

* docs(alerting.md): add multiple channel alerting to docs
2024-09-02 14:29:57 -07:00
Krish Dholakia
ca4e746545 LiteLLM minor fixes + improvements (31/08/2024) (#5464)
* fix(vertex_endpoints.py): fix vertex ai pass through endpoints

* test(test_streaming.py): skip model due to end of life

* feat(custom_logger.py): add special callback for model hitting tpm/rpm limits

Closes https://github.com/BerriAI/litellm/issues/4096
2024-09-01 13:31:42 -07:00
Ishaan Jaff
3fae5eb94e feat prometheus add metric for failure / model 2024-08-31 10:05:23 -07:00
Ishaan Jaff
c60125d7be add gcs bucket base 2024-08-30 10:41:39 -07:00
Krish Dholakia
fe2a3c02e5 - merge - fix TypeError: 'CompletionUsage' object is not subscriptable #5441 (#5448)
* fix TypeError: 'CompletionUsage' object is not subscriptable (#5441)

* test(test_team_logging.py): mark flaky test

---------

Co-authored-by: yafei lee <yafei@dao42.com>
2024-08-30 08:54:42 -07:00
Ishaan Jaff
fddf10eeb8 prometheus - safe update start / end time 2024-08-28 16:13:56 -07:00
Ishaan Jaff
359a003ac8 v0 add rerank on litellm proxy 2024-08-27 17:28:39 -07:00
Ishaan Jaff
a8e192a868 fix use guardrail for pre call hook 2024-08-23 09:34:08 -07:00
Ishaan Jaff
be853d93da fix prom latency metrics 2024-08-23 06:59:19 -07:00
Ishaan Jaff
9476582fb7 update promtheus metric names 2024-08-22 14:03:00 -07:00
Ishaan Jaff
c719c375f7 track litellm_request_latency_metric 2024-08-22 13:58:10 -07:00
Ishaan Jaff
0ccb1c17f7 fix init correct prometheus metrics 2024-08-22 13:29:35 -07:00
Krish Dholakia
41835d9397 Merge pull request #5323 from MarkRx/feature/langsmith-ids
Support LangSmith parent_run_id, trace_id, session_id
2024-08-21 15:38:50 -07:00
MarkRx
58529e2c9c Support LangSmith parent_run_id, trace_id, session_id 2024-08-21 16:09:30 -04:00
Ishaan Jaff
cdbd245c3d working lakera ai during call hook 2024-08-20 14:39:04 -07:00