Commit graph

252 commits

Author SHA1 Message Date
Krish Dholakia
d88e8922d4
Litellm dev 11 02 2024 (#6561)
* fix(dual_cache.py): update in-memory check for redis batch get cache

Fixes latency delay for async_batch_redis_cache

* fix(service_logger.py): fix race condition causing otel service logging to be overwritten if service_callbacks set

* feat(user_api_key_auth.py): add parent otel component for auth

allows us to isolate how much latency is added by auth checks

* perf(parallel_request_limiter.py): move async_set_cache_pipeline (from max parallel request limiter) out of execution path (background task)

reduces latency by 200ms

* feat(user_api_key_auth.py): have user api key auth object return user tpm/rpm limits - reduces redis calls in downstream task (parallel_request_limiter)

Reduces latency by 400-800ms

* fix(parallel_request_limiter.py): use batch get cache to reduce user/key/team usage object calls

reduces latency by 50-100ms

* fix: fix linting error

* fix(_service_logger.py): fix import

* fix(user_api_key_auth.py): fix service logging

* fix(dual_cache.py): don't pass 'self'

* fix: fix python3.8 error

* fix: fix init]
2024-11-04 07:48:20 +05:30
Ishaan Jaff
2c37aad1c4 ui new build 2024-10-30 23:53:14 +05:30
Ishaan Jaff
03c56804a0
(UI) fix + test displaying number of keys an internal user owns (#6507)
* fix view internal user key count

* add test for /user/list

* fix test user list

* testing ui change

* ui new build
2024-10-30 20:44:15 +05:30
Krish Dholakia
4f8a3fd4cf
redis otel tracing + async support for latency routing (#6452)
* docs(exception_mapping.md): add missing exception types

Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183

* fix(main.py): register custom model pricing with specific key

Ensure custom model pricing is registered to the specific model+provider key combination

* test: make testing more robust for custom pricing

* fix(redis_cache.py): instrument otel logging for sync redis calls

ensures complete coverage for all redis cache calls

* refactor: pass parent_otel_span for redis caching calls in router

allows for more observability into what calls are causing latency issues

* test: update tests with new params

* refactor: ensure e2e otel tracing for router

* refactor(router.py): add more otel tracing acrosss router

catch all latency issues for router requests

* fix: fix linting error

* fix(router.py): fix linting error

* fix: fix test

* test: fix tests

* fix(dual_cache.py): pass ttl to redis cache

* fix: fix param
2024-10-28 21:52:12 -07:00
Ishaan Jaff
b8d91b3f41 ui new build 2024-10-25 23:38:54 +04:00
Krish Dholakia
2acb0c0675
Litellm Minor Fixes & Improvements (10/12/2024) (#6179)
* build(model_prices_and_context_window.json): add bedrock llama3.2 pricing

* build(model_prices_and_context_window.json): add bedrock cross region inference pricing

* Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)"

This reverts commit 2a5624af47.

* add azure/gpt-4o-2024-05-13 (#6174)

* LiteLLM Minor Fixes & Improvements (10/10/2024)  (#6158)

* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic

* fix(vertex_ai/): support passing custom api base to partner models

Fixes https://github.com/BerriAI/litellm/issues/4317

* fix(proxy_server.py): Fix prometheus premium user check logic

* docs(prometheus.md): update quick start docs

* fix(custom_llm.py): support passing dynamic api key + api base

* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints

Closes https://github.com/BerriAI/litellm/issues/6081

* feat(openai/realtime): add openai realtime api logging

Closes https://github.com/BerriAI/litellm/issues/6081

* fix(realtime_streaming.py): fix linting errors

* fix(realtime_streaming.py): fix linting errors

* fix: fix linting errors

* fix pattern match router

* Add literalai in the sidebar observability category (#6163)

* fix: add literalai in the sidebar

* fix: typo

* update (#6160)

* Feat: Add Langtrace integration (#5341)

* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict

* docs fix

* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)

* fix move s3 to use customLogger

* add basic s3 logging test

* add s3 to custom logger compatible

* use batch logger for s3

* s3 set flush interval and batch size

* fix s3 logging

* add notes on s3 logging

* fix s3 logging

* add basic s3 logging test

* fix s3 type errors

* add test for sync logging on s3

* fix: fix to debug log

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>

* docs(custom_llm_server.md): update doc on passing custom params

* fix(pass_through_endpoints.py): don't require headers

Fixes https://github.com/BerriAI/litellm/issues/6128

* feat(utils.py): add support for caching rerank endpoints

Closes https://github.com/BerriAI/litellm/issues/6144

* feat(litellm_logging.py'): add response headers for failed requests

Closes https://github.com/BerriAI/litellm/issues/6159

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-12 11:48:34 -07:00
Ishaan Jaff
fa1451af90 ui new build 2024-10-09 16:04:49 +05:30
Ishaan Jaff
285b589095 ui new build 2024-10-07 13:01:19 +05:30
Krrish Dholakia
3560f0ef2c refactor: move all testing to top-level of repo
Closes https://github.com/BerriAI/litellm/issues/486
2024-09-28 21:08:14 -07:00
Ishaan Jaff
36114f234c ui new build 2024-09-23 18:10:12 -07:00
Ishaan Jaff
c19592e502 ui new build 2024-09-23 13:17:40 -07:00
Ishaan Jaff
030b2e1bae ui new build 2024-09-23 07:56:23 -07:00
Ishaan Jaff
696fc387d2 ui new build 2024-09-20 08:11:05 -07:00
Ishaan Jaff
8dbb1f59d7 ui new build 2024-09-19 17:18:49 -07:00
Krrish Dholakia
cdd7cd4d69 build: bump from 1.44.28 -> 1.45.0 2024-09-12 23:10:29 -07:00
Krish Dholakia
98c34a7e27
LiteLLM Minor Fixes and Improvements (11/09/2024) (#5634)
* fix(caching.py): set ttl for async_increment cache

fixes issue where ttl for redis client was not being set on increment_cache

Fixes https://github.com/BerriAI/litellm/issues/5609

* fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis

Fixes https://github.com/BerriAI/litellm/issues/5609

* fix(router.py): support adding retry policy + allowed fails policy via config.yaml

* fix(router.py): don't cooldown single deployments

No point, as there's no other deployment to loadbalance with.

* fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens

Closes https://github.com/BerriAI/litellm/issues/5605

* docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs

* fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set

Fixes issue where key logging would not be set if team metadata was not none

* fix(secret_managers/main.py): load environment variables correctly

Fixes issue where os.environ/ was not being loaded correctly

* test(test_router.py): fix test

* feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek

* test: fix tests

* test: fix test

* test: fix test

* test: fix test

* test: fix test
2024-09-11 22:36:06 -07:00
Ishaan Jaff
c574c729cd ui new build 2024-09-07 16:24:06 -07:00
Ishaan Jaff
516a6b63e1 ui new build 2024-09-06 18:10:46 -07:00
Ishaan Jaff
8e25ba8de1 ui new build 2024-09-05 17:05:39 -07:00
Krrish Dholakia
18689c25e9 feat(team_endpoints.py): return team member budgets in /team/info call
Fixes https://github.com/BerriAI/litellm/issues/5390
2024-08-28 19:14:01 -07:00
Ishaan Jaff
5ad6d4cf77 new ui build 2024-08-28 14:43:33 -07:00
Krrish Dholakia
bd3057e495 test(test_proxy_exception_mapping): loosen assert 2024-08-27 16:14:30 -07:00
Ishaan Jaff
f3b3f39eb5 ui new build 2024-08-26 19:01:35 -07:00
Krrish Dholakia
64952ab044 fix: fix tests 2024-08-24 19:32:22 -07:00
Ishaan Jaff
d9769c393e ui new build 2024-08-24 16:45:53 -07:00
Krrish Dholakia
7fce6b0163 fix(health_check.py): return 'missing mode' error message, if error with health check, and mode is missing 2024-08-16 17:24:29 -07:00
Ishaan Jaff
9c3124c5a7 ui new build 2024-08-16 12:53:23 -07:00
Krrish Dholakia
1510daba4f bump: version 1.43.15 → 1.43.16 2024-08-15 23:04:30 -07:00
Krrish Dholakia
5fdbfcee44 fix(user_api_key_auth.py): more precisely expand scope to handle 'basic' tokens 2024-08-13 22:00:33 -07:00
Ishaan Jaff
c7804e1ea2 ui new build 2024-08-13 18:45:09 -07:00
Krrish Dholakia
0ea056971c docs(prefix.md): add prefix support to docs 2024-08-10 13:55:47 -07:00
Ishaan Jaff
2f8db40cab ui new build 2024-08-09 12:26:24 -07:00
Krrish Dholakia
f0f19a9457 bump: version 1.43.4 → 1.43.5 2024-08-08 23:47:01 -07:00
Krish Dholakia
3d259f1883
Merge branch 'main' into litellm_sso_team_member_add 2024-08-08 23:34:06 -07:00
Ishaan Jaff
30ab7191d6 ui new build 2024-08-08 23:30:51 -07:00
Ishaan Jaff
2245711e69 ui new build 2024-08-08 23:29:08 -07:00
Krrish Dholakia
81a0a8ab22 fix(internal_user_endpoints.py): expose new 'internal_user_budget_duration' flag
Relevant to - https://github.com/BerriAI/litellm/issues/5106
2024-08-08 23:26:01 -07:00
Ishaan Jaff
04037c5b0e ui new build 2024-08-08 23:25:27 -07:00
Ishaan Jaff
8c8e2f8c68 ui new build 2024-08-08 23:25:27 -07:00
Ishaan Jaff
b14b346e2e ui new build 2024-08-08 18:53:22 -07:00
Ishaan Jaff
3bb001f136 ui new build 2024-08-08 18:25:41 -07:00
Krrish Dholakia
f75a31cd11 fix(user_api_key_auth.py): Fixes https://github.com/BerriAI/litellm/issues/5111 2024-08-08 17:18:59 -07:00
Krrish Dholakia
5da4c27e8d fix(internal_user_endpoints.py): expose new 'internal_user_budget_duration' flag
Relevant to - https://github.com/BerriAI/litellm/issues/5106
2024-08-08 13:05:03 -07:00
Krrish Dholakia
ee8d2f25b9 build: ui - update to include max budget per team 2024-08-08 09:09:23 -07:00
Krrish Dholakia
540d5f318b feat(litellm_logging.py): log exception response headers to langfuse 2024-08-01 18:01:07 -07:00
Ishaan Jaff
cc9a597d37 ui new build 2024-08-01 09:59:24 -07:00
Krrish Dholakia
d02d3d9712 build(model_prices_and_context_window.json): add azure gpt-4o-mini regional + global standard pricing to model cost map 2024-08-01 09:44:40 -07:00
Krrish Dholakia
d8a8cd2961 feat(ui): add ability to enable traceloop + langsmith via ui 2024-07-31 21:40:29 -07:00
Krrish Dholakia
b77edc59ed fix(user_api_key_cache): fix check to not raise error if team object is missing 2024-07-30 18:25:04 -07:00
Ishaan Jaff
d5593a9106 ui new build 2024-07-30 13:33:34 -07:00