Commit graph

3623 commits

Author SHA1 Message Date
Krish Dholakia
d121fc4775 fix(proxy/utils.py): auto-update if required view missing from db. raise warning for optional views. (#5675)
Prevents missing optional views from blocking proxy startup.
2024-09-12 22:15:44 -07:00
Ishaan Jaff
e7c22f63e7 [Fix-Router] Don't cooldown when only 1 deployment exists (#5673)
* fix get model list

* fix test custom callback router

* fix embedding fallback test

* fix router retry policy on AuthErrors

* fix router test

* add test for single deployments no cooldown test prod

* add test test_single_deployment_no_cooldowns_test_prod_mock_completion_calls
2024-09-12 19:14:58 -07:00
Ishaan Jaff
91dd3e11c4 [Feat-Perf] Use Batching + Squashing (#5645)
* use folder for slack alerting

* clean up slack alerting

* fix test alerting
2024-09-12 18:37:53 -07:00
Krish Dholakia
0b249278bb Refactor 'check_view_exists' logic (#5659)
* fix(proxy/utils.py): comment out auto-upsert logic in check_view_exists

Prevents proxy from failing on startup due to faulty logic

* fix(db/migration_scripts/create_views.py): fix 'DailyTagSpend' quotation on check

* fix(create_views.py): mongly global spend time period should be 30d not 20d

* fix(schema.prisma): index on startTime and endUser for efficient UI querying
2024-09-12 13:39:50 -07:00
Krish Dholakia
dec53961f7 LiteLLM Minor Fixes and Improvements (11/09/2024) (#5634)
* fix(caching.py): set ttl for async_increment cache

fixes issue where ttl for redis client was not being set on increment_cache

Fixes https://github.com/BerriAI/litellm/issues/5609

* fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis

Fixes https://github.com/BerriAI/litellm/issues/5609

* fix(router.py): support adding retry policy + allowed fails policy via config.yaml

* fix(router.py): don't cooldown single deployments

No point, as there's no other deployment to loadbalance with.

* fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens

Closes https://github.com/BerriAI/litellm/issues/5605

* docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs

* fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set

Fixes issue where key logging would not be set if team metadata was not none

* fix(secret_managers/main.py): load environment variables correctly

Fixes issue where os.environ/ was not being loaded correctly

* test(test_router.py): fix test

* feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek

* test: fix tests

* test: fix test

* test: fix test

* test: fix test

* test: fix test
2024-09-11 22:36:06 -07:00
Ishaan Jaff
f7fc14ac34 Merge branch 'main' into litellm_otel_fixes 2024-09-11 18:06:29 -07:00
Ishaan Jaff
6f9d7a7df8 Merge pull request #5638 from BerriAI/litellm_langsmith_perf
[Langsmith Perf Improvement] Use /batch for Langsmith Logging
2024-09-11 17:43:26 -07:00
steffen-sbt
357dd3cad5 Add the option to specify a schema in the postgres DB, also modify docs (#5640) 2024-09-11 14:53:52 -07:00
Ishaan Jaff
3e0c4448cd use vars for batch size and flush interval seconds 2024-09-11 14:40:58 -07:00
Ishaan Jaff
c72c8c0383 fix otel use sensible defaults 2024-09-11 14:24:04 -07:00
Krish Dholakia
7f47c48b35 LiteLLM Minor Fixes and Improvements (09/10/2024) (#5618)
* fix(cost_calculator.py): move to debug for noisy warning message on cost calculation error

Fixes https://github.com/BerriAI/litellm/issues/5610

* fix(databricks/cost_calculator.py): Handles model name issues for databricks models

* fix(main.py): fix stream chunk builder for multiple tool calls

Fixes https://github.com/BerriAI/litellm/issues/5591

* fix: correctly set user_alias when passed in

Fixes https://github.com/BerriAI/litellm/issues/5612

* fix(types/utils.py): allow passing role for message object

https://github.com/BerriAI/litellm/issues/5621

* fix(litellm_logging.py): Fix langfuse logging across multiple projects

Fixes issue where langfuse logger was re-using the old logging object

* feat(proxy/_types.py): support adding key-based tags for tag-based routing

Enable tag based routing at key-level

* fix(proxy/_types.py): fix inheritance

* test(test_key_generate_prisma.py): fix test

* test: fix test

* fix(litellm_logging.py): return used callback object
2024-09-11 11:30:29 -07:00
Ishaan Jaff
530cc34866 stash - langsmith use batching for logging 2024-09-11 08:06:56 -07:00
Ishaan Jaff
4515f43976 Merge pull request #5623 from BerriAI/litellm_vertex_use_async_for_getting_token
[Feat-Vertex Perf] Use async func to get auth credentials
2024-09-10 18:53:48 -07:00
Ishaan Jaff
c8fe600dbf fix case when gemini is used 2024-09-10 17:06:45 -07:00
Ishaan Jaff
7891b3742c fix vertex use async func to set auth creds 2024-09-10 16:12:18 -07:00
Ishaan Jaff
536ca7d516 Merge branch 'main' into litellm_use_helper_to_get_httpx_clients 2024-09-10 15:02:54 -07:00
Ishaan Jaff
852a2baa39 Merge pull request #5619 from BerriAI/litellm_vertex_use_get_httpx_client
[Fix-Perf] Vertex AI cache httpx clients
2024-09-10 13:59:39 -07:00
Ishaan Jaff
f481961429 use get async httpx client 2024-09-10 13:08:49 -07:00
Ishaan Jaff
0d6081e370 pass llm provider when creating async httpx clients 2024-09-10 11:51:42 -07:00
Ishaan Jaff
93c1db4a79 rename get_async_httpx_client 2024-09-10 10:38:01 -07:00
Ishaan Jaff
21c462cf56 fix vertex ai use _get_async_client 2024-09-10 10:33:19 -07:00
Ishaan Jaff
a4ccffefd7 fix regen keys when no duration is passed 2024-09-10 08:04:18 -07:00
Ishaan Jaff
02325f33d7 Merge branch 'main' into litellm_allow_turning_off_message_logging_for_callbacks 2024-09-09 21:59:36 -07:00
Krish Dholakia
09ca581620 LiteLLM Minor Fixes and Improvements (09/09/2024) (#5602)
* fix(main.py): pass default azure api version as alternative in completion call

Fixes api error caused due to api version

Closes https://github.com/BerriAI/litellm/issues/5584

* Fixed gemini-1.5-flash pricing (#5590)

* add /key/list endpoint

* bump: version 1.44.21 → 1.44.22

* docs architecture

* Fixed gemini-1.5-flash pricing

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* fix(bedrock/chat.py): fix converse api stop sequence param mapping

Fixes https://github.com/BerriAI/litellm/issues/5592

* fix(databricks/cost_calculator.py): handle databricks model name changes

Fixes https://github.com/BerriAI/litellm/issues/5597

* fix(azure.py): support azure api version 2024-08-01-preview

Closes https://github.com/BerriAI/litellm/issues/5377

* fix(proxy/_types.py): allow dev keys to call cohere /rerank endpoint

Fixes issue where only admin could call rerank endpoint

* fix(azure.py): check if model is gpt-4o

* fix(proxy/_types.py): support /v1/rerank on non-admin routes as well

* fix(cost_calculator.py): fix split on `/` logic in cost calculator

---------

Co-authored-by: F1bos <44951186+F1bos@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-09 21:56:12 -07:00
Krish Dholakia
52849e6422 LiteLLM Minor Fixes and Improvements (09/07/2024) (#5580)
* fix(litellm_logging.py): set completion_start_time_float to end_time_float if none

Fixes https://github.com/BerriAI/litellm/issues/5500

* feat(_init_.py): add new 'openai_text_completion_compatible_providers' list

Fixes https://github.com/BerriAI/litellm/issues/5558

Handles correctly routing fireworks ai calls when done via text completions

* fix: fix linting errors

* fix: fix linting errors

* fix(openai.py): fix exception raised

* fix(openai.py): fix error handling

* fix(_redis.py): allow all supported arguments for redis cluster (#5554)

* Revert "fix(_redis.py): allow all supported arguments for redis cluster (#5554)" (#5583)

This reverts commit f2191ef4cb.

* fix(router.py): return model alias w/ underlying deployment on router.get_model_list()

Fixes https://github.com/BerriAI/litellm/issues/5524#issuecomment-2336410666

* test: handle flaky tests

---------

Co-authored-by: Jonas Dittrich <58814480+Kakadus@users.noreply.github.com>
2024-09-09 18:54:17 -07:00
Ishaan Jaff
3278da17cf Merge branch 'main' into litellm_tag_routing_fixes 2024-09-09 17:45:18 -07:00
Ishaan Jaff
d303a3d03c fix log failures for key based logging 2024-09-09 16:33:06 -07:00
Ishaan Jaff
7e8af27527 fix otel test 2024-09-09 16:20:47 -07:00
Ishaan Jaff
e07b2ce6ea use callback_settings when intializing otel 2024-09-09 16:05:48 -07:00
Ishaan Jaff
176397cfca Merge pull request #5599 from BerriAI/litellm_allow_mounting_prom_callbacks
[Feat] support using "callbacks" for prometheus
2024-09-09 15:00:43 -07:00
Ishaan Jaff
f49fdab804 fix debug statements 2024-09-09 14:00:17 -07:00
Ishaan Jaff
0f2b8e511c fix create script for pre-creating views 2024-09-09 11:03:27 -07:00
Ishaan Jaff
3369b4e41a support using "callbacks" for prometheus 2024-09-09 08:26:03 -07:00
Ishaan Jaff
0e0decd6b9 add /key/list endpoint 2024-09-07 16:52:28 -07:00
Ishaan Jaff
185579a8ef ui new build 2024-09-07 16:24:06 -07:00
Ishaan Jaff
e912d81b0c add doc on spend report frequency 2024-09-07 11:54:33 -07:00
Ishaan Jaff
15820c6b7b add spend_report_frequency as a general setting 2024-09-07 11:44:58 -07:00
Krish Dholakia
501b6f5bac Allow client-side credentials to be sent to proxy (accept only if complete credentials are given) (#5575)
* feat: initial commit

* fix(proxy/auth/auth_utils.py): Allow client-side credentials to be given to the proxy (accept only if complete credentials are given)
2024-09-06 19:21:54 -07:00
Ishaan Jaff
2b7580916e ui new build 2024-09-06 18:10:46 -07:00
Ishaan Jaff
4db821897d Merge pull request #5566 from BerriAI/litellm_ui_regen_keys
[Feat] Allow setting duration time when regenerating key
2024-09-06 18:05:51 -07:00
Ishaan Jaff
164d8696ca Merge pull request #5574 from BerriAI/litellm_tags_use_views
[Feat-Proxy] Use DB Views to Get spend per Tag (Usage endpoints)
2024-09-06 17:33:06 -07:00
Krish Dholakia
2cab33b061 LiteLLM Minor Fixes and Improvements (08/06/2024) (#5567)
* fix(utils.py): return citations for perplexity streaming

Fixes https://github.com/BerriAI/litellm/issues/5535

* fix(anthropic/chat.py): support fallbacks for anthropic streaming (#5542)

* fix(anthropic/chat.py): support fallbacks for anthropic streaming

Fixes https://github.com/BerriAI/litellm/issues/5512

* fix(anthropic/chat.py): use module level http client if none given (prevents early client closure)

* fix: fix linting errors

* fix(http_handler.py): fix raise_for_status error handling

* test: retry flaky test

* fix otel type

* fix(bedrock/embed): fix error raising

* test(test_openai_batches_and_files.py): skip azure batches test (for now) quota exceeded

* fix(test_router.py): skip azure batch route test (for now) - hit batch quota limits

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* All `model_group_alias` should show up in `/models`, `/model/info` , `/model_group/info` (#5539)

* fix(router.py): support returning model_alias model names in `/v1/models`

* fix(proxy_server.py): support returning model alias'es on `/model/info`

* feat(router.py): support returning model group alias for `/model_group/info`

* fix(proxy_server.py): fix linting errors

* fix(proxy_server.py): fix linting errors

* build(model_prices_and_context_window.json): add amazon titan text premier pricing information

Closes https://github.com/BerriAI/litellm/issues/5560

* feat(litellm_logging.py): log standard logging response object for pass through endpoints. Allows bedrock /invoke agent calls to be correctly logged to langfuse + s3

* fix(success_handler.py): fix linting error

* fix(success_handler.py): fix linting errors

* fix(team_endpoints.py): Allows admin to update team member budgets

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-06 17:16:24 -07:00
Ishaan Jaff
16a3223474 fix linting 2024-09-06 16:54:43 -07:00
Ishaan Jaff
0c4022d848 fix use view for getting tag usage 2024-09-06 16:28:24 -07:00
Ishaan Jaff
b3629ebdc5 allow passing expiry time to /key/regenerate 2024-09-06 08:36:34 -07:00
Krish Dholakia
355f4a7c90 LiteLLM Minor Fixes and Improvements (#5537)
* fix(vertex_ai): Fixes issue where multimodal message without text was failing vertex calls

Fixes https://github.com/BerriAI/litellm/issues/5515

* fix(azure.py): move to using httphandler for oidc token calls

Fixes issue where ssl certificates weren't being picked up as expected

Closes https://github.com/BerriAI/litellm/issues/5522

* feat: Allows admin to set a default_max_internal_user_budget in config, and allow setting more specific values as env vars

* fix(proxy_server.py): fix read for max_internal_user_budget

* build(model_prices_and_context_window.json): add regional gpt-4o-2024-08-06 pricing

Closes https://github.com/BerriAI/litellm/issues/5540

* test: skip re-test
2024-09-05 18:03:34 -07:00
Ishaan Jaff
18e2169c40 ui new build 2024-09-05 17:05:39 -07:00
Ishaan Jaff
dd7d93fd54 Merge branch 'main' into litellm_allow_internal_user_view_usage 2024-09-05 16:46:06 -07:00
Ishaan Jaff
56835f77aa fix on /user/info show all keys - even expired ones 2024-09-05 15:31:41 -07:00
Ishaan Jaff
7ef1ac7996 fix allow internal user to view their own usage 2024-09-05 12:53:44 -07:00