Commit graph

12117 commits

Author SHA1 Message Date
Ishaan Jaff
0cab9bee4e fix _read_request_body (#7706) 2025-01-11 21:54:51 -08:00
Ishaan Jaff
57d97dbee7 fix get_llm_provider for aiohttp openai 2025-01-11 21:45:06 -08:00
Ishaan Jaff
f4ba9c0b58 (perf sdk) - minor changes to cost calculator to run helpers only when necessary (#7704)
* use _get_model_info_helper

* fix _get_potential_model_names
2025-01-11 21:10:37 -08:00
Ishaan Jaff
bbed882eff use _get_model_info_helper (#7703) 2025-01-11 21:08:15 -08:00
Krish Dholakia
267be77720 Litellm dev 01 11 2025 p3 (#7702)
* fix(__init__.py): fix init to exclude pricing-only model cost values from real model names

prevents bad health checks on wildcard routes

* fix(get_llm_provider.py): fix to handle calling bedrock_converse models
2025-01-11 20:06:54 -08:00
Krish Dholakia
1ca69019d0 build: new ui build (#7685) 2025-01-10 22:12:17 -08:00
Krish Dholakia
953c021aa7 Litellm dev 01 10 2025 p3 (#7682)
* feat(langfuse.py): log the used prompt when prompt management used

* test: fix test

* docs(self_serve.md): add doc on restricting personal key creation on ui

* feat(s3.py): support s3 logging with team alias prefixes (if available)

New preview feature

* fix(main.py): remove old if block - simplify to just await if coroutine returned

fixes lm_studio async embedding error

* fix(langfuse.py): handle get prompt check
2025-01-10 21:56:42 -08:00
Krish Dholakia
e54d23c919 Litellm dev 01 10 2025 p2 (#7679)
* test(test_basic_python_version.py): assert all optional dependencies are marked as extras on poetry

Fixes https://github.com/BerriAI/litellm/issues/7677

* docs(secret.md): clarify 'read_and_write' secret manager usage on aws

* docs(secret.md): fix doc

* build(ui/teams.tsx): add edit/delete button for updating user / team membership on ui

allows updating user role to admin on ui

* build(ui/teams.tsx): display edit member component on ui, when edit button on member clicked

* feat(team_endpoints.py): support updating team member role to admin via api endpoints

allows team member to become admin post-add

* build(ui/user_dashboard.tsx): if team admin - show all team keys

Fixes https://github.com/BerriAI/litellm/issues/7650

* test(config.yml): add tomli to ci/cd

* test: don't call python_basic_testing in local testing (covered by python 3.13 testing)
2025-01-10 21:50:53 -08:00
Ishaan Jaff
d866b1beda [Bug fix]: Proxy Auth Layer - Allow Azure Realtime routes as llm_api_routes (#7684)
* fix route check azure realtime endpoints

* test_is_llm_api_route

* fix /realtime

* test_routes_on_litellm_proxy
2025-01-10 20:38:06 -08:00
Ishaan Jaff
a49ef48bcb fix proxy pre call hook - only use if user is using alerting (#7683) 2025-01-10 19:07:05 -08:00
Ishaan Jaff
76c81f97d7 uvicorn allow setting num workers (#7681) 2025-01-10 19:03:14 -08:00
Ishaan Jaff
7e2cf585f2 (performance improvement - litellm sdk + proxy) - ensure litellm does not create unnecessary threads when running async functions (#7680)
* fix handle_sync_success_callbacks_for_async_calls

* fix handle_sync_success_callbacks_for_async_calls

* fix linting / testing errors

* use handle_sync_success_callbacks_for_async_calls

* add unit testing for logging fixes
2025-01-10 17:57:22 -08:00
Krish Dholakia
ebc66c1e1e LiteLLM Minor Fixes & Improvements (01/10/2025) - p1 (#7670)
* test(test_get_model_info.py): add unit test confirming router deployment updates global 'get_model_info'

* fix(get_supported_openai_params.py): fix custom llm provider 'get_supported_openai_params'

Fixes https://github.com/BerriAI/litellm/issues/7668

* docs(azure.md): clarify how azure ad token refresh on proxy works

Closes https://github.com/BerriAI/litellm/issues/7665
2025-01-10 17:49:05 -08:00
Hugues Chocart
9369129bf0 feat: allow to pass custom parent run id (#7651) 2025-01-10 17:04:46 -08:00
Ishaan Jaff
1c04ae6002 (litellm sdk - perf improvement) - optimize pre_call_check (#7673)
* latency fix - litellm sdk

* fix linting error

* fix litellm logging
2025-01-10 14:16:39 -08:00
Ishaan Jaff
9e2b1101c0 (litellm sdk - perf improvement) - use O(1) set lookups for checking llm providers / models (#7672)
* fix get model info logic to use O(1) lookups

* perf - use O(1) lookup for get llm provider
2025-01-10 14:16:30 -08:00
Ishaan Jaff
b7f336aef1 speed up use_custom_pricing_for_model (#7674) 2025-01-10 14:00:48 -08:00
Ishaan Jaff
76586f175d latency fix _cache_key_object (#7676) 2025-01-10 13:59:26 -08:00
vivek-athina
eba95acb2a Use environment variable for Athina logging URL (#7628)
* Use environment variable for Athina logging URL

* Added to docs as well

* Changed the env var name
2025-01-10 07:47:12 -08:00
Krish Dholakia
75c3ddfc9e fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… (#7660)
* fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url

* refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic

deduplicates code

* fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages

* docs(prompt_management.md): update prompt management to be in beta

given feedback - this still needs to be revised (e.g. passing in user message, not ignoring)

* refactor(prompt_management_base.py): introduce base class for prompt management

allows consistent behaviour across prompt management integrations

* feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base

* fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set

allows tracking what prompt was used for what purpose

* feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse

allows logging prompt id / prompt variables to langfuse

* test: fix test

* fix(router.py): cleanup unused imports

* fix: fix linting error

* fix: fix trace param typing

* fix: fix linting errors

* fix: fix code qa check
2025-01-10 07:31:59 -08:00
Krish Dholakia
afdcbe3d64 fix(main.py): fix lm_studio/ embedding routing (#7658)
* fix(main.py): fix lm_studio/ embedding routing

adds the mapping + updates docs with example

* docs(self_serve.md): update doc to show how to auto-add sso users to teams

* fix(streaming_handler.py): simplify async iterator check, to just check if streaming response is an async iterable
2025-01-09 23:03:24 -08:00
Krrish Dholakia
dee99babde build(ui/): update ui build 2025-01-09 22:44:05 -08:00
Krish Dholakia
921904fd29 feat(ui_sso.py): Allows users to use test key pane, and have team budget limits be enforced for their use-case (#7666) 2025-01-09 22:12:45 -08:00
Ishaan Jaff
b9dbc70a8e (minor latency fixes / proxy) - use verbose_proxy_logger.debug() instead of litellm.print_verbose (#7664)
* minor latency fixes

* fix code quality
2025-01-09 21:06:09 -08:00
Ishaan Jaff
f2854fefaf use asyncio tasks for logging db metrics (#7663) 2025-01-09 19:59:32 -08:00
Ishaan Jaff
19cac744f8 (Feat - Batches API) add support for retrieving vertex api batch jobs (#7661)
* add _async_retrieve_batch

* fix aretrieve_batch

* fix _get_batch_id_from_vertex_ai_batch_response

* fix batches docs
2025-01-09 18:35:03 -08:00
Ishaan Jaff
153740f6c9 (proxy perf improvement) - use uvloop for higher RPS (10%-20% higher RPS) (#7662)
* uvicorn use uvloop

* fix uvloop==0.21.0

* add uvloop to pyproject

* test_completion_response_ratelimit_headers
2025-01-09 18:11:20 -08:00
Ishaan Jaff
7d680d411d (proxy - RPS) - Get 2K RPS at 4 instances, minor fix aiohttp_openai/ (#7659)
* speed up transform_response

* use 2 workers

* undo changes to uvicorn

* ci/cd run again
2025-01-09 17:24:18 -08:00
Ishaan Jaff
6f6bf45684 fix 1 - latency fix (#7655) 2025-01-09 15:57:05 -08:00
Krish Dholakia
b77832a793 Litellm dev 01 08 2025 p1 (#7640)
* feat(ui_sso.py): support reading team ids from sso token

* feat(ui_sso.py): working upsert sso user teams membership in litellm - if team exists

Adds user to relevant teams, if user is part of teams and team exists on litellm

* fix(ui_sso.py): safely handle add team member task

* build(ui/): support setting team id when creating team on UI

* build(ui/): teams.tsx

allow setting team id on ui

* build(circle_ci/requirements.txt): add fastapi-sso to ci/cd testing

* fix: fix linting errors
2025-01-08 22:08:20 -08:00
Krish Dholakia
6d8cfeaf14 LiteLLM Minor Fixes & Improvements (01/08/2025) - p2 (#7643)
* fix(streaming_chunk_builder_utils.py): add test for groq tool calling + streaming + combine chunks

Addresses https://github.com/BerriAI/litellm/issues/7621

* fix(streaming_utils.py): fix modelresponseiterator for openai like chunk parser

ensures chunk parser uses the correct tool call id when translating the chunk

 Fixes https://github.com/BerriAI/litellm/issues/7621

* build(model_hub.tsx): display cost pricing on model hub

* build(model_hub.tsx): show cost per token pricing + complete model information

* fix(types/utils.py): fix usage object handling
2025-01-08 19:45:19 -08:00
Krrish Dholakia
cbe9223dd0 build(model_prices_and_context_window.json): omni-moderation-latest-intents 2025-01-08 19:06:04 -08:00
Ishaan Jaff
818f5b0113 fix is llm api route check (#7631) 2025-01-08 18:45:59 -08:00
Ishaan Jaff
b7f92eeced ci/cd run again 2025-01-08 18:36:39 -08:00
Krish Dholakia
12a78fe05f Allow assigning teams to org on UI + OpenAI omni-moderation cost model tracking (#7566)
* feat(cost_calculator.py): add cost tracking ($0) for openai moderations endpoint

removes sentry cost tracking errors caused by this

* build(teams.tsx): allow assigning teams to orgs
2025-01-08 16:58:21 -08:00
Krish Dholakia
b769b826d0 Litellm dev 01 07 2025 p2 (#7622)
* build(ui/): update ui

* fix: drop unsupported non-whitespace characters for real when calling… (#7484)

* fix: drop unsupported non-whitespace characters for real when calling anthropic with stop sequences

* test: add parameterized test for _map_stop_sequences method in AnthropicConfig

---------

Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>
2025-01-08 16:56:39 -08:00
Ishaan Jaff
74873317c2 (feat) - allow building litellm proxy from pip package (#7633)
* fix working build from pip

* add tests for proxy_build_from_pip_tests

* doc clean up for deployment

* docs cleanup

* docs build from pip

* fix cd docker/build_from_pip
2025-01-08 16:36:57 -08:00
Krish Dholakia
d413f31fb7 Litellm dev 01 07 2025 p3 (#7635)
* fix(__init__.py): fix mistral large tool calling

map bedrock mistral large to converse endpoint

Fixes https://github.com/BerriAI/litellm/issues/7521

* braintrust logging: respect project_id, add more metrics + more (#7613)

* braintrust logging: respect project_id, add more metrics

* braintrust logger: improve json formatting

* braintrust logger: add test for passing specific project_id

* rm unneeded import

* braintrust logging: rm unneeded var in tets

* add project_name

* update docs

---------

Co-authored-by: H <no@email.com>

---------

Co-authored-by: hi019 <65871571+hi019@users.noreply.github.com>
Co-authored-by: H <no@email.com>
2025-01-08 11:46:24 -08:00
Krish Dholakia
0cf32ed2f3 fix(utils.py): fix select tokenizer for custom tokenizer (#7599)
* fix(utils.py): fix select tokenizer for custom tokenizer

* fix(router.py): fix 'utils/token_counter' endpoint
2025-01-07 22:37:09 -08:00
Krish Dholakia
3d0a044f7c Litellm dev 01 01 2025 p2 (#7615)
* fix(utils.py): prevent double logging when passing 'fallbacks=' to .completion()

Fixes https://github.com/BerriAI/litellm/issues/7477

* fix(utils.py): fix vertex anthropic check

* fix(utils.py): ensure supported params is always set

Fixes https://github.com/BerriAI/litellm/issues/7470

* test(test_optional_params.py): add unit testing to prevent mistranslation

Fixes https://github.com/BerriAI/litellm/issues/7470

* fix: fix linting error

* test: cleanup
2025-01-07 21:40:33 -08:00
Ishaan Jaff
a4007e3294 (Feat) soft budget alerts on keys (#7623)
* class WebhookEvent(CallInfo):
Add

* handle soft budget alerts

* handle soft budget

* fix budget alerts

* fix CallInfo

* fix _get_user_info_str

* test_soft_budget_alerts

* test_soft_budget_alert
2025-01-07 21:36:34 -08:00
Krish Dholakia
73094873b2 Litellm dev 01 07 2025 p1 (#7618)
* fix(main.py): pass custom llm provider on litellm logging provider update

* fix(cost_calculator.py): don't append provider name to return model if existing llm provider

Fixes https://github.com/BerriAI/litellm/issues/7607

* fix(prometheus_services.py): fix prometheus system health error logging

Fixes https://github.com/BerriAI/litellm/issues/7611
2025-01-07 21:22:31 -08:00
Ishaan Jaff
c661b1f856 ci/cd run again 2025-01-07 10:01:29 -08:00
Krrish Dholakia
9146b4c7e7 docs: cleanup keys 2025-01-06 21:57:18 -08:00
Ishaan Jaff
5d3ae6da62 aiohttp_openai/ fixes - allow using aiohttp_openai/gpt-4o (#7598)
* fixes for get_complete_url

* update aiohttp tests

* fix event loop for aiohtto

* ci/cd run again

* test_aiohttp_openai
2025-01-06 21:39:11 -08:00
Krish Dholakia
4760693094 Litellm dev 01 06 2025 p1 (#7594)
* fix(custom_logger.py): expose new 'async_get_chat_completion_prompt' event hook

* fix(custom_logger.py): langfuse_prompt_management.py

remove 'headers' from custom logger 'async_get_chat_completion_prompt' and 'get_chat_completion_prompt' event hooks

* feat(router.py): expose new function for prompt management based routing

* feat(router.py): partial working router prompt factory logic

allows load balanced model to be used for model name w/ langfuse prompt management call

* feat(router.py): fix prompt management with load balanced model group

* feat(langfuse_prompt_management.py): support reading in openai params from langfuse

enables user to define optional params on langfuse vs. client code

* test(test_Router.py): add unit test for router based langfuse prompt management

* fix: fix linting errors
2025-01-06 21:26:21 -08:00
Krrish Dholakia
bb633d0714 docs(prompt_management.md): update docs to show how to point to load balanced model name 2025-01-06 21:09:09 -08:00
Krish Dholakia
0af875c7df Litellm dev 01 06 2025 p2 (#7597)
* test(test_amazing_vertex_completion.py): fix test

* test: initial working code gecko test

* fix(vertex_ai_non_gemini.py): support vertex ai code gecko fake streaming

Fixes https://github.com/BerriAI/litellm/issues/7360

* test(test_get_model_info.py): add test for getting custom provider model info

Covers https://github.com/BerriAI/litellm/issues/7575

* fix(utils.py): fix get_provider_model_info check

Handle custom llm provider scenario

Fixes https://github.com/
BerriAI/litellm/issues/7575
2025-01-06 21:04:49 -08:00
Krish Dholakia
b44aef5110 Litellm dev 01 06 2025 p3 (#7596)
* build(model_prices_and_context_window.json): add gemini-1.5-pro 'supports_vision' = true

Fixes https://github.com/BerriAI/litellm/issues/7592

* build(model_prices_and_context_window.json): add new mistral models pricing + model info
2025-01-06 20:44:04 -08:00
Ishaan Jaff
307969c3e4 (proxy perf improvement) - remove redundant .copy() operation (#7564)
* latency fix proxy

* remove useless copy in add_key_level_controls
2025-01-06 20:36:47 -08:00