Commit graph

4171 commits

Author SHA1 Message Date
Ishaan Jaff
0587528acb use set for public routes 2025-01-12 16:22:56 -08:00
Ishaan Jaff
21bacaf498 fix _read_request_body to re-use parsed body already (#7722) 2025-01-12 15:41:40 -08:00
Ishaan Jaff
0cab9bee4e fix _read_request_body (#7706) 2025-01-11 21:54:51 -08:00
Krish Dholakia
267be77720 Litellm dev 01 11 2025 p3 (#7702)
* fix(__init__.py): fix init to exclude pricing-only model cost values from real model names

prevents bad health checks on wildcard routes

* fix(get_llm_provider.py): fix to handle calling bedrock_converse models
2025-01-11 20:06:54 -08:00
Krish Dholakia
1ca69019d0 build: new ui build (#7685) 2025-01-10 22:12:17 -08:00
Krish Dholakia
953c021aa7 Litellm dev 01 10 2025 p3 (#7682)
* feat(langfuse.py): log the used prompt when prompt management used

* test: fix test

* docs(self_serve.md): add doc on restricting personal key creation on ui

* feat(s3.py): support s3 logging with team alias prefixes (if available)

New preview feature

* fix(main.py): remove old if block - simplify to just await if coroutine returned

fixes lm_studio async embedding error

* fix(langfuse.py): handle get prompt check
2025-01-10 21:56:42 -08:00
Krish Dholakia
e54d23c919 Litellm dev 01 10 2025 p2 (#7679)
* test(test_basic_python_version.py): assert all optional dependencies are marked as extras on poetry

Fixes https://github.com/BerriAI/litellm/issues/7677

* docs(secret.md): clarify 'read_and_write' secret manager usage on aws

* docs(secret.md): fix doc

* build(ui/teams.tsx): add edit/delete button for updating user / team membership on ui

allows updating user role to admin on ui

* build(ui/teams.tsx): display edit member component on ui, when edit button on member clicked

* feat(team_endpoints.py): support updating team member role to admin via api endpoints

allows team member to become admin post-add

* build(ui/user_dashboard.tsx): if team admin - show all team keys

Fixes https://github.com/BerriAI/litellm/issues/7650

* test(config.yml): add tomli to ci/cd

* test: don't call python_basic_testing in local testing (covered by python 3.13 testing)
2025-01-10 21:50:53 -08:00
Ishaan Jaff
d866b1beda [Bug fix]: Proxy Auth Layer - Allow Azure Realtime routes as llm_api_routes (#7684)
* fix route check azure realtime endpoints

* test_is_llm_api_route

* fix /realtime

* test_routes_on_litellm_proxy
2025-01-10 20:38:06 -08:00
Ishaan Jaff
a49ef48bcb fix proxy pre call hook - only use if user is using alerting (#7683) 2025-01-10 19:07:05 -08:00
Ishaan Jaff
76c81f97d7 uvicorn allow setting num workers (#7681) 2025-01-10 19:03:14 -08:00
Krish Dholakia
ebc66c1e1e LiteLLM Minor Fixes & Improvements (01/10/2025) - p1 (#7670)
* test(test_get_model_info.py): add unit test confirming router deployment updates global 'get_model_info'

* fix(get_supported_openai_params.py): fix custom llm provider 'get_supported_openai_params'

Fixes https://github.com/BerriAI/litellm/issues/7668

* docs(azure.md): clarify how azure ad token refresh on proxy works

Closes https://github.com/BerriAI/litellm/issues/7665
2025-01-10 17:49:05 -08:00
Ishaan Jaff
76586f175d latency fix _cache_key_object (#7676) 2025-01-10 13:59:26 -08:00
Krish Dholakia
75c3ddfc9e fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… (#7660)
* fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url

* refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic

deduplicates code

* fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages

* docs(prompt_management.md): update prompt management to be in beta

given feedback - this still needs to be revised (e.g. passing in user message, not ignoring)

* refactor(prompt_management_base.py): introduce base class for prompt management

allows consistent behaviour across prompt management integrations

* feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base

* fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set

allows tracking what prompt was used for what purpose

* feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse

allows logging prompt id / prompt variables to langfuse

* test: fix test

* fix(router.py): cleanup unused imports

* fix: fix linting error

* fix: fix trace param typing

* fix: fix linting errors

* fix: fix code qa check
2025-01-10 07:31:59 -08:00
Krish Dholakia
afdcbe3d64 fix(main.py): fix lm_studio/ embedding routing (#7658)
* fix(main.py): fix lm_studio/ embedding routing

adds the mapping + updates docs with example

* docs(self_serve.md): update doc to show how to auto-add sso users to teams

* fix(streaming_handler.py): simplify async iterator check, to just check if streaming response is an async iterable
2025-01-09 23:03:24 -08:00
Krrish Dholakia
dee99babde build(ui/): update ui build 2025-01-09 22:44:05 -08:00
Krish Dholakia
921904fd29 feat(ui_sso.py): Allows users to use test key pane, and have team budget limits be enforced for their use-case (#7666) 2025-01-09 22:12:45 -08:00
Ishaan Jaff
b9dbc70a8e (minor latency fixes / proxy) - use verbose_proxy_logger.debug() instead of litellm.print_verbose (#7664)
* minor latency fixes

* fix code quality
2025-01-09 21:06:09 -08:00
Ishaan Jaff
f2854fefaf use asyncio tasks for logging db metrics (#7663) 2025-01-09 19:59:32 -08:00
Ishaan Jaff
153740f6c9 (proxy perf improvement) - use uvloop for higher RPS (10%-20% higher RPS) (#7662)
* uvicorn use uvloop

* fix uvloop==0.21.0

* add uvloop to pyproject

* test_completion_response_ratelimit_headers
2025-01-09 18:11:20 -08:00
Krish Dholakia
b77832a793 Litellm dev 01 08 2025 p1 (#7640)
* feat(ui_sso.py): support reading team ids from sso token

* feat(ui_sso.py): working upsert sso user teams membership in litellm - if team exists

Adds user to relevant teams, if user is part of teams and team exists on litellm

* fix(ui_sso.py): safely handle add team member task

* build(ui/): support setting team id when creating team on UI

* build(ui/): teams.tsx

allow setting team id on ui

* build(circle_ci/requirements.txt): add fastapi-sso to ci/cd testing

* fix: fix linting errors
2025-01-08 22:08:20 -08:00
Krish Dholakia
6d8cfeaf14 LiteLLM Minor Fixes & Improvements (01/08/2025) - p2 (#7643)
* fix(streaming_chunk_builder_utils.py): add test for groq tool calling + streaming + combine chunks

Addresses https://github.com/BerriAI/litellm/issues/7621

* fix(streaming_utils.py): fix modelresponseiterator for openai like chunk parser

ensures chunk parser uses the correct tool call id when translating the chunk

 Fixes https://github.com/BerriAI/litellm/issues/7621

* build(model_hub.tsx): display cost pricing on model hub

* build(model_hub.tsx): show cost per token pricing + complete model information

* fix(types/utils.py): fix usage object handling
2025-01-08 19:45:19 -08:00
Ishaan Jaff
818f5b0113 fix is llm api route check (#7631) 2025-01-08 18:45:59 -08:00
Krish Dholakia
b769b826d0 Litellm dev 01 07 2025 p2 (#7622)
* build(ui/): update ui

* fix: drop unsupported non-whitespace characters for real when calling… (#7484)

* fix: drop unsupported non-whitespace characters for real when calling anthropic with stop sequences

* test: add parameterized test for _map_stop_sequences method in AnthropicConfig

---------

Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>
2025-01-08 16:56:39 -08:00
Ishaan Jaff
74873317c2 (feat) - allow building litellm proxy from pip package (#7633)
* fix working build from pip

* add tests for proxy_build_from_pip_tests

* doc clean up for deployment

* docs cleanup

* docs build from pip

* fix cd docker/build_from_pip
2025-01-08 16:36:57 -08:00
Krish Dholakia
d413f31fb7 Litellm dev 01 07 2025 p3 (#7635)
* fix(__init__.py): fix mistral large tool calling

map bedrock mistral large to converse endpoint

Fixes https://github.com/BerriAI/litellm/issues/7521

* braintrust logging: respect project_id, add more metrics + more (#7613)

* braintrust logging: respect project_id, add more metrics

* braintrust logger: improve json formatting

* braintrust logger: add test for passing specific project_id

* rm unneeded import

* braintrust logging: rm unneeded var in tets

* add project_name

* update docs

---------

Co-authored-by: H <no@email.com>

---------

Co-authored-by: hi019 <65871571+hi019@users.noreply.github.com>
Co-authored-by: H <no@email.com>
2025-01-08 11:46:24 -08:00
Krish Dholakia
0cf32ed2f3 fix(utils.py): fix select tokenizer for custom tokenizer (#7599)
* fix(utils.py): fix select tokenizer for custom tokenizer

* fix(router.py): fix 'utils/token_counter' endpoint
2025-01-07 22:37:09 -08:00
Ishaan Jaff
a4007e3294 (Feat) soft budget alerts on keys (#7623)
* class WebhookEvent(CallInfo):
Add

* handle soft budget alerts

* handle soft budget

* fix budget alerts

* fix CallInfo

* fix _get_user_info_str

* test_soft_budget_alerts

* test_soft_budget_alert
2025-01-07 21:36:34 -08:00
Krish Dholakia
73094873b2 Litellm dev 01 07 2025 p1 (#7618)
* fix(main.py): pass custom llm provider on litellm logging provider update

* fix(cost_calculator.py): don't append provider name to return model if existing llm provider

Fixes https://github.com/BerriAI/litellm/issues/7607

* fix(prometheus_services.py): fix prometheus system health error logging

Fixes https://github.com/BerriAI/litellm/issues/7611
2025-01-07 21:22:31 -08:00
Krish Dholakia
4760693094 Litellm dev 01 06 2025 p1 (#7594)
* fix(custom_logger.py): expose new 'async_get_chat_completion_prompt' event hook

* fix(custom_logger.py): langfuse_prompt_management.py

remove 'headers' from custom logger 'async_get_chat_completion_prompt' and 'get_chat_completion_prompt' event hooks

* feat(router.py): expose new function for prompt management based routing

* feat(router.py): partial working router prompt factory logic

allows load balanced model to be used for model name w/ langfuse prompt management call

* feat(router.py): fix prompt management with load balanced model group

* feat(langfuse_prompt_management.py): support reading in openai params from langfuse

enables user to define optional params on langfuse vs. client code

* test(test_Router.py): add unit test for router based langfuse prompt management

* fix: fix linting errors
2025-01-06 21:26:21 -08:00
Ishaan Jaff
307969c3e4 (proxy perf improvement) - remove redundant .copy() operation (#7564)
* latency fix proxy

* remove useless copy in add_key_level_controls
2025-01-06 20:36:47 -08:00
Ishaan Jaff
89249fcf09 (Feat) - allow including dd-trace in litellm base image (#7587)
* introduce USE_DDTRACE=true

* update dd tracer

* update

* bump dd trace

* use og slim image

* DD tracing

* fix _init_dd_tracer
2025-01-06 17:27:09 -08:00
Ishaan Jaff
1bea935889 fix _return_user_api_key_auth_obj (#7591) 2025-01-06 16:43:14 -08:00
Ishaan Jaff
52d0c29087 latency fix proxy (#7563) 2025-01-04 20:18:32 -08:00
Krish Dholakia
5cf223c66a Support deleting keys by key_alias (#7552)
* feat(key_management_endpoints.py): allow deleting keys based on key alias

easier for proxy admin to delete known bad key

* fix(key_management_event_hooks.py): fix linting error

* docs(key_management_endpoints.py): document new key_aliases param

* fix(key_management_endpoints.py): return deleted keys to user

fixes return when passing key aliases
2025-01-04 19:41:48 -08:00
Ishaan Jaff
b91d7195a1 fix [PROXY] returned data from litellm_pre_call_util (#7558) 2025-01-04 18:47:36 -08:00
Krish Dholakia
ebe113810b Create and view organizations + assign org admins on the Proxy UI (#7557)
* feat: initial commit for new 'organizations' tab on ui

* build(ui/): create generic card for rendering complete org data table

can be reused in teams as well

simplifies things

* build(ui/): display created orgs on ui

* build(ui/): support adding orgs via UI

* build(ui/): add org in selection dropdown

* build(organizations.tsx): allow assigning org admins

* build(ui/): show org members on ui

* build(ui/): cleanup + show actual models on org dropdown

* build(ui/): explain user roles within organization
2025-01-04 17:31:24 -08:00
Ishaan Jaff
256a2d7847 (Feat) Hashicorp Secret Manager - Allow storing virtual keys in secret manager (#7549)
* use a base abstract class

* async_write_secret for hcorp

* fix hcorp

* async_write_secret for hashicopr secret manager

* store virtual keys in hcorp

* add delete secret

* test_hashicorp_secret_manager_write_secret

* test_hashicorp_secret_manager_delete_secret

* docs Supported Secret Managers

* docs storing keys in hcorp

* docs hcorp

* docs secret managers

* test_key_generate_with_secret_manager_call

* fix unused imports
2025-01-04 11:35:59 -08:00
Krish Dholakia
db82b3bb2a feat(router.py): support request prioritization for text completion c… (#7540)
* feat(router.py): support request prioritization for text completion calls

* fix(internal_user_endpoints.py): fix sql query to return all keys, including null team id keys on `/user/info`

Fixes https://github.com/BerriAI/litellm/issues/7485

* fix: fix linting errors

* fix: fix linting error

* test(test_router_helper_utils.py): add direct test for '_schedule_factory'

Fixes code qa test
2025-01-03 19:35:44 -08:00
Ishaan Jaff
df677ab073 (fix proxy perf) use _read_request_body instead of ast.literal_eval to get better performance (#7545)
* fix ast literal eval

* run ci/cd again
2025-01-03 17:48:32 -08:00
Ishaan Jaff
81d1826c25 [Feature]: - allow print alert log to console (#7534)
* update send_to_webhook

* test_print_alerting_payload_warning

* add alerting_args spec

* test_alerting.py
2025-01-03 17:48:13 -08:00
Krish Dholakia
5d4ab0a123 Revert "fix: add missing parameters order, limit, before, and after in get_as…" (#7542)
This reverts commit 4b0505dffd.
2025-01-03 16:32:12 -08:00
Ishaan Jaff
6e2f712b06 (fix) aiohttp_openai/ route - get to 1K RPS on single instance (#7539)
* ClientSession

* re use client_session

* _init_client_session

* fix aiohttp
2025-01-03 15:12:17 -08:00
Jean Carlo de Souza
08056b9aef fix: add missing parameters order, limit, before, and after in get_assistants method for openai (#7537)
- Ensured that `before` and `after` parameters are only passed when provided to avoid AttributeError.
- Implemented safe access using default values for `before` and `after` to prevent missing attribute issues.
- Added consistent handling of `order` and `limit` to improve flexibility and robustness in API calls.
2025-01-03 14:41:54 -08:00
Krish Dholakia
43fc21413f Litellm dev 01 02 2025 p1 (#7516)
* fix(redact_messages.py): fix redact messages for non-model response input to be dictionary

fixes issue with otel logging when message redaction is enabled

* fix(proxy_server.py): fix langfuse key leak in exception string

* test: fix test

* test: fix test

* test: fix tests
2025-01-03 14:40:57 -08:00
Krish Dholakia
796913fb30 Fix langfuse prompt management on proxy (#7535)
* fix(types/utils.py): support langfuse + humanloop routes on llm router

* fix(main.py): remove acompletion elif block

just await if coroutine returned
2025-01-03 12:42:37 -08:00
Ishaan Jaff
f07613593b fix - access metadata (#7523) 2025-01-03 10:02:10 -08:00
Ishaan Jaff
3a454ee2ce (perf) use aiohttp for custom_openai (#7514)
* use aiohttp handler

* BaseLLMAIOHTTPHandler

* use CustomOpenAIChatConfig

* CustomOpenAIChatConfig

* CustomOpenAIChatConfig

* fix linting

* AiohttpOpenAIChatConfig

* fix order

* aiohttp_openai
2025-01-02 22:15:17 -08:00
Krish Dholakia
02ff7b0a8a Litellm dev 01 01 2025 p1 (#7498)
* refactor(prometheus.py): refactor to remove `_tag` metrics and incorporate in regular metrics

* fix(prometheus.py): handle label values not set in enum values

* feat(prometheus.py): working e2e custom metadata labels

* docs(prometheus.md): update docs to clarify how custom metrics would work

* test(test_prometheus_unit_tests.py): fix test

* test: add unit testing
2025-01-01 18:59:28 -08:00
Krish Dholakia
b0f570ee16 Litellm dev 12 30 2024 p2 (#7495)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* fix(types/utils.py): handle none logprobs

Fixes https://github.com/BerriAI/litellm/issues/328

* fix(exception_mapping_utils.py): fix error str unbound error

* refactor(azure_ai/): move to openai_like chat completion handler

allows for easy swapping of api base url's (e.g. ai.services.com)

Fixes https://github.com/BerriAI/litellm/issues/7275

* refactor(azure_ai/): move to base llm http handler

* fix(azure_ai/): handle differing api endpoints

* fix(azure_ai/): make sure all unit tests are passing

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting error

* fix: fix linting errors

* fix(azure_ai/transformation.py): handle extra body param

* fix(azure_ai/transformation.py): fix max retries param handling

* fix: fix test

* test(test_azure_o1.py): fix test

* fix(llm_http_handler.py): support handling azure ai unprocessable entity error

* fix(llm_http_handler.py): handle sync invalid param error for azure ai

* fix(azure_ai/): streaming support with base_llm_http_handler

* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai

* fix: fix linting errors

* fix(llm_http_handler.py): fix linting error

* fix(azure_ai/): handle cohere tool call invalid index param error
2025-01-01 18:57:29 -08:00
Ishaan Jaff
1180edac18 (Feat) Add support for reading secrets from Hashicorp vault (#7497)
* HashicorpSecretManager

* test_hashicorp_secret_managerv

* use 1 helper initialize_secret_manager

* add HASHICORP_VAULT

* working config

* hcorp read_secret

* HashicorpSecretManager

* add secret_manager_testing

* use 1 folder for secret manager testing

* test_hashicorp_secret_manager_get_secret

* HashicorpSecretManager

* docs HCP secrets

* update folder name

* docs hcorp secret manager

* remove unused imports

* add conftest.py

* fix tests

* docs document env vars
2025-01-01 18:35:05 -08:00