Commit graph

3088 commits

Author SHA1 Message Date
Krish Dholakia
2340f1b31f
Pass router tags in request headers - x-litellm-tags (#8609)
* feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header

allow tag based routing + spend tracking via request headers

* docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking

* docs(tag_routing.md): add to docs

* fix(utils.py): only pass str values for openai metadata param

* fix(utils.py): drop non-str values for metadata param to openai

preview-feature, otel span was being sent in
2025-02-18 08:26:22 -08:00
Krrish Dholakia
7bfd816d3b build: merge commit 1b15568af7
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Feb 17 21:37:36 2025 -0800

    fix(proxy/_types.py): fix linting error

commit dc4d5cffa6
Author: Krrish Dholakia <krrishdholakia@gmail.com>
2025-02-17 21:56:00 -08:00
Krrish Dholakia
d0413ec96b docs(routing.md): add section on weighted deployments 2025-02-17 17:02:06 -08:00
Krrish Dholakia
048dd995dc docs: update litellm user management heirarchy doc
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 36s
2025-02-16 00:31:13 -08:00
Krrish Dholakia
c2e0c2f0bc docs(request_headers.md): document openai org id header handling in request_headers.md 2025-02-16 00:04:38 -08:00
Ishaan Jaff
6b3bfa2b42
(Feat) - return x-litellm-attempted-fallbacks in responses from litellm proxy (#8558)
* add_fallback_headers_to_response

* test x-litellm-attempted-fallbacks

* unit test attempted fallbacks

* fix add_fallback_headers_to_response

* docs document response headers

* fix file name
2025-02-15 14:54:23 -08:00
miraclebakelaser
3c197b9925
docs(perplexity.md): removing return_citations documentation (#8527)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
Deprecation Notice:

Effective immediately, all API users will see citations returned as part of their requests by default. This is not a breaking change. The return_citations parameter will no longer have any effect.

[source](https://docs.perplexity.ai/changelog/changelog#citations-public-release-and-increased-default-rate-limits)
2025-02-13 22:09:54 -08:00
Krish Dholakia
58141df65d
Litellm dev 02 13 2025 p2 (#8525)
* fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param

Closes https://github.com/BerriAI/litellm/issues/8500

* build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model

* style: cleanup invalid json trailing commma

* feat(utils.py): support passing 'tokenizer_config' to register_prompt_template

enables passing complete tokenizer config of model to litellm

 Allows calling deepseek on bedrock with the correct prompt template

* fix(utils.py): fix register_prompt_template for custom model names

* test(test_prompt_factory.py): fix test

* test(test_completion.py): add e2e test for bedrock invoke deepseek ft model

* feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls

enables proxy admin to set base model for ft bedrock deepseek model

* feat(bedrock/invoke): support deepseek_r1 route for bedrock

makes it easy to apply the right chat template to that call

* feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work

* test(test_completion.py): add e2e mock test for bedrock deepseek

* docs(bedrock.md): document new deepseek_r1 route for bedrock

allows us to use the right config

* fix(exception_mapping_utils.py): catch read operation timeout
2025-02-13 20:28:42 -08:00
vivek-athina
fd0769f2ed
Added custom_attributes to additional_keys which can be sent to athina (#8518) 2025-02-13 13:19:24 -08:00
exiao
fa3136c391
add phoenix docs for observability integration (#8522)
* Add files via upload

* Update arize_integration.md

* Update arize_integration.md

* add Phoenix docs
2025-02-13 13:18:37 -08:00
Krish Dholakia
305049a968
Litellm dev 02 12 2025 p1 (#8494)
* Resolves https://github.com/BerriAI/litellm/issues/6625 (#8459)

- enables no auth for SMTP

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>

* add sonar pricings (#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* test: fix test

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>
2025-02-12 22:39:29 -08:00
Krrish Dholakia
9f93ed110a docs: fix docs
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
2025-02-12 07:28:21 -08:00
Krrish Dholakia
c4a5e2c5c7 docs(token_auth.md): clarify scopes can be a list or comma separated string 2025-02-12 07:26:47 -08:00
Krish Dholakia
9c4c7813fb
Allow org admin to create teams on UI (#8407)
* fix(client_initialization_utils.py): handle custom llm provider set with valid value not from model name

* fix(handle_jwt.py): handle groups not existing in jwt token

if user not in group, this won't exist

* fix(handle_jwt.py): add new `enforce_team_based_model_access` flag to jwt auth

allows proxy admin to enforce user can only call model if team has access

* feat(navbar.tsx): expose new dropdown in navbar - allow org admin to create teams within org context

* fix(navbar.tsx): remove non-functional cogicon

* fix(proxy/utils.py): include user-org memberships in `/user/info` response

return orgs user is a member of and the user role within org

* feat(organization_endpoints.py): allow internal user to query `/organizations/list` and get all orgs they belong to

enables org admin to select org they belong to, to create teams

* fix(navbar.tsx): show change in ui when org switcher clicked

* feat(page.tsx): update user role based on org they're in

allows org admin to create teams in the org context

* feat(teams.tsx): working e2e flow for allowing org admin to add new teams

* style(navbar.tsx): clarify switching orgs on UI is in BETA

* fix(organization_endpoints.py): handle getting but not setting members

* test: fix test

* fix(client_initialization_utils.py): revert custom llm provider handling fix - causing unintended issues

* docs(token_auth.md): cleanup docs
2025-02-09 00:07:15 -08:00
Mubashir Osmani
bc2ac8264e
added gemini 2.0 models (#8412) 2025-02-08 22:34:22 -08:00
Krish Dholakia
1dd3713f1a
Anthropic Citations API Support (#8382)
* test(test_anthropic_completion.py): add test ensuring anthropic structured output response is consistent

Resolves https://github.com/BerriAI/litellm/issues/8291

* feat(anthropic.py): support citations api with new user document message format

Resolves https://github.com/BerriAI/litellm/issues/7970

* fix(anthropic/chat/transformation.py): return citations as a provider-specific-field

Resolves https://github.com/BerriAI/litellm/issues/7970

* feat(anthropic/chat/handler.py): add streaming citations support

Resolves https://github.com/BerriAI/litellm/issues/7970

* fix(handler.py): fix code qa error

* fix(handler.py): only set provider specific fields if non-empty dict

* docs(anthropic.md): add citations api to anthropic docs
2025-02-07 22:27:01 -08:00
Krish Dholakia
d720744656
Litellm dev 02 06 2025 p3 (#8343)
* feat(handle_jwt.py): initial commit to allow scope based model access

* feat(handle_jwt.py): allow model access based on token scopes

allow admin to control model access from IDP

* test(test_jwt.py): add unit testing for scope based model access

* docs(token_auth.md): add scope based model access to docs

* docs(token_auth.md): update docs

* docs(token_auth.md): update docs

* build: add gemini commercial rate limits

* fix: fix linting error
2025-02-06 23:15:33 -08:00
Ishaan Jaff
e3aab50ab3 docs assembly ai
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
2025-02-06 21:30:36 -08:00
Ishaan Jaff
229f270dd6 docs assembly ai eu endpoints 2025-02-06 21:13:40 -08:00
Krish Dholakia
f031926b82
fix(utils.py): handle key error in msg validation (#8325)
* fix(utils.py): handle key error in msg validation

* Support running Aim Guard during LLM call (#7918)

* support running Aim Guard during LLM call

* Rename header

* adjust docs and fix type annotations

* fix(timeout.md): doc fix for openai example on dynamic timeouts

---------

Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
2025-02-06 18:13:46 -08:00
Rok Benko
3ec9c28fb7
Update local_debugging.md (#8308) 2025-02-06 16:19:32 -08:00
exiao
85491a0bab
Add Arize Cookbook for Turning on LiteLLM Proxy (#8336)
* Add files via upload

* Update arize_integration.md
2025-02-06 16:16:28 -08:00
Tyler Wagner
5e921804b9
fix: docs links (#8294)
Fixed the docs links in the enterprise md.
2025-02-05 20:41:20 -08:00
Zhaohan Dong
88e7046165
Added compatibility guidance, etc. for xAI Grok model (#8282)
* Various updates

Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>

* Update xAI branding

Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>

* Revert changes

Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>

---------

Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>
2025-02-05 17:21:47 -08:00
waterstark
fbe3c58372
Added a guide for users who want to use LiteLLM with AI/ML API. (#7058)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* Added a guide for users who want to use LiteLLM with AI/ML.

* Minor changes

* Minor changes

* Fix sidebars.js

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-02-05 06:20:35 -08:00
Krish Dholakia
8d3a942fbd
Litellm staging (#8270)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
* fix(opik.py): cleanup

* docs(opik_integration.md): cleanup opik integration docs

* fix(redact_messages.py): fix redact messages check header logic

ensures stringified bool value in header is still asserted to true

 allows dynamic message redaction

* feat(redact_messages.py): support `x-litellm-enable-message-redaction` request header

allows dynamic message redaction
2025-02-04 22:35:48 -08:00
Krish Dholakia
4e34fc3bf8
[BETA] Support OIDC role based access to proxy (#8260)
* feat(proxy/_types.py): add new jwt field params

allows users + services to auth into proxy

* feat(handle_jwt.py): allow team role proxy access

allows proxy admin to set allowed team roles

* fix(proxy/_types.py): add 'routes' to role based permissions

allow proxy admin to restrict what routes a team can access easily

* feat(handle_jwt.py): support more flexible role based route access

v2 on role based 'allowed_routes'

* test(test_jwt.py): add unit test for rbac for proxy routes

* feat(handle_jwt.py): ensure cost tracking always works for any jwt request with `enforce_rbac=True`

* docs(token_auth.md): add documentation on controlling model access via OIDC Roles

* test: increase time delay before retrying

* test: handle model overloaded for test
2025-02-04 21:59:39 -08:00
Ishaan Jaff
8fd60a420d
(Feat) - New pass through add assembly ai passthrough endpoints (#8220)
* add assembly ai pass through request

* fix assembly pass through

* fix test_assemblyai_basic_transcribe

* fix assemblyai auth check

* test_assemblyai_transcribe_with_non_admin_key

* working assembly ai test

* working assembly ai proxy route

* use helper func to pass through logging

* clean up logging assembly ai

* test: update test to handle gemini token counter change

* fix(factory.py): fix bedrock http:// handling

* add unit testing for assembly pt handler

* docs assembly ai pass through endpoint

* fix proxy_pass_through_endpoint_tests

* fix standard_passthrough_logging_object

* fix ASSEMBLYAI_API_KEY

* test test_assemblyai_proxy_route_basic_post

* test_assemblyai_proxy_route_get_transcript

* fix is is_assemblyai_route

* test_is_assemblyai_route

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-02-03 21:54:32 -08:00
foreign-sub
aa6a18ecc2
docs: fix typo in lm_studio.md (#8222)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
2025-02-03 18:37:31 -08:00
Krish Dholakia
c8494abdea
test(base_llm_unit_tests.py): add test to ensure drop params is respe… (#8224)
* test(base_llm_unit_tests.py): add test to ensure drop params is respected

* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility

* build: add cherry picked commits
2025-02-03 16:04:44 -08:00
Zhaohan Dong
d60d3ee970
Add xAI and fix some old model config (#8218)
Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>
2025-02-03 15:29:19 -08:00
fzowl
d1d9c1e95a
docs: Updating the available VoyageAI models in the docs (#8215)
* Refresh VoyageAI models and prices and context

* Refresh VoyageAI models and prices and context

* Refresh VoyageAI models and prices and context

* Updating the available VoyageAI models in the docs
2025-02-03 07:26:33 -08:00
Ishaan Jaff
4e9c2d5b21 docs update log stream event 2025-02-01 16:33:28 -08:00
Krish Dholakia
23f458d2da
Improved O3 + Azure O3 support (#8181)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix: support azure o3 model family for fake streaming workaround (#8162)

* fix: support azure o3 model family for fake streaming workaround

* refactor: rename helper to is_o_series_model for clarity

* update function calling parameters for o3 models (#8178)

* refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3

ensures max_tokens is correctly translated for o3

* feat(openai/): refactor o1 files to be 'o_series' files

expands naming to cover o3

* fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets

* test(test_azure_o_series.py): assert stream faked for azure o3 mini

Resolves https://github.com/BerriAI/litellm/pull/8162

* fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing

* docs(azure.md): update doc with `o_series/` model name

---------

Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
2025-02-01 09:52:28 -08:00
Krish Dholakia
2147cad307
Litellm dev 01 31 2025 p2 (#8164)
* docs(token_auth.md): clarify title

* refactor(handle_jwt.py): add jwt auth manager + refactor to handle groups

allows user to call model if user belongs to group with model access

* refactor(handle_jwt.py): refactor to first check if service call then check user call

* feat(handle_jwt.py): new `enforce_team_access` param

only allows user to call model if a team they belong to has model access

allows controlling user model access by team

* fix(handle_jwt.py): fix error string, remove unecessary param

* docs(token_auth.md): add controlling model access for jwt tokens via teams to docs

* test: fix tests post refactor

* fix: fix linting errors

* fix: fix linting error

* test: fix import error
2025-01-31 22:52:35 -08:00
Ishaan Jaff
1d9ccb7fbe doc fix 2025-01-31 21:19:39 -08:00
Ishaan Jaff
9ff27809b2
(Feat) add bedrock/deepseek custom import models (#8132)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
* add support for using llama spec with bedrock

* fix get_bedrock_invoke_provider

* add support for using bedrock provider in mappings

* working request

* test_bedrock_custom_deepseek

* test_bedrock_custom_deepseek

* fix _get_model_id_for_llama_like_model

* test_bedrock_custom_deepseek

* doc DeepSeek-R1-Distill-Llama-70B

* test_bedrock_custom_deepseek
2025-01-31 18:40:44 -08:00
Krrish Dholakia
4b6fc4bba3 docs: fix dead links
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
2025-01-31 10:09:49 -08:00
Krish Dholakia
a699000a4b
New stable release - release notes (#8148)
* docs(v1.59.8-stable): add release note

* docs(index.md): cleanup new stable release, release notes
2025-01-31 10:02:59 -08:00
Krish Dholakia
de261e2120
Doc updates + management endpoint fixes (#8138)
* Litellm dev 01 29 2025 p4 (#8107)

* fix(key_management_endpoints.py): always get db team

Fixes https://github.com/BerriAI/litellm/issues/7983

* test(test_key_management.py): add unit test enforcing check_db_only is always true on key generate checks

* test: fix test

* test: skip gemini thinking

* Litellm dev 01 29 2025 p3 (#8106)

* fix(__init__.py): reduces size of __init__.py and reduces scope for errors by using correct param

* refactor(__init__.py): refactor init by cleaning up redundant params

* refactor(__init__.py): move more constants into constants.py

cleanup root

* refactor(__init__.py): more cleanup

* feat(__init__.py): expose new 'disable_hf_tokenizer_download' param

enables hf model usage in offline env

* docs(config_settings.md): document new disable_hf_tokenizer_download param

* fix: fix linting error

* fix: fix unsafe comparison

* test: fix test

* docs(public_teams.md): add doc showing how to expose public teams for users to join

* docs: add beta disclaimer on public teams

* test: update tests
2025-01-30 22:56:41 -08:00
Krish Dholakia
69a6da4727
Litellm dev 01 30 2025 p2 (#8134)
* feat(lowest_tpm_rpm_v2.py): fix redis cache check to use >= instead of >

makes it consistent

* test(test_custom_guardrails.py): add more unit testing on default on guardrails

ensure it runs if user sent guardrail list is empty

* docs(quick_start.md): clarify default on guardrails run even if user guardrails list contains other guardrails

* refactor(litellm_logging.py): refactor no-log to helper util

allows for more consistent behavior

* feat(litellm_logging.py): add event hook to verbose logs

* fix(litellm_logging.py): add unit testing to ensure `litellm.disable_no_log_param` is respected

* docs(logging.md): document how to disable 'no-log' param

* test: fix test to handle feb

* test: cleanup old bedrock model

* fix: fix router check
2025-01-30 22:18:53 -08:00
Krish Dholakia
41407f7be1
Doc updates - add key rotations to docs (#8136)
* docs(virtual_keys.md): add key rotations to virtual keys doc

* docs(enterprise.md): add key rotations to enterprise docs
2025-01-30 22:17:00 -08:00
Krrish Dholakia
9fa44a4fbe docs(bedrock.md): update docs to show how to use converse like route for internal proxy usage
Resolves https://github.com/BerriAI/litellm/issues/8085
2025-01-29 21:00:45 -08:00
Krish Dholakia
dad24f2b52
Litellm dev 01 29 2025 p2 (#8102)
* docs: cleanup doc

* feat(bedrock/): initial commit adding bedrock/converse_like/<model> route support

allows routing to a converse like endpoint

Resolves https://github.com/BerriAI/litellm/issues/8085

* feat(bedrock/chat/converse_transformation.py): make converse config base config compatible

enables new 'converse_like' route

* feat(converse_transformation.py): enables using the proxy with converse like api endpoint

Resolves https://github.com/BerriAI/litellm/issues/8085
2025-01-29 20:53:37 -08:00
Krish Dholakia
d9eb8f42ff
Litellm dev 01 27 2025 p3 (#8047)
* docs(reliability.md): add doc on disabling fallbacks per request

* feat(litellm_pre_call_utils.py): support reading request timeout from request headers - new `x-litellm-timeout` param

Allows setting dynamic model timeouts from vercel's AI sdk

* test(test_proxy_server.py): add simple unit test for reading request timeout

* test(test_fallbacks.py): add e2e test to confirm timeout passed in request headers is correctly read

* feat(main.py): support passing metadata to openai in preview

Resolves https://github.com/BerriAI/litellm/issues/6022#issuecomment-2616119371

* fix(main.py): fix passing openai metadata

* docs(request_headers.md): document new request headers

* build: Merge branch 'main' into litellm_dev_01_27_2025_p3

* test: loosen test
2025-01-28 18:01:27 -08:00
Krish Dholakia
8eaa5dc797
Bedrock document processing fixes (#8005)
* refactor(factory.py): refactor async bedrock message transformation to use async get request for image url conversion

improve latency of bedrock call

* test(test_bedrock_completion.py): add unit testing to ensure async image url get called for async bedrock call

* refactor(factory.py): refactor bedrock translation to use BedrockImageProcessor

reduces duplicate code

* fix(factory.py): fix bug not allowing pdf's to be processed

* fix(factory.py): fix bedrock converse document understanding with image url

* docs(bedrock.md): clarify all bedrock document types are supported

* refactor: cleanup redundant test + unused imports

* perf: improve perf with reusable clients

* test: fix test
2025-01-28 17:48:32 -08:00
Krish Dholakia
2eaa0079f2
feat(handle_jwt.py): initial commit adding custom RBAC support on jwt… (#8037)
* feat(handle_jwt.py): initial commit adding custom RBAC support on jwt auth

allows admin to define user role field and allowed roles which map to 'internal_user' on litellm

* fix(auth_checks.py): ensure user allowed to access model, when calling via personal keys

Fixes https://github.com/BerriAI/litellm/issues/8029

* feat(handle_jwt.py): support role based access with model permission control on proxy

Allows admin to just grant users roles on IDP (e.g. Azure AD/Keycloak) and user can immediately start calling models

* docs(rbac): add docs on rbac for model access control

make it clear how admin can use roles to control model access on proxy

* fix: fix linting errors

* test(test_user_api_key_auth.py): add unit testing to ensure rbac role is correctly enforced

* test(test_user_api_key_auth.py): add more testing

* test(test_users.py): add unit testing to ensure user model access is always checked for new keys

Resolves https://github.com/BerriAI/litellm/issues/8029

* test: fix unit test

* fix(dot_notation_indexing.py): fix typing to work with python 3.8
2025-01-28 16:27:06 -08:00
Rashmi Pawar
986c463983
(doc) Add nvidia as provider (#8023)
* add nvidia as provider in docs

* fixes for closing tag

* review changes
2025-01-27 21:18:34 -08:00
Ishaan Jaff
1255772547 docs smol agents 2025-01-27 18:12:23 -08:00
Ishaan Jaff
e845675773 fix smol agents doc 2025-01-27 18:10:09 -08:00