Commit graph

3426 commits

Author SHA1 Message Date
Rashmi Pawar
986c463983
(doc) Add nvidia as provider (#8023)
* add nvidia as provider in docs

* fixes for closing tag

* review changes
2025-01-27 21:18:34 -08:00
Ishaan Jaff
1255772547 docs smol agents 2025-01-27 18:12:23 -08:00
Ishaan Jaff
e845675773 fix smol agents doc 2025-01-27 18:10:09 -08:00
Aymeric Roucher
7caa05505c
Add smolagents (#8026)
* Add smolagents

* Try removing emoji from url

* Revert "Try removing emoji from url"

This reverts commit 7fdc1e88e4.
2025-01-27 18:09:37 -08:00
Krrish Dholakia
395e903346 docs(bedrock.md): cleanup doc
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
2025-01-27 07:39:15 -08:00
Ishaan Jaff
669b4fc955
(Prometheus) - emit key budget metrics on startup (#8002)
* add UI_SESSION_TOKEN_TEAM_ID

* add type KeyListResponseObject

* add _list_key_helper

* _initialize_api_key_budget_metrics

* key / budget metrics

* init key budget metrics on startup

* test_initialize_api_key_budget_metrics

* fix linting

* test_list_key_helper

* test_initialize_remaining_budget_metrics_exception_handling
2025-01-25 10:37:52 -08:00
Krrish Dholakia
cb5da90a1a docs(bedrock.md): add doc on calling bedrock via internal gateway
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 36s
2025-01-24 22:00:59 -08:00
Ishaan Jaff
74caef0843
(Feat) - Add GCS Pub/Sub Logging integration for sending DB SpendLogs to BigQuery (#7976)
* add pub_sub

* fix custom batch logger for GCS PUB/SUB

* GCS_PUBSUB_PROJECT_ID

* e2e gcs pub sub

* add gcs pub sub

* fix logging

* add GcsPubSubLogger

* fix pub sub

* add pub sub

* docs gcs pub / sub

* docs on pub sub controls

* test_gcs_pub_sub

* fix publish_message

* test_async_gcs_pub_sub

* test_async_gcs_pub_sub
2025-01-24 20:57:20 -08:00
Ishaan Jaff
ed283bc5b4
(Feat) - allow setting default_on guardrails (#7973)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* test_default_on_guardrail

* update debug on custom guardrail

* refactor guardrails init

* guardrail registry

* allow switching guardrails default_on

* fix circle import issue

* fix bedrock applying guardrails where content is a list

* fix unused import

* docs default on guardrail

* docs fix per api key
2025-01-24 10:14:05 -08:00
Ishaan Jaff
bedd48e83e docs - store_prompts_in_spend_logs 2025-01-24 07:14:54 -08:00
Tyler Wagner
7d320d31bd
docs: fix typo (#7953) 2025-01-23 16:59:17 -08:00
Krish Dholakia
e3bacf7196
Litellm dev 01 22 2025 p1 (#7933)
* docs(docker_quick_start.md): add more troubleshooting guides

* test(test_fallbacks.py): add e2e test for proxy with fallbacks + custom fallback message

* test(test_bedrock_completion.py): skip test now that bedrock supports this behaviour

* test(test_fireworks_ai_translation.py): mock fireworks ai test
2025-01-22 19:55:32 -08:00
dependabot[bot]
5c88644b87
build(deps): bump undici from 6.21.0 to 6.21.1 in /docs/my-website (#7902)
Bumps [undici](https://github.com/nodejs/undici) from 6.21.0 to 6.21.1.
- [Release notes](https://github.com/nodejs/undici/releases)
- [Commits](https://github.com/nodejs/undici/compare/v6.21.0...v6.21.1)

---
updated-dependencies:
- dependency-name: undici
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-22 16:56:23 -08:00
Ishaan Jaff
0eec1d860e docs on custom tags with langfuse 2025-01-22 08:48:56 -08:00
Ishaan Jaff
9e0ca7da94 docs fix order of logging integrations 2025-01-22 08:45:22 -08:00
Krish Dholakia
76795dba39
Deepseek r1 support + watsonx qa improvements (#7907)
* fix(types/utils.py): support returning 'reasoning_content' for deepseek models

Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218

* fix(convert_dict_to_response.py): return deepseek response in provider_specific_field

allows for separating openai vs. non-openai params in model response

* fix(utils.py): support 'provider_specific_field' in delta chunk as well

allows deepseek reasoning content chunk to be returned to user from stream as well

Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218

* fix(watsonx/chat/handler.py): fix passing space id to watsonx on chat route

* fix(watsonx/): fix watsonx_text/ route with space id

* fix(watsonx/): qa item - also adds better unit testing for watsonx embedding calls

* fix(utils.py): rename to '..fields'

* fix: fix linting errors

* fix(utils.py): fix typing - don't show provider-specific field if none or empty - prevents default respons
e from being non-oai compatible

* fix: cleanup unused imports

* docs(deepseek.md): add docs for deepseek reasoning model
2025-01-21 23:13:15 -08:00
Ishaan Jaff
a57b8f6802 fix litellm_overhead_latency_metric 2025-01-21 21:33:31 -08:00
Yuki Watanabe
3f053fc99c
Update MLflow calllback and documentation (#7809)
* Update MLlfow tracer

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* doc update

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* doc update

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* image rename

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2025-01-21 20:56:48 -08:00
Krish Dholakia
64e1df1f14
Litellm dev 01 20 2025 p3 (#7890)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix(router.py): pass stream timeout correctly for non openai / azure models

Fixes https://github.com/BerriAI/litellm/issues/7870

* test(test_router_timeout.py): add test for streaming

* test(test_router_timeout.py): add unit testing for new router functions

* docs(ollama.md): link to section on calling ollama within docker container

* test: remove redundant test

* test: fix test to include timeout value

* docs(config_settings.md): document new router settings param
2025-01-20 21:46:36 -08:00
Krish Dholakia
4b23420a20
Litellm dev 01 20 2025 p1 (#7884)
* fix(initial-test-to-return-api-timeout-value-in-openai-timeout-exception): Makes it easier for user to debug why request timed out

* feat(openai.py): return timeout value + time taken on openai timeout errors

helps debug timeout errors

* fix(utils.py): fix num retries extraction logic when num_retries = 0

* fix(config_settings.md): litellm_logging.py

support printing payload to console if 'LITELLM_PRINT_STANDARD_LOGGING_PAYLOAD' is true

 Enables easier debug

* test(test_auth_checks.py'): remove common checks userapikeyauth enforcement check

* fix(litellm_logging.py): fix linting error
2025-01-20 21:45:48 -08:00
King
f654567aa2
Fix typo Update alerting.md (#7880) 2025-01-20 08:04:03 -08:00
Ishaan Jaff
baa9fda9ce docs - Custom Retention Policies 2025-01-20 07:29:48 -08:00
Ishaan Jaff
803da333bf docs Data Retention Policy 2025-01-20 07:00:38 -08:00
Krish Dholakia
dca6904937
JWT Auth - enforce_rbac support + UI team view, spend calc fix (#7863)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* fix(user_dashboard.tsx): fix spend calculation when team selected

sum all team keys, not user keys

* docs(admin_ui_sso.md): fix docs tabbing

* feat(user_api_key_auth.py): introduce new 'enforce_rbac' param on jwt auth

allows proxy admin to prevent any unmapped yet authenticated jwt tokens from calling proxy

Fixes https://github.com/BerriAI/litellm/issues/6793

* test: more unit testing + refactoring

* fix: fix returning id when obj not found in db

* fix(user_api_key_auth.py): add end user id tracking from jwt auth

* docs(token_auth.md): add doc on rbac with JWTs

* fix: fix unused params

* test: remove old test
2025-01-19 21:28:55 -08:00
Krish Dholakia
3a7b13efa2
feat(health_check.py): set upperbound for api when making health check call (#7865)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 10s
* feat(health_check.py): set upperbound for api when making health check call

prevent bad model from health check to hang and cause pod restarts

* fix(health_check.py): cleanup task once completed

* fix(constants.py): bump default health check timeout to 1min

* docs(health.md): add 'health_check_timeout' to health docs on litellm

* build(proxy_server_config.yaml): add bad model to health check
2025-01-18 19:47:43 -08:00
Ishaan Jaff
f8b059bfa1 docs data sec 2025-01-18 17:44:02 -08:00
Ishaan Jaff
c458c7c801 litellm security page 2025-01-18 17:24:39 -08:00
Ishaan Jaff
c0253e17af docs Security Certifications 2025-01-18 17:12:42 -08:00
Ishaan Jaff
40a78253a0 docs data privacy 2025-01-18 17:01:57 -08:00
Ishaan Jaff
a2762fb273 ui release note 2025-01-17 20:27:53 -08:00
Ishaan Jaff
bc311b7a47 ui logs - view messages / responses 2025-01-17 20:20:49 -08:00
Ishaan Jaff
2c117264a2
[Hashicorp - secret manager] - use vault namespace for tls auth (#7834)
* hcorp - use x-vault-namespace

* _get_tls_cert_auth_body

* HCP_VAULT_CERT_ROLE

* test_hashicorp_secret_manager_tls_cert_auth

* HCP_VAULT_CERT_ROLE
2025-01-17 19:27:56 -08:00
Nikolaiev Dmytro
7b45349145
Update instructor tutorial (#7784)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
2025-01-15 15:10:50 -08:00
Hugues Chocart
6fff77d131
[integrations/lunary] Improve Lunary documentaiton (#7770)
* update lunary doc

* better title

* tweaks

* Update langchain.md

* Update lunary_integration.md
2025-01-15 15:00:25 -08:00
Ishaan Jaff
df7d500d42
docs iam role based access for bedrock (#7774) 2025-01-14 19:02:02 -08:00
Krish Dholakia
7b27cfb0ae
Support temporary budget increases on keys (#7754)
* fix(gpt_transformation.py): fix response_format translation check for 4o models

Fixes https://github.com/BerriAI/litellm/issues/7616

* feat(key_management_endpoints.py): support 'temp_budget_increase' and 'temp_budget_expiry' fields

Allow proxy admin to grant temporary budget increases to keys

* fix(proxy/_types.py): enforce temp_budget_increase and temp_budget_expiry are always passed together

* feat(user_api_key_auth.py): initial working temp budget increase logic

ensures key budget exceeded error checks for temp budget in key metadata

* feat(proxy_server.py): return the key max budget and key spend in the response headers

Allows clientside user to know their remaining limits

* test: add unit testing for new proxy utils

Ensures new key budget is correctly handled

* docs(temporary_budget_increase.md): add doc on temporary budget increase

* fix(utils.py): remove 3.5 from response_format check for now

not all azure  3.5 models support response_format

* fix(user_api_key_auth.py): return valid user api key auth object on all paths
2025-01-14 17:03:11 -08:00
Krish Dholakia
29663c2db5
Litellm dev 01 14 2025 p1 (#7771)
* First-class Aim Guardrails support (#7738)

* initial aim support

* add tests

* docs(langsmith_integration.md): cleanup

* style: cleanup unused imports

---------

Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
2025-01-14 16:18:21 -08:00
Ishaan Jaff
8c016d0184 docs benchmark 2025-01-14 10:48:43 -08:00
Ishaan Jaff
eb2770fee2 update benchmarks 2025-01-14 10:45:28 -08:00
Ishaan Jaff
d510f1d517
(fix) health check - allow setting health_check_model (#7752)
* use _update_litellm_params_for_health_check

* fix Wildcard Routes

* test_update_litellm_params_for_health_check

* test_perform_health_check_with_health_check_model

* fix doc string

* huggingface/mistralai/Mistral-7B-Instruct-v0.3
2025-01-13 20:16:44 -08:00
ProphetJeremy
ecf6d22dc2
(docs) Update vertex.md old code example
Complete imports
Remove invalid parameter `disable_atributon`
2025-01-13 14:27:27 +01:00
Krish Dholakia
ec5a354eac
add azure o1 pricing (#7715)
* build(model_prices_and_context_window.json): add azure o1 pricing

Closes https://github.com/BerriAI/litellm/issues/7712

* refactor: replace regex with string method for whitespace check in stop-sequences handling (#7713)

* Allows overriding keep_alive time in ollama (#7079)

* Allows overriding keep_alive time in ollama

* Also adds to ollama_chat

* Adds some info on the docs about this parameter

* fix: together ai warning (#7688)

Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>

* fix(proxy_server.py): handle config containing thread locked objects when using get_config_state

* fix(proxy_server.py): add exception to debug

* build(model_prices_and_context_window.json): update 'supports_vision' for azure o1

---------

Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>
Co-authored-by: Regis David Souza Mesquita <github@rdsm.dev>
Co-authored-by: Carl <45709281+capsenz@users.noreply.github.com>
Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>
2025-01-12 18:15:35 -08:00
Krrish Dholakia
3062564488 docs(enterprise.md): cleanup docs and add faq
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
2025-01-11 10:46:55 -08:00
Krrish Dholakia
d988bfb6f8 docs(enterprise.md): clarify sla for patching vulnerabilities 2025-01-11 10:42:32 -08:00
Krish Dholakia
5e537fbdb1
fix(model_hub.tsx): clarify cost in model hub is per 1m tokens (#7687)
* fix(model_hub.tsx): clarify cost in model hub is per 1m tokens

* docs: test blog

* docs: improve release note docs

* docs(docs/): new stable release doc

* docs(docs/): specify date in all posts

* docs(docs/): add git diff to stable release docs
2025-01-11 09:57:09 -08:00
Krrish Dholakia
9a1c050cf7 docs: new release notes
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 41s
2025-01-10 22:49:20 -08:00
Krrish Dholakia
f2ca244766 docs(logging.md): docs(logging.md): add docs on s3 bucket logging with team alias prefix 2025-01-10 22:28:05 -08:00
Krish Dholakia
27892acdfc
Litellm dev 01 10 2025 p3 (#7682)
* feat(langfuse.py): log the used prompt when prompt management used

* test: fix test

* docs(self_serve.md): add doc on restricting personal key creation on ui

* feat(s3.py): support s3 logging with team alias prefixes (if available)

New preview feature

* fix(main.py): remove old if block - simplify to just await if coroutine returned

fixes lm_studio async embedding error

* fix(langfuse.py): handle get prompt check
2025-01-10 21:56:42 -08:00
Krish Dholakia
c4780479a9
Litellm dev 01 10 2025 p2 (#7679)
* test(test_basic_python_version.py): assert all optional dependencies are marked as extras on poetry

Fixes https://github.com/BerriAI/litellm/issues/7677

* docs(secret.md): clarify 'read_and_write' secret manager usage on aws

* docs(secret.md): fix doc

* build(ui/teams.tsx): add edit/delete button for updating user / team membership on ui

allows updating user role to admin on ui

* build(ui/teams.tsx): display edit member component on ui, when edit button on member clicked

* feat(team_endpoints.py): support updating team member role to admin via api endpoints

allows team member to become admin post-add

* build(ui/user_dashboard.tsx): if team admin - show all team keys

Fixes https://github.com/BerriAI/litellm/issues/7650

* test(config.yml): add tomli to ci/cd

* test: don't call python_basic_testing in local testing (covered by python 3.13 testing)
2025-01-10 21:50:53 -08:00
Ishaan Jaff
49d74748b0 fix showing release notes 2025-01-10 20:40:50 -08:00