litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 11:14:04 +00:00

Author	SHA1	Message	Date
Krish Dholakia	816f0ef8d2	LiteLLM Minor Fixes & Improvements (12/05/2024) (#7051 ) * fix(cost_calculator.py): move to using `.get_model_info()` for cost per token calculations ensures cost tracking is reliable - handles edge cases of parsing model cost map * build(model_prices_and_context_window.json): add 'supports_response_schema' for select tgai models Fixes https://github.com/BerriAI/litellm/pull/7037#discussion_r1872157329 * build(model_prices_and_context_window.json): remove 'pdf input' and 'vision' support from nova micro in model map Bedrock docs indicate no support for micro - https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html * fix(converse_transformation.py): support amazon nova tool use * fix(opentelemetry): Add missing LLM request type attribute to spans (#7041) * feat(opentelemetry): add LLM request type attribute to spans * lint * fix: curl usage (#7038) curl -d, --data <data> is lowercase d curl -D, --dump-header <filename> is uppercase D references: https://curl.se/docs/manpage.html#-d https://curl.se/docs/manpage.html#-D * fix(spend_tracking.py): handle empty 'id' in model response - when creating spend log Fixes https://github.com/BerriAI/litellm/issues/7023 * fix(streaming_chunk_builder.py): handle initial id being empty string Fixes https://github.com/BerriAI/litellm/issues/7023 * fix(anthropic_passthrough_logging_handler.py): add end user cost tracking for anthropic pass through endpoint * docs(pass_through/): refactor docs location + add table on supported features for pass through endpoints * feat(anthropic_passthrough_logging_handler.py): support end user cost tracking via anthropic sdk * docs(anthropic_completion.md): add docs on passing end user param for cost tracking on anthropic sdk * fix(litellm_logging.py): use standard logging payload if present in kwargs prevent datadog logging error for pass through endpoints * docs(bedrock.md): add rerank api usage example to docs * bugfix/change dummy tool name format (#7053) * fix viewing keys (#7042) * ui new build * build(model_prices_and_context_window.json): add bedrock region models to model cost map (#7044) * bye (#6982) * (fix) litellm router.aspeech (#6962) * doc Migrating Databases * fix aspeech on router * test_audio_speech_router * test_audio_speech_router * docs show supported providers on batches api doc * change dummy tool name format --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> * fix: fix linting errors * test: update test * fix(litellm_logging.py): fix pass through check * fix(test_otel_logging.py): fix test * fix(cost_calculator.py): update handling for cost per second * fix(cost_calculator.py): fix cost check * test: fix test * (fix) adding public routes when using custom header (#7045) * get_api_key_from_custom_header * add test_get_api_key_from_custom_header * fix testing use 1 file for test user api key auth * fix test user api key auth * test_custom_api_key_header_name * build: update ui build --------- Co-authored-by: Doron Kopit <83537683+doronkopit5@users.noreply.github.com> Co-authored-by: lloydchang <lloydchang@gmail.com> Co-authored-by: hgulersen <haymigulersen@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com>	2024-12-06 14:29:53 -08:00
Ishaan Jaff	0c40dd7199	docs show supported providers on batches api doc All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 10s Details	2024-12-05 14:04:36 -08:00
Ishaan Jaff	fb1ec4b6be	(fix) litellm router.aspeech (#6962 ) * doc Migrating Databases * fix aspeech on router * test_audio_speech_router * test_audio_speech_router	2024-12-05 13:39:50 -08:00
yujonglee	1eb9b4ee13	bye (#6982 )	2024-12-05 13:38:10 -08:00
Krish Dholakia	61b35c12bb	LiteLLM Minor Fixes & Improvements (12/05/2024) (#7037 ) * fix(together_ai/chat): only return response_format + tools for supported models Fixes https://github.com/BerriAI/litellm/issues/6972 * feat(bedrock/rerank): initial working commit for bedrock rerank api support Closes https://github.com/BerriAI/litellm/issues/7021 * feat(bedrock/rerank): async bedrock rerank api support Addresses https://github.com/BerriAI/litellm/issues/7021 * build(model_prices_and_context_window.json): add 'supports_prompt_caching' for bedrock models + cleanup cross-region from model list (duplicate information - lead to inconsistencies ) * docs(json_mode.md): clarify model support for json schema Closes https://github.com/BerriAI/litellm/issues/6998 * fix(_service_logger.py): handle dd callback in list ensure failed spend tracking is logged to datadog * feat(converse_transformation.py): translate from anthropic format to bedrock format Closes https://github.com/BerriAI/litellm/issues/7030 * fix: fix linting errors * test: fix test	2024-12-05 00:02:31 -08:00
Ishaan Jaff	84db69d4c4	(feat) add Vertex Batches API support in OpenAI format (#7032 ) * working request * working transform * working request * transform vertex batch response * add _async_create_batch * move gcs functions to base * fix _get_content_from_openai_file * transform_openai_file_content_to_vertex_ai_file_content * fix transform vertex gcs bucket upload to OAI files format * working e2e test * _get_gcs_object_name * fix linting * add doc string * fix transform_gcs_bucket_response_to_openai_file_object * use vertex for batch endpoints * add batches support for vertex * test_vertex_batches_endpoint * test_vertex_batch_prediction * fix gcs bucket base auth * docs clean up batches * docs Batch API * docs vertex batches api * test_get_gcs_logging_config_without_service_account * undo change * fix vertex md * test_get_gcs_logging_config_without_service_account * ci/cd run again	2024-12-04 19:40:28 -08:00
Krrish Dholakia	d37fbc2d59	docs(bedrock.md): clarify what version supports converse/invoke routes	2024-12-04 15:42:21 -08:00
Krrish Dholakia	61192d1fea	docs(bedrock.md): add nova model notice	2024-12-04 15:36:48 -08:00
Ishaan Jaff	3262cd1a61	docs add FAq to life of a request All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details	2024-12-04 09:36:55 -08:00
Krrish Dholakia	ec2503b7cb	docs(azure.md): add proxy spend tracking for azure models to azure docs Fixes https://github.com/BerriAI/litellm/issues/7004#issuecomment-2516644243	2024-12-04 09:08:56 -08:00
Krish Dholakia	6bb934c0ac	fix(key_management_endpoints.py): override metadata field value on up… (#7008 ) * fix(key_management_endpoints.py): override metadata field value on update allow user to override tags * feat(__init__.py): expose new disable_end_user_cost_tracking_prometheus_only metric allow disabling end user cost tracking on prometheus - fixes cardinality issue * fix(litellm_pre_call_utils.py): add key/team level enforced params Fixes https://github.com/BerriAI/litellm/issues/6652 * fix(key_management_endpoints.py): allow user to pass in `enforced_params` as a top level param on /key/generate and /key/update * docs(enterprise.md): add docs on enforcing required params for llm requests * Add support of Galadriel API (#7005) * fix(router.py): robust retry after handling set retry after time to 0 if >0 healthy deployments. handle base case = 1 deployment * test(test_router.py): fix test * feat(bedrock/): add support for 'nova' models also adds explicit 'converse/' route for simpler routing * fix: fix 'supports_pdf_input' return if model supports pdf input on get_model_info * feat(converse_transformation.py): support bedrock pdf input * docs(document_understanding.md): add document understanding to docs * fix(litellm_pre_call_utils.py): fix linting error * fix(init.py): fix passing of bedrock converse models * feat(bedrock/converse): support 'response_format={"type": "json_object"}' * fix(converse_handler.py): fix linting error * fix(base_llm_unit_tests.py): fix test * fix: fix test * test: fix test * test: fix test * test: remove duplicate test --------- Co-authored-by: h4n0 <4738254+h4n0@users.noreply.github.com>	2024-12-03 23:03:50 -08:00
Krrish Dholakia	5a430d3c69	docs(json_mode.md): update json docs	2024-12-02 23:08:19 -08:00
Krish Dholakia	859b47f08b	LiteLLM Minor Fixes & Improvements (11/29/2024) (#6965 ) * fix(factory.py): ensure tool call converts image url Fixes https://github.com/BerriAI/litellm/issues/6953 * fix(transformation.py): support mp4 + pdf url's for vertex ai Fixes https://github.com/BerriAI/litellm/issues/6936 * fix(http_handler.py): mask gemini api key in error logs Fixes https://github.com/BerriAI/litellm/issues/6963 * docs(prometheus.md): update prometheus FAQs * feat(auth_checks.py): ensure specific model access > wildcard model access if wildcard model is in access group, but specific model is not - deny access * fix(auth_checks.py): handle auth checks for team based model access groups handles scenario where model access group used for wildcard models * fix(internal_user_endpoints.py): support adding guardrails on `/user/update` Fixes https://github.com/BerriAI/litellm/issues/6942 * fix(key_management_endpoints.py): fix prepare_metadata_fields helper * fix: fix tests * build(requirements.txt): bump openai dep version fixes proxies argument * test: fix tests * fix(http_handler.py): fix error message masking * fix(bedrock_guardrails.py): pass in prepped data * test: fix test * test: fix nvidia nim test * fix(http_handler.py): return original response headers * fix: revert maskedhttpstatuserror * test: update tests * test: cleanup test * fix(key_management_endpoints.py): fix metadata field update logic * fix(key_management_endpoints.py): maintain initial order of guardrails in key update * fix(key_management_endpoints.py): handle prepare metadata * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix: fix key management errors * fix(key_management_endpoints.py): update metadata * test: update test * refactor: add more debug statements * test: skip flaky test * test: fix test * fix: fix test * fix: fix update metadata logic * fix: fix test * ci(config.yml): change db url for e2e ui testing	2024-12-01 05:24:11 -08:00
Ishaan Jaff	05f810922c	(feat) Allow disabling ErrorLogs written to the DB (#6940 ) * fix - allow disabling logging error logs * docs on disabling error logs * doc string for _PROXY_failure_handler * test_disable_error_logs * rename file * fix rename file * increase test coverage for test_enable_error_logs	2024-11-27 19:34:51 -08:00
Ishaan Jaff	4ebb7c8a7f	(docs + fix) Add docs on Moderations endpoint, Text Completion (#6947 ) * fix _pass_through_moderation_endpoint_factory * fix route_llm_request * doc moderations api * docs on /moderations * add e2e tests for moderations api * docs moderations api * test_pass_through_moderation_endpoint_factory * docs text completion	2024-11-27 16:30:48 -08:00
Ishaan Jaff	eba700a491	Revert "Revert "(feat) Allow using include to include external YAML files in a config.yaml (#6922 )"" This reverts commit `5d13302e6b`.	2024-11-27 16:08:59 -08:00
Sara Han	8af5b11f54	docs: update the docs (#6923 )	2024-11-28 03:43:20 +05:30
Krish Dholakia	2d2931a215	LiteLLM Minor Fixes & Improvements (11/26/2024) (#6913 ) * docs(config_settings.md): document all router_settings * ci(config.yml): add router_settings doc test to ci/cd * test: debug test on ci/cd * test: debug ci/cd test * test: fix test * fix(team_endpoints.py): skip invalid team object. don't fail `/team/list` call Causes downstream errors if ui just fails to load team list * test(base_llm_unit_tests.py): add 'response_format={"type": "text"}' test to base_llm_unit_tests adds complete coverage for all 'response_format' values to ci/cd * feat(router.py): support wildcard routes in `get_router_model_info()` Addresses https://github.com/BerriAI/litellm/issues/6914 * build(model_prices_and_context_window.json): add tpm/rpm limits for all gemini models Allows for ratelimit tracking for gemini models even with wildcard routing enabled Addresses https://github.com/BerriAI/litellm/issues/6914 * feat(router.py): add tpm/rpm tracking on success/failure to global_router Addresses https://github.com/BerriAI/litellm/issues/6914 * feat(router.py): support wildcard routes on router.get_model_group_usage() * fix(router.py): fix linting error * fix(router.py): implement get_remaining_tokens_and_requests Addresses https://github.com/BerriAI/litellm/issues/6914 * fix(router.py): fix linting errors * test: fix test * test: fix tests * docs(config_settings.md): add missing dd env vars to docs * fix(router.py): check if hidden params is dict	2024-11-28 00:01:38 +05:30
Ishaan Jaff	5d13302e6b	Revert "(feat) Allow using include to include external YAML files in a config.yaml (#6922 )" This reverts commit `68e59824a3`.	2024-11-27 10:17:09 -08:00
Ishaan Jaff	68e59824a3	(feat) Allow using include to include external YAML files in a config.yaml (#6922 ) * add helper to process inlcudes directive on yaml * add doc on config management * unit tests for `include` on config.yaml	2024-11-26 20:27:12 -08:00
Ishaan Jaff	c285132ad6	(docs) Simplify `/vertex_ai/` pass through docs (#6910 ) * simplify vertex pass through docs * allow using known path for setting up pass throughs * add unit testing for vtx pass through auth	2024-11-25 23:57:50 -08:00
Krrish Dholakia	d26ad42f86	docs(router_architecture.md): add router architecture docs	2024-11-26 12:54:38 +05:30
Ishaan Jaff	c60261c3bc	(feat) Add support for using @google/generative-ai JS with LiteLLM Proxy (#6899 ) * feat - allow using gemini js SDK with LiteLLM * add auth for gemini_proxy_route * basic local test for js * test cost tagging gemini js requests * add js sdk test for gemini with litellm * add docs on gemini JS SDK * run node.js tests * fix google ai studio tests * fix vertex js spend test	2024-11-25 13:13:03 -08:00
Ishaan Jaff	f77bf49772	feat - allow sending `tags` on vertex pass through requests (#6876 ) * feat - allow tagging vertex JS SDK request * add unit testing for passing headers for pass through endpoints * fix allow using vertex_ai as the primary way for pass through vertex endpoints * docs on vertex js pass tags * add e2e test for vertex pass through with spend tags * add e2e tests for streaming vertex JS with tags * fix vertex ai testing	2024-11-25 12:12:09 -08:00
Ishaan Jaff	c73ce95c01	(feat) - provider budget improvements - ensure provider budgets work with multiple proxy instances + improve latency to ~90ms (#6886 ) * use 1 file for duration_in_seconds * add to readme.md * re use duration_in_seconds * fix importing _extract_from_regex, get_last_day_of_month * fix import * update provider budget routing * fix - remove dup test * add support for using in multi instance environments * test_in_memory_redis_sync_e2e * test_in_memory_redis_sync_e2e * fix test_in_memory_redis_sync_e2e * fix code quality check * fix test provider budgets * working provider budget tests * add fixture for provider budget routing * fix router testing for provider budgets * add comments on provider budget routing * use RedisPipelineIncrementOperation * add redis async_increment_pipeline * use redis async_increment_pipeline * use lower value for testing * use redis async_increment_pipeline * use consistent key name for increment op * add handling for budget windows * fix typing async_increment_pipeline * fix set attr * add clear doc strings * unit testing for provider budgets * test_redis_increment_pipeline	2024-11-24 16:36:19 -08:00
Ishaan Jaff	34bfebe470	(QOL improvement) Provider budget routing - allow using 1s, 1d, 1mo, 2mo etc (#6885 ) * use 1 file for duration_in_seconds * add to readme.md * re use duration_in_seconds * fix importing _extract_from_regex, get_last_day_of_month * fix import * update provider budget routing * fix - remove dup test	2024-11-23 16:59:46 -08:00
Ishaan Jaff	e69678a9b3	update doc title	2024-11-23 16:25:00 -08:00
Ishaan Jaff	afc69761de	docs - have 1 section for routing +load balancing (#6884 ) * docs - have 1 section for routing +load balancing * remove emoji	2024-11-23 15:56:57 -08:00
Krish Dholakia	424b8b0231	Litellm dev 11 23 2024 (#6881 ) * build(ui/create_key_button.tsx): support adding tags for cost tracking/routing when making key * LiteLLM Minor Fixes & Improvements (11/23/2024) (#6870) * feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc. * fix(utils.py): allow disabling end user cost tracking with new param Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small * docs(configs.md): add disable_end_user_cost_tracking reference to docs * feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role Enables admin to restrict key creation, and assign team admins to handle distributing keys * test(test_key_management.py): add unit testing for personal / team key restriction checks * docs: add docs on restricting key creation * docs(finetuned_models.md): add new guide on calling finetuned models * docs(input.md): cleanup anthropic supported params Closes https://github.com/BerriAI/litellm/issues/6856 * test(test_embedding.py): add test for passing extra headers via embedding * feat(cohere/embed): pass client to async embedding * feat(rerank.py): add `/v1/rerank` if missing for cohere base url Closes https://github.com/BerriAI/litellm/issues/6844 * fix(main.py): pass extra_headers param to openai Fixes https://github.com/BerriAI/litellm/issues/6836 * fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically * fix(handler.py): fix linting error * fix: fix typing * build: add conftest to proxy_admin_ui_tests/ * test: fix test * fix: fix linting errors * test: fix test * fix: fix pass through testing * feat(key_management_endpoints.py): allow proxy_admin to enforce params on key creation allows admin to force team keys to have tags * build(ui/): show teams in leftnav + allow team admin to add new members * build(ui/): show created tags in dropdown makes it easier for admin to add tags to keys * test(test_key_management.py): fix test * test: fix test * fix playwright e2e ui test * fix e2e ui testing deps * fix: fix linting errors * fix e2e ui testing * fix e2e ui testing, only run e2e ui testing in playwright --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-11-23 22:37:16 +05:30
Krish Dholakia	7e9d8b58f6	LiteLLM Minor Fixes & Improvements (11/23/2024) (#6870 ) * feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc. * fix(utils.py): allow disabling end user cost tracking with new param Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small * docs(configs.md): add disable_end_user_cost_tracking reference to docs * feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role Enables admin to restrict key creation, and assign team admins to handle distributing keys * test(test_key_management.py): add unit testing for personal / team key restriction checks * docs: add docs on restricting key creation * docs(finetuned_models.md): add new guide on calling finetuned models * docs(input.md): cleanup anthropic supported params Closes https://github.com/BerriAI/litellm/issues/6856 * test(test_embedding.py): add test for passing extra headers via embedding * feat(cohere/embed): pass client to async embedding * feat(rerank.py): add `/v1/rerank` if missing for cohere base url Closes https://github.com/BerriAI/litellm/issues/6844 * fix(main.py): pass extra_headers param to openai Fixes https://github.com/BerriAI/litellm/issues/6836 * fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically * fix(handler.py): fix linting error * fix: fix typing * build: add conftest to proxy_admin_ui_tests/ * test: fix test * fix: fix linting errors * test: fix test * fix: fix pass through testing	2024-11-23 15:17:40 +05:30
Ishaan Jaff	b2b3e40d13	(feat) use `@google-cloud/vertexai` js sdk with litellm (#6873 ) * stash gemini JS test * add vertex js sdj example * handle vertex pass through separately * tes vertex JS sdk * fix vertex_proxy_route * use PassThroughStreamingHandler * fix PassThroughStreamingHandler * use common _create_vertex_response_logging_payload_for_generate_content * test vertex js * add working vertex jest tests * move basic bass through test * use good name for test * test vertex * test_chunk_processor_yields_raw_bytes * unit tests for streaming * test_convert_raw_bytes_to_str_lines * run unit tests 1st * simplify local * docs add usage example for js * use get_litellm_virtual_key * add unit tests for vertex pass through	2024-11-22 16:50:10 -08:00
Ishaan Jaff	8856256730	fix doc format	2024-11-21 23:29:40 -08:00
Ishaan Jaff	14124bab45	docs - Send `litellm_metadata` (tags)	2024-11-21 21:46:49 -08:00
Ishaan Jaff	6717929206	(Feat) Allow passing `litellm_metadata` to pass through endpoints + Add e2e tests for /anthropic/ usage tracking (#6864 ) * allow passing _litellm_metadata in pass through endpoints * fix _create_anthropic_response_logging_payload * include litellm_call_id in logging * add e2e testing for anthropic spend logs * add testing for spend logs payload * add example with anthropic python SDK	2024-11-21 21:41:05 -08:00
Krrish Dholakia	2903fd4164	docs: update json mode docs	2024-11-22 03:00:45 +05:30
Krish Dholakia	7e5085dc7b	Litellm dev 11 21 2024 (#6837 ) * Fix Vertex AI function calling invoke: use JSON format instead of protobuf text format. (#6702) * test: test tool_call conversion when arguments is empty dict Fixes https://github.com/BerriAI/litellm/issues/6833 * fix(openai_like/handler.py): return more descriptive error message Fixes https://github.com/BerriAI/litellm/issues/6812 * test: skip overloaded model * docs(anthropic.md): update anthropic docs to show how to route to any new model * feat(groq/): fake stream when 'response_format' param is passed Groq doesn't support streaming when response_format is set * feat(groq/): add response_format support for groq Closes https://github.com/BerriAI/litellm/issues/6845 * fix(o1_handler.py): remove fake streaming for o1 Closes https://github.com/BerriAI/litellm/issues/6801 * build(model_prices_and_context_window.json): add groq llama3.2b model pricing Closes https://github.com/BerriAI/litellm/issues/6807 * fix(utils.py): fix handling ollama response format param Fixes https://github.com/BerriAI/litellm/issues/6848#issuecomment-2491215485 * docs(sidebars.js): refactor chat endpoint placement * fix: fix linting errors * test: fix test * test: fix test * fix(openai_like/handler): handle max retries * fix(streaming_handler.py): fix streaming check for openai-compatible providers * test: update test * test: correctly handle model is overloaded error * test: update test * test: fix test * test: mark flaky test --------- Co-authored-by: Guowang Li <Guowang@users.noreply.github.com>	2024-11-22 01:53:52 +05:30
Krish Dholakia	b11bc0374e	Litellm dev 11 20 2024 (#6838 ) * feat(customer_endpoints.py): support passing budget duration via `/customer/new` endpoint Closes https://github.com/BerriAI/litellm/issues/5651 * docs: add missing params to swagger + api documentation test * docs: add documentation for all key endpoints documents all params on swagger * docs(internal_user_endpoints.py): document all /user/new params Ensures all params are documented * docs(team_endpoints.py): add missing documentation for team endpoints Ensures 100% param documentation on swagger * docs(organization_endpoints.py): document all org params Adds documentation for all params in org endpoint * docs(customer_endpoints.py): add coverage for all params on /customer endpoints ensures all /customer/* params are documented * ci(config.yml): add endpoint doc testing to ci/cd * fix: fix internal_user_endpoints.py * fix(internal_user_endpoints.py): support 'duration' param * fix(partner_models/main.py): fix anthropic re-raise exception on vertex * fix: fix pydantic obj * build(model_prices_and_context_window.json): add new vertex claude model names vertex claude changed model names - causes cost tracking errors	2024-11-21 05:20:37 +05:30
Krish Dholakia	b0be5bf3a1	LiteLLM Minor Fixes & Improvements (11/19/2024) (#6820 ) * fix(anthropic/chat/transformation.py): add json schema as values: json_schema fixes passing pydantic obj to anthropic Fixes https://github.com/BerriAI/litellm/issues/6766 * (feat): Add timestamp_granularities parameter to transcription API (#6457) * Add timestamp_granularities parameter to transcription API * add param to the local test * fix(databricks/chat.py): handle max_retries optional param handling for openai-like calls Fixes issue with calling finetuned vertex ai models via databricks route * build(ui/): add team admins via proxy ui * fix: fix linting error * test: fix test * docs(vertex.md): refactor docs * test: handle overloaded anthropic model error * test: remove duplicate test * test: fix test * test: update test to handle model overloaded error --------- Co-authored-by: Show <35062952+BrunooShow@users.noreply.github.com>	2024-11-21 00:57:58 +05:30
Ishaan Jaff	7463dab9c6	(feat) provider budget routing improvements (#6827 ) * minor fix for provider budget * fix raise good error message when budget crossed for provider budget * fix test provider budgets * test provider budgets * feat - emit llm provider spend on prometheus * test_prometheus_metric_tracking * doc provider budgets	2024-11-19 21:25:08 -08:00
Ishaan Jaff	3c6fe21935	(Feat) Add provider specific budget routing (#6817 ) * add ProviderBudgetConfig * working test_provider_budgets_e2e_test * test_provider_budgets_e2e_test_expect_to_fail * use 1 cache read for getting provider spend * test_provider_budgets_e2e_test * add doc on provider budgets * clean up provider budgets * unit testing for provider budget routing * use as flag, not routing strat * fix init provider budget routing * use async_filter_deployments * fix test provider budgets * doc provider budget routing * doc provider budget routing * fix docs changes * fix comment	2024-11-19 20:25:27 -08:00
Ishaan Jaff	1890fde3f3	(Proxy) add support for DOCS_URL and REDOC_URL (#6806 ) * add support for DOCS_URL and REDOC_URL * document env vars * add unit tests for docs url and redocs url	2024-11-19 07:02:12 -08:00
Krrish Dholakia	7550aba474	docs(gemini.md): add embeddings as a supported endpoint for gemini models	2024-11-19 10:27:02 +05:30
Krish Dholakia	ba28e52ee8	Litellm lm studio embedding params (#6746 ) * fix(ollama.py): fix get model info request Fixes https://github.com/BerriAI/litellm/issues/6703 * feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param * docs(anthropic.md): document all supported openai params for anthropic * test: fix tests * fix: fix tests * feat(jina_ai/): add rerank support Closes https://github.com/BerriAI/litellm/issues/6691 * test: handle service unavailable error * fix(handler.py): refactor together ai rerank call * test: update test to handle overloaded error * test: fix test * Litellm router trace (#6742) * feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks * feat(router.py): log trace id across retry/fallback logic allows grouping llm logs for the same request * test: fix tests * fix: fix test * fix(transformation.py): only set non-none stop_sequences * Litellm router disable fallbacks (#6743) * bump: version 1.52.6 → 1.52.7 * feat(router.py): enable dynamically disabling fallbacks Allows for enabling/disabling fallbacks per key * feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key * test: fix test * fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error * fix(lm_studio/embed): support translating lm studio optional params ' * feat(auth_checks.py): fix auth check inside route - `/team/list` Fixes regression where non-admin w/ user_id=None able to query all teams * docs proxy_budget_rescheduler_min_time * helm run DISABLE_SCHEMA_UPDATE * docs helm pre sync hook * fix migration job.yaml * fix DATABASE_URL * use existing spec for migrations job * fix yaml on migrations job * fix migration job * update doc on pre sync hook * fix migrations-job.yaml * fix migration job * fix prisma migration * test - handle eol model claude-2, use claude-2.1 instead * (docs) add instructions on how to contribute to docker image * Update code blocks huggingface.md (#6737) * Update prefix.md (#6734) * fix test_supports_response_schema * mark Helm PreSyn as BETA * (Feat) Add support for storing virtual keys in AWS SecretManager (#6728) * add SecretManager to httpxSpecialProvider * fix importing AWSSecretsManagerV2 * add unit testing for writing keys to AWS secret manager * use KeyManagementEventHooks for key/generated events * us event hooks for key management endpoints * working AWSSecretsManagerV2 * fix write secret to AWS secret manager on /key/generate * fix KeyManagementSettings * use tasks for key management hooks * add async_delete_secret * add test for async_delete_secret * use _delete_virtual_keys_from_secret_manager * fix test secret manager * test_key_generate_with_secret_manager_call * fix check for key_management_settings * sync_read_secret * test_aws_secret_manager * fix sync_read_secret * use helper to check when _should_read_secret_from_secret_manager * test_get_secret_with_access_mode * test - handle eol model claude-2, use claude-2.1 instead * docs AWS secret manager * fix test_read_nonexistent_secret * fix test_supports_response_schema * ci/cd run again * LiteLLM Minor Fixes & Improvement (11/14/2024) (#6730) * fix(ollama.py): fix get model info request Fixes https://github.com/BerriAI/litellm/issues/6703 * feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param * docs(anthropic.md): document all supported openai params for anthropic * test: fix tests * fix: fix tests * feat(jina_ai/): add rerank support Closes https://github.com/BerriAI/litellm/issues/6691 * test: handle service unavailable error * fix(handler.py): refactor together ai rerank call * test: update test to handle overloaded error * test: fix test * Litellm router trace (#6742) * feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks * feat(router.py): log trace id across retry/fallback logic allows grouping llm logs for the same request * test: fix tests * fix: fix test * fix(transformation.py): only set non-none stop_sequences * Litellm router disable fallbacks (#6743) * bump: version 1.52.6 → 1.52.7 * feat(router.py): enable dynamically disabling fallbacks Allows for enabling/disabling fallbacks per key * feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key * test: fix test * fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error * test: handle gemini error * test: fix test * fix: new run * bump: version 1.52.7 → 1.52.8 * docs: add docs on jina ai rerank support * docs(reliability.md): add tutorial on disabling fallbacks per key * docs(logging.md): add 'trace_id' param to standard logging payload * (feat) add bedrock/stability.stable-image-ultra-v1:0 (#6723) * add stability.stable-image-ultra-v1:0 * add pricing for stability.stable-image-ultra-v1:0 * fix test_supports_response_schema * ci/cd run again * [Feature]: Stop swallowing up AzureOpenAi exception responses in litellm's implementation for a BadRequestError (#6745) * fix azure exceptions * test_bad_request_error_contains_httpx_response * test_bad_request_error_contains_httpx_response * use safe access to get exception response * fix get attr * [Feature]: json_schema in response support for Anthropic (#6748) * _convert_tool_response_to_message * fix ModelResponseIterator * fix test_json_response_format * test_json_response_format_stream * fix _convert_tool_response_to_message * use helper _handle_json_mode_chunk * fix _process_response * unit testing for test_convert_tool_response_to_message_no_arguments * update doc for JSON mode * fix: import audio check (#6740) * fix imagegeneration output_cost_per_image on model cost map (#6752) * (feat) Vertex AI - add support for fine tuned embedding models (#6749) * fix use fine tuned vertex embedding models * test_vertex_embedding_url * add _transform_openai_request_to_fine_tuned_embedding_request * add _transform_openai_request_to_fine_tuned_embedding_request * add transform_openai_request_to_vertex_embedding_request * add _transform_vertex_response_to_openai_for_fine_tuned_models * test_vertexai_embedding for ft models * fix test_vertexai_embedding_finetuned * doc fine tuned / custom embedding models * fix test test_partner_models_httpx * bump: version 1.52.8 → 1.52.9 * LiteLLM Minor Fixes & Improvements (11/13/2024) (#6729) * fix(utils.py): add logprobs support for together ai Fixes https://github.com/BerriAI/litellm/issues/6724 * feat(pass_through_endpoints/): add anthropic/ pass-through endpoint adds new `anthropic/` pass-through endpoint + refactors docs * feat(spend_management_endpoints.py): allow /global/spend/report to query team + customer id enables seeing spend for a customer in a team * Add integration with MLflow Tracing (#6147) * Add MLflow logger Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Streaming handling Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * lint Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * address comments and fix issues Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * address comments and fix issues Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Move logger construction code Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Add docs Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * async handlers Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * new picture Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * fix(mlflow.py): fix ruff linting errors * ci(config.yml): add mlflow to ci testing * fix: fix test * test: fix test * Litellm key update fix (#6710) * fix(caching): convert arg to equivalent kwargs in llm caching handler prevent unexpected errors * fix(caching_handler.py): don't pass args to caching * fix(caching): remove all args from caching.py fix(caching): consistent function signatures + abc method * test(caching_unit_tests.py): add unit tests for llm caching ensures coverage for common caching scenarios across different implementations * refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one * fix(router.py): drop redis password requirement * fix(proxy_server.py): fix faulty slack alerting check * fix(langfuse.py): avoid copying functions/thread lock objects in metadata fixes metadata copy error when parent otel span in metadata * test: update test * fix(key_management_endpoints.py): fix /key/update with metadata update * fix(key_management_endpoints.py): fix key_prepare_update helper * fix(key_management_endpoints.py): reset value to none if set in key update * fix: update test ' * Litellm dev 11 11 2024 (#6693) * fix(__init__.py): add 'watsonx_text' as mapped llm api route Fixes https://github.com/BerriAI/litellm/issues/6663 * fix(opentelemetry.py): fix passing parallel tool calls to otel Fixes https://github.com/BerriAI/litellm/issues/6677 * refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling reduces bugs in repo * fix(__init__.py): update provider-model mapping to include all known provider-model mappings Fixes https://github.com/BerriAI/litellm/issues/6669 * feat(anthropic): support passing document in llm api call * docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function * fix(factory.py): fix linting error * add clear doc string for GCS bucket logging * Add docs to export logs to Laminar (#6674) * Add docs to export logs to Laminar * minor fix: newline at end of file * place laminar after http and grpc * (Feat) Add langsmith key based logging (#6682) * add langsmith_api_key to StandardCallbackDynamicParams * create a file for langsmith types * langsmith add key / team based logging * add key based logging for langsmith * fix langsmith key based logging * fix linting langsmith * remove NOQA violation * add unit test coverage for all helpers in test langsmith * test_langsmith_key_based_logging * docs langsmith key based logging * run langsmith tests in logging callback tests * fix logging testing * test_langsmith_key_based_logging * test_add_callback_via_key_litellm_pre_call_utils_langsmith * add debug statement langsmith key based logging * test_langsmith_key_based_logging * (fix) OpenAI's optional messages[].name does not work with Mistral API (#6701) * use helper for _transform_messages mistral * add test_message_with_name to base LLMChat test * fix linting * add xAI on Admin UI (#6680) * (docs) add benchmarks on 1K RPS (#6704) * docs litellm proxy benchmarks * docs GCS bucket * doc fix - reduce clutter on logging doc title * (feat) add cost tracking stable diffusion 3 on Bedrock (#6676) * add cost tracking for sd3 * test_image_generation_bedrock * fix get model info for image cost * add cost_calculator for stability 1 models * add unit testing for bedrock image cost calc * test_cost_calculator_with_no_optional_params * add test_cost_calculator_basic * correctly allow size Optional * fix cost_calculator * sd3 unit tests cost calc * fix raise correct error 404 when /key/info is called on non-existent key (#6653) * fix raise correct error on /key/info * add not_found_error error * fix key not found in DB error * use 1 helper for checking token hash * fix error code on key info * fix test key gen prisma * test_generate_and_call_key_info * test fix test_call_with_valid_model_using_all_models * fix key info tests * bump: version 1.52.4 → 1.52.5 * add defaults used for GCS logging * LiteLLM Minor Fixes & Improvements (11/12/2024) (#6705) * fix(caching): convert arg to equivalent kwargs in llm caching handler prevent unexpected errors * fix(caching_handler.py): don't pass args to caching * fix(caching): remove all args from caching.py fix(caching): consistent function signatures + abc method * test(caching_unit_tests.py): add unit tests for llm caching ensures coverage for common caching scenarios across different implementations * refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one * fix(router.py): drop redis password requirement * fix(proxy_server.py): fix faulty slack alerting check * fix(langfuse.py): avoid copying functions/thread lock objects in metadata fixes metadata copy error when parent otel span in metadata * test: update test * bump: version 1.52.5 → 1.52.6 * (feat) helm hook to sync db schema (#6715) * v0 migration job * fix job * fix migrations job.yml * handle standalone DB on helm hook * fix argo cd annotations * fix db migration helm hook * fix migration job * doc fix Using Http/2 with Hypercorn * (fix proxy redis) Add redis sentinel support (#6154) * add sentinel_password support * add doc for setting redis sentinel password * fix redis sentinel - use sentinel password * Fix: Update gpt-4o costs to that of gpt-4o-2024-08-06 (#6714) Fixes #6713 * (fix) using Anthropic `response_format={"type": "json_object"}` (#6721) * add support for response_format=json anthropic * add test_json_response_format to baseLLM ChatTest * fix test_litellm_anthropic_prompt_caching_tools * fix test_anthropic_function_call_with_no_schema * test test_create_json_tool_call_for_response_format * (feat) Add cost tracking for Azure Dall-e-3 Image Generation + use base class to ensure basic image generation tests pass (#6716) * add BaseImageGenTest * use 1 class for unit testing * add debugging to BaseImageGenTest * TestAzureOpenAIDalle3 * fix response_cost_calculator * test_basic_image_generation * fix img gen basic test * fix _select_model_name_for_cost_calc * fix test_aimage_generation_bedrock_with_optional_params * fix undo changes cost tracking * fix response_cost_calculator * fix test_cost_azure_gpt_35 * fix remove dup test (#6718) * (build) update db helm hook * (build) helm db pre sync hook * (build) helm db sync hook * test: run test_team_logging firdst --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com> Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de> * test: update test * test: skip anthropic overloaded error * test: cleanup test * test: update tests * test: fix test * test: handle gemini overloaded model error * test: handle internal server error * test: handle anthropic overloaded error * test: handle claude instability --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com> Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de> --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Jongseob Jeon <aiden.jongseob@gmail.com> Co-authored-by: Camden Clark <camdenaws@gmail.com> Co-authored-by: Rasswanth <61219215+IamRash-7@users.noreply.github.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com> Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>	2024-11-19 09:54:50 +05:30
Ishaan Jaff	994fb51016	Docs - use 1 page for all logging integrations on proxy + add logging features at top level (#6805 ) * use 1 page for bucket logging * docs logging proxy * remove dup doc * docs fix emoji * docs team logging	2024-11-18 18:35:52 -08:00
Ishaan Jaff	7bb5304323	(docs) simplify left nav names + use a section for `making llm requests` (#6799 ) * fix emojis on docs * add section on making LLM requests * docs simplify sidebar	2024-11-18 12:53:43 -08:00
Ishaan Jaff	bbdec2995a	(docs improvement) remove emojis, use `guides` section, categorize uncategorized docs (#6796 ) * proxy - use Setup & Deployment category * fix emoji * use guides section to user facing usage * docs - remove emojis * use 1 quick start	2024-11-18 12:23:54 -08:00
Ishaan Jaff	f43768d617	(fix) httpx handler - bind to ipv4 for httpx handler (#6785 ) * bind to ipv4 on httpx handler * add force_ipv4 * use helper for _create_async_transport * fix circular import * document force_ipv4 * test_async_http_handler_force_ipv4	2024-11-18 12:22:51 -08:00
Ishaan Jaff	627d6a6614	(Doc) Add section on what is stored in the DB + Add clear section on key/team based logging (#6769 ) * fix clean up logging * add Default Login, Logout URLs * docs on tables in DB schema * docs add section on What is stored in the DB	2024-11-16 10:28:44 -08:00
Emmanuel Ferdman	9cf3dcbbf3	Update routing references (#6758 ) * Update routing references Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * Update routing references Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> --------- Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2024-11-16 08:28:44 -08:00
Ishaan Jaff	41aade2cc0	(feat) Use `litellm/` prefix when storing virtual keys in AWS secret manager (#6765 ) * fix - storing AWS keys in secret manager * fix test_key_generate_with_secret_manager_call * allow using prefix_for_stored_virtual_keys * add prefix_for_stored_virtual_keys * test_key_generate_with_secret_manager_call	2024-11-15 18:07:43 -08:00

... 10 11 12 13 14 ...

3415 commits