litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 10:44:24 +00:00

Author	SHA1	Message	Date
Krish Dholakia	6005450c8f	LiteLLM Minor Fixes & Improvements (10/09/2024) (#6139 ) * fix(utils.py): don't return 'none' response headers Fixes https://github.com/BerriAI/litellm/issues/6123 * fix(vertex_and_google_ai_studio_gemini.py): support parsing out additional properties and strict value for tool calls Fixes https://github.com/BerriAI/litellm/issues/6136 * fix(cost_calculator.py): set default character value to none Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403290196 * fix(google.py): fix cost per token / cost per char conversion Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403370287 * build(model_prices_and_context_window.json): update gemini pricing Fixes https://github.com/BerriAI/litellm/issues/6133 * build(model_prices_and_context_window.json): update gemini pricing * fix(litellm_logging.py): fix streaming caching logging when 'turn_off_message_logging' enabled Stores unredacted response in cache * build(model_prices_and_context_window.json): update gemini-1.5-flash pricing * fix(cost_calculator.py): fix default prompt_character count logic Fixes error in gemini cost calculation * fix(cost_calculator.py): fix cost calc for tts models	2024-10-10 00:42:11 -07:00
Krish Dholakia	9695c1af10	LiteLLM Minor Fixes & Improvements (10/08/2024) (#6119 ) * refactor(cost_calculator.py): move error line to debug - https://github.com/BerriAI/litellm/issues/5683#issuecomment-2398599498 * fix(migrate-hidden-params-to-read-from-standard-logging-payload): Fixes https://github.com/BerriAI/litellm/issues/5546#issuecomment-2399994026 * fix(types/utils.py): mark weight as a litellm param Fixes https://github.com/BerriAI/litellm/issues/5781 * feat(internal_user_endpoints.py): fix /user/info + show user max budget as default max budget Fixes https://github.com/BerriAI/litellm/issues/6117 * feat: support returning team member budget in `/user/info` Sets user max budget in team as max budget on ui Closes https://github.com/BerriAI/litellm/issues/6117 * bug fix for optional parameter passing to replicate (#6067) Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com> * fix(o1_transformation.py): handle o1 temperature=0 o1 doesn't support temp=0, allow admin to drop this param * test: fix test --------- Signed-off-by: Mandana Vaziri <mvaziri@us.ibm.com> Co-authored-by: Mandana Vaziri <mvaziri@us.ibm.com>	2024-10-08 21:57:03 -07:00
Krish Dholakia	6729c9ca7f	LiteLLM Minor Fixes & Improvements (10/07/2024) (#6101 ) * fix(utils.py): support dropping temperature param for azure o1 models * fix(main.py): handle azure o1 streaming requests o1 doesn't support streaming, fake it to ensure code works as expected * feat(utils.py): expose `hosted_vllm/` endpoint, with tool handling for vllm Fixes https://github.com/BerriAI/litellm/issues/6088 * refactor(internal_user_endpoints.py): cleanup unused params + update docstring Closes https://github.com/BerriAI/litellm/issues/6100 * fix(main.py): expose custom image generation api support Fixes https://github.com/BerriAI/litellm/issues/6097 * fix: fix linting errors * docs(custom_llm_server.md): add docs on custom api for image gen calls * fix(types/utils.py): handle dict type * fix(types/utils.py): fix linting errors	2024-10-07 22:17:22 -07:00
Krish Dholakia	49d8b2be46	fix(utils.py): fix fix pydantic obj to schema creation for vertex en… (#6071 ) * fix(utils.py): fix fix pydantic obj to schema creation for vertex endpoints Fixes https://github.com/BerriAI/litellm/issues/6027 * test(test_completion.pyu): skip test - avoid hitting gemini rate limits * fix(common_utils.py): fix ruff linting error	2024-10-06 00:25:55 -04:00
Krish Dholakia	f2c0a31e3c	LiteLLM Minor Fixes & Improvements (10/05/2024) (#6083 ) * docs(prompt_caching.md): add prompt caching cost calc example to docs * docs(prompt_caching.md): add proxy examples to docs * feat(utils.py): expose new helper `supports_prompt_caching()` to check if a model supports prompt caching * docs(prompt_caching.md): add docs on checking model support for prompt caching * build: fix invalid json	2024-10-05 18:59:11 -04:00
Krish Dholakia	fac3b2ee42	Add pyright to ci/cd + Fix remaining type-checking errors (#6082 ) * fix: fix type-checking errors * fix: fix additional type-checking errors * fix: additional type-checking error fixes * fix: fix additional type-checking errors * fix: additional type-check fixes * fix: fix all type-checking errors + add pyright to ci/cd * fix: fix incorrect import * ci(config.yml): use mypy on ci/cd * fix: fix type-checking errors in utils.py * fix: fix all type-checking errors on main.py * fix: fix mypy linting errors * fix(anthropic/cost_calculator.py): fix linting errors * fix: fix mypy linting errors * fix: fix linting errors	2024-10-05 17:04:00 -04:00
Ishaan Jaff	7267852511	linting error fix	2024-10-05 15:03:39 +05:30
Ishaan Jaff	3682f661d8	(feat) add cost tracking for OpenAI prompt caching (#6055 ) * add cache_read_input_token_cost for prompt caching models * add prompt caching for latest models * add openai cost calculator * add openai prompt caching test * fix lint check * add not on how usage._cache_read_input_tokens is used * fix cost calc whisper openai * use output_cost_per_second * add input_cost_per_second	2024-10-05 14:20:15 +05:30
Ishaan Jaff	670ecda4e2	(fixes) gcs bucket key based logging (#6044 ) * fixes for gcs bucket logging * fix StandardCallbackDynamicParams * fix - gcs logging when payload is not serializable * add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket * working success callbacks * linting fixes * fix linting error * add type hints to functions * fixes for dynamic success and failure logging * fix for test_async_chat_openai_stream	2024-10-04 11:56:10 +05:30
Krish Dholakia	09f0c09ba4	fix(utils.py): return openai streaming prompt caching tokens (#6051 ) * fix(utils.py): return openai streaming prompt caching tokens Closes https://github.com/BerriAI/litellm/issues/6038 * fix(main.py): fix error in finish_reason updates	2024-10-03 22:20:13 -04:00
Ishaan Jaff	d92696a303	(feat) add nvidia nim embeddings (#6032 ) * nvidia nim support embedding config * add nvidia config in init * nvidia nim embeddings * docs nvidia nim embeddings * docs embeddings on nvidia nim * fix llm translation test	2024-10-03 17:12:14 +05:30
Ishaan Jaff	21e05a0f3e	(feat proxy) add key based logging for GCS bucket (#6031 ) * init litellm langfuse / gcs credentials in litellm logging obj * add gcs key based test * rename vars * save standard_callback_dynamic_params in model call details * add working gcs bucket key based logging * test_basic_gcs_logging_per_request * linting fix * add doc on gcs bucket team based logging	2024-10-03 15:24:31 +05:30
Krish Dholakia	14165d3648	LiteLLM Minor Fixes & Improvements (10/02/2024) (#6023 ) * feat(together_ai/completion): handle together ai completion calls * fix: handle list of int / list of list of int for text completion calls * fix(utils.py): check if base model in bedrock converse model list Fixes https://github.com/BerriAI/litellm/issues/6003 * test(test_optional_params.py): add unit tests for bedrock optional param mapping Fixes https://github.com/BerriAI/litellm/issues/6003 * feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist Fixes https://github.com/BerriAI/litellm/issues/5388 * fixed an issue with tool use of claude models with anthropic and bedrock (#6013) * fix(utils.py): handle empty schema for anthropic/bedrock Fixes https://github.com/BerriAI/litellm/issues/6012 * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix(proxy_cli.py): fix import route for app + health checks path (#6026) * (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018) * fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav Fixes https://github.com/BerriAI/litellm/issues/5999 --------- Co-authored-by: David Manouchehri <david.manouchehri@ai.moda> --------- Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com> Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>	2024-10-02 22:00:28 -04:00
Krish Dholakia	d57be47b0f	Litellm ruff linting enforcement (#5992 ) * ci(config.yml): add a 'check_code_quality' step Addresses https://github.com/BerriAI/litellm/issues/5991 * ci(config.yml): check why circle ci doesn't pick up this test * ci(config.yml): fix to run 'check_code_quality' tests * fix(__init__.py): fix unprotected import * fix(__init__.py): don't remove unused imports * build(ruff.toml): update ruff.toml to ignore unused imports * fix: fix: ruff + pyright - fix linting + type-checking errors * fix: fix linting errors * fix(lago.py): fix module init error * fix: fix linting errors * ci(config.yml): cd into correct dir for checks * fix(proxy_server.py): fix linting error * fix(utils.py): fix bare except causes ruff linting errors * fix: ruff - fix remaining linting errors * fix(clickhouse.py): use standard logging object * fix(__init__.py): fix unprotected import * fix: ruff - fix linting errors * fix: fix linting errors * ci(config.yml): cleanup code qa step (formatting handled in local_testing) * fix(_health_endpoints.py): fix ruff linting errors * ci(config.yml): just use ruff in check_code_quality pipeline for now * build(custom_guardrail.py): include missing file * style(embedding_handler.py): fix ruff check	2024-10-01 19:44:20 -04:00
Krrish Dholakia	5fbcdd8b11	fix(utils.py): fix updating hidden params	2024-09-28 21:08:15 -07:00
Krrish Dholakia	b0eff0b84f	fix(router.py): handle setting response headers during retries	2024-09-28 21:08:15 -07:00
Krrish Dholakia	d64e971d8c	fix(azure): return response headers for sync embedding calls	2024-09-28 21:08:15 -07:00
Krrish Dholakia	55d7bc7f32	fix(utils.py): guarantee openai-compatible headers always exist in response Fixes https://github.com/BerriAI/litellm/issues/5957	2024-09-28 21:08:15 -07:00
Krrish Dholakia	498e14ba59	fix(return-openai-compatible-headers): v0 is openai, azure, anthropic Fixes https://github.com/BerriAI/litellm/issues/5957	2024-09-28 21:08:15 -07:00
Krish Dholakia	0b30e212da	LiteLLM Minor Fixes & Improvements (09/27/2024) (#5938 ) * fix(langfuse.py): prevent double logging requester metadata Fixes https://github.com/BerriAI/litellm/issues/5935 * build(model_prices_and_context_window.json): add mistral pixtral cost tracking Closes https://github.com/BerriAI/litellm/issues/5837 * handle streaming for azure ai studio error * [Perf Proxy] parallel request limiter - use one cache update call (#5932) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf * fix(groq/chat/transformation.py): Fixes https://github.com/BerriAI/litellm/issues/5839 * feat(anthropic/chat.py): return 'retry-after' headers from anthropic Fixes https://github.com/BerriAI/litellm/issues/4387 * feat: raise validation error if message has tool calls without passing `tools` param for anthropic/bedrock Closes https://github.com/BerriAI/litellm/issues/5747 * [Feature]#5940, add max_workers parameter for the batch_completion (#5947) * handle streaming for azure ai studio error * bump: version 1.48.2 → 1.48.3 * docs(data_security.md): add legal/compliance faq's Make it easier for companies to use litellm * docs: resolve imports * [Feature]#5940, add max_workers parameter for the batch_completion method --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com> Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local> * fix(converse_transformation.py): fix default message value * fix(utils.py): fix get_model_info to handle finetuned models Fixes issue for standard logging payloads, where model_map_value was null for finetuned openai models * fix(litellm_pre_call_utils.py): add debug statement for data sent after updating with team/key callbacks * fix: fix linting errors * fix(anthropic/chat/handler.py): fix cache creation input tokens * fix(exception_mapping_utils.py): fix missing imports * fix(anthropic/chat/handler.py): fix usage block translation * test: fix test * test: fix tests * style(types/utils.py): trigger new build * test: fix test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Jose Alberto Arango Sanchez <jose.arangos@udea.edu.co> Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>	2024-09-27 22:52:57 -07:00
Krish Dholakia	bd17424c4b	LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925 ) (#5937 ) * LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925) * fix(litellm_logging.py): don't initialize prometheus_logger if non premium user Prevents bad error messages in logs Fixes https://github.com/BerriAI/litellm/issues/5897 * Add Support for Custom Providers in Vision and Function Call Utils (#5688) * Add Support for Custom Providers in Vision and Function Call Utils Lookup * Remove parallel function call due to missing model info param * Add Unit Tests for Vision and Function Call Changes * fix-#5920: set header value to string to fix "'int' object has no att… (#5922) * LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880) * LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) * feat(auth_utils.py): enable admin to allow client-side credentials to be passed Makes it easier for devs to experiment with finetuned fireworks ai models * feat(router.py): allow setting configurable_clientside_auth_params for a model Closes https://github.com/BerriAI/litellm/issues/5843 * build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit Fixes https://github.com/BerriAI/litellm/issues/5850 * fix(azure_ai/): support content list for azure ai Fixes https://github.com/BerriAI/litellm/issues/4237 * fix(litellm_logging.py): always set saved_cache_cost Set to 0 by default * fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing handles calling 405b+ size models * fix(slack_alerting.py): fix error alerting for failed spend tracking Fixes regression with slack alerting error monitoring * fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error * docs(bedrock.md): add llama3-1 models * test: fix tests * fix(azure_ai/chat): fix transformation for azure ai calls * feat(azure_ai/embed): Add azure ai embeddings support Closes https://github.com/BerriAI/litellm/issues/5861 * fix(azure_ai/embed): enable async embedding * feat(azure_ai/embed): support azure ai multimodal embeddings * fix(azure_ai/embed): support async multi modal embeddings * feat(together_ai/embed): support together ai embedding calls * feat(rerank/main.py): log source documents for rerank endpoints to langfuse improves rerank endpoint logging * fix(langfuse.py): support logging `/audio/speech` input to langfuse * test(test_embedding.py): fix test * test(test_completion_cost.py): fix helper util * fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'" --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926) This reverts commit `a554ae2695`. * build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing Enables cost tracking for azure ai cohere rerank models * fix(litellm_logging.py): fix debug log to be clearer Closes https://github.com/BerriAI/litellm/issues/5909 * test(test_utils.py): fix test name * fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models * fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints * fix(converse_handler.py): support new llama 3-2 models Fixes https://github.com/BerriAI/litellm/issues/5901 * fix(litellm_logging.py): ensure response is redacted for standard message logging Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360 * fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation allows user to set custom cost for model * fix(config.yml): fix docker hub auht * build(config.yml): add docker auth to all tests * fix(db/create_views.py): fix linting error * fix(main.py): fix circular import * fix(azure_ai/__init__.py): fix circular import * fix(main.py): fix import * fix: fix linting errors * test: fix test * fix(proxy_server.py): pass premium user value on startup used for prometheus init --------- Co-authored-by: Cole Murray <colemurray.cs@gmail.com> Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com> * handle streaming for azure ai studio error * [Perf Proxy] parallel request limiter - use one cache update call (#5932) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf * test: fix test * test(test_rerank.py): fix test --------- Co-authored-by: Cole Murray <colemurray.cs@gmail.com> Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-09-27 17:54:13 -07:00
Krish Dholakia	a1d9e96b31	LiteLLM Minor Fixes & Improvements (09/25/2024) (#5893 ) * fix(langfuse.py): support new langfuse prompt_chat class init params * fix(langfuse.py): handle new init values on prompt chat + prompt text templates fixes error caused during langfuse logging * docs(openai_compatible.md): clarify `openai/` handles correct routing for `/v1/completions` route Fixes https://github.com/BerriAI/litellm/issues/5876 * fix(utils.py): handle unmapped gemini model optional param translation Fixes https://github.com/BerriAI/litellm/issues/5888 * fix(o1_transformation.py): fix o-1 validation, to not raise error if temperature=1 Fixes https://github.com/BerriAI/litellm/issues/5884 * fix(prisma_client.py): refresh iam token Fixes https://github.com/BerriAI/litellm/issues/5896 * fix: pass drop params where required * fix(utils.py): pass drop_params correctly * fix(types/vertex_ai.py): fix generation config * test(test_max_completion_tokens.py): fix test * fix(vertex_and_google_ai_studio_gemini.py): fix map openai params	2024-09-26 16:41:44 -07:00
Krish Dholakia	16c0307eab	LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880 ) * LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) * feat(auth_utils.py): enable admin to allow client-side credentials to be passed Makes it easier for devs to experiment with finetuned fireworks ai models * feat(router.py): allow setting configurable_clientside_auth_params for a model Closes https://github.com/BerriAI/litellm/issues/5843 * build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit Fixes https://github.com/BerriAI/litellm/issues/5850 * fix(azure_ai/): support content list for azure ai Fixes https://github.com/BerriAI/litellm/issues/4237 * fix(litellm_logging.py): always set saved_cache_cost Set to 0 by default * fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing handles calling 405b+ size models * fix(slack_alerting.py): fix error alerting for failed spend tracking Fixes regression with slack alerting error monitoring * fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error * docs(bedrock.md): add llama3-1 models * test: fix tests * fix(azure_ai/chat): fix transformation for azure ai calls * feat(azure_ai/embed): Add azure ai embeddings support Closes https://github.com/BerriAI/litellm/issues/5861 * fix(azure_ai/embed): enable async embedding * feat(azure_ai/embed): support azure ai multimodal embeddings * fix(azure_ai/embed): support async multi modal embeddings * feat(together_ai/embed): support together ai embedding calls * feat(rerank/main.py): log source documents for rerank endpoints to langfuse improves rerank endpoint logging * fix(langfuse.py): support logging `/audio/speech` input to langfuse * test(test_embedding.py): fix test * test(test_completion_cost.py): fix helper util	2024-09-25 22:11:57 -07:00
Krish Dholakia	8039b95aaf	LiteLLM Minor Fixes & Improvements (09/21/2024) (#5819 ) * fix(router.py): fix error message * Litellm disable keys (#5814) * build(schema.prisma): allow blocking/unblocking keys Fixes https://github.com/BerriAI/litellm/issues/5328 * fix(key_management_endpoints.py): fix pop * feat(auth_checks.py): allow admin to enable/disable virtual keys Closes https://github.com/BerriAI/litellm/issues/5328 * docs(vertex.md): add auth section for vertex ai Addresses - https://github.com/BerriAI/litellm/issues/5768#issuecomment-2365284223 * build(model_prices_and_context_window.json): show which models support prompt_caching Closes https://github.com/BerriAI/litellm/issues/5776 * fix(router.py): allow setting default priority for requests * fix(router.py): add 'retry-after' header for concurrent request limit errors Fixes https://github.com/BerriAI/litellm/issues/5783 * fix(router.py): correctly raise and use retry-after header from azure+openai Fixes https://github.com/BerriAI/litellm/issues/5783 * fix(user_api_key_auth.py): fix valid token being none * fix(auth_checks.py): fix model dump for cache management object * fix(user_api_key_auth.py): pass prisma_client to obj * test(test_otel.py): update test for new key check * test: fix test	2024-09-21 18:51:53 -07:00
Ishaan Jaff	b987d99486	handle hf rate limit error	2024-09-21 18:00:18 -07:00
Ishaan Jaff	1d630b61ad	[Feat] Add fireworks AI embedding (#5812 ) * add fireworks embedding models * add fireworks ai * fireworks ai embeddings support * is_fireworks_embedding_model * working fireworks embeddings * fix health check * models * fix embedding get optional params * fix linting errors * fix pick_cheapest_chat_model_from_llm_provider * add fireworks ai litellm provider * docs fireworks embedding models * fixes for when azure ad token is passed	2024-09-20 22:23:28 -07:00
Krish Dholakia	3933fba41f	LiteLLM Minor Fixes & Improvements (09/19/2024) (#5793 ) * fix(model_prices_and_context_window.json): add cost tracking for more vertex llama3.1 model 8b and 70b models * fix(proxy/utils.py): handle data being none on pre-call hooks * fix(proxy/): create views on initial proxy startup fixes base case, where user starts proxy for first time Fixes https://github.com/BerriAI/litellm/issues/5756 * build(config.yml): fix vertex version for test * feat(ui/): support enabling/disabling slack alerting Allows admin to turn on/off slack alerting through ui * feat(rerank/main.py): support langfuse logging * fix(proxy/utils.py): fix linting errors * fix(langfuse.py): log clean metadata * test(tests): replace deprecated openai model	2024-09-20 08:19:52 -07:00
Krish Dholakia	d46660ea0f	LiteLLM Minor Fixes & Improvements (09/18/2024) (#5772 ) * fix(proxy_server.py): fix azure key vault logic to not require client id/secret * feat(cost_calculator.py): support fireworks ai cost tracking * build(docker-compose.yml): add lines for mounting config.yaml to docker compose Closes https://github.com/BerriAI/litellm/issues/5739 * fix(input.md): update docs to clarify litellm supports content as a list of dictionaries Fixes https://github.com/BerriAI/litellm/issues/5755 * fix(input.md): update input.md to include all message values * fix(image_handling.py): follow image url redirects Fixes https://github.com/BerriAI/litellm/issues/5763 * fix(router.py): Fix model key/base leak in error message Fixes https://github.com/BerriAI/litellm/issues/5762 * fix(http_handler.py): fix linting error * fix(azure.py): fix logging to show azure_ad_token being used Fixes https://github.com/BerriAI/litellm/issues/5767 * fix(_redis.py): add redis sentinel support Closes https://github.com/BerriAI/litellm/issues/4381 * feat(_redis.py): add redis sentinel support Closes https://github.com/BerriAI/litellm/issues/4381 * test(test_completion_cost.py): fix test * Databricks Integration: Integrate Databricks SDK as optional mechanism for fetching API base and token, if unspecified (#5746) * LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723) * coverage (#5713) Signed-off-by: dbczumar <corey.zumar@databricks.com> * Move (#5714) Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix(litellm_logging.py): fix logging client re-init (#5710) Fixes https://github.com/BerriAI/litellm/issues/5695 * fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config Fixes https://github.com/BerriAI/litellm/issues/5682 * feat(o1_handler.py): fake streaming for openai o1 models Fixes https://github.com/BerriAI/litellm/issues/5694 * docs: deprecated traceloop integration in favor of native otel (#5249) * fix: fix linting errors * fix: fix linting errors * fix(main.py): fix o1 import --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com> * feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730) * feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it * fix(custom_logger.py): reset calltype * fix: fix linting errors * fix: fix linting error * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix: fix import * Fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * DB test Signed-off-by: dbczumar <corey.zumar@databricks.com> * Coverage Signed-off-by: dbczumar <corey.zumar@databricks.com> * progress Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix test name Signed-off-by: dbczumar <corey.zumar@databricks.com> --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com> * test: fix test * test(test_databricks.py): fix test * fix(databricks/chat.py): handle custom endpoint (e.g. sagemaker) * Apply code scanning fix for clear-text logging of sensitive information Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> * fix(__init__.py): fix known fireworks ai models --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2024-09-19 13:25:29 -07:00
Krish Dholakia	dd602753c0	Litellm fix router testing (#5748 ) * test: fix testing - azure changed content policy error logic * test: fix tests to use mock responses * test(test_image_generation.py): handle api instability * test(test_image_generation.py): handle azure api instability * fix(utils.py): fix unbounded variable error * fix(utils.py): fix unbounded variable error * test: refactor test to use mock response * test: mark flaky azure tests	2024-09-17 18:02:23 -07:00
Krish Dholakia	234185ec13	LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723 ) (#5731 ) * LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723) * coverage (#5713) Signed-off-by: dbczumar <corey.zumar@databricks.com> * Move (#5714) Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix(litellm_logging.py): fix logging client re-init (#5710) Fixes https://github.com/BerriAI/litellm/issues/5695 * fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config Fixes https://github.com/BerriAI/litellm/issues/5682 * feat(o1_handler.py): fake streaming for openai o1 models Fixes https://github.com/BerriAI/litellm/issues/5694 * docs: deprecated traceloop integration in favor of native otel (#5249) * fix: fix linting errors * fix: fix linting errors * fix(main.py): fix o1 import --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com> * feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730) * feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it * fix(custom_logger.py): reset calltype * fix: fix linting errors * fix: fix linting error * fix: fix import * test(test_databricks.py): fix databricks tests --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>	2024-09-17 08:05:52 -07:00
Krish Dholakia	da77706c26	Litellm stable dev (#5711 ) * feat(aws_base_llm.py): prevents recreating boto3 credentials during high traffic Leads to 100ms perf boost in local testing * fix(base_aws_llm.py): fix credential caching check to see if token is set * refactor(bedrock/chat): separate converse api and invoke api + isolate converse api transformation logic Make it easier to see how requests are transformed for /converse * fix: fix imports * fix(bedrock/embed): fix reordering of headers * fix(base_aws_llm.py): fix get credential logic * fix(converse_handler.py): fix ai21 streaming response	2024-09-14 23:22:59 -07:00
Ishaan Jaff	c8d15544c8	[Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments (#5698 ) * move cooldown logic to it's own helper * add new track deployment metrics folder * increment success, fails for deployment in current minute * fix cooldown logic * fix test_aaarouter_dynamic_cooldown_message_retry_time * fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls * clean up get from deployment test * fix _async_get_healthy_deployments * add mock InternalServerError * test deployment failing 25% requests * add test_high_traffic_cooldowns_one_bad_deployment * fix vertex load test * add test for rate limit error models in cool down * change default cooldown time * fix cooldown message time * fix cooldown on 429 error * fix doc string for _should_cooldown_deployment * fix sync cooldown logic router	2024-09-14 18:01:19 -07:00
Ishaan Jaff	85acdb9193	[Feat] Add `max_completion_tokens` param (#5691 ) * add max_completion_tokens * add max_completion_tokens * add max_completion_tokens support for OpenAI models * add max_completion_tokens param * add max_completion_tokens for bedrock converse models * add test for converse maxTokens * fix openai o1 param mapping test * move test optional params * add max_completion_tokens for anthropic api * fix conftest * add max_completion tokens for vertex ai partner models * add max_completion_tokens for fireworks ai * add max_completion_tokens for hf rest api * add test for param mapping * add param mapping for vertex, gemini + testing * predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd * add max_completion_tokens to openai supported params * fix fireworks ai param mapping	2024-09-14 14:57:01 -07:00
Krish Dholakia	60709a0753	LiteLLM Minor Fixes and Improvements (09/13/2024) (#5689 ) * refactor: cleanup unused variables + fix pyright errors * feat(health_check.py): Closes https://github.com/BerriAI/litellm/issues/5686 * fix(o1_reasoning.py): add stricter check for o-1 reasoning model * refactor(mistral/): make it easier to see mistral transformation logic * fix(openai.py): fix openai o-1 model param mapping Fixes https://github.com/BerriAI/litellm/issues/5685 * feat(main.py): infer finetuned gemini model from base model Fixes https://github.com/BerriAI/litellm/issues/5678 * docs(vertex.md): update docs to call finetuned gemini models * feat(proxy_server.py): allow admin to hide proxy model aliases Closes https://github.com/BerriAI/litellm/issues/5692 * docs(load_balancing.md): add docs on hiding alias models from proxy config * fix(base.py): don't raise notimplemented error * fix(user_api_key_auth.py): fix model max budget check * fix(router.py): fix elif * fix(user_api_key_auth.py): don't set team_id to empty str * fix(team_endpoints.py): fix response type * test(test_completion.py): handle predibase error * test(test_proxy_server.py): fix test * fix(o1_transformation.py): fix max_completion_token mapping * test(test_image_generation.py): mark flaky test	2024-09-14 10:02:55 -07:00
Ishaan Jaff	64a828b455	Merge pull request #5573 from BerriAI/litellm_add_rerank_spend_tracking [Feat] Add cost tracking for cohere rerank	2024-09-06 18:11:30 -07:00
Krish Dholakia	72e961af3c	LiteLLM Minor Fixes and Improvements (08/06/2024) (#5567 ) * fix(utils.py): return citations for perplexity streaming Fixes https://github.com/BerriAI/litellm/issues/5535 * fix(anthropic/chat.py): support fallbacks for anthropic streaming (#5542) * fix(anthropic/chat.py): support fallbacks for anthropic streaming Fixes https://github.com/BerriAI/litellm/issues/5512 * fix(anthropic/chat.py): use module level http client if none given (prevents early client closure) * fix: fix linting errors * fix(http_handler.py): fix raise_for_status error handling * test: retry flaky test * fix otel type * fix(bedrock/embed): fix error raising * test(test_openai_batches_and_files.py): skip azure batches test (for now) quota exceeded * fix(test_router.py): skip azure batch route test (for now) - hit batch quota limits --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * All `model_group_alias` should show up in `/models`, `/model/info` , `/model_group/info` (#5539) * fix(router.py): support returning model_alias model names in `/v1/models` * fix(proxy_server.py): support returning model alias'es on `/model/info` * feat(router.py): support returning model group alias for `/model_group/info` * fix(proxy_server.py): fix linting errors * fix(proxy_server.py): fix linting errors * build(model_prices_and_context_window.json): add amazon titan text premier pricing information Closes https://github.com/BerriAI/litellm/issues/5560 * feat(litellm_logging.py): log standard logging response object for pass through endpoints. Allows bedrock /invoke agent calls to be correctly logged to langfuse + s3 * fix(success_handler.py): fix linting error * fix(success_handler.py): fix linting errors * fix(team_endpoints.py): Allows admin to update team member budgets --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-09-06 17:16:24 -07:00
Ishaan Jaff	1852e1cd9a	basic cohere rerank logging	2024-09-06 15:18:16 -07:00
Ishaan Jaff	795db8672a	fix log /audio to langfuse	2024-09-05 16:26:37 -07:00
Ishaan Jaff	25887c1846	fix import error	2024-09-05 10:09:44 -07:00
Ishaan Jaff	81ee1653af	use correct type hints for audio transcriptions	2024-09-05 09:12:27 -07:00
Krish Dholakia	1e7e538261	LiteLLM Minor fixes + improvements (08/04/2024) (#5505 ) * Minor IAM AWS OIDC Improvements (#5246) * AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region. * AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles. * (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow. * fix(router.py): log rejected requests Fixes https://github.com/BerriAI/litellm/issues/5498 * refactor: don't use verbose_logger.exception, if exception is raised User might already have handling for this. But alerting systems in prod will raise this as an unhandled error. * fix(datadog.py): support setting datadog source as an env var Fixes https://github.com/BerriAI/litellm/issues/5508 * docs(logging.md): add dd_source to datadog docs * fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers * (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509) * feat(anthropic.py): support 'cache_control' param for content when it is a string * Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519) This reverts commit `3fac0349c2`. * refactor: ci/cd run again --------- Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>	2024-09-04 22:16:55 -07:00
Ishaan Jaff	9ed4dad5ac	test get llm provider	2024-09-04 09:22:19 -07:00
Krish Dholakia	be3c7b401e	LiteLLM Minor fixes + improvements (08/03/2024) (#5488 ) * fix(internal_user_endpoints.py): set budget_reset_at for /user/update * fix(vertex_and_google_ai_studio_gemini.py): handle accumulated json Fixes https://github.com/BerriAI/litellm/issues/5479 * fix(vertex_ai_and_gemini.py): fix assistant message function call when content is not None Fixes https://github.com/BerriAI/litellm/issues/5490 * fix(proxy_server.py): generic state uuid for okta sso * fix(lago.py): improve debug logs Debugging for https://github.com/BerriAI/litellm/issues/5477 * docs(bedrock.md): add bedrock cross-region inferencing to docs * fix(azure.py): return azure response headers on aembedding call * feat(azure.py): return azure response headers for `/audio/transcription` * fix(types/utils.py): standardize deepseek / anthropic prompt caching usage information Closes https://github.com/BerriAI/litellm/issues/5285 * docs(usage.md): add docs on litellm usage object * test(test_completion.py): mark flaky test	2024-09-03 21:21:34 -07:00
Ishaan Jaff	b0178a85cf	refactor get_secret	2024-09-03 10:42:12 -07:00
Krish Dholakia	9f3fa29624	feat(router.py): Support Loadbalancing batch azure api endpoints (#5469 ) * feat(router.py): initial commit for loadbalancing azure batch api endpoints Closes https://github.com/BerriAI/litellm/issues/5396 * fix(router.py): working `router.acreate_file()` * feat(router.py): working router.acreate_batch endpoint * feat(router.py): expose router.aretrieve_batch function Make it easy for user to retrieve the batch information * feat(router.py): support 'router.alist_batches' endpoint Adds support for getting all batches across all endpoints * feat(router.py): working loadbalancing on `/v1/files` * feat(proxy_server.py): working loadbalancing on `/v1/batches` * feat(proxy_server.py): working loadbalancing on Retrieve + List batch	2024-09-02 21:32:55 -07:00
Ishaan Jaff	c1adb0b7f2	Merge branch 'main' into litellm_track_imagen_spend_logs	2024-09-02 21:21:15 -07:00
Ishaan Jaff	9fcab392a4	fix get llm provider for imagen	2024-09-02 17:49:36 -07:00
Ishaan Jaff	666dd6aca2	Merge pull request #5478 from BerriAI/litellm_Add_ai21 [Feat] Add AI21 /chat API	2024-09-02 16:20:37 -07:00
Krish Dholakia	f9e6507cd1	LiteLLM Minor Fixes + Improvements (#5474 ) * feat(proxy/_types.py): add lago billing to callbacks ui Closes https://github.com/BerriAI/litellm/issues/5472 * fix(anthropic.py): return anthropic prompt caching information Fixes https://github.com/BerriAI/litellm/issues/5364 * feat(bedrock/chat.py): support 'json_schema' for bedrock models Closes https://github.com/BerriAI/litellm/issues/5434 * fix(bedrock/embed/embeddings.py): support async embeddings for amazon titan models * fix: linting fixes * fix: handle key errors * fix(bedrock/chat.py): fix bedrock ai21 streaming object * feat(bedrock/embed): support bedrock embedding optional params * fix(databricks.py): fix usage chunk * fix(internal_user_endpoints.py): apply internal user defaults, if user role updated Fixes issue where user update wouldn't apply defaults * feat(slack_alerting.py): provide multiple slack channels for a given alert type multiple channels might be interested in receiving an alert for a given type * docs(alerting.md): add multiple channel alerting to docs	2024-09-02 14:29:57 -07:00
Ishaan Jaff	f28e94827c	docs update ai21 doc	2024-09-02 12:26:53 -07:00

1 2 3 4 5 ...

1901 commits