Commit graph

17913 commits

Author SHA1 Message Date
Krrish Dholakia
ec6ec32bf8 bump: version 1.48.6 → 1.48.7 2024-09-28 23:11:10 -07:00
Krish Dholakia
12cb4ee05c
Litellm Minor Fixes & Improvements (09/24/2024) (#5963)
* fix(batch_redis_get.py): handle custom namespace

Fix https://github.com/BerriAI/litellm/issues/5917

* fix(litellm_logging.py): fix linting error

* refactor(test_proxy_utils.py): place at root level test folder

* refactor: move all testing to top-level of repo

Closes https://github.com/BerriAI/litellm/issues/486

* refactor: fix imports

* refactor(test_stream_chunk_builder.py): fix import

* build(config.yml): fix build_and_test part of tests

* fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(return-openai-compatible-headers): v0 is openai, azure, anthropic

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(utils.py): guarantee openai-compatible headers always exist in response

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(azure): return response headers for sync embedding calls

* fix(router.py): handle setting response headers during retries

* fix(utils.py): fix updating hidden params

* fix(router.py): skip setting model_group response headers for now

current implementation increases redis cache calls by 3x

* docs(reliability.md): add tutorial on setting wildcard models as fallbacks

* fix(caching.py): cleanup print_stack()

* fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing

* test: refactor test

* test: run test first

* fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set)

* (perf improvement proxy) use one redis set cache to update spend in db (30-40% perf improvement)  (#5960)

* use one set op to update spend in db

* fix test_team_cache_update_called

* fix redis async_set_cache_pipeline when empty list passed to it (#5962)

* [Feat Proxy] Allow using hypercorn for http v2  (#5950)

* use run_hypercorn

* add docs on using hypercorn

* docs clean up langfuse.md

* (feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus  (#5968)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* fix checks for premium user on prometheus

* log_success_fallback_event and log_failure_fallback_event

* log original_exception in log_success_fallback_event

* track key, team and exception status and class on fallback metrics

* use get_standard_logging_metadata

* fix import error

* track litellm_deployment_successful_fallbacks

* add test test_proxy_fallback_metrics

* add log log_success_fallback_event

* fix test prometheus

* (proxy prometheus) track api key and team in latency metrics (#5966)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* (feat prometheus proxy) track remaining team and key alias in deployment failure metrics (#5967)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* bump: version 1.48.5 → 1.48.6

* fix sso sign in tests

* ci/cd run again

* add sentry sdk to litellm docker (#5965)

* ci/cd run again

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-28 21:09:48 -07:00
Ishaan Jaff
b4f8f170e7 ci/cd run again 2024-09-28 21:08:15 -07:00
Ishaan Jaff
ad4488d691 ci/cd run again 2024-09-28 21:08:15 -07:00
Ishaan Jaff
357bb53e9d (feat prometheus proxy) track remaining team and key alias in deployment failure metrics (#5967)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total
2024-09-28 21:08:15 -07:00
Krrish Dholakia
6c7d1d5c96 fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set) 2024-09-28 21:08:15 -07:00
Krrish Dholakia
fa64b6ca24 test: run test first 2024-09-28 21:08:15 -07:00
Krrish Dholakia
392e5c538e test: refactor test 2024-09-28 21:08:15 -07:00
Krrish Dholakia
3f8a5b3ef6 fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing 2024-09-28 21:08:15 -07:00
Krrish Dholakia
575b7911b2 fix(caching.py): cleanup print_stack() 2024-09-28 21:08:15 -07:00
Krrish Dholakia
c9d6925a42 docs(reliability.md): add tutorial on setting wildcard models as fallbacks 2024-09-28 21:08:15 -07:00
Krrish Dholakia
81d6c5e5a5 fix(router.py): skip setting model_group response headers for now
current implementation increases redis cache calls by 3x
2024-09-28 21:08:15 -07:00
Krrish Dholakia
5fbcdd8b11 fix(utils.py): fix updating hidden params 2024-09-28 21:08:15 -07:00
Krrish Dholakia
b0eff0b84f fix(router.py): handle setting response headers during retries 2024-09-28 21:08:15 -07:00
Krrish Dholakia
d64e971d8c fix(azure): return response headers for sync embedding calls 2024-09-28 21:08:15 -07:00
Krrish Dholakia
55d7bc7f32 fix(utils.py): guarantee openai-compatible headers always exist in response
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Krrish Dholakia
498e14ba59 fix(return-openai-compatible-headers): v0 is openai, azure, anthropic
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Krrish Dholakia
5222fc8e1b fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Krrish Dholakia
c0cdc6e496 build(config.yml): fix build_and_test part of tests 2024-09-28 21:08:14 -07:00
Krrish Dholakia
dd2c0abd33 refactor(test_stream_chunk_builder.py): fix import 2024-09-28 21:08:14 -07:00
Krrish Dholakia
5ad01e59f6 refactor: fix imports 2024-09-28 21:08:14 -07:00
Krrish Dholakia
3560f0ef2c refactor: move all testing to top-level of repo
Closes https://github.com/BerriAI/litellm/issues/486
2024-09-28 21:08:14 -07:00
Krrish Dholakia
5403c5828c refactor(test_proxy_utils.py): place at root level test folder 2024-09-28 21:08:14 -07:00
Krrish Dholakia
c1036001fa fix(litellm_logging.py): fix linting error 2024-09-28 21:08:14 -07:00
Krrish Dholakia
efc06d4a03 fix(batch_redis_get.py): handle custom namespace
Fix https://github.com/BerriAI/litellm/issues/5917
2024-09-28 21:08:14 -07:00
Ishaan Jaff
e9e086a0b6 ci/cd run again 2024-09-28 19:35:24 -07:00
Ishaan Jaff
e70d1a2808
add sentry sdk to litellm docker (#5965) 2024-09-28 19:33:41 -07:00
Ishaan Jaff
4251375db3 ci/cd run again 2024-09-28 19:23:10 -07:00
Ishaan Jaff
8c38cfff9d fix sso sign in tests 2024-09-28 19:11:28 -07:00
Ishaan Jaff
67d2989d1a bump: version 1.48.5 → 1.48.6 2024-09-28 19:06:28 -07:00
Ishaan Jaff
9d8c215d38
(feat prometheus proxy) track remaining team and key alias in deployment failure metrics (#5967)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total
2024-09-28 19:05:42 -07:00
Ishaan Jaff
7e69cdd1b4
(proxy prometheus) track api key and team in latency metrics (#5966)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency
2024-09-28 19:04:42 -07:00
Ishaan Jaff
49ec40b1cb
(feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus (#5968)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* fix checks for premium user on prometheus

* log_success_fallback_event and log_failure_fallback_event

* log original_exception in log_success_fallback_event

* track key, team and exception status and class on fallback metrics

* use get_standard_logging_metadata

* fix import error

* track litellm_deployment_successful_fallbacks

* add test test_proxy_fallback_metrics

* add log log_success_fallback_event

* fix test prometheus
2024-09-28 19:00:21 -07:00
Ishaan Jaff
b817974c8e docs clean up langfuse.md 2024-09-28 18:59:02 -07:00
Ishaan Jaff
0d0f46a826
[Feat Proxy] Allow using hypercorn for http v2 (#5950)
* use run_hypercorn

* add docs on using hypercorn
2024-09-28 15:03:50 -07:00
Ishaan Jaff
7500855654
fix redis async_set_cache_pipeline when empty list passed to it (#5962) 2024-09-28 13:32:00 -07:00
Ishaan Jaff
eb325cce7d
(perf improvement proxy) use one redis set cache to update spend in db (30-40% perf improvement) (#5960)
* use one set op to update spend in db

* fix test_team_cache_update_called
2024-09-28 13:00:31 -07:00
Ishaan Jaff
8bf7573fd8
(fix proxy) model_group/info support rerank models (#5955)
* fix /model_group/info on rerank

* add test test_proxy_model_group_info_rerank
2024-09-28 10:54:43 -07:00
Ishaan Jaff
088d906276
fix use one async async_batch_set_cache (#5956) 2024-09-28 09:59:38 -07:00
Krrish Dholakia
1f51159ed2 bump: version 1.48.4 → 1.48.5 2024-09-27 22:58:58 -07:00
Krish Dholakia
0b30e212da
LiteLLM Minor Fixes & Improvements (09/27/2024) (#5938)
* fix(langfuse.py): prevent double logging requester metadata

Fixes https://github.com/BerriAI/litellm/issues/5935

* build(model_prices_and_context_window.json): add mistral pixtral cost tracking

Closes https://github.com/BerriAI/litellm/issues/5837

* handle streaming for azure ai studio error

* [Perf Proxy] parallel request limiter - use one cache update call (#5932)

* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* fix(groq/chat/transformation.py): Fixes https://github.com/BerriAI/litellm/issues/5839

* feat(anthropic/chat.py): return 'retry-after' headers from anthropic

Fixes https://github.com/BerriAI/litellm/issues/4387

* feat: raise validation error if message has tool calls without passing `tools` param for anthropic/bedrock

Closes https://github.com/BerriAI/litellm/issues/5747

* [Feature]#5940, add max_workers parameter for the batch_completion (#5947)

* handle streaming for azure ai studio error

* bump: version 1.48.2 → 1.48.3

* docs(data_security.md): add legal/compliance faq's

Make it easier for companies to use litellm

* docs: resolve imports

* [Feature]#5940, add max_workers parameter for the batch_completion method

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>

* fix(converse_transformation.py): fix default message value

* fix(utils.py): fix get_model_info to handle finetuned models

Fixes issue for standard logging payloads, where model_map_value was null for finetuned openai models

* fix(litellm_pre_call_utils.py): add debug statement for data sent after updating with team/key callbacks

* fix: fix linting errors

* fix(anthropic/chat/handler.py): fix cache creation input tokens

* fix(exception_mapping_utils.py): fix missing imports

* fix(anthropic/chat/handler.py): fix usage block translation

* test: fix test

* test: fix tests

* style(types/utils.py): trigger new build

* test: fix test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Jose Alberto Arango Sanchez <jose.arangos@udea.edu.co>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>
2024-09-27 22:52:57 -07:00
Krish Dholakia
754981a78f
fix(proxy/utils.py): fix create missing views check (#5953) 2024-09-27 20:32:46 -07:00
Ishaan Jaff
39b5d8f383 fix test_vertexai_multimodal_embedding_base64image_in_input 2024-09-27 20:17:08 -07:00
Ishaan Jaff
9fb1ee2294 bump 1.48.3 -> 1.48.4 2024-09-27 18:17:56 -07:00
Ishaan Jaff
e15b0f2cf7 fix merge conflicts 2024-09-27 18:07:42 -07:00
Ishaan Jaff
a5ffe21f11 bump: version 1.48.3 → 1.48.4 2024-09-27 18:05:53 -07:00
Ishaan Jaff
353faeeccd bump: version 1.49.0 → 1.49.1 2024-09-27 18:04:52 -07:00
Ishaan Jaff
627504d054 bump: version 1.48.3 → 1.49.0 2024-09-27 18:04:47 -07:00
Ishaan Jaff
fd87ae69b8
[Vertex Multimodal embeddings] Fixes to work with Langchain OpenAI Embedding (#5949)
* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* add InstanceImage type

* fix vertex image transform

* add langchain vertex test request

* add new vertex test

* update multimodal embedding tests

* add test_vertexai_multimodal_embedding_base64image_in_input

* simplify langchain mm embedding usage

* add langchain example for multimodal embeddings on vertex

* fix linting error
2024-09-27 18:04:03 -07:00
Krish Dholakia
bd17424c4b
LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925) (#5937)
* LiteLLM Minor Fixes & Improvements (09/26/2024)  (#5925)

* fix(litellm_logging.py): don't initialize prometheus_logger if non premium user

Prevents bad error messages in logs

Fixes https://github.com/BerriAI/litellm/issues/5897

* Add Support for Custom Providers in Vision and Function Call Utils (#5688)

* Add Support for Custom Providers in Vision and Function Call Utils Lookup

* Remove parallel function call due to missing model info param

* Add Unit Tests for Vision and Function Call Changes

* fix-#5920: set header value to string to fix "'int' object has no att… (#5922)

* LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880)

* LiteLLM Minor Fixes & Improvements (09/23/2024)  (#5842)

* feat(auth_utils.py): enable admin to allow client-side credentials to be passed

Makes it easier for devs to experiment with finetuned fireworks ai models

* feat(router.py): allow setting configurable_clientside_auth_params for a model

Closes https://github.com/BerriAI/litellm/issues/5843

* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit

Fixes https://github.com/BerriAI/litellm/issues/5850

* fix(azure_ai/): support content list for azure ai

Fixes https://github.com/BerriAI/litellm/issues/4237

* fix(litellm_logging.py): always set saved_cache_cost

Set to 0 by default

* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing

handles calling 405b+ size models

* fix(slack_alerting.py): fix error alerting for failed spend tracking

Fixes regression with slack alerting error monitoring

* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error

* docs(bedrock.md): add llama3-1 models

* test: fix tests

* fix(azure_ai/chat): fix transformation for azure ai calls

* feat(azure_ai/embed): Add azure ai embeddings support

Closes https://github.com/BerriAI/litellm/issues/5861

* fix(azure_ai/embed): enable async embedding

* feat(azure_ai/embed): support azure ai multimodal embeddings

* fix(azure_ai/embed): support async multi modal embeddings

* feat(together_ai/embed): support together ai embedding calls

* feat(rerank/main.py): log source documents for rerank endpoints to langfuse

improves rerank endpoint logging

* fix(langfuse.py): support logging `/audio/speech` input to langfuse

* test(test_embedding.py): fix test

* test(test_completion_cost.py): fix helper util

* fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'"

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926)

This reverts commit a554ae2695.

* build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing

Enables cost tracking for azure ai cohere rerank models

* fix(litellm_logging.py): fix debug log to be clearer

Closes https://github.com/BerriAI/litellm/issues/5909

* test(test_utils.py): fix test name

* fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models

* fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints

* fix(converse_handler.py): support new llama 3-2 models

Fixes https://github.com/BerriAI/litellm/issues/5901

* fix(litellm_logging.py): ensure response is redacted for standard message logging

Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360

* fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation

allows user to set custom cost for model

* fix(config.yml): fix docker hub auht

* build(config.yml): add docker auth to all tests

* fix(db/create_views.py): fix linting error

* fix(main.py): fix circular import

* fix(azure_ai/__init__.py): fix circular import

* fix(main.py): fix import

* fix: fix linting errors

* test: fix test

* fix(proxy_server.py): pass premium user value on startup

used for prometheus init

---------

Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>

* handle streaming for azure ai studio error

* [Perf Proxy] parallel request limiter - use one cache update call (#5932)

* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* test: fix test

* test(test_rerank.py): fix test

---------

Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-27 17:54:13 -07:00