Commit graph

17974 commits

Author SHA1 Message Date
Krrish Dholakia
1f51159ed2 bump: version 1.48.4 → 1.48.5 2024-09-27 22:58:58 -07:00
Krish Dholakia
0b30e212da
LiteLLM Minor Fixes & Improvements (09/27/2024) (#5938)
* fix(langfuse.py): prevent double logging requester metadata

Fixes https://github.com/BerriAI/litellm/issues/5935

* build(model_prices_and_context_window.json): add mistral pixtral cost tracking

Closes https://github.com/BerriAI/litellm/issues/5837

* handle streaming for azure ai studio error

* [Perf Proxy] parallel request limiter - use one cache update call (#5932)

* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* fix(groq/chat/transformation.py): Fixes https://github.com/BerriAI/litellm/issues/5839

* feat(anthropic/chat.py): return 'retry-after' headers from anthropic

Fixes https://github.com/BerriAI/litellm/issues/4387

* feat: raise validation error if message has tool calls without passing `tools` param for anthropic/bedrock

Closes https://github.com/BerriAI/litellm/issues/5747

* [Feature]#5940, add max_workers parameter for the batch_completion (#5947)

* handle streaming for azure ai studio error

* bump: version 1.48.2 → 1.48.3

* docs(data_security.md): add legal/compliance faq's

Make it easier for companies to use litellm

* docs: resolve imports

* [Feature]#5940, add max_workers parameter for the batch_completion method

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>

* fix(converse_transformation.py): fix default message value

* fix(utils.py): fix get_model_info to handle finetuned models

Fixes issue for standard logging payloads, where model_map_value was null for finetuned openai models

* fix(litellm_pre_call_utils.py): add debug statement for data sent after updating with team/key callbacks

* fix: fix linting errors

* fix(anthropic/chat/handler.py): fix cache creation input tokens

* fix(exception_mapping_utils.py): fix missing imports

* fix(anthropic/chat/handler.py): fix usage block translation

* test: fix test

* test: fix tests

* style(types/utils.py): trigger new build

* test: fix test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Jose Alberto Arango Sanchez <jose.arangos@udea.edu.co>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>
2024-09-27 22:52:57 -07:00
Krish Dholakia
754981a78f
fix(proxy/utils.py): fix create missing views check (#5953) 2024-09-27 20:32:46 -07:00
Ishaan Jaff
39b5d8f383 fix test_vertexai_multimodal_embedding_base64image_in_input 2024-09-27 20:17:08 -07:00
Ishaan Jaff
9fb1ee2294 bump 1.48.3 -> 1.48.4 2024-09-27 18:17:56 -07:00
Ishaan Jaff
e15b0f2cf7 fix merge conflicts 2024-09-27 18:07:42 -07:00
Ishaan Jaff
a5ffe21f11 bump: version 1.48.3 → 1.48.4 2024-09-27 18:05:53 -07:00
Ishaan Jaff
353faeeccd bump: version 1.49.0 → 1.49.1 2024-09-27 18:04:52 -07:00
Ishaan Jaff
627504d054 bump: version 1.48.3 → 1.49.0 2024-09-27 18:04:47 -07:00
Ishaan Jaff
fd87ae69b8
[Vertex Multimodal embeddings] Fixes to work with Langchain OpenAI Embedding (#5949)
* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* add InstanceImage type

* fix vertex image transform

* add langchain vertex test request

* add new vertex test

* update multimodal embedding tests

* add test_vertexai_multimodal_embedding_base64image_in_input

* simplify langchain mm embedding usage

* add langchain example for multimodal embeddings on vertex

* fix linting error
2024-09-27 18:04:03 -07:00
Krish Dholakia
bd17424c4b
LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925) (#5937)
* LiteLLM Minor Fixes & Improvements (09/26/2024)  (#5925)

* fix(litellm_logging.py): don't initialize prometheus_logger if non premium user

Prevents bad error messages in logs

Fixes https://github.com/BerriAI/litellm/issues/5897

* Add Support for Custom Providers in Vision and Function Call Utils (#5688)

* Add Support for Custom Providers in Vision and Function Call Utils Lookup

* Remove parallel function call due to missing model info param

* Add Unit Tests for Vision and Function Call Changes

* fix-#5920: set header value to string to fix "'int' object has no att… (#5922)

* LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880)

* LiteLLM Minor Fixes & Improvements (09/23/2024)  (#5842)

* feat(auth_utils.py): enable admin to allow client-side credentials to be passed

Makes it easier for devs to experiment with finetuned fireworks ai models

* feat(router.py): allow setting configurable_clientside_auth_params for a model

Closes https://github.com/BerriAI/litellm/issues/5843

* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit

Fixes https://github.com/BerriAI/litellm/issues/5850

* fix(azure_ai/): support content list for azure ai

Fixes https://github.com/BerriAI/litellm/issues/4237

* fix(litellm_logging.py): always set saved_cache_cost

Set to 0 by default

* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing

handles calling 405b+ size models

* fix(slack_alerting.py): fix error alerting for failed spend tracking

Fixes regression with slack alerting error monitoring

* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error

* docs(bedrock.md): add llama3-1 models

* test: fix tests

* fix(azure_ai/chat): fix transformation for azure ai calls

* feat(azure_ai/embed): Add azure ai embeddings support

Closes https://github.com/BerriAI/litellm/issues/5861

* fix(azure_ai/embed): enable async embedding

* feat(azure_ai/embed): support azure ai multimodal embeddings

* fix(azure_ai/embed): support async multi modal embeddings

* feat(together_ai/embed): support together ai embedding calls

* feat(rerank/main.py): log source documents for rerank endpoints to langfuse

improves rerank endpoint logging

* fix(langfuse.py): support logging `/audio/speech` input to langfuse

* test(test_embedding.py): fix test

* test(test_completion_cost.py): fix helper util

* fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'"

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926)

This reverts commit a554ae2695.

* build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing

Enables cost tracking for azure ai cohere rerank models

* fix(litellm_logging.py): fix debug log to be clearer

Closes https://github.com/BerriAI/litellm/issues/5909

* test(test_utils.py): fix test name

* fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models

* fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints

* fix(converse_handler.py): support new llama 3-2 models

Fixes https://github.com/BerriAI/litellm/issues/5901

* fix(litellm_logging.py): ensure response is redacted for standard message logging

Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360

* fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation

allows user to set custom cost for model

* fix(config.yml): fix docker hub auht

* build(config.yml): add docker auth to all tests

* fix(db/create_views.py): fix linting error

* fix(main.py): fix circular import

* fix(azure_ai/__init__.py): fix circular import

* fix(main.py): fix import

* fix: fix linting errors

* test: fix test

* fix(proxy_server.py): pass premium user value on startup

used for prometheus init

---------

Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>

* handle streaming for azure ai studio error

* [Perf Proxy] parallel request limiter - use one cache update call (#5932)

* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* test: fix test

* test(test_rerank.py): fix test

---------

Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-27 17:54:13 -07:00
Ishaan Jaff
789ce6b747
allow setting LANGFUSE_FLUSH_INTERVAL (#5944) 2024-09-27 17:42:15 -07:00
Ishaan Jaff
f4613a100d [Perf Proxy] parallel request limiter - use one cache update call (#5932)
* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf
2024-09-27 17:24:46 -07:00
Khanh Le
71f68ac185
docs(vertex.md): fix codestral fim placement (#5946) 2024-09-27 17:21:34 -07:00
Ishaan Jaff
bbf4db79c1 docs - show correct rpm - > tpm conversion for Azure 2024-09-27 17:18:55 -07:00
Krrish Dholakia
70df474e64 docs: resolve imports 2024-09-27 13:36:29 -07:00
Krrish Dholakia
2e9dca135e docs(data_security.md): add legal/compliance faq's
Make it easier for companies to use litellm
2024-09-27 13:33:27 -07:00
Ishaan Jaff
46c8509941 bump: version 1.48.2 → 1.48.3 2024-09-27 10:39:36 -07:00
Ishaan Jaff
ab153c3e8d handle streaming for azure ai studio error 2024-09-27 08:17:36 -07:00
Ishaan Jaff
93cf9abb88
[Fix Azure AI Studio] drop_params_from_unprocessable_entity_error (#5936)
* fix drop_params_from_unprocessable_entity_error

* fix drop_params_from_unprocessable_entity_error for async azure ai requests

* fix extra body reading azure ai studio
2024-09-27 07:58:52 -07:00
Jacob Hagstedt P Suorra
9ec3365ba6
Upgrade dependencies in dockerfile (#5862)
* Upgrade dependencies in dockerfile

* Change apt-get to apk for alpine image

* Set requirements file to same as dockerfile

---------

Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com>
2024-09-27 07:51:20 -07:00
Jannik Maierhöfer
52e971155a
[docs] updated langfuse integration guide (#5921) 2024-09-27 07:49:47 -07:00
Ishaan Jaff
480a8d73c0
[Fix] Perf use only async functions for get cache (#5930)
* use async functions

* fix parallel request limiter use correct user id

* fix getting user_id
2024-09-26 18:39:40 -07:00
Ishaan Jaff
58171f35ef
[Fix proxy perf] Use correct cache key when reading from redis cache (#5928)
* fix parallel request limiter use correct user id

* async def get_user_object(
fix

* use safe get_internal_user_object

* fix store internal users in redis correctly
2024-09-26 18:13:35 -07:00
Ishaan Jaff
8b6eec1951
track api key and alias in remaining tokens metric (#5924) 2024-09-26 18:01:03 -07:00
Krish Dholakia
a1d9e96b31
LiteLLM Minor Fixes & Improvements (09/25/2024) (#5893)
* fix(langfuse.py): support new langfuse prompt_chat class init params

* fix(langfuse.py): handle new init values on prompt chat + prompt text templates

fixes error caused during langfuse logging

* docs(openai_compatible.md): clarify `openai/` handles correct routing for `/v1/completions` route

Fixes https://github.com/BerriAI/litellm/issues/5876

* fix(utils.py): handle unmapped gemini model optional param translation

Fixes https://github.com/BerriAI/litellm/issues/5888

* fix(o1_transformation.py): fix o-1 validation, to not raise error if temperature=1

Fixes https://github.com/BerriAI/litellm/issues/5884

* fix(prisma_client.py): refresh iam token

Fixes https://github.com/BerriAI/litellm/issues/5896

* fix: pass drop params where required

* fix(utils.py): pass drop_params correctly

* fix(types/vertex_ai.py): fix generation config

* test(test_max_completion_tokens.py): fix test

* fix(vertex_and_google_ai_studio_gemini.py): fix map openai params
2024-09-26 16:41:44 -07:00
Krish Dholakia
16c0307eab
LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880)
* LiteLLM Minor Fixes & Improvements (09/23/2024)  (#5842)

* feat(auth_utils.py): enable admin to allow client-side credentials to be passed

Makes it easier for devs to experiment with finetuned fireworks ai models

* feat(router.py): allow setting configurable_clientside_auth_params for a model

Closes https://github.com/BerriAI/litellm/issues/5843

* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit

Fixes https://github.com/BerriAI/litellm/issues/5850

* fix(azure_ai/): support content list for azure ai

Fixes https://github.com/BerriAI/litellm/issues/4237

* fix(litellm_logging.py): always set saved_cache_cost

Set to 0 by default

* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing

handles calling 405b+ size models

* fix(slack_alerting.py): fix error alerting for failed spend tracking

Fixes regression with slack alerting error monitoring

* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error

* docs(bedrock.md): add llama3-1 models

* test: fix tests

* fix(azure_ai/chat): fix transformation for azure ai calls

* feat(azure_ai/embed): Add azure ai embeddings support

Closes https://github.com/BerriAI/litellm/issues/5861

* fix(azure_ai/embed): enable async embedding

* feat(azure_ai/embed): support azure ai multimodal embeddings

* fix(azure_ai/embed): support async multi modal embeddings

* feat(together_ai/embed): support together ai embedding calls

* feat(rerank/main.py): log source documents for rerank endpoints to langfuse

improves rerank endpoint logging

* fix(langfuse.py): support logging `/audio/speech` input to langfuse

* test(test_embedding.py): fix test

* test(test_completion_cost.py): fix helper util
2024-09-25 22:11:57 -07:00
Krrish Dholakia
5bc5eaff8a build(model_prices_and_context_window.json): add new gemini - google ai studio models
Closes https://github.com/BerriAI/litellm/pull/5879#issuecomment-2375703347
2024-09-25 21:50:30 -07:00
Peter Laß
e8066f13cb
Update litellm helm envconfigmap (#5872)
* Added the possibility to use ConfigMapRef for env values.

* bumped aapVersion to 1.46.6 and chartVersion to 0.3.0

---------

Co-authored-by: Peter Laß <peter.lass@maibornwolff.de>
2024-09-25 21:43:08 -07:00
David Manouchehri
057bef6561
Add Llama 3.2 90b model on Vertex AI. (#5908) 2024-09-25 21:21:57 -07:00
Ishaan Jaff
25bb4152a0 ci/cd run again 2024-09-25 20:11:44 -07:00
Ishaan Jaff
76a647b9bb bump: version 1.48.1 → 1.48.2 2024-09-25 19:56:43 -07:00
Ishaan Jaff
f6cdb4ca0d
[Perf improvement Proxy] Use Dual Cache for getting key and team objects (#5903)
* use dual cache - perf

* fix auth checks

* fix budget checks for keys

* fix get / set team tests
2024-09-25 19:56:17 -07:00
Krrish Dholakia
39c9150e97 build(model_prices_and_context_window.json): add new gemini models 2024-09-25 19:33:49 -07:00
John HU
8c7e357a23
Add gemini-1.5-pro-002 and gemini-1.5-flash-002 (#5879) 2024-09-25 19:31:37 -07:00
Ishaan Jaff
a8dd495eae
[Feat] add fireworks llama 3.2 models + cost tracking (#5905)
* add fireworks llama 3.2 vision models

* add new llama3.2 models

* docs add new llama 3.2 vision models
2024-09-25 17:59:46 -07:00
Ishaan Jaff
4bdeefd7e4
docs service accounts (#5900) 2024-09-25 15:46:13 -07:00
Ishaan Jaff
2444d76fda fix - add stricter type check for OTEL when args[0] is not dict 2024-09-25 13:48:50 -07:00
Ishaan Jaff
7cbcf538c6
[Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL (#5881)
* fix use previous internal usage caching logic

* fix test_dual_cache_uses_redis

* redis track event_metadata in service logging

* show otel error on _get_parent_otel_span_from_kwargs

* track parent otel span on internal usage cache

* update_request_status

* fix internal usage cache

* fix linting

* fix test internal usage cache

* fix linting error

* show event metadata in redis set

* fix test_get_team_redis

* fix test_get_team_redis

* test_proxy_logging_setup
2024-09-25 10:57:08 -07:00
Ishaan Jaff
4ec4d02474
[Feat-Router] Allow setting which environment to use a model on (#5892)
* add check deployment_is_active_for_environment

* add test for test_init_router_with_supported_environments

* show good example config for environments

* docs clean up config.yaml

* docs cleanup

* docs configs

* docs specfic env
2024-09-25 10:12:06 -07:00
Krish Dholakia
4911979c61
feat(vertex): Use correct provider for response_schema support check (#5815) (#5829)
* Fixed gemini-1.5-flash pricing

* (models): Added missing gemini experimental models + fixed pricing for gemini-1.5-pro-exp-0827

* Added gemini/gemini-1.5-flash-001 model

* Updated supports_response_schema to true for gemini flash 1.5 models

* feat(vertex): Use correct provider for response_schema support check

Co-authored-by: F1bos <44951186+F1bos@users.noreply.github.com>
2024-09-25 08:53:21 -07:00
Ishaan Jaff
2516360ceb docs show all configs 2024-09-25 06:37:38 -07:00
Ishaan Jaff
a8bb2f476c docs show relevant litellm_settings 2024-09-25 06:36:10 -07:00
Krrish Dholakia
b2e80ecb8e docs(user_keys.md): add docs on configurable clientside auth credentials
Allow easy switching of finetuned models
2024-09-24 22:44:39 -07:00
Ishaan Jaff
becb4acc24 bump: version 1.48.0 → 1.48.1 2024-09-24 22:26:46 -07:00
Ishaan Jaff
b468c58fe5 chore - handle case when otel metadata field value is None 2024-09-24 22:24:12 -07:00
Ishaan Jaff
2000e8cde9
[Perf Fix] Don't always read from Redis by Default (#5877)
* fix use previous internal usage caching logic

* fix test_dual_cache_uses_redis
2024-09-24 21:34:18 -07:00
Ishaan Jaff
3ccdb42d26
[Fix] OTEL - Don't log messages when callback settings disable message logging (#5875)
* fix otel dont log messages

* otel fix redis failure hook logging
2024-09-24 18:29:52 -07:00
Krish Dholakia
d37c8b5c6b
LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) (#5858)
* LiteLLM Minor Fixes & Improvements (09/23/2024)  (#5842)

* feat(auth_utils.py): enable admin to allow client-side credentials to be passed

Makes it easier for devs to experiment with finetuned fireworks ai models

* feat(router.py): allow setting configurable_clientside_auth_params for a model

Closes https://github.com/BerriAI/litellm/issues/5843

* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit

Fixes https://github.com/BerriAI/litellm/issues/5850

* fix(azure_ai/): support content list for azure ai

Fixes https://github.com/BerriAI/litellm/issues/4237

* fix(litellm_logging.py): always set saved_cache_cost

Set to 0 by default

* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing

handles calling 405b+ size models

* fix(slack_alerting.py): fix error alerting for failed spend tracking

Fixes regression with slack alerting error monitoring

* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error

* docs(bedrock.md): add llama3-1 models

* test: fix tests

* fix(azure_ai/chat): fix transformation for azure ai calls
2024-09-24 15:01:31 -07:00
Ishaan Jaff
4df9aca45e add debian fixes to non root image 2024-09-24 07:24:59 -07:00