Krish Dholakia
ba28e52ee8
Litellm lm studio embedding params ( #6746 )
...
* fix(ollama.py): fix get model info request
Fixes https://github.com/BerriAI/litellm/issues/6703
* feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param
* docs(anthropic.md): document all supported openai params for anthropic
* test: fix tests
* fix: fix tests
* feat(jina_ai/): add rerank support
Closes https://github.com/BerriAI/litellm/issues/6691
* test: handle service unavailable error
* fix(handler.py): refactor together ai rerank call
* test: update test to handle overloaded error
* test: fix test
* Litellm router trace (#6742 )
* feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks
* feat(router.py): log trace id across retry/fallback logic
allows grouping llm logs for the same request
* test: fix tests
* fix: fix test
* fix(transformation.py): only set non-none stop_sequences
* Litellm router disable fallbacks (#6743 )
* bump: version 1.52.6 → 1.52.7
* feat(router.py): enable dynamically disabling fallbacks
Allows for enabling/disabling fallbacks per key
* feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key
* test: fix test
* fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error
* fix(lm_studio/embed): support translating lm studio optional params
'
* feat(auth_checks.py): fix auth check inside route - `/team/list`
Fixes regression where non-admin w/ user_id=None able to query all teams
* docs proxy_budget_rescheduler_min_time
* helm run DISABLE_SCHEMA_UPDATE
* docs helm pre sync hook
* fix migration job.yaml
* fix DATABASE_URL
* use existing spec for migrations job
* fix yaml on migrations job
* fix migration job
* update doc on pre sync hook
* fix migrations-job.yaml
* fix migration job
* fix prisma migration
* test - handle eol model claude-2, use claude-2.1 instead
* (docs) add instructions on how to contribute to docker image
* Update code blocks huggingface.md (#6737 )
* Update prefix.md (#6734 )
* fix test_supports_response_schema
* mark Helm PreSyn as BETA
* (Feat) Add support for storing virtual keys in AWS SecretManager (#6728 )
* add SecretManager to httpxSpecialProvider
* fix importing AWSSecretsManagerV2
* add unit testing for writing keys to AWS secret manager
* use KeyManagementEventHooks for key/generated events
* us event hooks for key management endpoints
* working AWSSecretsManagerV2
* fix write secret to AWS secret manager on /key/generate
* fix KeyManagementSettings
* use tasks for key management hooks
* add async_delete_secret
* add test for async_delete_secret
* use _delete_virtual_keys_from_secret_manager
* fix test secret manager
* test_key_generate_with_secret_manager_call
* fix check for key_management_settings
* sync_read_secret
* test_aws_secret_manager
* fix sync_read_secret
* use helper to check when _should_read_secret_from_secret_manager
* test_get_secret_with_access_mode
* test - handle eol model claude-2, use claude-2.1 instead
* docs AWS secret manager
* fix test_read_nonexistent_secret
* fix test_supports_response_schema
* ci/cd run again
* LiteLLM Minor Fixes & Improvement (11/14/2024) (#6730 )
* fix(ollama.py): fix get model info request
Fixes https://github.com/BerriAI/litellm/issues/6703
* feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param
* docs(anthropic.md): document all supported openai params for anthropic
* test: fix tests
* fix: fix tests
* feat(jina_ai/): add rerank support
Closes https://github.com/BerriAI/litellm/issues/6691
* test: handle service unavailable error
* fix(handler.py): refactor together ai rerank call
* test: update test to handle overloaded error
* test: fix test
* Litellm router trace (#6742 )
* feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks
* feat(router.py): log trace id across retry/fallback logic
allows grouping llm logs for the same request
* test: fix tests
* fix: fix test
* fix(transformation.py): only set non-none stop_sequences
* Litellm router disable fallbacks (#6743 )
* bump: version 1.52.6 → 1.52.7
* feat(router.py): enable dynamically disabling fallbacks
Allows for enabling/disabling fallbacks per key
* feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key
* test: fix test
* fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error
* test: handle gemini error
* test: fix test
* fix: new run
* bump: version 1.52.7 → 1.52.8
* docs: add docs on jina ai rerank support
* docs(reliability.md): add tutorial on disabling fallbacks per key
* docs(logging.md): add 'trace_id' param to standard logging payload
* (feat) add bedrock/stability.stable-image-ultra-v1:0 (#6723 )
* add stability.stable-image-ultra-v1:0
* add pricing for stability.stable-image-ultra-v1:0
* fix test_supports_response_schema
* ci/cd run again
* [Feature]: Stop swallowing up AzureOpenAi exception responses in litellm's implementation for a BadRequestError (#6745 )
* fix azure exceptions
* test_bad_request_error_contains_httpx_response
* test_bad_request_error_contains_httpx_response
* use safe access to get exception response
* fix get attr
* [Feature]: json_schema in response support for Anthropic (#6748 )
* _convert_tool_response_to_message
* fix ModelResponseIterator
* fix test_json_response_format
* test_json_response_format_stream
* fix _convert_tool_response_to_message
* use helper _handle_json_mode_chunk
* fix _process_response
* unit testing for test_convert_tool_response_to_message_no_arguments
* update doc for JSON mode
* fix: import audio check (#6740 )
* fix imagegeneration output_cost_per_image on model cost map (#6752 )
* (feat) Vertex AI - add support for fine tuned embedding models (#6749 )
* fix use fine tuned vertex embedding models
* test_vertex_embedding_url
* add _transform_openai_request_to_fine_tuned_embedding_request
* add _transform_openai_request_to_fine_tuned_embedding_request
* add transform_openai_request_to_vertex_embedding_request
* add _transform_vertex_response_to_openai_for_fine_tuned_models
* test_vertexai_embedding for ft models
* fix test_vertexai_embedding_finetuned
* doc fine tuned / custom embedding models
* fix test test_partner_models_httpx
* bump: version 1.52.8 → 1.52.9
* LiteLLM Minor Fixes & Improvements (11/13/2024) (#6729 )
* fix(utils.py): add logprobs support for together ai
Fixes
https://github.com/BerriAI/litellm/issues/6724
* feat(pass_through_endpoints/): add anthropic/ pass-through endpoint
adds new `anthropic/` pass-through endpoint + refactors docs
* feat(spend_management_endpoints.py): allow /global/spend/report to query team + customer id
enables seeing spend for a customer in a team
* Add integration with MLflow Tracing (#6147 )
* Add MLflow logger
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* Streaming handling
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* lint
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* address comments and fix issues
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* address comments and fix issues
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* Move logger construction code
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* Add docs
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* async handlers
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* new picture
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
---------
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* fix(mlflow.py): fix ruff linting errors
* ci(config.yml): add mlflow to ci testing
* fix: fix test
* test: fix test
* Litellm key update fix (#6710 )
* fix(caching): convert arg to equivalent kwargs in llm caching handler
prevent unexpected errors
* fix(caching_handler.py): don't pass args to caching
* fix(caching): remove all *args from caching.py
* fix(caching): consistent function signatures + abc method
* test(caching_unit_tests.py): add unit tests for llm caching
ensures coverage for common caching scenarios across different implementations
* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one
* fix(router.py): drop redis password requirement
* fix(proxy_server.py): fix faulty slack alerting check
* fix(langfuse.py): avoid copying functions/thread lock objects in metadata
fixes metadata copy error when parent otel span in metadata
* test: update test
* fix(key_management_endpoints.py): fix /key/update with metadata update
* fix(key_management_endpoints.py): fix key_prepare_update helper
* fix(key_management_endpoints.py): reset value to none if set in key update
* fix: update test
'
* Litellm dev 11 11 2024 (#6693 )
* fix(__init__.py): add 'watsonx_text' as mapped llm api route
Fixes https://github.com/BerriAI/litellm/issues/6663
* fix(opentelemetry.py): fix passing parallel tool calls to otel
Fixes https://github.com/BerriAI/litellm/issues/6677
* refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling
reduces bugs in repo
* fix(__init__.py): update provider-model mapping to include all known provider-model mappings
Fixes https://github.com/BerriAI/litellm/issues/6669
* feat(anthropic): support passing document in llm api call
* docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function
* fix(factory.py): fix linting error
* add clear doc string for GCS bucket logging
* Add docs to export logs to Laminar (#6674 )
* Add docs to export logs to Laminar
* minor fix: newline at end of file
* place laminar after http and grpc
* (Feat) Add langsmith key based logging (#6682 )
* add langsmith_api_key to StandardCallbackDynamicParams
* create a file for langsmith types
* langsmith add key / team based logging
* add key based logging for langsmith
* fix langsmith key based logging
* fix linting langsmith
* remove NOQA violation
* add unit test coverage for all helpers in test langsmith
* test_langsmith_key_based_logging
* docs langsmith key based logging
* run langsmith tests in logging callback tests
* fix logging testing
* test_langsmith_key_based_logging
* test_add_callback_via_key_litellm_pre_call_utils_langsmith
* add debug statement langsmith key based logging
* test_langsmith_key_based_logging
* (fix) OpenAI's optional messages[].name does not work with Mistral API (#6701 )
* use helper for _transform_messages mistral
* add test_message_with_name to base LLMChat test
* fix linting
* add xAI on Admin UI (#6680 )
* (docs) add benchmarks on 1K RPS (#6704 )
* docs litellm proxy benchmarks
* docs GCS bucket
* doc fix - reduce clutter on logging doc title
* (feat) add cost tracking stable diffusion 3 on Bedrock (#6676 )
* add cost tracking for sd3
* test_image_generation_bedrock
* fix get model info for image cost
* add cost_calculator for stability 1 models
* add unit testing for bedrock image cost calc
* test_cost_calculator_with_no_optional_params
* add test_cost_calculator_basic
* correctly allow size Optional
* fix cost_calculator
* sd3 unit tests cost calc
* fix raise correct error 404 when /key/info is called on non-existent key (#6653 )
* fix raise correct error on /key/info
* add not_found_error error
* fix key not found in DB error
* use 1 helper for checking token hash
* fix error code on key info
* fix test key gen prisma
* test_generate_and_call_key_info
* test fix test_call_with_valid_model_using_all_models
* fix key info tests
* bump: version 1.52.4 → 1.52.5
* add defaults used for GCS logging
* LiteLLM Minor Fixes & Improvements (11/12/2024) (#6705 )
* fix(caching): convert arg to equivalent kwargs in llm caching handler
prevent unexpected errors
* fix(caching_handler.py): don't pass args to caching
* fix(caching): remove all *args from caching.py
* fix(caching): consistent function signatures + abc method
* test(caching_unit_tests.py): add unit tests for llm caching
ensures coverage for common caching scenarios across different implementations
* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one
* fix(router.py): drop redis password requirement
* fix(proxy_server.py): fix faulty slack alerting check
* fix(langfuse.py): avoid copying functions/thread lock objects in metadata
fixes metadata copy error when parent otel span in metadata
* test: update test
* bump: version 1.52.5 → 1.52.6
* (feat) helm hook to sync db schema (#6715 )
* v0 migration job
* fix job
* fix migrations job.yml
* handle standalone DB on helm hook
* fix argo cd annotations
* fix db migration helm hook
* fix migration job
* doc fix Using Http/2 with Hypercorn
* (fix proxy redis) Add redis sentinel support (#6154 )
* add sentinel_password support
* add doc for setting redis sentinel password
* fix redis sentinel - use sentinel password
* Fix: Update gpt-4o costs to that of gpt-4o-2024-08-06 (#6714 )
Fixes #6713
* (fix) using Anthropic `response_format={"type": "json_object"}` (#6721 )
* add support for response_format=json anthropic
* add test_json_response_format to baseLLM ChatTest
* fix test_litellm_anthropic_prompt_caching_tools
* fix test_anthropic_function_call_with_no_schema
* test test_create_json_tool_call_for_response_format
* (feat) Add cost tracking for Azure Dall-e-3 Image Generation + use base class to ensure basic image generation tests pass (#6716 )
* add BaseImageGenTest
* use 1 class for unit testing
* add debugging to BaseImageGenTest
* TestAzureOpenAIDalle3
* fix response_cost_calculator
* test_basic_image_generation
* fix img gen basic test
* fix _select_model_name_for_cost_calc
* fix test_aimage_generation_bedrock_with_optional_params
* fix undo changes cost tracking
* fix response_cost_calculator
* fix test_cost_azure_gpt_35
* fix remove dup test (#6718 )
* (build) update db helm hook
* (build) helm db pre sync hook
* (build) helm db sync hook
* test: run test_team_logging firdst
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com>
Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>
* test: update test
* test: skip anthropic overloaded error
* test: cleanup test
* test: update tests
* test: fix test
* test: handle gemini overloaded model error
* test: handle internal server error
* test: handle anthropic overloaded error
* test: handle claude instability
---------
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com>
Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>
---------
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Jongseob Jeon <aiden.jongseob@gmail.com>
Co-authored-by: Camden Clark <camdenaws@gmail.com>
Co-authored-by: Rasswanth <61219215+IamRash-7@users.noreply.github.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com>
Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>
2024-11-19 09:54:50 +05:30
Ishaan Jaff
41aade2cc0
(feat) Use litellm/
prefix when storing virtual keys in AWS secret manager ( #6765 )
...
* fix - storing AWS keys in secret manager
* fix test_key_generate_with_secret_manager_call
* allow using prefix_for_stored_virtual_keys
* add prefix_for_stored_virtual_keys
* test_key_generate_with_secret_manager_call
2024-11-15 18:07:43 -08:00
Ishaan Jaff
f8e700064e
(Feat) Add support for storing virtual keys in AWS SecretManager ( #6728 )
...
* add SecretManager to httpxSpecialProvider
* fix importing AWSSecretsManagerV2
* add unit testing for writing keys to AWS secret manager
* use KeyManagementEventHooks for key/generated events
* us event hooks for key management endpoints
* working AWSSecretsManagerV2
* fix write secret to AWS secret manager on /key/generate
* fix KeyManagementSettings
* use tasks for key management hooks
* add async_delete_secret
* add test for async_delete_secret
* use _delete_virtual_keys_from_secret_manager
* fix test secret manager
* test_key_generate_with_secret_manager_call
* fix check for key_management_settings
* sync_read_secret
* test_aws_secret_manager
* fix sync_read_secret
* use helper to check when _should_read_secret_from_secret_manager
* test_get_secret_with_access_mode
* test - handle eol model claude-2, use claude-2.1 instead
* docs AWS secret manager
* fix test_read_nonexistent_secret
* fix test_supports_response_schema
* ci/cd run again
2024-11-14 09:25:07 -08:00
Krish Dholakia
d88e8922d4
Litellm dev 11 02 2024 ( #6561 )
...
* fix(dual_cache.py): update in-memory check for redis batch get cache
Fixes latency delay for async_batch_redis_cache
* fix(service_logger.py): fix race condition causing otel service logging to be overwritten if service_callbacks set
* feat(user_api_key_auth.py): add parent otel component for auth
allows us to isolate how much latency is added by auth checks
* perf(parallel_request_limiter.py): move async_set_cache_pipeline (from max parallel request limiter) out of execution path (background task)
reduces latency by 200ms
* feat(user_api_key_auth.py): have user api key auth object return user tpm/rpm limits - reduces redis calls in downstream task (parallel_request_limiter)
Reduces latency by 400-800ms
* fix(parallel_request_limiter.py): use batch get cache to reduce user/key/team usage object calls
reduces latency by 50-100ms
* fix: fix linting error
* fix(_service_logger.py): fix import
* fix(user_api_key_auth.py): fix service logging
* fix(dual_cache.py): don't pass 'self'
* fix: fix python3.8 error
* fix: fix init]
2024-11-04 07:48:20 +05:30
Krish Dholakia
4f8a3fd4cf
redis otel tracing + async support for latency routing ( #6452 )
...
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
2024-10-28 21:52:12 -07:00
Ishaan Jaff
610974b4fc
(code quality) add ruff check PLR0915 for too-many-statements
( #6309 )
...
* ruff add PLR0915
* add noqa for PLR0915
* fix noqa
* add # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* add # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
2024-10-18 15:36:49 +05:30
Krish Dholakia
39486e2003
Litellm dev 10 14 2024 ( #6221 )
...
* fix(__init__.py): expose DualCache, RedisCache, InMemoryCache on root
abstract internal file refactors from impacting users
* feat(utils.py): handle invalid openai parallel tool calling response
Fixes https://community.openai.com/t/model-tries-to-call-unknown-function-multi-tool-use-parallel/490653
* docs(bedrock.md): clarify all bedrock models are supported
Closes https://github.com/BerriAI/litellm/issues/6168#issuecomment-2412082236
2024-10-14 22:11:14 -07:00
Ishaan Jaff
4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache ( #6208 )
...
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement ( #5992 )
...
* ci(config.yml): add a 'check_code_quality' step
Addresses https://github.com/BerriAI/litellm/issues/5991
* ci(config.yml): check why circle ci doesn't pick up this test
* ci(config.yml): fix to run 'check_code_quality' tests
* fix(__init__.py): fix unprotected import
* fix(__init__.py): don't remove unused imports
* build(ruff.toml): update ruff.toml to ignore unused imports
* fix: fix: ruff + pyright - fix linting + type-checking errors
* fix: fix linting errors
* fix(lago.py): fix module init error
* fix: fix linting errors
* ci(config.yml): cd into correct dir for checks
* fix(proxy_server.py): fix linting error
* fix(utils.py): fix bare except
causes ruff linting errors
* fix: ruff - fix remaining linting errors
* fix(clickhouse.py): use standard logging object
* fix(__init__.py): fix unprotected import
* fix: ruff - fix linting errors
* fix: fix linting errors
* ci(config.yml): cleanup code qa step (formatting handled in local_testing)
* fix(_health_endpoints.py): fix ruff linting errors
* ci(config.yml): just use ruff in check_code_quality pipeline for now
* build(custom_guardrail.py): include missing file
* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Krrish Dholakia
6c7d1d5c96
fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set)
2024-09-28 21:08:15 -07:00
Krrish Dholakia
3f8a5b3ef6
fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing
2024-09-28 21:08:15 -07:00
Krrish Dholakia
5222fc8e1b
fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way
...
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Krrish Dholakia
efc06d4a03
fix(batch_redis_get.py): handle custom namespace
...
Fix https://github.com/BerriAI/litellm/issues/5917
2024-09-28 21:08:14 -07:00
Ishaan Jaff
088d906276
fix use one async async_batch_set_cache ( #5956 )
2024-09-28 09:59:38 -07:00
Krish Dholakia
0b30e212da
LiteLLM Minor Fixes & Improvements (09/27/2024) ( #5938 )
...
* fix(langfuse.py): prevent double logging requester metadata
Fixes https://github.com/BerriAI/litellm/issues/5935
* build(model_prices_and_context_window.json): add mistral pixtral cost tracking
Closes https://github.com/BerriAI/litellm/issues/5837
* handle streaming for azure ai studio error
* [Perf Proxy] parallel request limiter - use one cache update call (#5932 )
* fix parallel request limiter - use one cache update call
* ci/cd run again
* run ci/cd again
* use docker username password
* fix config.yml
* fix config
* fix config
* fix config.yml
* ci/cd run again
* use correct typing for batch set cache
* fix async_set_cache_pipeline
* fix only check user id tpm / rpm limits when limits set
* fix test_openai_azure_embedding_with_oidc_and_cf
* fix(groq/chat/transformation.py): Fixes https://github.com/BerriAI/litellm/issues/5839
* feat(anthropic/chat.py): return 'retry-after' headers from anthropic
Fixes https://github.com/BerriAI/litellm/issues/4387
* feat: raise validation error if message has tool calls without passing `tools` param for anthropic/bedrock
Closes https://github.com/BerriAI/litellm/issues/5747
* [Feature]#5940, add max_workers parameter for the batch_completion (#5947 )
* handle streaming for azure ai studio error
* bump: version 1.48.2 → 1.48.3
* docs(data_security.md): add legal/compliance faq's
Make it easier for companies to use litellm
* docs: resolve imports
* [Feature]#5940, add max_workers parameter for the batch_completion method
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>
* fix(converse_transformation.py): fix default message value
* fix(utils.py): fix get_model_info to handle finetuned models
Fixes issue for standard logging payloads, where model_map_value was null for finetuned openai models
* fix(litellm_pre_call_utils.py): add debug statement for data sent after updating with team/key callbacks
* fix: fix linting errors
* fix(anthropic/chat/handler.py): fix cache creation input tokens
* fix(exception_mapping_utils.py): fix missing imports
* fix(anthropic/chat/handler.py): fix usage block translation
* test: fix test
* test: fix tests
* style(types/utils.py): trigger new build
* test: fix test
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Jose Alberto Arango Sanchez <jose.arangos@udea.edu.co>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>
2024-09-27 22:52:57 -07:00
Ishaan Jaff
f4613a100d
[Perf Proxy] parallel request limiter - use one cache update call ( #5932 )
...
* fix parallel request limiter - use one cache update call
* ci/cd run again
* run ci/cd again
* use docker username password
* fix config.yml
* fix config
* fix config
* fix config.yml
* ci/cd run again
* use correct typing for batch set cache
* fix async_set_cache_pipeline
* fix only check user id tpm / rpm limits when limits set
* fix test_openai_azure_embedding_with_oidc_and_cf
2024-09-27 17:24:46 -07:00
Ishaan Jaff
58171f35ef
[Fix proxy perf] Use correct cache key when reading from redis cache ( #5928 )
...
* fix parallel request limiter use correct user id
* async def get_user_object(
fix
* use safe get_internal_user_object
* fix store internal users in redis correctly
2024-09-26 18:13:35 -07:00
Ishaan Jaff
7cbcf538c6
[Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL ( #5881 )
...
* fix use previous internal usage caching logic
* fix test_dual_cache_uses_redis
* redis track event_metadata in service logging
* show otel error on _get_parent_otel_span_from_kwargs
* track parent otel span on internal usage cache
* update_request_status
* fix internal usage cache
* fix linting
* fix test internal usage cache
* fix linting error
* show event metadata in redis set
* fix test_get_team_redis
* fix test_get_team_redis
* test_proxy_logging_setup
2024-09-25 10:57:08 -07:00
Krish Dholakia
234185ec13
LiteLLM Minor Fixes & Improvements (09/16/2024) ( #5723 ) ( #5731 )
...
* LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723 )
* coverage (#5713 )
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* Move (#5714 )
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix(litellm_logging.py): fix logging client re-init (#5710 )
Fixes https://github.com/BerriAI/litellm/issues/5695
* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config
Fixes https://github.com/BerriAI/litellm/issues/5682
* feat(o1_handler.py): fake streaming for openai o1 models
Fixes https://github.com/BerriAI/litellm/issues/5694
* docs: deprecated traceloop integration in favor of native otel (#5249 )
* fix: fix linting errors
* fix: fix linting errors
* fix(main.py): fix o1 import
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730 )
* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view
Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it
* fix(custom_logger.py): reset calltype
* fix: fix linting errors
* fix: fix linting error
* fix: fix import
* test(test_databricks.py): fix databricks tests
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
2024-09-17 08:05:52 -07:00
Ishaan Jaff
b6ae2204a8
[Feat-Proxy] Slack Alerting - allow using os.environ/ vars for alert to webhook url ( #5726 )
...
* allow using os.environ for slack urls
* use env vars for webhook urls
* fix types for get_secret
* fix linting
* fix linting
* fix linting
* linting fixes
* linting fix
* docs alerting slack
* fix get data
2024-09-16 18:03:37 -07:00
Krish Dholakia
dad1ad2077
LiteLLM Minor Fixes and Improvements (09/14/2024) ( #5697 )
...
* fix(health_check.py): hide sensitive keys from health check debug information k
* fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue
* fix(vertex_llm_base.py): fix exception message to not log credentials
2024-09-14 10:32:39 -07:00
Ishaan Jaff
fb5be57bb8
v0 add rerank on litellm proxy
2024-08-27 17:28:39 -07:00
Ishaan Jaff
0e1d3804ff
refactor vertex endpoints to pass through all routes
2024-08-21 17:08:42 -07:00
Ishaan Jaff
4685b9909a
feat - allow accessing data post success call
2024-08-19 11:35:33 -07:00
Ishaan Jaff
398295116f
inly write model tpm/rpm tracking when user set it
2024-08-18 09:58:09 -07:00
Ishaan Jaff
fa96610bbc
fix async_pre_call_hook in parallel request limiter
2024-08-17 12:42:28 -07:00
Ishaan Jaff
feb8c3c5b4
Merge pull request #5259 from BerriAI/litellm_return_remaining_tokens_in_header
...
[Feat] return `x-litellm-key-remaining-requests-{model}`: 1, `x-litellm-key-remaining-tokens-{model}: None` in response headers
2024-08-17 12:41:16 -07:00
Ishaan Jaff
ee0f772b5c
feat return rmng tokens for model for api key
2024-08-17 12:35:10 -07:00
Ishaan Jaff
5985c7e933
feat - use commong helper for getting model group
2024-08-17 10:46:04 -07:00
Ishaan Jaff
412d30d362
add litellm-key-remaining-tokens on prometheus
2024-08-17 10:02:20 -07:00
Ishaan Jaff
785482f023
feat add settings for rpm/tpm limits for a model
2024-08-17 09:16:01 -07:00
Ishaan Jaff
1ee33478c9
track rpm/tpm usage per key+model
2024-08-16 18:28:58 -07:00
Krrish Dholakia
61f4b71ef7
refactor: replace .error() with .exception() logging for better debugging on sentry
2024-08-16 09:22:47 -07:00
Krrish Dholakia
5d96ff6694
fix(utils.py): handle scenario where model="azure/*" and custom_llm_provider="azure"
...
Fixes https://github.com/BerriAI/litellm/issues/4912
2024-08-02 17:48:53 -07:00
Ishaan Jaff
c4e4b4675c
fix raise better error when crossing tpm / rpm limits
2024-07-26 17:35:08 -07:00
Krrish Dholakia
07d90f6739
feat(aporio_ai.py): support aporio ai prompt injection for chat completion requests
...
Closes https://github.com/BerriAI/litellm/issues/2950
2024-07-17 16:38:47 -07:00
Krrish Dholakia
fde434be66
feat(proxy_server.py): return 'retry-after' param for rate limited requests
...
Closes https://github.com/BerriAI/litellm/issues/4695
2024-07-13 17:15:20 -07:00
Krrish Dholakia
7e769f3b89
fix: fix linting errors
2024-07-13 14:39:42 -07:00
Krrish Dholakia
0cc273d77b
feat(pass_through_endpoint.py): support enforcing key rpm limits on pass through endpoints
...
Closes https://github.com/BerriAI/litellm/issues/4698
2024-07-13 13:29:44 -07:00
Krrish Dholakia
9d918d2ac7
fix(presidio_pii_masking.py): support logging_only pii masking
2024-07-11 18:04:12 -07:00
Krrish Dholakia
1193ee8803
fix(presidio_pii_masking.py): fix presidio unset url check + add same check for langfuse
2024-07-06 17:50:55 -07:00
Krrish Dholakia
d57d3df1d6
fix(presidio_pii_masking.py): add support for setting 'http://' if unset by render env for presidio base url
2024-07-06 17:42:10 -07:00
Krrish Dholakia
196b94455e
fix(dynamic_rate_limiter.py): add rpm allocation, priority + quota reservation to docs
2024-07-01 23:35:42 -07:00
Krrish Dholakia
6b529d4e0e
fix(dynamic_rate_limiter.py): support setting priority + reserving tpm/rpm
2024-07-01 23:08:54 -07:00
Krrish Dholakia
0781014706
test(test_dynamic_rate_limit_handler.py): refactor tests for rpm suppprt
2024-07-01 20:16:10 -07:00
Krrish Dholakia
f23b17091d
fix(dynamic_rate_limiter.py): support dynamic rate limiting on rpm
2024-07-01 17:45:10 -07:00
Krrish Dholakia
bae7377128
docs(team_budgets.md): fix script
...
/
2024-06-22 15:42:05 -07:00
Krrish Dholakia
a31a05d45d
feat(dynamic_rate_limiter.py): working e2e
2024-06-22 14:41:22 -07:00
Krrish Dholakia
532f24bfb7
refactor: instrument 'dynamic_rate_limiting' callback on proxy
2024-06-22 00:32:29 -07:00
Krrish Dholakia
068e8dff5b
feat(dynamic_rate_limiter.py): passing base case
2024-06-21 22:46:46 -07:00