* docs(config_settings.md): document all router_settings
* ci(config.yml): add router_settings doc test to ci/cd
* test: debug test on ci/cd
* test: debug ci/cd test
* test: fix test
* fix(team_endpoints.py): skip invalid team object. don't fail `/team/list` call
Causes downstream errors if ui just fails to load team list
* test(base_llm_unit_tests.py): add 'response_format={"type": "text"}' test to base_llm_unit_tests
adds complete coverage for all 'response_format' values to ci/cd
* feat(router.py): support wildcard routes in `get_router_model_info()`
Addresses https://github.com/BerriAI/litellm/issues/6914
* build(model_prices_and_context_window.json): add tpm/rpm limits for all gemini models
Allows for ratelimit tracking for gemini models even with wildcard routing enabled
Addresses https://github.com/BerriAI/litellm/issues/6914
* feat(router.py): add tpm/rpm tracking on success/failure to global_router
Addresses https://github.com/BerriAI/litellm/issues/6914
* feat(router.py): support wildcard routes on router.get_model_group_usage()
* fix(router.py): fix linting error
* fix(router.py): implement get_remaining_tokens_and_requests
Addresses https://github.com/BerriAI/litellm/issues/6914
* fix(router.py): fix linting errors
* test: fix test
* test: fix tests
* docs(config_settings.md): add missing dd env vars to docs
* fix(router.py): check if hidden params is dict
* feat - allow using gemini js SDK with LiteLLM
* add auth for gemini_proxy_route
* basic local test for js
* test cost tagging gemini js requests
* add js sdk test for gemini with litellm
* add docs on gemini JS SDK
* run node.js tests
* fix google ai studio tests
* fix vertex js spend test
* stash gemini JS test
* add vertex js sdj example
* handle vertex pass through separately
* tes vertex JS sdk
* fix vertex_proxy_route
* use PassThroughStreamingHandler
* fix PassThroughStreamingHandler
* use common _create_vertex_response_logging_payload_for_generate_content
* test vertex js
* add working vertex jest tests
* move basic bass through test
* use good name for test
* test vertex
* test_chunk_processor_yields_raw_bytes
* unit tests for streaming
* test_convert_raw_bytes_to_str_lines
* run unit tests 1st
* simplify local
* docs add usage example for js
* use get_litellm_virtual_key
* add unit tests for vertex pass through
* fix - don't block proxy startup if not a premium user
* test_litellm_proxy_server_config_with_prometheus
* add test for proxy startup
* fix remove unused test
* fix startup test
* add comment on bad-license
* feat(customer_endpoints.py): support passing budget duration via `/customer/new` endpoint
Closes https://github.com/BerriAI/litellm/issues/5651
* docs: add missing params to swagger + api documentation test
* docs: add documentation for all key endpoints
documents all params on swagger
* docs(internal_user_endpoints.py): document all /user/new params
Ensures all params are documented
* docs(team_endpoints.py): add missing documentation for team endpoints
Ensures 100% param documentation on swagger
* docs(organization_endpoints.py): document all org params
Adds documentation for all params in org endpoint
* docs(customer_endpoints.py): add coverage for all params on /customer endpoints
ensures all /customer/* params are documented
* ci(config.yml): add endpoint doc testing to ci/cd
* fix: fix internal_user_endpoints.py
* fix(internal_user_endpoints.py): support 'duration' param
* fix(partner_models/main.py): fix anthropic re-raise exception on vertex
* fix: fix pydantic obj
* add langsmith_api_key to StandardCallbackDynamicParams
* create a file for langsmith types
* langsmith add key / team based logging
* add key based logging for langsmith
* fix langsmith key based logging
* fix linting langsmith
* remove NOQA violation
* add unit test coverage for all helpers in test langsmith
* test_langsmith_key_based_logging
* docs langsmith key based logging
* run langsmith tests in logging callback tests
* fix logging testing
* test_langsmith_key_based_logging
* test_add_callback_via_key_litellm_pre_call_utils_langsmith
* add debug statement langsmith key based logging
* test_langsmith_key_based_logging
* use CustomBatchLogger for GCS
* add GCS bucket logging type
* use batch logging for GCs bucket
* add gcs_bucket
* allow setting flush_interval on CustomBatchLogger
* set GCS_FLUSH_INTERVAL to 1s
* fix test_key_logging
* fix test_key_logging
* add docs on new env vars
* add unit testing for standard logging payload
* unit testing for static methods in litellm_logging
* add code coverage check for litellm_logging
* litellm_logging_code_coverage
* test_get_final_response_obj
* fix validate_redacted_message_span_attributes
* test validate_redacted_message_span_attributes
* fix(pattern_match_deployments.py): default to user input if unable to map based on wildcards
* test: fix test
* test: reset test name
* test: update conftest to reload proxy server module between tests
* ci(config.yml): move langfuse out of local_testing
reduce ci/cd time
* ci(config.yml): cleanup langfuse ci/cd tests
* fix: update test to not use global proxy_server app module
* ci: move caching to a separate test pipeline
speed up ci pipeline
* test: update conftest to check if proxy_server attr exists before reloading
* build(conftest.py): don't block on inability to reload proxy_server
* ci(config.yml): update caching unit test filter to work on 'cache' keyword as well
* fix(encrypt_decrypt_utils.py): use function to get salt key
* test: mark flaky test
* test: handle anthropic overloaded errors
* refactor: create separate ci/cd pipeline for proxy unit tests
make ci/cd faster
* ci(config.yml): add litellm_proxy_unit_testing to build_and_test jobs
* ci(config.yml): generate prisma binaries for proxy unit tests
* test: readd vertex_key.json
* ci(config.yml): remove `-s` from proxy_unit_test cmd
speed up test
* ci: remove any 'debug' logging flag
speed up ci pipeline
* test: fix test
* test(test_braintrust.py): rerun
* test: add delay for braintrust test
* fix debug statements
* fix assert prisma_client.health_check is called on _setup
* asser that _setup_prisma_client is called on startup proxy
* fix prisma client health_check
* add test_bad_database_url
* add strict checks on db startup
* temp remove fix to validate if check works as expected
* add health_check back
* test_proxy_server_prisma_setup_invalid_db
* perf: move writing key to cache, to background task
* perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils
adds 200ms on calls with pgdb connected
* fix(litellm_pre_call_utils.py'): rename call_type to actual call used
* perf(proxy_server.py): remove db logic from _get_config_from_file
was causing db calls to occur on every llm request, if team_id was set on key
* fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db
reduces latency/call by ~100ms
* fix(proxy_server.py): minor fix on existing_settings not incl alerting
* fix(exception_mapping_utils.py): map databricks exception string
* fix(auth_checks.py): fix auth check logic
* test: correctly mark flaky test
* fix(utils.py): handle auth token error for tokenizers.from_pretrained
* fix router strat
* use async set / get cache in router_strategy
* add coverage for router strategy
* fix imports
* fix batch_get_cache
* use async methods for least busy
* fix least busy use async methods
* fix test_dual_cache_increment
* test async_get_available_deployment when routing_strategy="least-busy"
* (refactor) use _assemble_complete_response_from_streaming_chunks
* add unit test for test_assemble_complete_response_from_streaming_chunks_1
* fix assemble complete_streaming_response
* config add logging_testing
* add logging_coverage in codecov
* test test_assemble_complete_response_from_streaming_chunks_3
* add unit tests for _assemble_complete_response_from_streaming_chunks
* fix remove unused / junk function
* add test for streaming_chunks when error assembling
* fix: enable new 'disable_prisma_schema_update' flag
* build(config.yml): remove setup remote docker step
* ci(config.yml): give container time to start up
* ci(config.yml): update test
* build(config.yml): actually start docker
* build(config.yml): simplify grep check
* fix(prisma_client.py): support reading disable_schema_update via env vars
* ci(config.yml): add test to check if all general settings are documented
* build(test_General_settings.py): check available dir
* ci: check ../ repo path
* build: check ./
* build: fix test
* LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925)
* fix(litellm_logging.py): don't initialize prometheus_logger if non premium user
Prevents bad error messages in logs
Fixes https://github.com/BerriAI/litellm/issues/5897
* Add Support for Custom Providers in Vision and Function Call Utils (#5688)
* Add Support for Custom Providers in Vision and Function Call Utils Lookup
* Remove parallel function call due to missing model info param
* Add Unit Tests for Vision and Function Call Changes
* fix-#5920: set header value to string to fix "'int' object has no att… (#5922)
* LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880)
* LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842)
* feat(auth_utils.py): enable admin to allow client-side credentials to be passed
Makes it easier for devs to experiment with finetuned fireworks ai models
* feat(router.py): allow setting configurable_clientside_auth_params for a model
Closes https://github.com/BerriAI/litellm/issues/5843
* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit
Fixes https://github.com/BerriAI/litellm/issues/5850
* fix(azure_ai/): support content list for azure ai
Fixes https://github.com/BerriAI/litellm/issues/4237
* fix(litellm_logging.py): always set saved_cache_cost
Set to 0 by default
* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing
handles calling 405b+ size models
* fix(slack_alerting.py): fix error alerting for failed spend tracking
Fixes regression with slack alerting error monitoring
* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error
* docs(bedrock.md): add llama3-1 models
* test: fix tests
* fix(azure_ai/chat): fix transformation for azure ai calls
* feat(azure_ai/embed): Add azure ai embeddings support
Closes https://github.com/BerriAI/litellm/issues/5861
* fix(azure_ai/embed): enable async embedding
* feat(azure_ai/embed): support azure ai multimodal embeddings
* fix(azure_ai/embed): support async multi modal embeddings
* feat(together_ai/embed): support together ai embedding calls
* feat(rerank/main.py): log source documents for rerank endpoints to langfuse
improves rerank endpoint logging
* fix(langfuse.py): support logging `/audio/speech` input to langfuse
* test(test_embedding.py): fix test
* test(test_completion_cost.py): fix helper util
* fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'"
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926)
This reverts commit a554ae2695.
* build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing
Enables cost tracking for azure ai cohere rerank models
* fix(litellm_logging.py): fix debug log to be clearer
Closes https://github.com/BerriAI/litellm/issues/5909
* test(test_utils.py): fix test name
* fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models
* fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints
* fix(converse_handler.py): support new llama 3-2 models
Fixes https://github.com/BerriAI/litellm/issues/5901
* fix(litellm_logging.py): ensure response is redacted for standard message logging
Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360
* fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation
allows user to set custom cost for model
* fix(config.yml): fix docker hub auht
* build(config.yml): add docker auth to all tests
* fix(db/create_views.py): fix linting error
* fix(main.py): fix circular import
* fix(azure_ai/__init__.py): fix circular import
* fix(main.py): fix import
* fix: fix linting errors
* test: fix test
* fix(proxy_server.py): pass premium user value on startup
used for prometheus init
---------
Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>
* handle streaming for azure ai studio error
* [Perf Proxy] parallel request limiter - use one cache update call (#5932)
* fix parallel request limiter - use one cache update call
* ci/cd run again
* run ci/cd again
* use docker username password
* fix config.yml
* fix config
* fix config
* fix config.yml
* ci/cd run again
* use correct typing for batch set cache
* fix async_set_cache_pipeline
* fix only check user id tpm / rpm limits when limits set
* fix test_openai_azure_embedding_with_oidc_and_cf
* test: fix test
* test(test_rerank.py): fix test
---------
Co-authored-by: Cole Murray <colemurray.cs@gmail.com>
Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix parallel request limiter - use one cache update call
* ci/cd run again
* run ci/cd again
* use docker username password
* fix config.yml
* fix config
* fix config
* fix config.yml
* ci/cd run again
* use correct typing for batch set cache
* fix async_set_cache_pipeline
* fix only check user id tpm / rpm limits when limits set
* fix test_openai_azure_embedding_with_oidc_and_cf