Ishaan Jaff
391b107909
[Feat UI sso] store 'provider' in user metadata ( #5856 )
...
* store sso provider in user metadata
* store user metadata
* store user auth_provider in user metadata
* add "metadata" for LiteLLM_UserTable
* fix sso test
2024-09-23 17:49:36 -07:00
Ishaan Jaff
d9e798ecda
[Testing-Proxy] Add E2E Admin UI testing ( #5845 )
...
* add working ui e2e testing
* ui test
* ui playwright testing
* install python on ui testing
* add playwright testing
* fix ui testing
* fix ui testing
* add redis vars for testing
* fix playwright testing
* fix playwright testing
* rename ui testing
* move e2e ui testing
2024-09-23 11:34:42 -07:00
Ishaan Jaff
6b9b469686
testing - nvidia nim api use mock testing
2024-09-23 08:48:13 -07:00
Krrish Dholakia
2a8eb492a1
test(test_otel.py): fix test
2024-09-23 08:10:06 -07:00
Krish Dholakia
8039b95aaf
LiteLLM Minor Fixes & Improvements (09/21/2024) ( #5819 )
...
* fix(router.py): fix error message
* Litellm disable keys (#5814 )
* build(schema.prisma): allow blocking/unblocking keys
Fixes https://github.com/BerriAI/litellm/issues/5328
* fix(key_management_endpoints.py): fix pop
* feat(auth_checks.py): allow admin to enable/disable virtual keys
Closes https://github.com/BerriAI/litellm/issues/5328
* docs(vertex.md): add auth section for vertex ai
Addresses - https://github.com/BerriAI/litellm/issues/5768#issuecomment-2365284223
* build(model_prices_and_context_window.json): show which models support prompt_caching
Closes https://github.com/BerriAI/litellm/issues/5776
* fix(router.py): allow setting default priority for requests
* fix(router.py): add 'retry-after' header for concurrent request limit errors
Fixes https://github.com/BerriAI/litellm/issues/5783
* fix(router.py): correctly raise and use retry-after header from azure+openai
Fixes https://github.com/BerriAI/litellm/issues/5783
* fix(user_api_key_auth.py): fix valid token being none
* fix(auth_checks.py): fix model dump for cache management object
* fix(user_api_key_auth.py): pass prisma_client to obj
* test(test_otel.py): update test for new key check
* test: fix test
2024-09-21 18:51:53 -07:00
Ishaan Jaff
d100b32573
[SSO-UI] Set new sso users as internal_view role users ( #5824 )
...
* use /user/list endpoint on admin ui
* sso insert user with role when user does not exist
* add sso sign in test
* linting fix
* rename self serve doc
* add doc for self serve flow
* test - sso sign in default values
* add test for /user/list endpoint
2024-09-21 16:43:52 -07:00
Ishaan Jaff
711932294c
[Feat] Add testing for prometheus failure metrics ( #5823 )
...
* prom - show status code and class type on prom
* log exception_class name on prometheus metrics
* prometheus track error code and status
* add bad model
* add prometheus failure metric test
* remove outdated file
* fix litellm_proxy_total_requests_metric
* add prometheus metrics testing
2024-09-21 11:36:29 -07:00
Ishaan Jaff
1973ae8fb8
[Feat] Allow setting supports_vision
for Custom OpenAI endpoints + Added testing ( #5821 )
...
* add test for using images with custom openai endpoints
* run all otel tests
* update name of test
* add custom openai model to test config
* add test for setting supports_vision=True for model
* fix test guardrails aporia
* docs supports vison
* fix yaml
* fix yaml
* docs supports vision
* fix bedrock guardrail test
* fix cohere rerank test
* update model_group doc string
* add better prints on test
2024-09-21 11:35:55 -07:00
Krish Dholakia
3933fba41f
LiteLLM Minor Fixes & Improvements (09/19/2024) ( #5793 )
...
* fix(model_prices_and_context_window.json): add cost tracking for more vertex llama3.1 model
8b and 70b models
* fix(proxy/utils.py): handle data being none on pre-call hooks
* fix(proxy/): create views on initial proxy startup
fixes base case, where user starts proxy for first time
Fixes https://github.com/BerriAI/litellm/issues/5756
* build(config.yml): fix vertex version for test
* feat(ui/): support enabling/disabling slack alerting
Allows admin to turn on/off slack alerting through ui
* feat(rerank/main.py): support langfuse logging
* fix(proxy/utils.py): fix linting errors
* fix(langfuse.py): log clean metadata
* test(tests): replace deprecated openai model
2024-09-20 08:19:52 -07:00
Krish Dholakia
d46660ea0f
LiteLLM Minor Fixes & Improvements (09/18/2024) ( #5772 )
...
* fix(proxy_server.py): fix azure key vault logic to not require client id/secret
* feat(cost_calculator.py): support fireworks ai cost tracking
* build(docker-compose.yml): add lines for mounting config.yaml to docker compose
Closes https://github.com/BerriAI/litellm/issues/5739
* fix(input.md): update docs to clarify litellm supports content as a list of dictionaries
Fixes https://github.com/BerriAI/litellm/issues/5755
* fix(input.md): update input.md to include all message values
* fix(image_handling.py): follow image url redirects
Fixes https://github.com/BerriAI/litellm/issues/5763
* fix(router.py): Fix model key/base leak in error message
Fixes https://github.com/BerriAI/litellm/issues/5762
* fix(http_handler.py): fix linting error
* fix(azure.py): fix logging to show azure_ad_token being used
Fixes https://github.com/BerriAI/litellm/issues/5767
* fix(_redis.py): add redis sentinel support
Closes https://github.com/BerriAI/litellm/issues/4381
* feat(_redis.py): add redis sentinel support
Closes https://github.com/BerriAI/litellm/issues/4381
* test(test_completion_cost.py): fix test
* Databricks Integration: Integrate Databricks SDK as optional mechanism for fetching API base and token, if unspecified (#5746 )
* LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723 )
* coverage (#5713 )
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* Move (#5714 )
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix(litellm_logging.py): fix logging client re-init (#5710 )
Fixes https://github.com/BerriAI/litellm/issues/5695
* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config
Fixes https://github.com/BerriAI/litellm/issues/5682
* feat(o1_handler.py): fake streaming for openai o1 models
Fixes https://github.com/BerriAI/litellm/issues/5694
* docs: deprecated traceloop integration in favor of native otel (#5249 )
* fix: fix linting errors
* fix: fix linting errors
* fix(main.py): fix o1 import
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730 )
* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view
Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it
* fix(custom_logger.py): reset calltype
* fix: fix linting errors
* fix: fix linting error
* fix
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix: fix import
* Fix
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* DB test
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* Coverage
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* progress
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix test name
Signed-off-by: dbczumar <corey.zumar@databricks.com>
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
* test: fix test
* test(test_databricks.py): fix test
* fix(databricks/chat.py): handle custom endpoint (e.g. sagemaker)
* Apply code scanning fix for clear-text logging of sensitive information
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* fix(__init__.py): fix known fireworks ai models
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2024-09-19 13:25:29 -07:00
Ishaan Jaff
7f4dfe434a
[Fix] o1-mini causes pydantic warnings on reasoning_tokens
( #5754 )
...
* add requester_metadata in standard logging payload
* log requester_metadata in metadata
* use StandardLoggingPayload for logging
* docs StandardLoggingPayload
* fix import
* include standard logging object in failure
* add test for requester metadata
* handle completion_tokens_details
* add test for completion_tokens_details
2024-09-17 20:23:14 -07:00
Krish Dholakia
dd602753c0
Litellm fix router testing ( #5748 )
...
* test: fix testing - azure changed content policy error logic
* test: fix tests to use mock responses
* test(test_image_generation.py): handle api instability
* test(test_image_generation.py): handle azure api instability
* fix(utils.py): fix unbounded variable error
* fix(utils.py): fix unbounded variable error
* test: refactor test to use mock response
* test: mark flaky azure tests
2024-09-17 18:02:23 -07:00
Krish Dholakia
234185ec13
LiteLLM Minor Fixes & Improvements (09/16/2024) ( #5723 ) ( #5731 )
...
* LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723 )
* coverage (#5713 )
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* Move (#5714 )
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix(litellm_logging.py): fix logging client re-init (#5710 )
Fixes https://github.com/BerriAI/litellm/issues/5695
* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config
Fixes https://github.com/BerriAI/litellm/issues/5682
* feat(o1_handler.py): fake streaming for openai o1 models
Fixes https://github.com/BerriAI/litellm/issues/5694
* docs: deprecated traceloop integration in favor of native otel (#5249 )
* fix: fix linting errors
* fix: fix linting errors
* fix(main.py): fix o1 import
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730 )
* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view
Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it
* fix(custom_logger.py): reset calltype
* fix: fix linting errors
* fix: fix linting error
* fix: fix import
* test(test_databricks.py): fix databricks tests
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
2024-09-17 08:05:52 -07:00
Ishaan Jaff
4dcb092d12
fix test_all_model_configs
2024-09-16 17:44:48 -07:00
Ishaan Jaff
b878a67a7c
fic otel load test %
2024-09-14 18:04:28 -07:00
Ishaan Jaff
c8d15544c8
[Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments ( #5698 )
...
* move cooldown logic to it's own helper
* add new track deployment metrics folder
* increment success, fails for deployment in current minute
* fix cooldown logic
* fix test_aaarouter_dynamic_cooldown_message_retry_time
* fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls
* clean up get from deployment test
* fix _async_get_healthy_deployments
* add mock InternalServerError
* test deployment failing 25% requests
* add test_high_traffic_cooldowns_one_bad_deployment
* fix vertex load test
* add test for rate limit error models in cool down
* change default cooldown time
* fix cooldown message time
* fix cooldown on 429 error
* fix doc string for _should_cooldown_deployment
* fix sync cooldown logic router
2024-09-14 18:01:19 -07:00
Ishaan Jaff
85acdb9193
[Feat] Add max_completion_tokens
param ( #5691 )
...
* add max_completion_tokens
* add max_completion_tokens
* add max_completion_tokens support for OpenAI models
* add max_completion_tokens param
* add max_completion_tokens for bedrock converse models
* add test for converse maxTokens
* fix openai o1 param mapping test
* move test optional params
* add max_completion_tokens for anthropic api
* fix conftest
* add max_completion tokens for vertex ai partner models
* add max_completion_tokens for fireworks ai
* add max_completion_tokens for hf rest api
* add test for param mapping
* add param mapping for vertex, gemini + testing
* predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd
* add max_completion_tokens to openai supported params
* fix fireworks ai param mapping
2024-09-14 14:57:01 -07:00
Ishaan Jaff
741c8e8a45
[Feat - Perf Improvement] DataDog Logger 91% lower latency ( #5687 )
...
* fix refactor dd to be an instance of custom logger
* migrate dd logger to be async
* clean up dd logging
* add datadog sync and async code
* use batching for datadog logger
* add doc string for dd logging
* add clear doc string
* fix doc string
* allow debugging intake url
* clean up requirements.txt
* allow setting custom batch size on logger
* fix dd logging to use compression
* fix linting
* add dd load test
* fix dd load test
* fix dd url
* add test_datadog_logging_http_request
* fix test_datadog_logging_http_request
2024-09-13 17:39:17 -07:00
Ishaan Jaff
cd8d7ca915
[Fix] Performance - use in memory cache when downloading images from a url ( #5657 )
...
* fix use in memory cache when getting images
* fix linting
* fix load testing
* fix load test size
* fix load test size
* trigger ci/cd again
2024-09-13 07:23:42 -07:00
Ishaan Jaff
88706488f9
fix otel load test
2024-09-11 21:27:31 -07:00
Ishaan Jaff
b80f27dce3
fix otel tests
2024-09-11 21:25:27 -07:00
Ishaan Jaff
97ecf86d3d
fix langsmith load tests
2024-09-11 21:19:03 -07:00
Ishaan Jaff
b01a42ef4f
fix langsmith load test
2024-09-11 21:16:16 -07:00
Ishaan Jaff
a1f8fcfeed
fix load test
2024-09-11 21:06:42 -07:00
Ishaan Jaff
850b5dbadc
add otel load test
2024-09-11 20:47:12 -07:00
Ishaan Jaff
e7b047223e
add langsmith logging test
2024-09-11 20:35:11 -07:00
Ishaan Jaff
39a8bb2bc4
add test test_regenerate_key_ui
2024-09-10 09:12:03 -07:00
Ishaan Jaff
aed59abe35
allow passing expiry time to /key/regenerate
2024-09-06 08:36:34 -07:00
Ishaan Jaff
18f019f87d
move prisma test to correct location
2024-09-05 15:50:39 -07:00
Ishaan Jaff
edc51f45ac
add error message on test
2024-09-05 15:46:13 -07:00
Ishaan Jaff
3d9049df6d
move folder key gen prisma is in
2024-09-05 15:24:00 -07:00
Ishaan Jaff
9aff6a4c9d
add test for ui usage endpoints
2024-09-05 15:06:53 -07:00
Ishaan Jaff
94d6e800ee
add test for internal vs admin user
2024-09-05 13:30:51 -07:00
Ishaan Jaff
f9a3e343bb
add ui testing folder
2024-09-05 13:13:58 -07:00
Ishaan Jaff
cdc312d51d
mark test_team_logging as flaky
2024-09-04 20:29:21 -07:00
Krrish Dholakia
25e49a59b2
test: skip flaky test
2024-09-04 08:20:57 -07:00
Ishaan Jaff
3c898e23ea
refactor secret managers
2024-09-03 10:58:02 -07:00
Krish Dholakia
9f3fa29624
feat(router.py): Support Loadbalancing batch azure api endpoints ( #5469 )
...
* feat(router.py): initial commit for loadbalancing azure batch api endpoints
Closes https://github.com/BerriAI/litellm/issues/5396
* fix(router.py): working `router.acreate_file()`
* feat(router.py): working router.acreate_batch endpoint
* feat(router.py): expose router.aretrieve_batch function
Make it easy for user to retrieve the batch information
* feat(router.py): support 'router.alist_batches' endpoint
Adds support for getting all batches across all endpoints
* feat(router.py): working loadbalancing on `/v1/files`
* feat(proxy_server.py): working loadbalancing on `/v1/batches`
* feat(proxy_server.py): working loadbalancing on Retrieve + List batch
2024-09-02 21:32:55 -07:00
Ishaan Jaff
e9427205ef
add test for pass through streaming usage tracking
2024-09-02 16:17:49 -07:00
Ishaan Jaff
9e557ed072
fix test
2024-08-31 08:39:52 -07:00
Ishaan Jaff
b35bfb0302
fix cost tracking for vertex ai native
2024-08-31 08:22:27 -07:00
Ishaan Jaff
06857d108d
fix /spend logs call
2024-08-30 17:02:24 -07:00
Ishaan Jaff
2c86a62474
fix vertex ai test
2024-08-30 16:50:23 -07:00
Ishaan Jaff
f43060e8df
mark as async
2024-08-30 16:40:41 -07:00
Ishaan Jaff
414d2dcb52
call spend logs endpoint
2024-08-30 16:35:07 -07:00
Ishaan Jaff
f3f85f6141
add test for vertex basic pass throgh
2024-08-30 16:26:00 -07:00
Krish Dholakia
8d6a0bdc81
- merge - fix TypeError: 'CompletionUsage' object is not subscriptable #5441 ( #5448 )
...
* fix TypeError: 'CompletionUsage' object is not subscriptable (#5441 )
* test(test_team_logging.py): mark flaky test
---------
Co-authored-by: yafei lee <yafei@dao42.com>
2024-08-30 08:54:42 -07:00
Ishaan Jaff
e329c4509a
Merge branch 'main' into litellm_add_tag_control_team
2024-08-29 17:34:40 -07:00
Ishaan Jaff
4b7ceade64
mark test_key_info_spend_values_streaming as flaky
2024-08-29 14:39:53 -07:00
Ishaan Jaff
da2cefc45a
fix team based tag routing
2024-08-29 14:37:44 -07:00