Commit graph

259 commits

Author SHA1 Message Date
Ishaan Jaff
391b107909
[Feat UI sso] store 'provider' in user metadata (#5856)
* store sso provider in user metadata

* store user metadata

* store user auth_provider in user metadata

* add "metadata" for LiteLLM_UserTable

* fix sso test
2024-09-23 17:49:36 -07:00
Ishaan Jaff
d9e798ecda
[Testing-Proxy] Add E2E Admin UI testing (#5845)
* add working ui e2e testing

* ui test

* ui playwright testing

* install python on ui testing

* add playwright testing

* fix ui testing

* fix ui testing

* add redis vars for testing

* fix playwright testing

* fix playwright testing

* rename ui testing

* move e2e ui testing
2024-09-23 11:34:42 -07:00
Ishaan Jaff
6b9b469686 testing - nvidia nim api use mock testing 2024-09-23 08:48:13 -07:00
Krrish Dholakia
2a8eb492a1 test(test_otel.py): fix test 2024-09-23 08:10:06 -07:00
Krish Dholakia
8039b95aaf
LiteLLM Minor Fixes & Improvements (09/21/2024) (#5819)
* fix(router.py): fix error message

* Litellm disable keys (#5814)

* build(schema.prisma): allow blocking/unblocking keys

Fixes https://github.com/BerriAI/litellm/issues/5328

* fix(key_management_endpoints.py): fix pop

* feat(auth_checks.py): allow admin to enable/disable virtual keys

Closes https://github.com/BerriAI/litellm/issues/5328

* docs(vertex.md): add auth section for vertex ai

Addresses - https://github.com/BerriAI/litellm/issues/5768#issuecomment-2365284223

* build(model_prices_and_context_window.json): show which models support prompt_caching

Closes https://github.com/BerriAI/litellm/issues/5776

* fix(router.py): allow setting default priority for requests

* fix(router.py): add 'retry-after' header for concurrent request limit errors

Fixes https://github.com/BerriAI/litellm/issues/5783

* fix(router.py): correctly raise and use retry-after header from azure+openai

Fixes https://github.com/BerriAI/litellm/issues/5783

* fix(user_api_key_auth.py): fix valid token being none

* fix(auth_checks.py): fix model dump for cache management object

* fix(user_api_key_auth.py): pass prisma_client to obj

* test(test_otel.py): update test for new key check

* test: fix test
2024-09-21 18:51:53 -07:00
Ishaan Jaff
d100b32573
[SSO-UI] Set new sso users as internal_view role users (#5824)
* use /user/list endpoint on admin ui

* sso insert user with role when user does not exist

* add sso sign in test

* linting fix

* rename self serve doc

* add doc for self serve flow

* test - sso sign in default values

* add test for /user/list endpoint
2024-09-21 16:43:52 -07:00
Ishaan Jaff
711932294c
[Feat] Add testing for prometheus failure metrics (#5823)
* prom - show status code and class type on prom

* log exception_class name on prometheus metrics

* prometheus track error code and status

* add bad model

* add prometheus failure metric test

* remove outdated file

* fix litellm_proxy_total_requests_metric

* add prometheus metrics testing
2024-09-21 11:36:29 -07:00
Ishaan Jaff
1973ae8fb8
[Feat] Allow setting supports_vision for Custom OpenAI endpoints + Added testing (#5821)
* add test for using images with custom openai endpoints

* run all otel tests

* update name of test

* add custom openai model to test config

* add test for setting supports_vision=True for model

* fix test guardrails aporia

* docs supports vison

* fix yaml

* fix yaml

* docs supports vision

* fix bedrock guardrail test

* fix cohere rerank test

* update model_group doc string

* add better prints on test
2024-09-21 11:35:55 -07:00
Krish Dholakia
3933fba41f
LiteLLM Minor Fixes & Improvements (09/19/2024) (#5793)
* fix(model_prices_and_context_window.json): add cost tracking for more vertex llama3.1 model

8b and 70b models

* fix(proxy/utils.py): handle data being none on pre-call hooks

* fix(proxy/): create views on initial proxy startup

fixes base case, where user starts proxy for first time

 Fixes https://github.com/BerriAI/litellm/issues/5756

* build(config.yml): fix vertex version for test

* feat(ui/): support enabling/disabling slack alerting

Allows admin to turn on/off slack alerting through ui

* feat(rerank/main.py): support langfuse logging

* fix(proxy/utils.py): fix linting errors

* fix(langfuse.py): log clean metadata

* test(tests): replace deprecated openai model
2024-09-20 08:19:52 -07:00
Krish Dholakia
d46660ea0f
LiteLLM Minor Fixes & Improvements (09/18/2024) (#5772)
* fix(proxy_server.py): fix azure key vault logic to not require client id/secret

* feat(cost_calculator.py): support fireworks ai cost tracking

* build(docker-compose.yml): add lines for mounting config.yaml to docker compose

Closes https://github.com/BerriAI/litellm/issues/5739

* fix(input.md): update docs to clarify litellm supports content as a list of dictionaries

Fixes https://github.com/BerriAI/litellm/issues/5755

* fix(input.md): update input.md to include all message values

* fix(image_handling.py): follow image url redirects

Fixes https://github.com/BerriAI/litellm/issues/5763

* fix(router.py): Fix model key/base leak in error message

Fixes https://github.com/BerriAI/litellm/issues/5762

* fix(http_handler.py): fix linting error

* fix(azure.py): fix logging to show azure_ad_token being used

Fixes https://github.com/BerriAI/litellm/issues/5767

* fix(_redis.py): add redis sentinel support

Closes https://github.com/BerriAI/litellm/issues/4381

* feat(_redis.py): add redis sentinel support

Closes https://github.com/BerriAI/litellm/issues/4381

* test(test_completion_cost.py): fix test

* Databricks Integration: Integrate Databricks SDK as optional mechanism for fetching API base and token, if unspecified (#5746)

* LiteLLM Minor Fixes & Improvements (09/16/2024)  (#5723)

* coverage (#5713)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* Move (#5714)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix(litellm_logging.py): fix logging client re-init (#5710)

Fixes https://github.com/BerriAI/litellm/issues/5695

* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config

Fixes https://github.com/BerriAI/litellm/issues/5682

* feat(o1_handler.py): fake streaming for openai o1 models

Fixes https://github.com/BerriAI/litellm/issues/5694

* docs: deprecated traceloop integration in favor of native otel (#5249)

* fix: fix linting errors

* fix: fix linting errors

* fix(main.py): fix o1 import

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730)

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view

Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it

* fix(custom_logger.py): reset calltype

* fix: fix linting errors

* fix: fix linting error

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix: fix import

* Fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* DB test

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* Coverage

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* progress

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix test name

Signed-off-by: dbczumar <corey.zumar@databricks.com>

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

* test: fix test

* test(test_databricks.py): fix test

* fix(databricks/chat.py): handle custom endpoint (e.g. sagemaker)

* Apply code scanning fix for clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix(__init__.py): fix known fireworks ai models

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2024-09-19 13:25:29 -07:00
Ishaan Jaff
7f4dfe434a
[Fix] o1-mini causes pydantic warnings on reasoning_tokens (#5754)
* add requester_metadata in standard logging payload

* log requester_metadata in metadata

* use StandardLoggingPayload for logging

* docs StandardLoggingPayload

* fix import

* include standard logging object in failure

* add test for requester metadata

* handle completion_tokens_details

* add test for completion_tokens_details
2024-09-17 20:23:14 -07:00
Krish Dholakia
dd602753c0
Litellm fix router testing (#5748)
* test: fix testing - azure changed content policy error logic

* test: fix tests to use mock responses

* test(test_image_generation.py): handle api instability

* test(test_image_generation.py): handle azure api instability

* fix(utils.py): fix unbounded variable error

* fix(utils.py): fix unbounded variable error

* test: refactor test to use mock response

* test: mark flaky azure tests
2024-09-17 18:02:23 -07:00
Krish Dholakia
234185ec13
LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723) (#5731)
* LiteLLM Minor Fixes & Improvements (09/16/2024)  (#5723)

* coverage (#5713)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* Move (#5714)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix(litellm_logging.py): fix logging client re-init (#5710)

Fixes https://github.com/BerriAI/litellm/issues/5695

* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config

Fixes https://github.com/BerriAI/litellm/issues/5682

* feat(o1_handler.py): fake streaming for openai o1 models

Fixes https://github.com/BerriAI/litellm/issues/5694

* docs: deprecated traceloop integration in favor of native otel (#5249)

* fix: fix linting errors

* fix: fix linting errors

* fix(main.py): fix o1 import

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730)

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view

Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it

* fix(custom_logger.py): reset calltype

* fix: fix linting errors

* fix: fix linting error

* fix: fix import

* test(test_databricks.py): fix databricks tests

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
2024-09-17 08:05:52 -07:00
Ishaan Jaff
4dcb092d12 fix test_all_model_configs 2024-09-16 17:44:48 -07:00
Ishaan Jaff
b878a67a7c fic otel load test % 2024-09-14 18:04:28 -07:00
Ishaan Jaff
c8d15544c8
[Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments (#5698)
* move cooldown logic to it's own helper

* add new track deployment metrics folder

* increment success, fails for deployment in current minute

* fix cooldown logic

* fix test_aaarouter_dynamic_cooldown_message_retry_time

* fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls

* clean up get from deployment test

* fix _async_get_healthy_deployments

* add mock InternalServerError

* test deployment failing 25% requests

* add test_high_traffic_cooldowns_one_bad_deployment

* fix vertex load test

* add test for rate limit error models in cool down

* change default cooldown time

* fix cooldown message time

* fix cooldown on 429 error

* fix doc string for _should_cooldown_deployment

* fix sync cooldown logic router
2024-09-14 18:01:19 -07:00
Ishaan Jaff
85acdb9193
[Feat] Add max_completion_tokens param (#5691)
* add max_completion_tokens

* add max_completion_tokens

* add max_completion_tokens support for OpenAI models

* add max_completion_tokens param

* add max_completion_tokens for bedrock converse models

* add test for converse maxTokens

* fix openai o1 param mapping test

* move test optional params

* add max_completion_tokens for anthropic api

* fix conftest

* add max_completion tokens for vertex ai partner models

* add max_completion_tokens for fireworks ai

* add max_completion_tokens for hf rest api

* add test for param mapping

* add param mapping for vertex, gemini + testing

* predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd

* add max_completion_tokens to openai supported params

* fix fireworks ai param mapping
2024-09-14 14:57:01 -07:00
Ishaan Jaff
741c8e8a45
[Feat - Perf Improvement] DataDog Logger 91% lower latency (#5687)
* fix refactor dd to be an instance of custom logger

* migrate dd logger to be async

* clean up dd logging

* add datadog sync and async code

* use batching for datadog logger

* add doc string for dd logging

* add clear doc string

* fix doc string

* allow debugging intake url

* clean up requirements.txt

* allow setting custom batch size on logger

* fix dd logging to use compression

* fix linting

* add dd load test

* fix dd load test

* fix dd url

* add test_datadog_logging_http_request

* fix test_datadog_logging_http_request
2024-09-13 17:39:17 -07:00
Ishaan Jaff
cd8d7ca915
[Fix] Performance - use in memory cache when downloading images from a url (#5657)
* fix use in memory cache when getting images

* fix linting

* fix load testing

* fix load test size

* fix load test size

* trigger ci/cd again
2024-09-13 07:23:42 -07:00
Ishaan Jaff
88706488f9 fix otel load test 2024-09-11 21:27:31 -07:00
Ishaan Jaff
b80f27dce3 fix otel tests 2024-09-11 21:25:27 -07:00
Ishaan Jaff
97ecf86d3d fix langsmith load tests 2024-09-11 21:19:03 -07:00
Ishaan Jaff
b01a42ef4f fix langsmith load test 2024-09-11 21:16:16 -07:00
Ishaan Jaff
a1f8fcfeed fix load test 2024-09-11 21:06:42 -07:00
Ishaan Jaff
850b5dbadc add otel load test 2024-09-11 20:47:12 -07:00
Ishaan Jaff
e7b047223e add langsmith logging test 2024-09-11 20:35:11 -07:00
Ishaan Jaff
39a8bb2bc4 add test test_regenerate_key_ui 2024-09-10 09:12:03 -07:00
Ishaan Jaff
aed59abe35 allow passing expiry time to /key/regenerate 2024-09-06 08:36:34 -07:00
Ishaan Jaff
18f019f87d move prisma test to correct location 2024-09-05 15:50:39 -07:00
Ishaan Jaff
edc51f45ac add error message on test 2024-09-05 15:46:13 -07:00
Ishaan Jaff
3d9049df6d move folder key gen prisma is in 2024-09-05 15:24:00 -07:00
Ishaan Jaff
9aff6a4c9d add test for ui usage endpoints 2024-09-05 15:06:53 -07:00
Ishaan Jaff
94d6e800ee add test for internal vs admin user 2024-09-05 13:30:51 -07:00
Ishaan Jaff
f9a3e343bb add ui testing folder 2024-09-05 13:13:58 -07:00
Ishaan Jaff
cdc312d51d mark test_team_logging as flaky 2024-09-04 20:29:21 -07:00
Krrish Dholakia
25e49a59b2 test: skip flaky test 2024-09-04 08:20:57 -07:00
Ishaan Jaff
3c898e23ea refactor secret managers 2024-09-03 10:58:02 -07:00
Krish Dholakia
9f3fa29624
feat(router.py): Support Loadbalancing batch azure api endpoints (#5469)
* feat(router.py): initial commit for loadbalancing azure batch api endpoints

Closes https://github.com/BerriAI/litellm/issues/5396

* fix(router.py): working `router.acreate_file()`

* feat(router.py): working router.acreate_batch endpoint

* feat(router.py): expose router.aretrieve_batch function

Make it easy for user to retrieve the batch information

* feat(router.py): support 'router.alist_batches' endpoint

Adds support for getting all batches across all endpoints

* feat(router.py): working loadbalancing on `/v1/files`

* feat(proxy_server.py): working loadbalancing on `/v1/batches`

* feat(proxy_server.py): working loadbalancing on Retrieve + List batch
2024-09-02 21:32:55 -07:00
Ishaan Jaff
e9427205ef add test for pass through streaming usage tracking 2024-09-02 16:17:49 -07:00
Ishaan Jaff
9e557ed072 fix test 2024-08-31 08:39:52 -07:00
Ishaan Jaff
b35bfb0302 fix cost tracking for vertex ai native 2024-08-31 08:22:27 -07:00
Ishaan Jaff
06857d108d fix /spend logs call 2024-08-30 17:02:24 -07:00
Ishaan Jaff
2c86a62474 fix vertex ai test 2024-08-30 16:50:23 -07:00
Ishaan Jaff
f43060e8df mark as async 2024-08-30 16:40:41 -07:00
Ishaan Jaff
414d2dcb52 call spend logs endpoint 2024-08-30 16:35:07 -07:00
Ishaan Jaff
f3f85f6141 add test for vertex basic pass throgh 2024-08-30 16:26:00 -07:00
Krish Dholakia
8d6a0bdc81
- merge - fix TypeError: 'CompletionUsage' object is not subscriptable #5441 (#5448)
* fix TypeError: 'CompletionUsage' object is not subscriptable (#5441)

* test(test_team_logging.py): mark flaky test

---------

Co-authored-by: yafei lee <yafei@dao42.com>
2024-08-30 08:54:42 -07:00
Ishaan Jaff
e329c4509a
Merge branch 'main' into litellm_add_tag_control_team 2024-08-29 17:34:40 -07:00
Ishaan Jaff
4b7ceade64 mark test_key_info_spend_values_streaming as flaky 2024-08-29 14:39:53 -07:00
Ishaan Jaff
da2cefc45a fix team based tag routing 2024-08-29 14:37:44 -07:00