Commit graph

17974 commits

Author SHA1 Message Date
Ishaan Jaff
eef9bad9a6
(performance improvement - vertex embeddings) ~111.11% faster (#6000)
* use vertex llm as base class for embeddings

* use correct vertex class in main.py

* set_headers in vertex llm base

* add types for vertex embedding requests

* add embedding handler for vertex

* use async mode for vertex embedding tests

* use vertexAI textEmbeddingConfig

* fix linting

* add sync and async mode testing for vertex ai embeddings
2024-10-01 14:16:21 -07:00
Krrish Dholakia
18a28ef977 docs(data_security.md): cleanup docs 2024-10-01 15:33:10 -04:00
Krrish Dholakia
e8a291b539 docs(data_security.md): update faq doc 2024-10-01 14:38:34 -04:00
Ishaan Jaff
045ecf3ffb
(feat proxy slack alerting) - allow opting in to getting key / internal user alerts (#5990)
* define all slack alert types

* use correct type hints for alert type

* use correct defaults on slack alerting

* add readme for slack alerting

* fix linting error

* update readme

* docs all alert types

* update slack alerting docs

* fix slack alerting docs

* handle new testing dir structure

* fix config for testing

* fix testing folder related imports

* fix /tests import errors

* fix import stream_chunk_testdata

* docs alert types

* fix test test_langfuse_trace_id

* fix type checks for slack alerting

* fix outage alerting test slack
2024-10-01 10:49:22 -07:00
Paz
8225880af0
Fix: skip slack alert if there was no spend (#5998)
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
2024-10-01 08:02:16 -07:00
Ishaan Jaff
2a7e1e970d
(docs) prometheus metrics document all prometheus metrics (#5989)
* fix doc on prometheus

* (docs) clean up prometheus docs

* docs show what metrics are deprectaed

* doc clarify labels used for bduget metrics

* add litellm_remaining_api_key_requests_for_model
2024-09-30 16:38:38 -07:00
Ishaan Jaff
ca9c437021
add Azure OpenAI entrata id docs (#5985) 2024-09-30 12:17:58 -07:00
Ishaan Jaff
30aa04b8c2 add docs on privacy policy 2024-09-30 11:53:52 -07:00
Ishaan Jaff
50d1c864f2
fix grammar on health check docs (#5984) 2024-09-30 09:21:42 -07:00
Krrish Dholakia
7630680690 docs(response_headers.md): add response headers to docs 2024-09-28 23:33:50 -07:00
DAOUDI Soufian
bfa9553819
Fixed minor typo in bash command to prevent overwriting .env file (#5902)
Changed '>' to '>>' in the bash command to append the environment variable to the .env file instead of overwriting it.
2024-09-28 23:12:19 -07:00
Krrish Dholakia
ec6ec32bf8 bump: version 1.48.6 → 1.48.7 2024-09-28 23:11:10 -07:00
Krish Dholakia
12cb4ee05c
Litellm Minor Fixes & Improvements (09/24/2024) (#5963)
* fix(batch_redis_get.py): handle custom namespace

Fix https://github.com/BerriAI/litellm/issues/5917

* fix(litellm_logging.py): fix linting error

* refactor(test_proxy_utils.py): place at root level test folder

* refactor: move all testing to top-level of repo

Closes https://github.com/BerriAI/litellm/issues/486

* refactor: fix imports

* refactor(test_stream_chunk_builder.py): fix import

* build(config.yml): fix build_and_test part of tests

* fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(return-openai-compatible-headers): v0 is openai, azure, anthropic

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(utils.py): guarantee openai-compatible headers always exist in response

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(azure): return response headers for sync embedding calls

* fix(router.py): handle setting response headers during retries

* fix(utils.py): fix updating hidden params

* fix(router.py): skip setting model_group response headers for now

current implementation increases redis cache calls by 3x

* docs(reliability.md): add tutorial on setting wildcard models as fallbacks

* fix(caching.py): cleanup print_stack()

* fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing

* test: refactor test

* test: run test first

* fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set)

* (perf improvement proxy) use one redis set cache to update spend in db (30-40% perf improvement)  (#5960)

* use one set op to update spend in db

* fix test_team_cache_update_called

* fix redis async_set_cache_pipeline when empty list passed to it (#5962)

* [Feat Proxy] Allow using hypercorn for http v2  (#5950)

* use run_hypercorn

* add docs on using hypercorn

* docs clean up langfuse.md

* (feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus  (#5968)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* fix checks for premium user on prometheus

* log_success_fallback_event and log_failure_fallback_event

* log original_exception in log_success_fallback_event

* track key, team and exception status and class on fallback metrics

* use get_standard_logging_metadata

* fix import error

* track litellm_deployment_successful_fallbacks

* add test test_proxy_fallback_metrics

* add log log_success_fallback_event

* fix test prometheus

* (proxy prometheus) track api key and team in latency metrics (#5966)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* (feat prometheus proxy) track remaining team and key alias in deployment failure metrics (#5967)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* bump: version 1.48.5 → 1.48.6

* fix sso sign in tests

* ci/cd run again

* add sentry sdk to litellm docker (#5965)

* ci/cd run again

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-28 21:09:48 -07:00
Ishaan Jaff
b4f8f170e7 ci/cd run again 2024-09-28 21:08:15 -07:00
Ishaan Jaff
ad4488d691 ci/cd run again 2024-09-28 21:08:15 -07:00
Ishaan Jaff
357bb53e9d (feat prometheus proxy) track remaining team and key alias in deployment failure metrics (#5967)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total
2024-09-28 21:08:15 -07:00
Krrish Dholakia
6c7d1d5c96 fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set) 2024-09-28 21:08:15 -07:00
Krrish Dholakia
fa64b6ca24 test: run test first 2024-09-28 21:08:15 -07:00
Krrish Dholakia
392e5c538e test: refactor test 2024-09-28 21:08:15 -07:00
Krrish Dholakia
3f8a5b3ef6 fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing 2024-09-28 21:08:15 -07:00
Krrish Dholakia
575b7911b2 fix(caching.py): cleanup print_stack() 2024-09-28 21:08:15 -07:00
Krrish Dholakia
c9d6925a42 docs(reliability.md): add tutorial on setting wildcard models as fallbacks 2024-09-28 21:08:15 -07:00
Krrish Dholakia
81d6c5e5a5 fix(router.py): skip setting model_group response headers for now
current implementation increases redis cache calls by 3x
2024-09-28 21:08:15 -07:00
Krrish Dholakia
5fbcdd8b11 fix(utils.py): fix updating hidden params 2024-09-28 21:08:15 -07:00
Krrish Dholakia
b0eff0b84f fix(router.py): handle setting response headers during retries 2024-09-28 21:08:15 -07:00
Krrish Dholakia
d64e971d8c fix(azure): return response headers for sync embedding calls 2024-09-28 21:08:15 -07:00
Krrish Dholakia
55d7bc7f32 fix(utils.py): guarantee openai-compatible headers always exist in response
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Krrish Dholakia
498e14ba59 fix(return-openai-compatible-headers): v0 is openai, azure, anthropic
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Krrish Dholakia
5222fc8e1b fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Krrish Dholakia
c0cdc6e496 build(config.yml): fix build_and_test part of tests 2024-09-28 21:08:14 -07:00
Krrish Dholakia
dd2c0abd33 refactor(test_stream_chunk_builder.py): fix import 2024-09-28 21:08:14 -07:00
Krrish Dholakia
5ad01e59f6 refactor: fix imports 2024-09-28 21:08:14 -07:00
Krrish Dholakia
3560f0ef2c refactor: move all testing to top-level of repo
Closes https://github.com/BerriAI/litellm/issues/486
2024-09-28 21:08:14 -07:00
Krrish Dholakia
5403c5828c refactor(test_proxy_utils.py): place at root level test folder 2024-09-28 21:08:14 -07:00
Krrish Dholakia
c1036001fa fix(litellm_logging.py): fix linting error 2024-09-28 21:08:14 -07:00
Krrish Dholakia
efc06d4a03 fix(batch_redis_get.py): handle custom namespace
Fix https://github.com/BerriAI/litellm/issues/5917
2024-09-28 21:08:14 -07:00
Ishaan Jaff
e9e086a0b6 ci/cd run again 2024-09-28 19:35:24 -07:00
Ishaan Jaff
e70d1a2808
add sentry sdk to litellm docker (#5965) 2024-09-28 19:33:41 -07:00
Ishaan Jaff
4251375db3 ci/cd run again 2024-09-28 19:23:10 -07:00
Ishaan Jaff
8c38cfff9d fix sso sign in tests 2024-09-28 19:11:28 -07:00
Ishaan Jaff
67d2989d1a bump: version 1.48.5 → 1.48.6 2024-09-28 19:06:28 -07:00
Ishaan Jaff
9d8c215d38
(feat prometheus proxy) track remaining team and key alias in deployment failure metrics (#5967)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total
2024-09-28 19:05:42 -07:00
Ishaan Jaff
7e69cdd1b4
(proxy prometheus) track api key and team in latency metrics (#5966)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency
2024-09-28 19:04:42 -07:00
Ishaan Jaff
49ec40b1cb
(feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus (#5968)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* fix checks for premium user on prometheus

* log_success_fallback_event and log_failure_fallback_event

* log original_exception in log_success_fallback_event

* track key, team and exception status and class on fallback metrics

* use get_standard_logging_metadata

* fix import error

* track litellm_deployment_successful_fallbacks

* add test test_proxy_fallback_metrics

* add log log_success_fallback_event

* fix test prometheus
2024-09-28 19:00:21 -07:00
Ishaan Jaff
b817974c8e docs clean up langfuse.md 2024-09-28 18:59:02 -07:00
Ishaan Jaff
0d0f46a826
[Feat Proxy] Allow using hypercorn for http v2 (#5950)
* use run_hypercorn

* add docs on using hypercorn
2024-09-28 15:03:50 -07:00
Ishaan Jaff
7500855654
fix redis async_set_cache_pipeline when empty list passed to it (#5962) 2024-09-28 13:32:00 -07:00
Ishaan Jaff
eb325cce7d
(perf improvement proxy) use one redis set cache to update spend in db (30-40% perf improvement) (#5960)
* use one set op to update spend in db

* fix test_team_cache_update_called
2024-09-28 13:00:31 -07:00
Ishaan Jaff
8bf7573fd8
(fix proxy) model_group/info support rerank models (#5955)
* fix /model_group/info on rerank

* add test test_proxy_model_group_info_rerank
2024-09-28 10:54:43 -07:00
Ishaan Jaff
088d906276
fix use one async async_batch_set_cache (#5956) 2024-09-28 09:59:38 -07:00