Commit graph

18155 commits

Author SHA1 Message Date
Ishaan Jaff
45a981a37e bump: version 1.48.13 → 1.48.14 2024-10-04 17:19:33 +05:30
Ishaan Jaff
3c59d188ef ci/cd run again 2024-10-04 17:19:26 +05:30
Ishaan Jaff
69c96d9ba4 bump: version 1.48.12 → 1.48.13 2024-10-04 17:18:49 +05:30
Ishaan Jaff
e394ed1e5b
(fixes) docs + qa - gcs key based logging (#6061)
* fixes for required values for gcs bucket

* docs gcs bucket logging
2024-10-04 16:58:04 +05:30
Ishaan Jaff
2449d258cf
(docs) add 1k rps load test doc (#6059)
* docs 1k rps load test

* docs load testing

* docs load testing litellm

* docs load testing

* clean up load test doc

* docs prom metrics for load testing

* docs using prometheus on load testing

* doc load testing with prometheus
2024-10-04 16:56:34 +05:30
Ishaan Jaff
224460d4c9
fix prometheus track cooldown events on custom logger (#6060) 2024-10-04 16:56:22 +05:30
Ishaan Jaff
6e97bc4404 fix handle case when key based logging vars are set as os.environ/ vars 2024-10-04 12:10:25 +05:30
Ishaan Jaff
670ecda4e2
(fixes) gcs bucket key based logging (#6044)
* fixes for gcs bucket logging

* fix StandardCallbackDynamicParams

* fix - gcs logging when payload is not serializable

* add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket

* working success callbacks

* linting fixes

* fix linting error

* add type hints to functions

* fixes for dynamic success and failure logging

* fix for test_async_chat_openai_stream
2024-10-04 11:56:10 +05:30
Krrish Dholakia
793593e735 docs(realtime.md): add new /v1/realtime endpoint 2024-10-03 22:44:02 -04:00
Krish Dholakia
09f0c09ba4
fix(utils.py): return openai streaming prompt caching tokens (#6051)
* fix(utils.py): return openai streaming prompt caching tokens

Closes https://github.com/BerriAI/litellm/issues/6038

* fix(main.py): fix error in finish_reason updates
2024-10-03 22:20:13 -04:00
Krrish Dholakia
04ae095860 build: version bump 2024-10-03 21:42:49 -04:00
ls-marek-kerka
db55098a33
🔧 (model_prices_and_context_window.json): rename gemini-pro-flash to gemini-flash-experimental to reflect updated naming convention (#5980)
Co-authored-by: Marek Keřka <marek.kerka@gmail.com>
2024-10-03 18:06:39 -04:00
Krish Dholakia
5c33d1c9af
Litellm Minor Fixes & Improvements (10/03/2024) (#6049)
* fix(proxy_server.py): remove spendlog fixes from proxy startup logic

Moves  https://github.com/BerriAI/litellm/pull/4794 to `/db_scripts` and cleans up some caching-related debug info (easier to trace debug logs)

* fix(langfuse_endpoints.py): Fixes https://github.com/BerriAI/litellm/issues/6041

* fix(azure.py): fix health checks for azure audio transcription models

Fixes https://github.com/BerriAI/litellm/issues/5999

* Feat: Add Literal AI Integration (#5653)

* feat: add Literal AI integration

* update readme

* Update README.md

* fix: address comments

* fix: remove literalai sdk

* fix: use HTTPHandler

* chore: add test

* fix: add asyncio lock

* fix(literal_ai.py): fix linting errors

* fix(literal_ai.py): fix linting errors

* refactor: cleanup

---------

Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
2024-10-03 18:02:28 -04:00
Krish Dholakia
f9d0bcc5a1
OpenAI /v1/realtime api support (#6047)
* feat(azure/realtime): initial working commit for proxy azure openai realtime endpoint support

Adds support for passing /v1/realtime calls via litellm proxy

* feat(realtime_api/main.py): abstraction for handling openai realtime api calls

* feat(router.py): add `arealtime()` endpoint in router for realtime api calls

Allows using `model_list` in proxy for realtime as well

* fix: make realtime api a private function

Structure might change based on feedback. Make that clear to users.

* build(requirements.txt): add websockets to the requirements.txt

* feat(openai/realtime): add openai /v1/realtime api support
2024-10-03 17:11:22 -04:00
Ishaan Jaff
130842537f bump: version 1.48.10 → 1.48.11 2024-10-03 23:32:08 +05:30
Ishaan Jaff
4e88fd65e1
(feat) openai prompt caching (non streaming) - add prompt_tokens_details in usage response (#6039)
* add prompt_tokens_details in usage response

* use _prompt_tokens_details as a param in Usage

* fix linting errors

* fix type error

* fix ci/cd deps

* bump deps for openai

* bump deps openai

* fix llm translation testing

* fix llm translation embedding
2024-10-03 23:31:10 +05:30
Krish Dholakia
9fccb4a0da
fix(factory.py): bedrock: merge consecutive tool + user messages (#6028)
* fix(factory.py): bedrock:  merge consecutive tool + user messages

Fixes https://github.com/BerriAI/litellm/issues/6007

* LiteLLM Minor Fixes & Improvements (10/02/2024)  (#6023)

* feat(together_ai/completion): handle together ai completion calls

* fix: handle list of int / list of list of int for text completion calls

* fix(utils.py): check if base model in bedrock converse model list

Fixes https://github.com/BerriAI/litellm/issues/6003

* test(test_optional_params.py): add unit tests for bedrock optional param mapping

Fixes https://github.com/BerriAI/litellm/issues/6003

* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist

Fixes https://github.com/BerriAI/litellm/issues/5388

* fixed an issue with tool use of claude models with anthropic and bedrock (#6013)

* fix(utils.py): handle empty schema for anthropic/bedrock

Fixes https://github.com/BerriAI/litellm/issues/6012

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(proxy_cli.py): fix import route for app + health checks path (#6026)

* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018)

* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav

Fixes https://github.com/BerriAI/litellm/issues/5999

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

* fix(factory.py): correctly handle content in tool block

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-10-03 09:16:25 -04:00
Ishaan Jaff
1ab886f80d
(contributor PRs) oct 3rd, 2024 (#6034)
* Do not skip important tests for OIDC. (#6017)

* [Bug] Skip monthly slack alert if there was no spend (#6015)

* Fix: skip slack alert if there was no spend

* Skip monthly report when there was no spend

---------

Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Paz <paz@tryolabs.com>
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
2024-10-03 17:12:34 +05:30
Ishaan Jaff
d92696a303
(feat) add nvidia nim embeddings (#6032)
* nvidia nim support embedding config

* add nvidia config in init

* nvidia nim embeddings

* docs nvidia nim embeddings

* docs embeddings on nvidia nim

* fix llm translation test
2024-10-03 17:12:14 +05:30
Ishaan Jaff
05df9cc6d0 docs prometheus metrics 2024-10-03 16:31:29 +05:30
Ishaan Jaff
21e05a0f3e
(feat proxy) add key based logging for GCS bucket (#6031)
* init litellm langfuse / gcs credentials in litellm logging obj

* add gcs key based test

* rename vars

* save standard_callback_dynamic_params in model call details

* add working gcs bucket key based logging

* test_basic_gcs_logging_per_request

* linting fix

* add doc on gcs  bucket team based logging
2024-10-03 15:24:31 +05:30
Ishaan Jaff
835db6ae98
(load testing) add vertex_ai embeddings load test (#6004)
* use vertex llm as base class for embeddings

* use correct vertex class in main.py

* set_headers in vertex llm base

* add types for vertex embedding requests

* add embedding handler for vertex

* use async mode for vertex embedding tests

* use vertexAI textEmbeddingConfig

* fix linting

* add sync and async mode testing for vertex ai embeddings

* add basic load test

* add vertex ai load test on ci cd
2024-10-03 14:39:15 +05:30
Krish Dholakia
f8d9be1301
(azure): Enable stream_options for Azure OpenAI. (#6024) (#6029)
* (azure): Enable stream_options for Azure OpenAI. (#6024)

* LiteLLM Minor Fixes & Improvements (10/02/2024)  (#6023)

* feat(together_ai/completion): handle together ai completion calls

* fix: handle list of int / list of list of int for text completion calls

* fix(utils.py): check if base model in bedrock converse model list

Fixes https://github.com/BerriAI/litellm/issues/6003

* test(test_optional_params.py): add unit tests for bedrock optional param mapping

Fixes https://github.com/BerriAI/litellm/issues/6003

* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist

Fixes https://github.com/BerriAI/litellm/issues/5388

* fixed an issue with tool use of claude models with anthropic and bedrock (#6013)

* fix(utils.py): handle empty schema for anthropic/bedrock

Fixes https://github.com/BerriAI/litellm/issues/6012

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(proxy_cli.py): fix import route for app + health checks path (#6026)

* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018)

* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav

Fixes https://github.com/BerriAI/litellm/issues/5999

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
2024-10-02 22:59:14 -04:00
Krrish Dholakia
74647a5227 bump: version 1.48.9 → 1.48.10 2024-10-02 22:13:57 -04:00
Krish Dholakia
14165d3648
LiteLLM Minor Fixes & Improvements (10/02/2024) (#6023)
* feat(together_ai/completion): handle together ai completion calls

* fix: handle list of int / list of list of int for text completion calls

* fix(utils.py): check if base model in bedrock converse model list

Fixes https://github.com/BerriAI/litellm/issues/6003

* test(test_optional_params.py): add unit tests for bedrock optional param mapping

Fixes https://github.com/BerriAI/litellm/issues/6003

* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist

Fixes https://github.com/BerriAI/litellm/issues/5388

* fixed an issue with tool use of claude models with anthropic and bedrock (#6013)

* fix(utils.py): handle empty schema for anthropic/bedrock

Fixes https://github.com/BerriAI/litellm/issues/6012

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(proxy_cli.py): fix import route for app + health checks path (#6026)

* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018)

* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav

Fixes https://github.com/BerriAI/litellm/issues/5999

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-10-02 22:00:28 -04:00
David Manouchehri
8995ff49ae
(testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018) 2024-10-02 12:02:22 -04:00
Krrish Dholakia
121b493fe8 docs(code_quality.md): add doc on litellm code qa 2024-10-02 11:20:15 -04:00
Krrish Dholakia
e19bb55e3b bump: version 1.48.8 → 1.48.9 2024-10-01 23:52:16 -04:00
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement (#5992)
* ci(config.yml): add a 'check_code_quality' step

Addresses https://github.com/BerriAI/litellm/issues/5991

* ci(config.yml): check why circle ci doesn't pick up this test

* ci(config.yml): fix to run 'check_code_quality' tests

* fix(__init__.py): fix unprotected import

* fix(__init__.py): don't remove unused imports

* build(ruff.toml): update ruff.toml to ignore unused imports

* fix: fix: ruff + pyright - fix linting + type-checking errors

* fix: fix linting errors

* fix(lago.py): fix module init error

* fix: fix linting errors

* ci(config.yml): cd into correct dir for checks

* fix(proxy_server.py): fix linting error

* fix(utils.py): fix bare except

causes ruff linting errors

* fix: ruff - fix remaining linting errors

* fix(clickhouse.py): use standard logging object

* fix(__init__.py): fix unprotected import

* fix: ruff - fix linting errors

* fix: fix linting errors

* ci(config.yml): cleanup code qa step (formatting handled in local_testing)

* fix(_health_endpoints.py): fix ruff linting errors

* ci(config.yml): just use ruff in check_code_quality pipeline for now

* build(custom_guardrail.py): include missing file

* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Krrish Dholakia
3fc4ae0d65 build(custom_guardrail.py): include missing file 2024-10-01 17:18:52 -04:00
Ishaan Jaff
b5ab138134 bump: version 1.48.7 → 1.48.8 2024-10-01 14:17:31 -07:00
Ishaan Jaff
eef9bad9a6
(performance improvement - vertex embeddings) ~111.11% faster (#6000)
* use vertex llm as base class for embeddings

* use correct vertex class in main.py

* set_headers in vertex llm base

* add types for vertex embedding requests

* add embedding handler for vertex

* use async mode for vertex embedding tests

* use vertexAI textEmbeddingConfig

* fix linting

* add sync and async mode testing for vertex ai embeddings
2024-10-01 14:16:21 -07:00
Krrish Dholakia
18a28ef977 docs(data_security.md): cleanup docs 2024-10-01 15:33:10 -04:00
Krrish Dholakia
e8a291b539 docs(data_security.md): update faq doc 2024-10-01 14:38:34 -04:00
Ishaan Jaff
045ecf3ffb
(feat proxy slack alerting) - allow opting in to getting key / internal user alerts (#5990)
* define all slack alert types

* use correct type hints for alert type

* use correct defaults on slack alerting

* add readme for slack alerting

* fix linting error

* update readme

* docs all alert types

* update slack alerting docs

* fix slack alerting docs

* handle new testing dir structure

* fix config for testing

* fix testing folder related imports

* fix /tests import errors

* fix import stream_chunk_testdata

* docs alert types

* fix test test_langfuse_trace_id

* fix type checks for slack alerting

* fix outage alerting test slack
2024-10-01 10:49:22 -07:00
Paz
8225880af0
Fix: skip slack alert if there was no spend (#5998)
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
2024-10-01 08:02:16 -07:00
Ishaan Jaff
2a7e1e970d
(docs) prometheus metrics document all prometheus metrics (#5989)
* fix doc on prometheus

* (docs) clean up prometheus docs

* docs show what metrics are deprectaed

* doc clarify labels used for bduget metrics

* add litellm_remaining_api_key_requests_for_model
2024-09-30 16:38:38 -07:00
Ishaan Jaff
ca9c437021
add Azure OpenAI entrata id docs (#5985) 2024-09-30 12:17:58 -07:00
Ishaan Jaff
30aa04b8c2 add docs on privacy policy 2024-09-30 11:53:52 -07:00
Ishaan Jaff
50d1c864f2
fix grammar on health check docs (#5984) 2024-09-30 09:21:42 -07:00
Krrish Dholakia
7630680690 docs(response_headers.md): add response headers to docs 2024-09-28 23:33:50 -07:00
DAOUDI Soufian
bfa9553819
Fixed minor typo in bash command to prevent overwriting .env file (#5902)
Changed '>' to '>>' in the bash command to append the environment variable to the .env file instead of overwriting it.
2024-09-28 23:12:19 -07:00
Krrish Dholakia
ec6ec32bf8 bump: version 1.48.6 → 1.48.7 2024-09-28 23:11:10 -07:00
Krish Dholakia
12cb4ee05c
Litellm Minor Fixes & Improvements (09/24/2024) (#5963)
* fix(batch_redis_get.py): handle custom namespace

Fix https://github.com/BerriAI/litellm/issues/5917

* fix(litellm_logging.py): fix linting error

* refactor(test_proxy_utils.py): place at root level test folder

* refactor: move all testing to top-level of repo

Closes https://github.com/BerriAI/litellm/issues/486

* refactor: fix imports

* refactor(test_stream_chunk_builder.py): fix import

* build(config.yml): fix build_and_test part of tests

* fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(return-openai-compatible-headers): v0 is openai, azure, anthropic

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(utils.py): guarantee openai-compatible headers always exist in response

Fixes https://github.com/BerriAI/litellm/issues/5957

* fix(azure): return response headers for sync embedding calls

* fix(router.py): handle setting response headers during retries

* fix(utils.py): fix updating hidden params

* fix(router.py): skip setting model_group response headers for now

current implementation increases redis cache calls by 3x

* docs(reliability.md): add tutorial on setting wildcard models as fallbacks

* fix(caching.py): cleanup print_stack()

* fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing

* test: refactor test

* test: run test first

* fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set)

* (perf improvement proxy) use one redis set cache to update spend in db (30-40% perf improvement)  (#5960)

* use one set op to update spend in db

* fix test_team_cache_update_called

* fix redis async_set_cache_pipeline when empty list passed to it (#5962)

* [Feat Proxy] Allow using hypercorn for http v2  (#5950)

* use run_hypercorn

* add docs on using hypercorn

* docs clean up langfuse.md

* (feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus  (#5968)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* fix checks for premium user on prometheus

* log_success_fallback_event and log_failure_fallback_event

* log original_exception in log_success_fallback_event

* track key, team and exception status and class on fallback metrics

* use get_standard_logging_metadata

* fix import error

* track litellm_deployment_successful_fallbacks

* add test test_proxy_fallback_metrics

* add log log_success_fallback_event

* fix test prometheus

* (proxy prometheus) track api key and team in latency metrics (#5966)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* (feat prometheus proxy) track remaining team and key alias in deployment failure metrics (#5967)

* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* bump: version 1.48.5 → 1.48.6

* fix sso sign in tests

* ci/cd run again

* add sentry sdk to litellm docker (#5965)

* ci/cd run again

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-28 21:09:48 -07:00
Ishaan Jaff
b4f8f170e7 ci/cd run again 2024-09-28 21:08:15 -07:00
Ishaan Jaff
ad4488d691 ci/cd run again 2024-09-28 21:08:15 -07:00
Ishaan Jaff
357bb53e9d (feat prometheus proxy) track remaining team and key alias in deployment failure metrics (#5967)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total
2024-09-28 21:08:15 -07:00
Krrish Dholakia
6c7d1d5c96 fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set) 2024-09-28 21:08:15 -07:00
Krrish Dholakia
fa64b6ca24 test: run test first 2024-09-28 21:08:15 -07:00
Krrish Dholakia
392e5c538e test: refactor test 2024-09-28 21:08:15 -07:00