Commit graph

18126 commits

Author SHA1 Message Date
Krish Dholakia
04e5963b65
Litellm expose disable schema update flag (#6085)
* fix: enable new 'disable_prisma_schema_update' flag

* build(config.yml): remove setup remote docker step

* ci(config.yml): give container time to start up

* ci(config.yml): update test

* build(config.yml): actually start docker

* build(config.yml): simplify grep check

* fix(prisma_client.py): support reading disable_schema_update via env vars

* ci(config.yml): add test to check if all general settings are documented

* build(test_General_settings.py): check available dir

* ci: check ../ repo path

* build: check ./

* build: fix test
2024-10-05 21:26:51 -04:00
Krish Dholakia
f2c0a31e3c
LiteLLM Minor Fixes & Improvements (10/05/2024) (#6083)
* docs(prompt_caching.md): add prompt caching cost calc example to docs

* docs(prompt_caching.md): add proxy examples to docs

* feat(utils.py): expose new helper `supports_prompt_caching()` to check if a model supports prompt caching

* docs(prompt_caching.md): add docs on checking model support for prompt caching

* build: fix invalid json
2024-10-05 18:59:11 -04:00
Krish Dholakia
fac3b2ee42
Add pyright to ci/cd + Fix remaining type-checking errors (#6082)
* fix: fix type-checking errors

* fix: fix additional type-checking errors

* fix: additional type-checking error fixes

* fix: fix additional type-checking errors

* fix: additional type-check fixes

* fix: fix all type-checking errors + add pyright to ci/cd

* fix: fix incorrect import

* ci(config.yml): use mypy on ci/cd

* fix: fix type-checking errors in utils.py

* fix: fix all type-checking errors on main.py

* fix: fix mypy linting errors

* fix(anthropic/cost_calculator.py): fix linting errors

* fix: fix mypy linting errors

* fix: fix linting errors
2024-10-05 17:04:00 -04:00
Ishaan Jaff
f7ce1173f3 bump: version 1.48.15 → 1.48.16 2024-10-05 16:59:16 +05:30
Ishaan Jaff
3cb04480fb
(code clean up) use a folder for gcs bucket logging + add readme in folder (#6080)
* refactor gcs bucket

* add readme
2024-10-05 16:58:10 +05:30
Ishaan Jaff
6e6d38841f docs fix 2024-10-05 15:25:25 +05:30
GTonehour
d533acd24a
openrouter/openai's litellm_provider should be openrouter, not openai (#6079)
In model_prices_and_context_window.json, openrouter/* models all have litellm_provider set as "openrouter", except for four openrouter/openai/* models, which were set to "openai".
I suppose they must be set to "openrouter", so one can know it should use the openrouter API for this model.
2024-10-05 15:20:44 +05:30
Ishaan Jaff
ab0b536143
(feat) add azure openai cost tracking for prompt caching (#6077)
* add azure o1 models to model cost map

* add azure o1 cost tracking

* fix azure cost calc

* add get llm provider test
2024-10-05 15:04:18 +05:30
Ishaan Jaff
7267852511 linting error fix 2024-10-05 15:03:39 +05:30
Ishaan Jaff
5ee1342d37
(docs) reference router settings general settings etc (#6078) 2024-10-05 15:01:28 +05:30
Ishaan Jaff
d2f17cf97c docs routing config table 2024-10-05 14:40:07 +05:30
Ishaan Jaff
530915da51 add o-1 to Azure docs 2024-10-05 14:23:54 +05:30
Ishaan Jaff
3682f661d8
(feat) add cost tracking for OpenAI prompt caching (#6055)
* add cache_read_input_token_cost for prompt caching models

* add prompt caching for latest models

* add openai cost calculator

* add openai prompt caching test

* fix lint check

* add not on how usage._cache_read_input_tokens is used

* fix cost calc whisper openai

* use output_cost_per_second

* add input_cost_per_second
2024-10-05 14:20:15 +05:30
Ishaan Jaff
930606ad63
add azure o1 models to model cost map (#6075) 2024-10-05 13:22:06 +05:30
Ishaan Jaff
c84cfe977e
(feat) add /key/health endpoint to test key based logging (#6073)
* add /key/health endpoint

* add /key/health endpoint

* fix return from /key/health

* update doc string

* fix doc string for /key/health

* add test for /key/health

* fix linting

* docs /key/health
2024-10-05 11:56:55 +05:30
Krish Dholakia
4e921bee2b
fix(gcs_bucket.py): show error response text in exception (#6072) 2024-10-05 11:56:43 +05:30
Krrish Dholakia
4c9dea9f36 bump: version 1.48.14 → 1.48.15 2024-10-04 21:32:45 -04:00
Krish Dholakia
2e5c46ef6d
LiteLLM Minor Fixes & Improvements (10/04/2024) (#6064)
* fix(litellm_logging.py): ensure cache hits are scrubbed if 'turn_off_message_logging' is enabled

* fix(sagemaker.py): fix streaming to raise error immediately

Fixes https://github.com/BerriAI/litellm/issues/6054

* (fixes)  gcs bucket key based logging  (#6044)

* fixes for gcs bucket logging

* fix StandardCallbackDynamicParams

* fix - gcs logging when payload is not serializable

* add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket

* working success callbacks

* linting fixes

* fix linting error

* add type hints to functions

* fixes for dynamic success and failure logging

* fix for test_async_chat_openai_stream

* fix handle case when key based logging vars are set as os.environ/ vars

* fix prometheus track cooldown events on custom logger (#6060)

* (docs) add 1k rps load test doc  (#6059)

* docs 1k rps load test

* docs load testing

* docs load testing litellm

* docs load testing

* clean up load test doc

* docs prom metrics for load testing

* docs using prometheus on load testing

* doc load testing with prometheus

* (fixes) docs + qa - gcs key based logging  (#6061)

* fixes for required values for gcs bucket

* docs gcs bucket logging

* bump: version 1.48.12 → 1.48.13

* ci/cd run again

* bump: version 1.48.13 → 1.48.14

* update load test doc

* (docs) router settings - on litellm config  (#6037)

* add yaml with all router settings

* add docs for router settings

* docs router settings litellm settings

* (feat)  OpenAI prompt caching models to model cost map (#6063)

* add prompt caching for latest models

* add cache_read_input_token_cost for prompt caching models

* fix(litellm_logging.py): check if param is iterable

Fixes https://github.com/BerriAI/litellm/issues/6025#issuecomment-2393929946

* fix(factory.py): support passing an 'assistant_continue_message' to prevent bedrock error

Fixes https://github.com/BerriAI/litellm/issues/6053

* fix(databricks/chat): handle streaming responses

* fix(factory.py): fix linting error

* fix(utils.py): unify anthropic + deepseek prompt caching information to openai format

Fixes https://github.com/BerriAI/litellm/issues/6069

* test: fix test

* fix(types/utils.py): support all openai roles

Fixes https://github.com/BerriAI/litellm/issues/6052

* test: fix test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-10-04 21:28:53 -04:00
Ishaan Jaff
fc6e0dd6cb
(feat) OpenAI prompt caching models to model cost map (#6063)
* add prompt caching for latest models

* add cache_read_input_token_cost for prompt caching models
2024-10-04 19:12:13 +05:30
Ishaan Jaff
6d1de8e1ee
(docs) router settings - on litellm config (#6037)
* add yaml with all router settings

* add docs for router settings

* docs router settings litellm settings
2024-10-04 18:59:01 +05:30
Ishaan Jaff
0c9c42915f update load test doc 2024-10-04 18:47:26 +05:30
Ishaan Jaff
45a981a37e bump: version 1.48.13 → 1.48.14 2024-10-04 17:19:33 +05:30
Ishaan Jaff
3c59d188ef ci/cd run again 2024-10-04 17:19:26 +05:30
Ishaan Jaff
69c96d9ba4 bump: version 1.48.12 → 1.48.13 2024-10-04 17:18:49 +05:30
Ishaan Jaff
e394ed1e5b
(fixes) docs + qa - gcs key based logging (#6061)
* fixes for required values for gcs bucket

* docs gcs bucket logging
2024-10-04 16:58:04 +05:30
Ishaan Jaff
2449d258cf
(docs) add 1k rps load test doc (#6059)
* docs 1k rps load test

* docs load testing

* docs load testing litellm

* docs load testing

* clean up load test doc

* docs prom metrics for load testing

* docs using prometheus on load testing

* doc load testing with prometheus
2024-10-04 16:56:34 +05:30
Ishaan Jaff
224460d4c9
fix prometheus track cooldown events on custom logger (#6060) 2024-10-04 16:56:22 +05:30
Ishaan Jaff
6e97bc4404 fix handle case when key based logging vars are set as os.environ/ vars 2024-10-04 12:10:25 +05:30
Ishaan Jaff
670ecda4e2
(fixes) gcs bucket key based logging (#6044)
* fixes for gcs bucket logging

* fix StandardCallbackDynamicParams

* fix - gcs logging when payload is not serializable

* add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket

* working success callbacks

* linting fixes

* fix linting error

* add type hints to functions

* fixes for dynamic success and failure logging

* fix for test_async_chat_openai_stream
2024-10-04 11:56:10 +05:30
Krrish Dholakia
793593e735 docs(realtime.md): add new /v1/realtime endpoint 2024-10-03 22:44:02 -04:00
Krish Dholakia
09f0c09ba4
fix(utils.py): return openai streaming prompt caching tokens (#6051)
* fix(utils.py): return openai streaming prompt caching tokens

Closes https://github.com/BerriAI/litellm/issues/6038

* fix(main.py): fix error in finish_reason updates
2024-10-03 22:20:13 -04:00
Krrish Dholakia
04ae095860 build: version bump 2024-10-03 21:42:49 -04:00
ls-marek-kerka
db55098a33
🔧 (model_prices_and_context_window.json): rename gemini-pro-flash to gemini-flash-experimental to reflect updated naming convention (#5980)
Co-authored-by: Marek Keřka <marek.kerka@gmail.com>
2024-10-03 18:06:39 -04:00
Krish Dholakia
5c33d1c9af
Litellm Minor Fixes & Improvements (10/03/2024) (#6049)
* fix(proxy_server.py): remove spendlog fixes from proxy startup logic

Moves  https://github.com/BerriAI/litellm/pull/4794 to `/db_scripts` and cleans up some caching-related debug info (easier to trace debug logs)

* fix(langfuse_endpoints.py): Fixes https://github.com/BerriAI/litellm/issues/6041

* fix(azure.py): fix health checks for azure audio transcription models

Fixes https://github.com/BerriAI/litellm/issues/5999

* Feat: Add Literal AI Integration (#5653)

* feat: add Literal AI integration

* update readme

* Update README.md

* fix: address comments

* fix: remove literalai sdk

* fix: use HTTPHandler

* chore: add test

* fix: add asyncio lock

* fix(literal_ai.py): fix linting errors

* fix(literal_ai.py): fix linting errors

* refactor: cleanup

---------

Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
2024-10-03 18:02:28 -04:00
Krish Dholakia
f9d0bcc5a1
OpenAI /v1/realtime api support (#6047)
* feat(azure/realtime): initial working commit for proxy azure openai realtime endpoint support

Adds support for passing /v1/realtime calls via litellm proxy

* feat(realtime_api/main.py): abstraction for handling openai realtime api calls

* feat(router.py): add `arealtime()` endpoint in router for realtime api calls

Allows using `model_list` in proxy for realtime as well

* fix: make realtime api a private function

Structure might change based on feedback. Make that clear to users.

* build(requirements.txt): add websockets to the requirements.txt

* feat(openai/realtime): add openai /v1/realtime api support
2024-10-03 17:11:22 -04:00
Ishaan Jaff
130842537f bump: version 1.48.10 → 1.48.11 2024-10-03 23:32:08 +05:30
Ishaan Jaff
4e88fd65e1
(feat) openai prompt caching (non streaming) - add prompt_tokens_details in usage response (#6039)
* add prompt_tokens_details in usage response

* use _prompt_tokens_details as a param in Usage

* fix linting errors

* fix type error

* fix ci/cd deps

* bump deps for openai

* bump deps openai

* fix llm translation testing

* fix llm translation embedding
2024-10-03 23:31:10 +05:30
Krish Dholakia
9fccb4a0da
fix(factory.py): bedrock: merge consecutive tool + user messages (#6028)
* fix(factory.py): bedrock:  merge consecutive tool + user messages

Fixes https://github.com/BerriAI/litellm/issues/6007

* LiteLLM Minor Fixes & Improvements (10/02/2024)  (#6023)

* feat(together_ai/completion): handle together ai completion calls

* fix: handle list of int / list of list of int for text completion calls

* fix(utils.py): check if base model in bedrock converse model list

Fixes https://github.com/BerriAI/litellm/issues/6003

* test(test_optional_params.py): add unit tests for bedrock optional param mapping

Fixes https://github.com/BerriAI/litellm/issues/6003

* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist

Fixes https://github.com/BerriAI/litellm/issues/5388

* fixed an issue with tool use of claude models with anthropic and bedrock (#6013)

* fix(utils.py): handle empty schema for anthropic/bedrock

Fixes https://github.com/BerriAI/litellm/issues/6012

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(proxy_cli.py): fix import route for app + health checks path (#6026)

* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018)

* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav

Fixes https://github.com/BerriAI/litellm/issues/5999

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

* fix(factory.py): correctly handle content in tool block

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-10-03 09:16:25 -04:00
Ishaan Jaff
1ab886f80d
(contributor PRs) oct 3rd, 2024 (#6034)
* Do not skip important tests for OIDC. (#6017)

* [Bug] Skip monthly slack alert if there was no spend (#6015)

* Fix: skip slack alert if there was no spend

* Skip monthly report when there was no spend

---------

Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Paz <paz@tryolabs.com>
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
2024-10-03 17:12:34 +05:30
Ishaan Jaff
d92696a303
(feat) add nvidia nim embeddings (#6032)
* nvidia nim support embedding config

* add nvidia config in init

* nvidia nim embeddings

* docs nvidia nim embeddings

* docs embeddings on nvidia nim

* fix llm translation test
2024-10-03 17:12:14 +05:30
Ishaan Jaff
05df9cc6d0 docs prometheus metrics 2024-10-03 16:31:29 +05:30
Ishaan Jaff
21e05a0f3e
(feat proxy) add key based logging for GCS bucket (#6031)
* init litellm langfuse / gcs credentials in litellm logging obj

* add gcs key based test

* rename vars

* save standard_callback_dynamic_params in model call details

* add working gcs bucket key based logging

* test_basic_gcs_logging_per_request

* linting fix

* add doc on gcs  bucket team based logging
2024-10-03 15:24:31 +05:30
Ishaan Jaff
835db6ae98
(load testing) add vertex_ai embeddings load test (#6004)
* use vertex llm as base class for embeddings

* use correct vertex class in main.py

* set_headers in vertex llm base

* add types for vertex embedding requests

* add embedding handler for vertex

* use async mode for vertex embedding tests

* use vertexAI textEmbeddingConfig

* fix linting

* add sync and async mode testing for vertex ai embeddings

* add basic load test

* add vertex ai load test on ci cd
2024-10-03 14:39:15 +05:30
Krish Dholakia
f8d9be1301
(azure): Enable stream_options for Azure OpenAI. (#6024) (#6029)
* (azure): Enable stream_options for Azure OpenAI. (#6024)

* LiteLLM Minor Fixes & Improvements (10/02/2024)  (#6023)

* feat(together_ai/completion): handle together ai completion calls

* fix: handle list of int / list of list of int for text completion calls

* fix(utils.py): check if base model in bedrock converse model list

Fixes https://github.com/BerriAI/litellm/issues/6003

* test(test_optional_params.py): add unit tests for bedrock optional param mapping

Fixes https://github.com/BerriAI/litellm/issues/6003

* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist

Fixes https://github.com/BerriAI/litellm/issues/5388

* fixed an issue with tool use of claude models with anthropic and bedrock (#6013)

* fix(utils.py): handle empty schema for anthropic/bedrock

Fixes https://github.com/BerriAI/litellm/issues/6012

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(proxy_cli.py): fix import route for app + health checks path (#6026)

* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018)

* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav

Fixes https://github.com/BerriAI/litellm/issues/5999

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
2024-10-02 22:59:14 -04:00
Krrish Dholakia
74647a5227 bump: version 1.48.9 → 1.48.10 2024-10-02 22:13:57 -04:00
Krish Dholakia
14165d3648
LiteLLM Minor Fixes & Improvements (10/02/2024) (#6023)
* feat(together_ai/completion): handle together ai completion calls

* fix: handle list of int / list of list of int for text completion calls

* fix(utils.py): check if base model in bedrock converse model list

Fixes https://github.com/BerriAI/litellm/issues/6003

* test(test_optional_params.py): add unit tests for bedrock optional param mapping

Fixes https://github.com/BerriAI/litellm/issues/6003

* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist

Fixes https://github.com/BerriAI/litellm/issues/5388

* fixed an issue with tool use of claude models with anthropic and bedrock (#6013)

* fix(utils.py): handle empty schema for anthropic/bedrock

Fixes https://github.com/BerriAI/litellm/issues/6012

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(proxy_cli.py): fix import route for app + health checks path (#6026)

* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018)

* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav

Fixes https://github.com/BerriAI/litellm/issues/5999

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-10-02 22:00:28 -04:00
David Manouchehri
8995ff49ae
(testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018) 2024-10-02 12:02:22 -04:00
Krrish Dholakia
121b493fe8 docs(code_quality.md): add doc on litellm code qa 2024-10-02 11:20:15 -04:00
Krrish Dholakia
e19bb55e3b bump: version 1.48.8 → 1.48.9 2024-10-01 23:52:16 -04:00
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement (#5992)
* ci(config.yml): add a 'check_code_quality' step

Addresses https://github.com/BerriAI/litellm/issues/5991

* ci(config.yml): check why circle ci doesn't pick up this test

* ci(config.yml): fix to run 'check_code_quality' tests

* fix(__init__.py): fix unprotected import

* fix(__init__.py): don't remove unused imports

* build(ruff.toml): update ruff.toml to ignore unused imports

* fix: fix: ruff + pyright - fix linting + type-checking errors

* fix: fix linting errors

* fix(lago.py): fix module init error

* fix: fix linting errors

* ci(config.yml): cd into correct dir for checks

* fix(proxy_server.py): fix linting error

* fix(utils.py): fix bare except

causes ruff linting errors

* fix: ruff - fix remaining linting errors

* fix(clickhouse.py): use standard logging object

* fix(__init__.py): fix unprotected import

* fix: ruff - fix linting errors

* fix: fix linting errors

* ci(config.yml): cleanup code qa step (formatting handled in local_testing)

* fix(_health_endpoints.py): fix ruff linting errors

* ci(config.yml): just use ruff in check_code_quality pipeline for now

* build(custom_guardrail.py): include missing file

* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00