litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 02:34:29 +00:00

Author	SHA1	Message	Date
Krish Dholakia	ac4f32fb1e	Cost tracking for `gemini-2.5-pro` (#9837 ) * build(model_prices_and_context_window.json): add google/gemini-2.0-flash-lite-001 versioned pricing Closes https://github.com/BerriAI/litellm/issues/9829 * build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param * build(model_prices_and_context_window.json): add initial support for 'supported_output_modalities' param * build(model_prices_and_context_window.json): add supported endpoints to gemini-2.5-pro * build(model_prices_and_context_window.json): add gemini 200k+ pricing * feat(utils.py): support cost calculation for gemini-2.5-pro above 200k tokens Fixes https://github.com/BerriAI/litellm/issues/9807 * build: test dockerfile change * build: revert apk change * ci(config.yml): pip install wheel * ci: test problematic package first * ci(config.yml): pip install only binary * ci: try more things * ci: test different ml_dtypes version * ci(config.yml): check ml_dtypes==0.4.0 * ci: test * ci: cleanup config.yml * ci: specify ml dtypes in requirements.txt * ci: remove redisvl depedency (temporary) * fix: fix linting errors * test: update test * test: fix test	2025-04-09 18:48:43 -07:00
Krish Dholakia	6ba3c4a4f8	VertexAI non-jsonl file storage support (#9781 ) * test: add initial e2e test * fix(vertex_ai/files): initial commit adding sync file create support * refactor: initial commit of vertex ai non-jsonl files reaching gcp endpoint * fix(vertex_ai/files/transformation.py): initial working commit of non-jsonl file call reaching backend endpoint * fix(vertex_ai/files/transformation.py): working e2e non-jsonl file upload * test: working e2e jsonl call * test: unit testing for jsonl file creation * fix(vertex_ai/transformation.py): reset file pointer after read allow multiple reads on same file object * fix: fix linting errors * fix: fix ruff linting errors * fix: fix import * fix: fix linting error * fix: fix linting error * fix(vertex_ai/files/transformation.py): fix linting error * test: update test * test: update tests * fix: fix linting errors * fix: fix test * fix: fix linting error	2025-04-09 14:01:48 -07:00
Krish Dholakia	ac9f03beae	Allow passing `thinking` param to litellm proxy via client sdk + Code QA Refactor on get_optional_params (get correct values) (#9386 ) * fix(litellm_proxy/chat/transformation.py): support 'thinking' param Fixes https://github.com/BerriAI/litellm/issues/9380 * feat(azure/gpt_transformation.py): add azure audio model support Closes https://github.com/BerriAI/litellm/issues/6305 * fix(utils.py): use provider_config in common functions * fix(utils.py): add missing provider configs to get_chat_provider_config * test: fix test * fix: fix path * feat(utils.py): make bedrock invoke nova config baseconfig compatible * fix: fix linting errors * fix(azure_ai/transformation.py): remove buggy optional param filtering for azure ai Removes incorrect check for support tool choice when calling azure ai - prevented calling models with response_format unless on litell model cost map * fix(amazon_cohere_transformation.py): fix bedrock invoke cohere transformation to inherit from coherechatconfig * test: fix azure ai tool choice mapping * fix: fix model cost map to add 'supports_tool_choice' to cohere models * fix(get_supported_openai_params.py): check if custom llm provider in llm providers * fix(get_supported_openai_params.py): fix llm provider in list check * fix: fix ruff check errors * fix: support defs when calling bedrock nova * fix(factory.py): fix test	2025-04-07 21:04:11 -07:00
Krish Dholakia	fcf17d114f	Litellm dev 04 05 2025 p2 (#9774 ) * test: move test to just checking async * fix(transformation.py): handle function call with no schema * fix(utils.py): handle pydantic base model in message tool calls Fix https://github.com/BerriAI/litellm/issues/9321 * fix(vertex_and_google_ai_studio.py): handle tools=[] Fixes https://github.com/BerriAI/litellm/issues/9080 * test: remove max token restriction * test: fix basic test * fix(get_supported_openai_params.py): fix check * fix(converse_transformation.py): support fake streaming for meta.llama3-3-70b-instruct-v1:0 * fix: fix test * fix: parse out empty dictionary on dbrx streaming + tool calls * fix(handle-'strict'-param-when-calling-fireworks-ai): fireworks ai does not support 'strict' param * fix: fix ruff check ' * fix: handle no strict in function * fix: revert bedrock change - handle in separate PR	2025-04-07 21:02:52 -07:00
Krish Dholakia	34bdf36eab	Add inference providers support for Hugging Face (#8258 ) (#9738 ) (#9773 ) * Add inference providers support for Hugging Face (#8258) * add first version of inference providers for huggingface * temporarily skipping tests * Add documentation * Fix titles * remove max_retries from params and clean up * add suggestions * use llm http handler * update doc * add suggestions * run formatters * add tests * revert * revert * rename file * set maxsize for lru cache * fix embeddings * fix inference url * fix tests following breaking change in main * use ChatCompletionRequest * fix tests and lint * [Hugging Face] Remove outdated chat completion tests and fix embedding tests (#9749) * remove or fix tests * fix link in doc * fix(config_settings.md): document hf api key --------- Co-authored-by: célina <hanouticelina@gmail.com>	2025-04-05 10:50:15 -07:00
Krish Dholakia	8ee32291e0	Squashed commit of the following: (#9709 ) commit `b12a9892b7` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Wed Apr 2 08:09:56 2025 -0700 fix(utils.py): don't modify openai_token_counter commit `294de31803` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 21:22:40 2025 -0700 fix: fix linting error commit `cb6e9fbe40` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 19:52:45 2025 -0700 refactor: complete migration commit `bfc159172d` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 19:09:59 2025 -0700 refactor: refactor more constants commit `43ffb6a558` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:45:24 2025 -0700 fix: test commit `04dbe4310c` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:28:58 2025 -0700 refactor: refactor: move more constants into constants.py commit `3c26284aff` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:14:46 2025 -0700 refactor: migrate hardcoded constants out of __init__.py commit `c11e0de69d` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:11:21 2025 -0700 build: migrate all constants into constants.py commit `7882bdc787` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Mar 24 18:07:37 2025 -0700 build: initial test banning hardcoded numbers in repo	2025-04-02 21:24:54 -07:00
Ishaan Jaff	acf920a41a	Merge branch 'main' into litellm_fix_azure_o_series	2025-04-02 20:58:52 -07:00
Krish Dholakia	053b0e741f	Add Google AI Studio `/v1/files` upload API support (#9645 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 16s Details Helm unit test / unit-test (push) Successful in 23s Details * test: fix import for test * fix: fix bad error string * docs: cleanup files docs * fix(files/main.py): cleanup error string * style: initial commit with a provider/config pattern for files api google ai studio files api onboarding * fix: test * feat(gemini/files/transformation.py): support gemini files api response transformation * fix(gemini/files/transformation.py): return file id as gemini uri allows id to be passed in to chat completion request, just like openai * feat(llm_http_handler.py): support async route for files api on llm_http_handler * fix: fix linting errors * fix: fix model info check * fix: fix ruff errors * fix: fix linting errors * Revert "fix: fix linting errors" This reverts commit `926a5a527f`. * fix: fix linting errors * test: fix test * test: fix tests	2025-04-02 08:56:58 -07:00
Krish Dholakia	453003c378	fix(gemini/): add gemini/ route optional param mapping support (#9677 ) Fixes https://github.com/BerriAI/litellm/issues/9654	2025-04-02 08:56:32 -07:00
Krish Dholakia	23051d89dd	fix(streaming_handler.py): fix completion start time tracking (#9688 ) * fix(streaming_handler.py): fix completion start time tracking Fixes https://github.com/BerriAI/litellm/issues/9210 * feat(anthropic/chat/transformation.py): map openai 'reasoning_effort' to anthropic 'thinking' param Fixes https://github.com/BerriAI/litellm/issues/9022 * feat: map 'reasoning_effort' to 'thinking' param across bedrock + vertex Closes https://github.com/BerriAI/litellm/issues/9022#issuecomment-2705260808	2025-04-01 22:00:56 -07:00
Ishaan Jaff	5f286fe147	fix _check_valid_arg	2025-04-01 21:20:31 -07:00
Ishaan Jaff	f7129e5e59	fix _apply_openai_param_overrides	2025-04-01 21:17:59 -07:00
Ishaan Jaff	9acda77b75	add allowed_openai_params	2025-04-01 19:54:35 -07:00
Krish Dholakia	9b7ebb6a7d	build(pyproject.toml): add new dev dependencies - for type checking (#9631 ) * build(pyproject.toml): add new dev dependencies - for type checking * build: reformat files to fit black * ci: reformat to fit black * ci(test-litellm.yml): make tests run clear * build(pyproject.toml): add ruff * fix: fix ruff checks * build(mypy/): fix mypy linting errors * fix(hashicorp_secret_manager.py): fix passing cert for tls auth * build(mypy/): resolve all mypy errors * test: update test * fix: fix black formatting * build(pre-commit-config.yaml): use poetry run black * fix(proxy_server.py): fix linting error * fix: fix ruff safe representation error	2025-03-29 11:02:13 -07:00
Krish Dholakia	ccbac691e5	Support discovering gemini, anthropic, xai models by calling their `/v1/model` endpoint (#9530 ) * fix: initial commit for adding provider model discovery to gemini * feat(gemini/): add model discovery for gemini/ route * docs(set_keys.md): update docs to show you can check available gemini models as well * feat(anthropic/): add model discovery for anthropic api key * feat(xai/): add model discovery for XAI enables checking what models an xai key can call * ci: bump ci config yml * fix(topaz/common_utils.py): fix linting error * fix: fix linting error for python38	2025-03-27 22:50:48 -07:00
Krish Dholakia	c0845fec1f	Add OpenAI gpt-4o-transcribe support (#9517 ) * refactor: introduce new transformation config for gpt-4o-transcribe models * refactor: expose new transformation configs for audio transcription * ci: fix config yml * feat(openai/transcriptions): support provider config transformation on openai audio transcriptions allows gpt-4o and whisper audio transformation to work as expected * refactor: migrate fireworks ai + deepgram to new transform request pattern * feat(openai/): working support for gpt-4o-audio-transcribe * build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map * build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions` * fix(get_supported_openai_params.py): fix return * refactor(deepgram/): migrate unit test to deepgram handler * refactor: cleanup unused imports * fix(get_supported_openai_params.py): fix linting error * test: update test	2025-03-26 23:10:25 -07:00
Ishaan Jaff	cf22d31b2b	search_context_cost_per_query	2025-03-22 14:52:58 -07:00
Ishaan Jaff	7dd37a5b18	fix supports_web_search	2025-03-22 14:02:51 -07:00
Ishaan Jaff	1d7accce9e	test_supports_web_search	2025-03-22 13:49:35 -07:00
Ishaan Jaff	c4cbfd5716	supports_web_search	2025-03-22 13:01:41 -07:00
Ishaan Jaff	4b4a0b2612	supports_native_streaming	2025-03-20 13:52:30 -07:00
Ishaan Jaff	0352559c66	supports_native_streaming	2025-03-20 13:34:57 -07:00
Ishaan Jaff	bc174adcd0	add should_fake_stream	2025-03-20 09:54:26 -07:00
Krish Dholakia	d4caaae1be	Merge pull request #9274 from BerriAI/litellm_contributor_rebase_branch All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 43s Details Helm unit test / unit-test (push) Successful in 50s Details Litellm contributor rebase branch	2025-03-14 21:57:49 -07:00
Sunny Wan	f9a5109203	Merge branch 'BerriAI:main' into main	2025-03-13 19:37:22 -04:00
Krrish Dholakia	1cd57e95aa	fix: fix linting error	2025-03-13 14:33:19 -07:00
Tomer Bin	4a31b32a88	Support post-call guards for stream and non-stream responses	2025-03-13 08:53:54 +02:00
Krish Dholakia	2d957a0ed9	Merge branch 'main' into litellm_dev_03_10_2025_p3	2025-03-12 14:56:01 -07:00
Ishaan Jaff	342741ede1	Merge branch 'main' into litellm_responses_api_support	2025-03-12 12:04:12 -07:00
Ishaan Jaff	b790f0a5c6	log input of response API	2025-03-11 22:34:18 -07:00
Ishaan Jaff	4d55212c62	add BaseResponsesAPIConfig	2025-03-11 15:57:53 -07:00
Krrish Dholakia	f56c5ca380	feat: working e2e credential management - support reusing existing credentials	2025-03-10 19:29:24 -07:00
Krrish Dholakia	fdd5ba3084	feat(credential_accessor.py): support loading in credentials from credential_list Resolves https://github.com/BerriAI/litellm/issues/9114	2025-03-10 17:15:58 -07:00
omrishiv	0674491386	add support for Amazon Nova Canvas model (#7838 ) * add initial support for Amazon Nova Canvas model Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * adjust name to AmazonNovaCanvas and map function variables to config Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * tighten model name check Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * fix quality mapping Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * add premium quality in config Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * support all Amazon Nova Canvas tasks * remove unused import Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * add tests for image generation tasks and fix payload Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * add missing util file Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * update model prices backup file Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> * remove image tasks other than text->image Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> --------- Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>	2025-03-10 08:02:00 -07:00
Krish Dholakia	4330ef8e81	Fix batches api cost tracking + Log batch models in spend logs / standard logging payload (#9077 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 42s Details * feat(batches/): fix batch cost calculation - ensure it's accurate use the correct cost value - prev. defaulting to non-batch cost * feat(batch_utils.py): log batch models to spend logs + standard logging payload makes it easy to understand how cost was calculated * fix: fix stored payload for test * test: fix test	2025-03-08 11:47:25 -08:00
Krish Dholakia	0e3caf92b9	UI - new API Playground for testing LiteLLM translation (#9073 ) * feat: initial commit - enable dev to see translated request * feat(utils.py): expose new endpoint - `/utils/transform_request` to see the raw request sent by litellm * feat(transform_request.tsx): allow user to see their transformed request * refactor(litellm_logging.py): return raw request in 3 parts - api_base, headers, request body easier to render each individually on UI vs. extracting from combined string * feat: transform_request.tsx working e2e raw request viewing * fix(litellm_logging.py): fix transform viewing for bedrock models * fix(litellm_logging.py): don't return sensitive headers in raw request headers prevent accidental leak * feat(transform_request.tsx): style improvements	2025-03-07 19:39:31 -08:00
Ishaan Jaff	f47987e673	(Refactor) `/v1/messages` to follow simpler logic for Anthropic API spec (#9013 ) * anthropic_messages_handler v0 * fix /messages * working messages with router methods * test_anthropic_messages_handler_litellm_router_non_streaming * test_anthropic_messages_litellm_router_non_streaming_with_logging * AnthropicMessagesConfig * _handle_anthropic_messages_response_logging * working with /v1/messages endpoint * working /v1/messages endpoint * refactor to use router factory function * use aanthropic_messages * use BaseConfig for Anthropic /v1/messages * track api key, team on /v1/messages endpoint * fix get_logging_payload * BaseAnthropicMessagesTest * align test config * test_anthropic_messages_with_thinking * test_anthropic_streaming_with_thinking * fix - display anthropic url for debugging * test_bad_request_error_handling * test_anthropic_messages_router_streaming_with_bad_request * fix ProxyException * test_bad_request_error_handling_streaming * use provider_specific_header * test_anthropic_messages_with_extra_headers * test_anthropic_messages_to_wildcard_model * fix gcs pub sub test * standard_logging_payload * fix unit testing for anthopic /v1/messages support * fix pass through anthropic messages api * delete dead code * fix anthropic pass through response * revert change to spend tracking utils * fix get_litellm_metadata_from_kwargs * fix spend logs payload json * proxy_pass_through_endpoint_tests * TestAnthropicPassthroughBasic * fix pass through tests * test_async_vertex_proxy_route_api_key_auth * _handle_anthropic_messages_response_logging * vertex_credentials * test_set_default_vertex_config * test_anthropic_messages_litellm_router_non_streaming_with_logging * test_ageneric_api_call_with_fallbacks_basic * test__aadapter_completion	2025-03-06 00:43:08 -08:00
Sunny Wan	a2fed4059e	added Snowflake config to ProviderConfigManager	2025-03-05 20:32:18 -05:00
Krish Dholakia	662c59adcf	Support caching on reasoning content + other fixes (#8973 ) * fix(factory.py): pass on anthropic thinking content from assistant call * fix(factory.py): fix anthropic messages to handle thinking blocks Fixes https://github.com/BerriAI/litellm/issues/8961 * fix(factory.py): fix bedrock handling for assistant content in messages Fixes https://github.com/BerriAI/litellm/issues/8961 * feat(convert_dict_to_response.py): handle reasoning content + thinking blocks in chat completion block ensures caching works for anthropic thinking block * fix(convert_dict_to_response.py): pass all message params to delta block ensures streaming delta also contains the reasoning content / thinking block * test(test_prompt_factory.py): remove redundant test anthropic now supports assistant as the first message * fix(factory.py): fix linting errors * fix: fix code qa * test: remove falsy test * fix(litellm_logging.py): fix str conversion	2025-03-04 21:12:16 -08:00
Krrish Dholakia	8ea3d4c046	build: merge litellm_dev_03_01_2025_p2	2025-03-03 23:05:41 -08:00
Krish Dholakia	a65bfab697	Fix calling claude via invoke route + response_format support for claude on invoke route (#8908 ) * fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route move to using anthropic config as base * fix(utils.py): expose anthropic config via providerconfigmanager * fix(llm_http_handler.py): support json mode on async completion calls * fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke * fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config Prevents error when passing in 'response_format: {"type": "text"} * test: fix test * fix(utils.py): fix base invoke provider check * fix(anthropic_claude3_transformation.py): don't pass 'stream' param * fix: fix linting errors * fix(converse_transformation.py): handle response_format type=text for converse	2025-02-28 17:56:26 -08:00
Krish Dholakia	017c482d7b	fix(o_series_transformation.py): fix optional param check for o-serie… (#8787 ) * fix(o_series_transformation.py): fix optional param check for o-series models o3-mini and o-1 do not support parallel tool calling * fix(utils.py): support 'drop_params' for 'thinking' param across models allows switching to older claude versions (or non-anthropic models) and param to be safely dropped * fix: fix passing thinking param in optional params allows dropping thinking_param where not applicable * test: update old model * fix(utils.py): fix linting errors * fix(main.py): add param to acompletion	2025-02-26 12:26:55 -08:00
Krrish Dholakia	fcf4ea3608	build: merge squashed commit Squashed commit of the following: commit `6678e15381` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 09:29:15 2025 -0800 test_prompt_caching commit `bd86e0ac47` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 08:57:16 2025 -0800 test_prompt_caching commit `2fc21ad51e` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 08:13:45 2025 -0800 test_aprompt_caching commit `d94cff55ff` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 08:13:12 2025 -0800 test_prompt_caching commit `49c5e7811e` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 07:43:53 2025 -0800 ui new build commit `cb8d5e5917` Author: Ishaan Jaff <ishaanjaffer0324@gmail.com> Date: Wed Feb 26 07:38:56 2025 -0800 (UI) - Create Key flow for existing users (#8844) * working create user button * working create user for a key flow * allow searching users * working create user + key * use clear sections on create key * better search for users * fix create key * ui fix create key button - make it neater / cleaner * ui fix all keys table commit `335ba30467` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Wed Feb 26 08:53:17 2025 -0800 fix: fix file name commit `b8c5b31a4e` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Tue Feb 25 22:54:46 2025 -0800 fix: fix utils commit `ac6e503461` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Feb 24 10:43:31 2025 -0800 fix(main.py): fix openai message for assistant msg if role is missing - openai allows this Fixes https://github.com/BerriAI/litellm/issues/8661 commit `de3989dbc5` Author: Krrish Dholakia <krrishdholakia@gmail.com> Date: Mon Feb 24 21:19:25 2025 -0800 fix(get_litellm_params.py): handle no-log being passed in via kwargs Fixes https://github.com/BerriAI/litellm/issues/8380	2025-02-26 09:39:27 -08:00
Krish Dholakia	09462ba80c	Add cohere v2/rerank support (#8421 ) (#8605 ) * Add cohere v2/rerank support (#8421) * Support v2 endpoint cohere rerank * Add tests and docs * Make v1 default if old params used * Update docs * Update docs pt 2 * Update tests * Add e2e test * Clean up code * Use inheritence for new config * Fix linting issues (#8608) * Fix cohere v2 failing test + linting (#8672) * Fix test and unused imports * Fix tests * fix: fix linting errors * test: handle tgai instability * fix: skip service unavailable err * test: print logs for unstable test * test: skip unreliable tests --------- Co-authored-by: vibhavbhat <vibhavb00@gmail.com>	2025-02-22 22:25:29 -08:00
Krish Dholakia	b682dc4ec8	Add cost tracking for rerank via bedrock (#8691 ) * feat(bedrock/rerank): infer model region if model given as arn * test: add unit testing to ensure bedrock region name inferred from arn on rerank * feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137 * test(test_bedrock_completion.py): add testing for bedrock cohere rerank * feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking * build(model_prices_and_context_window.json): add amazon.rerank model to model cost map * fix(cost_calculator.py): bedrock/common_utils.py get base model from model w/ arn -> handles rerank model * build(model_prices_and_context_window.json): add bedrock cohere rerank pricing * feat(bedrock/rerank): migrate bedrock config to basererank config * Revert "feat(bedrock/rerank): migrate bedrock config to basererank config" This reverts commit `84fae1f167`. * test: add testing to ensure large doc / queries are correctly counted * Revert "test: add testing to ensure large doc / queries are correctly counted" This reverts commit `4337f1657e`. * fix(migrate-jina-ai-to-rerank-config): enables cost tracking * refactor(jina_ai/): finish migrating jina ai to base rerank config enables cost tracking * fix(jina_ai/rerank): e2e jina ai rerank cost tracking * fix: cleanup dead code * fix: fix python3.8 compatibility error * test: fix test * test: add e2e testing for azure ai rerank * fix: fix linting error * test: mark cohere as flaky	2025-02-20 21:00:18 -08:00
Krish Dholakia	f9df01fbc6	fix(utils.py): handle token counter error when invalid message passed in (#8670 ) * fix(utils.py): handle token counter error * fix(utils.py): testing fixes * fix(utils.py): fix incr for num tokens from list * fix(utils.py): fix text str token counting	2025-02-19 22:21:34 -08:00
Krrish Dholakia	9470f57e86	build: extract <think>..</think> block for amazon deepseek r1 and put in reasoning_content	2025-02-19 21:10:38 -08:00
Krish Dholakia	2340f1b31f	Pass router tags in request headers - `x-litellm-tags` (#8609 ) * feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header allow tag based routing + spend tracking via request headers * docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking * docs(tag_routing.md): add to docs * fix(utils.py): only pass str values for openai metadata param * fix(utils.py): drop non-str values for metadata param to openai preview-feature, otel span was being sent in	2025-02-18 08:26:22 -08:00
Krish Dholakia	a9276f27f9	fix(main.py): fix key leak error when unknown provider given (#8556 ) * fix(main.py): fix key leak error when unknown provider given don't return passed in args if unknown route on embedding * fix(main.py): remove instances of {args} being passed in exception prevent potential key leaks * test(code_coverage/prevent_key_leaks_in_codebase.py): ban usage of {args} in codebase * fix: fix linting errors * fix: remove unused variable	2025-02-15 14:02:55 -08:00
Ishaan Jaff	753290e5b8	(perf) Fix memory leak on `/completions` route (#8551 ) * initial mem util test * fix _cached_get_model_info_helper * test memory usage * fix tests * fix mem usage	2025-02-14 18:58:16 -08:00

1 2 3 4 5 ...

2106 commits