litellm

Author	SHA1	Message	Date
Krish Dholakia	7e5085dc7b	Litellm dev 11 21 2024 (#6837 ) * Fix Vertex AI function calling invoke: use JSON format instead of protobuf text format. (#6702) * test: test tool_call conversion when arguments is empty dict Fixes https://github.com/BerriAI/litellm/issues/6833 * fix(openai_like/handler.py): return more descriptive error message Fixes https://github.com/BerriAI/litellm/issues/6812 * test: skip overloaded model * docs(anthropic.md): update anthropic docs to show how to route to any new model * feat(groq/): fake stream when 'response_format' param is passed Groq doesn't support streaming when response_format is set * feat(groq/): add response_format support for groq Closes https://github.com/BerriAI/litellm/issues/6845 * fix(o1_handler.py): remove fake streaming for o1 Closes https://github.com/BerriAI/litellm/issues/6801 * build(model_prices_and_context_window.json): add groq llama3.2b model pricing Closes https://github.com/BerriAI/litellm/issues/6807 * fix(utils.py): fix handling ollama response format param Fixes https://github.com/BerriAI/litellm/issues/6848#issuecomment-2491215485 * docs(sidebars.js): refactor chat endpoint placement * fix: fix linting errors * test: fix test * test: fix test * fix(openai_like/handler): handle max retries * fix(streaming_handler.py): fix streaming check for openai-compatible providers * test: update test * test: correctly handle model is overloaded error * test: update test * test: fix test * test: mark flaky test --------- Co-authored-by: Guowang Li <Guowang@users.noreply.github.com>	2024-11-22 01:53:52 +05:30
Krish Dholakia	b11bc0374e	Litellm dev 11 20 2024 (#6838 ) * feat(customer_endpoints.py): support passing budget duration via `/customer/new` endpoint Closes https://github.com/BerriAI/litellm/issues/5651 * docs: add missing params to swagger + api documentation test * docs: add documentation for all key endpoints documents all params on swagger * docs(internal_user_endpoints.py): document all /user/new params Ensures all params are documented * docs(team_endpoints.py): add missing documentation for team endpoints Ensures 100% param documentation on swagger * docs(organization_endpoints.py): document all org params Adds documentation for all params in org endpoint * docs(customer_endpoints.py): add coverage for all params on /customer endpoints ensures all /customer/* params are documented * ci(config.yml): add endpoint doc testing to ci/cd * fix: fix internal_user_endpoints.py * fix(internal_user_endpoints.py): support 'duration' param * fix(partner_models/main.py): fix anthropic re-raise exception on vertex * fix: fix pydantic obj * build(model_prices_and_context_window.json): add new vertex claude model names vertex claude changed model names - causes cost tracking errors	2024-11-21 05:20:37 +05:30
David Manouchehri	a1f06de53d	Add gpt-4o-2024-11-20. (#6832 )	2024-11-21 03:48:29 +05:30
Krish Dholakia	b0be5bf3a1	LiteLLM Minor Fixes & Improvements (11/19/2024) (#6820 ) * fix(anthropic/chat/transformation.py): add json schema as values: json_schema fixes passing pydantic obj to anthropic Fixes https://github.com/BerriAI/litellm/issues/6766 * (feat): Add timestamp_granularities parameter to transcription API (#6457) * Add timestamp_granularities parameter to transcription API * add param to the local test * fix(databricks/chat.py): handle max_retries optional param handling for openai-like calls Fixes issue with calling finetuned vertex ai models via databricks route * build(ui/): add team admins via proxy ui * fix: fix linting error * test: fix test * docs(vertex.md): refactor docs * test: handle overloaded anthropic model error * test: remove duplicate test * test: fix test * test: update test to handle model overloaded error --------- Co-authored-by: Show <35062952+BrunooShow@users.noreply.github.com>	2024-11-21 00:57:58 +05:30
Krish Dholakia	cf579fe644	Litellm stable pr 10 30 2024 (#6821 ) * Update organization_endpoints.py to be able to list organizations (#6473) * Update organization_endpoints.py to be able to list organizations * Update test_organizations.py * Update test_organizations.py add test for list * Update test_organizations.py correct indentation * Add unreleased Claude 3.5 Haiku models. (#6476) --------- Co-authored-by: superpoussin22 <vincent.nadal@orange.fr> Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>	2024-11-20 05:03:42 +05:30
Ishaan Jaff	98c7889013	feat - add qwen2p5-coder-32b-instruct (#6818 )	2024-11-19 14:50:51 -08:00
Krish Dholakia	b854f6c07b	build: add gemini-exp-1114 (#6786 ) Fixes	2024-11-18 12:44:39 +05:30
paul-gauthier	73ccbc0f14	add openrouter/qwen/qwen-2.5-coder-32b-instruct (#6731 )	2024-11-15 18:08:28 -08:00
Ishaan Jaff	0f7ea14992	feat - add us.llama 3.1 models (#6760 )	2024-11-15 08:03:06 -08:00
Ishaan Jaff	c03351328f	fix imagegeneration output_cost_per_image on model cost map (#6752 )	2024-11-14 20:37:21 -08:00
Ishaan Jaff	7959dc9db3	(feat) add bedrock/stability.stable-image-ultra-v1:0 (#6723 ) * add stability.stable-image-ultra-v1:0 * add pricing for stability.stable-image-ultra-v1:0 * fix test_supports_response_schema * ci/cd run again	2024-11-14 14:47:15 -08:00
Kilian Lieret	e7543378b8	Fix: Update gpt-4o costs to that of gpt-4o-2024-08-06 (#6714 ) Fixes #6713	2024-11-12 18:40:52 -08:00
Krish Dholakia	f59cb46e71	Litellm dev 11 11 2024 (#6693 ) * fix(__init__.py): add 'watsonx_text' as mapped llm api route Fixes https://github.com/BerriAI/litellm/issues/6663 * fix(opentelemetry.py): fix passing parallel tool calls to otel Fixes https://github.com/BerriAI/litellm/issues/6677 * refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling reduces bugs in repo * fix(__init__.py): update provider-model mapping to include all known provider-model mappings Fixes https://github.com/BerriAI/litellm/issues/6669 * feat(anthropic): support passing document in llm api call * docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function * fix(factory.py): fix linting error	2024-11-12 00:16:35 +05:30
Ishaan Jaff	70aa85af1f	fix model cost map stability.sd3-large-v1:0	2024-11-08 19:51:35 -08:00
Ishaan Jaff	979dfe8ab2	(feat) Add Bedrock Stability.ai Stable Diffusion 3 Image Generation models (#6673 ) * add bedrock image gen async support * added async support for bedrock image gen * move image gen testing * add AmazonStability3Config * add AmazonStability3Config config * update AmazonStabilityConfig * update get_optional_params_image_gen * use 1 helper for _get_request_body * add transform_response_dict_to_openai_response for stability3 * test sd3-large-v1:0 * unit testing for bedrock image gen * fix load_vertex_ai_credentials * fix test_aimage_generation_vertex_ai * add stability.sd3-large-v1:0 to model cost map * add stability.stability.sd3-large-v1:0 to docs	2024-11-08 19:26:03 -08:00
David Manouchehri	a3baec081b	(pricing): Fix multiple mistakes in Claude pricing, and also increase context length allowed for Claude 3.5 Sonnet v2 on Bedrock. (#6666 )	2024-11-08 22:10:15 +05:30
Emerson Gomes	d0d29d70de	Update several Azure AI models in model cost map (#6655 ) * Adding Azure Phi 3/3.5 models to model cost map * Update gpt-4o-mini models * Adding missing Azure Mistral models to model cost map * Adding Azure Llama3.2 models to model cost map * Fix Gemini-1.5-flash pricing * Fix Gemini-1.5-flash output pricing * Fix Gemini-1.5-pro prices * Fix Gemini-1.5-flash output prices * Correct gemini-1.5-pro prices * Correction on Vertex Llama3.2 entry --------- Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>	2024-11-08 10:41:14 +05:30
Emerson Gomes	6e4a9bb3b7	Update gpt-4o-2024-08-06, and o1-preview, o1-mini models in model cost map (#6654 ) * Adding supports_response_schema to gpt-4o-2024-08-06 models * o1 models do not support vision --------- Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>	2024-11-07 16:26:22 -08:00
Krrish Dholakia	8ce53be498	build: fix json for model map	2024-11-05 04:04:55 +05:30
Krrish Dholakia	4debf4eceb	build: fix map	2024-11-05 04:01:31 +05:30
Krrish Dholakia	cc84b09b95	build: fix map	2024-11-05 04:01:04 +05:30
paul-gauthier	7525b6bbaa	Add 3.5 haiku (#6588 ) * feat: add claude-3-5-haiku-20241022 entries * feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models * add missing entries, remove vision * remove image token costs	2024-11-05 03:45:29 +05:30
Ishaan Jaff	5652c375b3	(feat) add XAI ChatCompletion Support (#6373 ) * init commit for XAI * add full logic for xai chat completion * test_completion_xai * docs xAI * add xai/grok-beta * test_xai_chat_config_get_openai_compatible_provider_info * test_xai_chat_config_map_openai_params * add xai streaming test	2024-11-01 20:37:09 +05:30
Xingyao Wang	e16f780b7c	Add `azure/gpt-4o-mini-2024-07-18` to model_prices_and_context_window.json (#6477 )	2024-10-29 09:02:42 -07:00
Krish Dholakia	f44ab00de2	LiteLLM Minor Fixes & Improvements (10/24/2024) (#6441 ) * fix(azure.py): handle /openai/deployment in azure api base * fix(factory.py): fix faulty anthropic tool result translation check Fixes https://github.com/BerriAI/litellm/issues/6422 * fix(gpt_transformation.py): add support for parallel_tool_calls to azure Fixes https://github.com/BerriAI/litellm/issues/6440 * fix(factory.py): support anthropic prompt caching for tool results * fix(vertex_ai/common_utils): don't pop non-null required field Fixes https://github.com/BerriAI/litellm/issues/6426 * feat(vertex_ai.py): support code_execution tool call for vertex ai + gemini Closes https://github.com/BerriAI/litellm/issues/6434 * build(model_prices_and_context_window.json): Add 'supports_assistant_prefill' for bedrock claude-3-5-sonnet v2 models Closes https://github.com/BerriAI/litellm/issues/6437 * fix(types/utils.py): fix linting * test: update test to include required fields * test: fix test * test: handle flaky test * test: remove e2e test - hitting gemini rate limits	2024-10-28 15:05:20 -07:00
Ishaan Jaff	828631d6fc	add pricing for amazon.titan-embed-image-v1 (#6444 )	2024-10-28 22:01:48 +05:30
Krish Dholakia	c03e5da41f	LiteLLM Minor Fixes & Improvements (10/24/2024) (#6421 ) * fix(utils.py): support passing dynamic api base to validate_environment Returns True if just api base is required and api base is passed * fix(litellm_pre_call_utils.py): feature flag sending client headers to llm api Fixes https://github.com/BerriAI/litellm/issues/6410 * fix(anthropic/chat/transformation.py): return correct error message * fix(http_handler.py): add error response text in places where we expect it * fix(factory.py): handle base case of no non-system messages to bedrock Fixes https://github.com/BerriAI/litellm/issues/6411 * feat(cohere/embed): Support cohere image embeddings Closes https://github.com/BerriAI/litellm/issues/6413 * fix(__init__.py): fix linting error * docs(supported_embedding.md): add image embedding example to docs * feat(cohere/embed): use cohere embedding returned usage for cost calc * build(model_prices_and_context_window.json): add embed-english-v3.0 details (image cost + 'supports_image_input' flag) * fix(cohere_transformation.py): fix linting error * test(test_proxy_server.py): cleanup test * test: cleanup test * fix: fix linting errors	2024-10-25 15:55:56 -07:00
Krish Dholakia	cb2563e3c0	Litellm dev 10 22 2024 (#6384 ) * fix(utils.py): add 'disallowed_special' for token counting on .encode() Fixes error when '< endoftext >' in string * Revert "(fix) standard logging metadata + add unit testing (#6366)" (#6381) This reverts commit `8359cb6fa9`. * add new 35 mode lcard (#6378) * Add claude 3 5 sonnet 20241022 models for all provides (#6380) * Add Claude 3.5 v2 on Amazon Bedrock and Vertex AI. * added anthropic/claude-3-5-sonnet-20241022 * add new 35 mode lcard --------- Co-authored-by: Paul Gauthier <paul@paulg.com> Co-authored-by: lowjiansheng <15527690+lowjiansheng@users.noreply.github.com> * test(skip-flaky-google-context-caching-test): google is not reliable. their sample code is also not working * Fix metadata being overwritten in speech() (#6295) * fix: adding missing redis cluster kwargs (#6318) Co-authored-by: Ali Arian <ali.arian@breadfinancial.com> * Add support for `max_completion_tokens` in Azure OpenAI (#6376) Now that Azure supports `max_completion_tokens`, no need for special handling for this param and let it pass thru. More details: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models?tabs=python-secure#api-support * build(model_prices_and_context_window.json): add voyage-finance-2 pricing Closes https://github.com/BerriAI/litellm/issues/6371 * build(model_prices_and_context_window.json): fix llama3.1 pricing model name on map Closes https://github.com/BerriAI/litellm/issues/6310 * feat(realtime_streaming.py): just log specific events Closes https://github.com/BerriAI/litellm/issues/6267 * fix(utils.py): more robust checking if unmapped vertex anthropic model belongs to that family of models Fixes https://github.com/BerriAI/litellm/issues/6383 * Fix Ollama stream handling for tool calls with None content (#6155) * test(test_max_completions): update test now that azure supports 'max_completion_tokens' * fix(handler.py): fix linting error --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com> Co-authored-by: David Manouchehri <david.manouchehri@ai.moda> Co-authored-by: Paul Gauthier <paul@paulg.com> Co-authored-by: John HU <hszqqq12@gmail.com> Co-authored-by: Ali Arian <113945203+ali-arian@users.noreply.github.com> Co-authored-by: Ali Arian <ali.arian@breadfinancial.com> Co-authored-by: Anand Taralika <46954145+taralika@users.noreply.github.com> Co-authored-by: Nolan Tremelling <34580718+NolanTrem@users.noreply.github.com>	2024-10-22 21:18:54 -07:00
David Manouchehri	7939e930ff	Add claude 3 5 sonnet 20241022 models for all provides (#6380 ) * Add Claude 3.5 v2 on Amazon Bedrock and Vertex AI. * added anthropic/claude-3-5-sonnet-20241022 * add new 35 mode lcard --------- Co-authored-by: Paul Gauthier <paul@paulg.com> Co-authored-by: lowjiansheng <15527690+lowjiansheng@users.noreply.github.com>	2024-10-22 11:51:16 -07:00
Low Jian Sheng	21ace6de45	add new 35 mode lcard (#6378 )	2024-10-22 23:39:20 +05:30
Krish Dholakia	7cc12bd5c6	LiteLLM Minor Fixes & Improvements (10/18/2024) (#6320 ) * fix(converse_transformation.py): handle cross region model name when getting openai param support Fixes https://github.com/BerriAI/litellm/issues/6291 * LiteLLM Minor Fixes & Improvements (10/17/2024) (#6293) * fix(ui_sso.py): fix faulty admin only check Fixes https://github.com/BerriAI/litellm/issues/6286 * refactor(sso_helper_utils.py): refactor /sso/callback to use helper utils, covered by unit testing Prevent future regressions * feat(prompt_factory): support 'ensure_alternating_roles' param Closes https://github.com/BerriAI/litellm/issues/6257 * fix(proxy/utils.py): add dailytagspend to expected views * feat(auth_utils.py): support setting regex for clientside auth credentials Fixes https://github.com/BerriAI/litellm/issues/6203 * build(cookbook): add tutorial for mlflow + langchain + litellm proxy tracing * feat(argilla.py): add argilla logging integration Closes https://github.com/BerriAI/litellm/issues/6201 * fix: fix linting errors * fix: fix ruff error * test: fix test * fix: update vertex ai assumption - parts not always guaranteed (#6296) * docs(configs.md): add argila env var to docs * docs(user_keys.md): add regex doc for clientside auth params * docs(argilla.md): add doc on argilla logging * docs(argilla.md): add sampling rate to argilla calls * bump: version 1.49.6 → 1.49.7 * add gpt-4o-audio models to model cost map (#6306) * (code quality) add ruff check PLR0915 for `too-many-statements` (#6309) * ruff add PLR0915 * add noqa for PLR0915 * fix noqa * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * add # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * # noqa: PLR0915 * doc fix Turn on / off caching per Key. (#6297) * (feat) Support `audio`, `modalities` params (#6304) * add audio, modalities param * add test for gpt audio models * add get_supported_openai_params for GPT audio models * add supported params for audio * test_audio_output_from_model * bump openai to openai==1.52.0 * bump openai on pyproject * fix audio test * fix test mock_chat_response * handle audio for Message * fix handling audio for OAI compatible API endpoints * fix linting * fix mock dbrx test * (feat) Support audio param in responses streaming (#6312) * add audio, modalities param * add test for gpt audio models * add get_supported_openai_params for GPT audio models * add supported params for audio * test_audio_output_from_model * bump openai to openai==1.52.0 * bump openai on pyproject * fix audio test * fix test mock_chat_response * handle audio for Message * fix handling audio for OAI compatible API endpoints * fix linting * fix mock dbrx test * add audio to Delta * handle model_response.choices.delta.audio * fix linting * build(model_prices_and_context_window.json): add gpt-4o-audio audio token cost tracking * refactor(model_prices_and_context_window.json): refactor 'supports_audio' to be 'supports_audio_input' and 'supports_audio_output' Allows for flag to be used for openai + gemini models (both support audio input) * feat(cost_calculation.py): support cost calc for audio model Closes https://github.com/BerriAI/litellm/issues/6302 * feat(utils.py): expose new `supports_audio_input` and `supports_audio_output` functions Closes https://github.com/BerriAI/litellm/issues/6303 * feat(handle_jwt.py): support single dict list * fix(cost_calculator.py): fix linting errors * fix: fix linting error * fix(cost_calculator): move to using standard openai usage cached tokens value * test: fix test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-10-19 22:23:27 -07:00
Ishaan Jaff	7724d5895c	add gpt-4o-audio models to model cost map (#6306 )	2024-10-18 12:45:42 +05:30
Krish Dholakia	2acb0c0675	Litellm Minor Fixes & Improvements (10/12/2024) (#6179 ) * build(model_prices_and_context_window.json): add bedrock llama3.2 pricing * build(model_prices_and_context_window.json): add bedrock cross region inference pricing * Revert "(perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)" This reverts commit `2a5624af47`. * add azure/gpt-4o-2024-05-13 (#6174) * LiteLLM Minor Fixes & Improvements (10/10/2024) (#6158) * refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic * fix(vertex_ai/): support passing custom api base to partner models Fixes https://github.com/BerriAI/litellm/issues/4317 * fix(proxy_server.py): Fix prometheus premium user check logic * docs(prometheus.md): update quick start docs * fix(custom_llm.py): support passing dynamic api key + api base * fix(realtime_api/main.py): Add request/response logging for realtime api endpoints Closes https://github.com/BerriAI/litellm/issues/6081 * feat(openai/realtime): add openai realtime api logging Closes https://github.com/BerriAI/litellm/issues/6081 * fix(realtime_streaming.py): fix linting errors * fix(realtime_streaming.py): fix linting errors * fix: fix linting errors * fix pattern match router * Add literalai in the sidebar observability category (#6163) * fix: add literalai in the sidebar * fix: typo * update (#6160) * Feat: Add Langtrace integration (#5341) * Feat: Add Langtrace integration * add langtrace service name * fix timestamps for traces * add tests * Discard Callback + use existing otel logger * cleanup * remove print statments * remove callback * add docs * docs * add logging docs * format logging * remove emoji and add litellm proxy example * format logging * format `logging.md` * add langtrace docs to logging.md * sync conflict * docs fix * (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165) * fix move s3 to use customLogger * add basic s3 logging test * add s3 to custom logger compatible * use batch logger for s3 * s3 set flush interval and batch size * fix s3 logging * add notes on s3 logging * fix s3 logging * add basic s3 logging test * fix s3 type errors * add test for sync logging on s3 * fix: fix to debug log --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Willy Douhard <willy.douhard@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> Co-authored-by: Ali Waleed <ali@scale3labs.com> * docs(custom_llm_server.md): update doc on passing custom params * fix(pass_through_endpoints.py): don't require headers Fixes https://github.com/BerriAI/litellm/issues/6128 * feat(utils.py): add support for caching rerank endpoints Closes https://github.com/BerriAI/litellm/issues/6144 * feat(litellm_logging.py'): add response headers for failed requests Closes https://github.com/BerriAI/litellm/issues/6159 --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Willy Douhard <willy.douhard@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> Co-authored-by: Ali Waleed <ali@scale3labs.com>	2024-10-12 11:48:34 -07:00
Ishaan Jaff	9db4ccca9f	add azure/gpt-4o-2024-05-13 (#6174 )	2024-10-12 10:47:45 +05:30
Krish Dholakia	6005450c8f	LiteLLM Minor Fixes & Improvements (10/09/2024) (#6139 ) * fix(utils.py): don't return 'none' response headers Fixes https://github.com/BerriAI/litellm/issues/6123 * fix(vertex_and_google_ai_studio_gemini.py): support parsing out additional properties and strict value for tool calls Fixes https://github.com/BerriAI/litellm/issues/6136 * fix(cost_calculator.py): set default character value to none Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403290196 * fix(google.py): fix cost per token / cost per char conversion Fixes https://github.com/BerriAI/litellm/issues/6133#issuecomment-2403370287 * build(model_prices_and_context_window.json): update gemini pricing Fixes https://github.com/BerriAI/litellm/issues/6133 * build(model_prices_and_context_window.json): update gemini pricing * fix(litellm_logging.py): fix streaming caching logging when 'turn_off_message_logging' enabled Stores unredacted response in cache * build(model_prices_and_context_window.json): update gemini-1.5-flash pricing * fix(cost_calculator.py): fix default prompt_character count logic Fixes error in gemini cost calculation * fix(cost_calculator.py): fix cost calc for tts models	2024-10-10 00:42:11 -07:00
Kyrylo Yefimenko	b68fee48a6	(fix) Fix Groq pricing for llama3.1 (#6114 ) * Adjust ollama models to chat instead of completions * Fix Groq prices for llama3.1	2024-10-08 20:20:58 +05:30
Krish Dholakia	f2c0a31e3c	LiteLLM Minor Fixes & Improvements (10/05/2024) (#6083 ) * docs(prompt_caching.md): add prompt caching cost calc example to docs * docs(prompt_caching.md): add proxy examples to docs * feat(utils.py): expose new helper `supports_prompt_caching()` to check if a model supports prompt caching * docs(prompt_caching.md): add docs on checking model support for prompt caching * build: fix invalid json	2024-10-05 18:59:11 -04:00
GTonehour	d533acd24a	openrouter/openai's litellm_provider should be openrouter, not openai (#6079 ) In model_prices_and_context_window.json, openrouter/* models all have litellm_provider set as "openrouter", except for four openrouter/openai/* models, which were set to "openai". I suppose they must be set to "openrouter", so one can know it should use the openrouter API for this model.	2024-10-05 15:20:44 +05:30
Ishaan Jaff	930606ad63	add azure o1 models to model cost map (#6075 )	2024-10-05 13:22:06 +05:30
Ishaan Jaff	fc6e0dd6cb	(feat) OpenAI prompt caching models to model cost map (#6063 ) * add prompt caching for latest models * add cache_read_input_token_cost for prompt caching models	2024-10-04 19:12:13 +05:30
ls-marek-kerka	db55098a33	🔧 (model_prices_and_context_window.json): rename gemini-pro-flash to gemini-flash-experimental to reflect updated naming convention (#5980 ) Co-authored-by: Marek Keřka <marek.kerka@gmail.com>	2024-10-03 18:06:39 -04:00
Krish Dholakia	0b30e212da	LiteLLM Minor Fixes & Improvements (09/27/2024) (#5938 ) * fix(langfuse.py): prevent double logging requester metadata Fixes https://github.com/BerriAI/litellm/issues/5935 * build(model_prices_and_context_window.json): add mistral pixtral cost tracking Closes https://github.com/BerriAI/litellm/issues/5837 * handle streaming for azure ai studio error * [Perf Proxy] parallel request limiter - use one cache update call (#5932) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf * fix(groq/chat/transformation.py): Fixes https://github.com/BerriAI/litellm/issues/5839 * feat(anthropic/chat.py): return 'retry-after' headers from anthropic Fixes https://github.com/BerriAI/litellm/issues/4387 * feat: raise validation error if message has tool calls without passing `tools` param for anthropic/bedrock Closes https://github.com/BerriAI/litellm/issues/5747 * [Feature]#5940, add max_workers parameter for the batch_completion (#5947) * handle streaming for azure ai studio error * bump: version 1.48.2 → 1.48.3 * docs(data_security.md): add legal/compliance faq's Make it easier for companies to use litellm * docs: resolve imports * [Feature]#5940, add max_workers parameter for the batch_completion method --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com> Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local> * fix(converse_transformation.py): fix default message value * fix(utils.py): fix get_model_info to handle finetuned models Fixes issue for standard logging payloads, where model_map_value was null for finetuned openai models * fix(litellm_pre_call_utils.py): add debug statement for data sent after updating with team/key callbacks * fix: fix linting errors * fix(anthropic/chat/handler.py): fix cache creation input tokens * fix(exception_mapping_utils.py): fix missing imports * fix(anthropic/chat/handler.py): fix usage block translation * test: fix test * test: fix tests * style(types/utils.py): trigger new build * test: fix test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Jose Alberto Arango Sanchez <jose.arangos@udea.edu.co> Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>	2024-09-27 22:52:57 -07:00
Krish Dholakia	bd17424c4b	LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925 ) (#5937 ) * LiteLLM Minor Fixes & Improvements (09/26/2024) (#5925) * fix(litellm_logging.py): don't initialize prometheus_logger if non premium user Prevents bad error messages in logs Fixes https://github.com/BerriAI/litellm/issues/5897 * Add Support for Custom Providers in Vision and Function Call Utils (#5688) * Add Support for Custom Providers in Vision and Function Call Utils Lookup * Remove parallel function call due to missing model info param * Add Unit Tests for Vision and Function Call Changes * fix-#5920: set header value to string to fix "'int' object has no att… (#5922) * LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880) * LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) * feat(auth_utils.py): enable admin to allow client-side credentials to be passed Makes it easier for devs to experiment with finetuned fireworks ai models * feat(router.py): allow setting configurable_clientside_auth_params for a model Closes https://github.com/BerriAI/litellm/issues/5843 * build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit Fixes https://github.com/BerriAI/litellm/issues/5850 * fix(azure_ai/): support content list for azure ai Fixes https://github.com/BerriAI/litellm/issues/4237 * fix(litellm_logging.py): always set saved_cache_cost Set to 0 by default * fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing handles calling 405b+ size models * fix(slack_alerting.py): fix error alerting for failed spend tracking Fixes regression with slack alerting error monitoring * fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error * docs(bedrock.md): add llama3-1 models * test: fix tests * fix(azure_ai/chat): fix transformation for azure ai calls * feat(azure_ai/embed): Add azure ai embeddings support Closes https://github.com/BerriAI/litellm/issues/5861 * fix(azure_ai/embed): enable async embedding * feat(azure_ai/embed): support azure ai multimodal embeddings * fix(azure_ai/embed): support async multi modal embeddings * feat(together_ai/embed): support together ai embedding calls * feat(rerank/main.py): log source documents for rerank endpoints to langfuse improves rerank endpoint logging * fix(langfuse.py): support logging `/audio/speech` input to langfuse * test(test_embedding.py): fix test * test(test_completion_cost.py): fix helper util * fix-#5920: set header value to string to fix "'int' object has no attribute 'encode'" --------- Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> * Revert "fix-#5920: set header value to string to fix "'int' object has no att…" (#5926) This reverts commit `a554ae2695`. * build(model_prices_and_context_window.json): add azure ai cohere rerank model pricing Enables cost tracking for azure ai cohere rerank models * fix(litellm_logging.py): fix debug log to be clearer Closes https://github.com/BerriAI/litellm/issues/5909 * test(test_utils.py): fix test name * fix(azure_ai/cost_calculator.py): support cost tracking for azure ai rerank models * fix(azure_ai): fix azure ai base model cost tracking for rerank endpoints * fix(converse_handler.py): support new llama 3-2 models Fixes https://github.com/BerriAI/litellm/issues/5901 * fix(litellm_logging.py): ensure response is redacted for standard message logging Fixes https://github.com/BerriAI/litellm/issues/5890#issuecomment-2378242360 * fix(cost_calculator.py): use 'get_model_info' for cohere rerank cost calculation allows user to set custom cost for model * fix(config.yml): fix docker hub auht * build(config.yml): add docker auth to all tests * fix(db/create_views.py): fix linting error * fix(main.py): fix circular import * fix(azure_ai/__init__.py): fix circular import * fix(main.py): fix import * fix: fix linting errors * test: fix test * fix(proxy_server.py): pass premium user value on startup used for prometheus init --------- Co-authored-by: Cole Murray <colemurray.cs@gmail.com> Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com> * handle streaming for azure ai studio error * [Perf Proxy] parallel request limiter - use one cache update call (#5932) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf * test: fix test * test(test_rerank.py): fix test --------- Co-authored-by: Cole Murray <colemurray.cs@gmail.com> Co-authored-by: bravomark <62681807+bravomark@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-09-27 17:54:13 -07:00
Krish Dholakia	16c0307eab	LiteLLM Minor Fixes & Improvements (09/24/2024) (#5880 ) * LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) * feat(auth_utils.py): enable admin to allow client-side credentials to be passed Makes it easier for devs to experiment with finetuned fireworks ai models * feat(router.py): allow setting configurable_clientside_auth_params for a model Closes https://github.com/BerriAI/litellm/issues/5843 * build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit Fixes https://github.com/BerriAI/litellm/issues/5850 * fix(azure_ai/): support content list for azure ai Fixes https://github.com/BerriAI/litellm/issues/4237 * fix(litellm_logging.py): always set saved_cache_cost Set to 0 by default * fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing handles calling 405b+ size models * fix(slack_alerting.py): fix error alerting for failed spend tracking Fixes regression with slack alerting error monitoring * fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error * docs(bedrock.md): add llama3-1 models * test: fix tests * fix(azure_ai/chat): fix transformation for azure ai calls * feat(azure_ai/embed): Add azure ai embeddings support Closes https://github.com/BerriAI/litellm/issues/5861 * fix(azure_ai/embed): enable async embedding * feat(azure_ai/embed): support azure ai multimodal embeddings * fix(azure_ai/embed): support async multi modal embeddings * feat(together_ai/embed): support together ai embedding calls * feat(rerank/main.py): log source documents for rerank endpoints to langfuse improves rerank endpoint logging * fix(langfuse.py): support logging `/audio/speech` input to langfuse * test(test_embedding.py): fix test * test(test_completion_cost.py): fix helper util	2024-09-25 22:11:57 -07:00
Krrish Dholakia	5bc5eaff8a	build(model_prices_and_context_window.json): add new gemini - google ai studio models Closes https://github.com/BerriAI/litellm/pull/5879#issuecomment-2375703347	2024-09-25 21:50:30 -07:00
David Manouchehri	057bef6561	Add Llama 3.2 90b model on Vertex AI. (#5908 )	2024-09-25 21:21:57 -07:00
Krrish Dholakia	39c9150e97	build(model_prices_and_context_window.json): add new gemini models	2024-09-25 19:33:49 -07:00
John HU	8c7e357a23	Add gemini-1.5-pro-002 and gemini-1.5-flash-002 (#5879 )	2024-09-25 19:31:37 -07:00
Ishaan Jaff	a8dd495eae	[Feat] add fireworks llama 3.2 models + cost tracking (#5905 ) * add fireworks llama 3.2 vision models * add new llama3.2 models * docs add new llama 3.2 vision models	2024-09-25 17:59:46 -07:00
Krish Dholakia	d37c8b5c6b	LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842 ) (#5858 ) * LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842) * feat(auth_utils.py): enable admin to allow client-side credentials to be passed Makes it easier for devs to experiment with finetuned fireworks ai models * feat(router.py): allow setting configurable_clientside_auth_params for a model Closes https://github.com/BerriAI/litellm/issues/5843 * build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit Fixes https://github.com/BerriAI/litellm/issues/5850 * fix(azure_ai/): support content list for azure ai Fixes https://github.com/BerriAI/litellm/issues/4237 * fix(litellm_logging.py): always set saved_cache_cost Set to 0 by default * fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing handles calling 405b+ size models * fix(slack_alerting.py): fix error alerting for failed spend tracking Fixes regression with slack alerting error monitoring * fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error * docs(bedrock.md): add llama3-1 models * test: fix tests * fix(azure_ai/chat): fix transformation for azure ai calls	2024-09-24 15:01:31 -07:00

1 2 3 4 5 ...

403 commits