Commit graph

200 commits

Author SHA1 Message Date
Zhaohan Dong
88e7046165
Added compatibility guidance, etc. for xAI Grok model (#8282)
* Various updates

Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>

* Update xAI branding

Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>

* Revert changes

Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>

---------

Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>
2025-02-05 17:21:47 -08:00
Zhaohan Dong
d60d3ee970
Add xAI and fix some old model config (#8218)
Signed-off-by: Zhaohan Dong <65422392+zhaohan-dong@users.noreply.github.com>
2025-02-03 15:29:19 -08:00
Ishaan Jaff
60bdfb437f doc on streaming usage litellm proxy 2024-12-30 21:06:34 -08:00
Krish Dholakia
cd5bdfcb7a
docs(input.md): document 'extra_headers' param support (#7268)
* docs(input.md): document 'extra_headers' param support

* fix: #7239 to move Nova topK parameter to `additionalModelRequestFields` (#7240)

Co-authored-by: Ryan Hoium <rhoium>

---------

Co-authored-by: ryanh-ai <3118399+ryanh-ai@users.noreply.github.com>
2024-12-17 07:19:14 -08:00
Krish Dholakia
61b35c12bb
LiteLLM Minor Fixes & Improvements (12/05/2024) (#7037)
* fix(together_ai/chat): only return response_format + tools for supported models

Fixes https://github.com/BerriAI/litellm/issues/6972

* feat(bedrock/rerank): initial working commit for bedrock rerank api support

Closes https://github.com/BerriAI/litellm/issues/7021

* feat(bedrock/rerank): async bedrock rerank api support

Addresses https://github.com/BerriAI/litellm/issues/7021

* build(model_prices_and_context_window.json): add 'supports_prompt_caching' for bedrock models + cleanup cross-region from model list (duplicate information - lead to inconsistencies )

* docs(json_mode.md): clarify model support for json schema

Closes https://github.com/BerriAI/litellm/issues/6998

* fix(_service_logger.py): handle dd callback in list

ensure failed spend tracking is logged to datadog

* feat(converse_transformation.py): translate from anthropic format to bedrock format

Closes https://github.com/BerriAI/litellm/issues/7030

* fix: fix linting errors

* test: fix test
2024-12-05 00:02:31 -08:00
Krish Dholakia
6bb934c0ac
fix(key_management_endpoints.py): override metadata field value on up… (#7008)
* fix(key_management_endpoints.py): override metadata field value on update

allow user to override tags

* feat(__init__.py): expose new disable_end_user_cost_tracking_prometheus_only metric

allow disabling end user cost tracking on prometheus - fixes cardinality issue

* fix(litellm_pre_call_utils.py): add key/team level enforced params

Fixes https://github.com/BerriAI/litellm/issues/6652

* fix(key_management_endpoints.py): allow user to pass in `enforced_params` as a top level param on /key/generate and /key/update

* docs(enterprise.md): add docs on enforcing required params for llm requests

* Add support of Galadriel API (#7005)

* fix(router.py): robust retry after handling

set retry after time to 0 if >0 healthy deployments. handle base case = 1 deployment

* test(test_router.py): fix test

* feat(bedrock/): add support for 'nova' models

also adds explicit 'converse/' route for simpler routing

* fix: fix 'supports_pdf_input'

return if model supports pdf input on get_model_info

* feat(converse_transformation.py): support bedrock pdf input

* docs(document_understanding.md): add document understanding to docs

* fix(litellm_pre_call_utils.py): fix linting error

* fix(init.py): fix passing of bedrock converse models

* feat(bedrock/converse): support 'response_format={"type": "json_object"}'

* fix(converse_handler.py): fix linting error

* fix(base_llm_unit_tests.py): fix test

* fix: fix test

* test: fix test

* test: fix test

* test: remove duplicate test

---------

Co-authored-by: h4n0 <4738254+h4n0@users.noreply.github.com>
2024-12-03 23:03:50 -08:00
Krrish Dholakia
5a430d3c69 docs(json_mode.md): update json docs 2024-12-02 23:08:19 -08:00
Krish Dholakia
7e9d8b58f6
LiteLLM Minor Fixes & Improvements (11/23/2024) (#6870)
* feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc.

* fix(utils.py): allow disabling end user cost tracking with new param

Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small

* docs(configs.md): add disable_end_user_cost_tracking reference to docs

* feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role

Enables admin to restrict key creation, and assign team admins to handle distributing keys

* test(test_key_management.py): add unit testing for personal / team key restriction checks

* docs: add docs on restricting key creation

* docs(finetuned_models.md): add new guide on calling finetuned models

* docs(input.md): cleanup anthropic supported params

Closes https://github.com/BerriAI/litellm/issues/6856

* test(test_embedding.py): add test for passing extra headers via embedding

* feat(cohere/embed): pass client to async embedding

* feat(rerank.py): add `/v1/rerank` if missing for cohere base url

Closes https://github.com/BerriAI/litellm/issues/6844

* fix(main.py): pass extra_headers param to openai

Fixes https://github.com/BerriAI/litellm/issues/6836

* fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set

Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically

* fix(handler.py): fix linting error

* fix: fix typing

* build: add conftest to proxy_admin_ui_tests/

* test: fix test

* fix: fix linting errors

* test: fix test

* fix: fix pass through testing
2024-11-23 15:17:40 +05:30
Krrish Dholakia
2903fd4164 docs: update json mode docs 2024-11-22 03:00:45 +05:30
Ishaan Jaff
6ae0bc4a11
[Feature]: json_schema in response support for Anthropic (#6748)
* _convert_tool_response_to_message

* fix ModelResponseIterator

* fix test_json_response_format

* test_json_response_format_stream

* fix _convert_tool_response_to_message

* use helper _handle_json_mode_chunk

* fix _process_response

* unit testing for test_convert_tool_response_to_message_no_arguments

* update doc for JSON mode
2024-11-14 16:59:45 -08:00
Camden Clark
b582efa3ce
Update prefix.md (#6734) 2024-11-14 11:18:35 +05:30
Ishaan Jaff
c047d51cc8
(feat) add Predicted Outputs for OpenAI (#6594)
* bump openai to openai==1.54.0

* add 'prediction' param

* testing fix bedrock deprecated cohere.command-text-v14

* test test_openai_prediction_param.py

* test_openai_prediction_param_with_caching

* doc Predicted Outputs

* doc Predicted Output
2024-11-04 21:16:57 -08:00
Ishaan Jaff
4cbdad9fc5
doc - using gpt-4o-audio-preview (#6326)
* doc on audio models

* doc supports vision

* doc audio input / output
2024-10-19 09:34:56 +05:30
Krish Dholakia
f2c0a31e3c
LiteLLM Minor Fixes & Improvements (10/05/2024) (#6083)
* docs(prompt_caching.md): add prompt caching cost calc example to docs

* docs(prompt_caching.md): add proxy examples to docs

* feat(utils.py): expose new helper `supports_prompt_caching()` to check if a model supports prompt caching

* docs(prompt_caching.md): add docs on checking model support for prompt caching

* build: fix invalid json
2024-10-05 18:59:11 -04:00
Krish Dholakia
2e5c46ef6d
LiteLLM Minor Fixes & Improvements (10/04/2024) (#6064)
* fix(litellm_logging.py): ensure cache hits are scrubbed if 'turn_off_message_logging' is enabled

* fix(sagemaker.py): fix streaming to raise error immediately

Fixes https://github.com/BerriAI/litellm/issues/6054

* (fixes)  gcs bucket key based logging  (#6044)

* fixes for gcs bucket logging

* fix StandardCallbackDynamicParams

* fix - gcs logging when payload is not serializable

* add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket

* working success callbacks

* linting fixes

* fix linting error

* add type hints to functions

* fixes for dynamic success and failure logging

* fix for test_async_chat_openai_stream

* fix handle case when key based logging vars are set as os.environ/ vars

* fix prometheus track cooldown events on custom logger (#6060)

* (docs) add 1k rps load test doc  (#6059)

* docs 1k rps load test

* docs load testing

* docs load testing litellm

* docs load testing

* clean up load test doc

* docs prom metrics for load testing

* docs using prometheus on load testing

* doc load testing with prometheus

* (fixes) docs + qa - gcs key based logging  (#6061)

* fixes for required values for gcs bucket

* docs gcs bucket logging

* bump: version 1.48.12 → 1.48.13

* ci/cd run again

* bump: version 1.48.13 → 1.48.14

* update load test doc

* (docs) router settings - on litellm config  (#6037)

* add yaml with all router settings

* add docs for router settings

* docs router settings litellm settings

* (feat)  OpenAI prompt caching models to model cost map (#6063)

* add prompt caching for latest models

* add cache_read_input_token_cost for prompt caching models

* fix(litellm_logging.py): check if param is iterable

Fixes https://github.com/BerriAI/litellm/issues/6025#issuecomment-2393929946

* fix(factory.py): support passing an 'assistant_continue_message' to prevent bedrock error

Fixes https://github.com/BerriAI/litellm/issues/6053

* fix(databricks/chat): handle streaming responses

* fix(factory.py): fix linting error

* fix(utils.py): unify anthropic + deepseek prompt caching information to openai format

Fixes https://github.com/BerriAI/litellm/issues/6069

* test: fix test

* fix(types/utils.py): support all openai roles

Fixes https://github.com/BerriAI/litellm/issues/6052

* test: fix test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-10-04 21:28:53 -04:00
Ishaan Jaff
1973ae8fb8
[Feat] Allow setting supports_vision for Custom OpenAI endpoints + Added testing (#5821)
* add test for using images with custom openai endpoints

* run all otel tests

* update name of test

* add custom openai model to test config

* add test for setting supports_vision=True for model

* fix test guardrails aporia

* docs supports vison

* fix yaml

* fix yaml

* docs supports vision

* fix bedrock guardrail test

* fix cohere rerank test

* update model_group doc string

* add better prints on test
2024-09-21 11:35:55 -07:00
Krish Dholakia
d46660ea0f
LiteLLM Minor Fixes & Improvements (09/18/2024) (#5772)
* fix(proxy_server.py): fix azure key vault logic to not require client id/secret

* feat(cost_calculator.py): support fireworks ai cost tracking

* build(docker-compose.yml): add lines for mounting config.yaml to docker compose

Closes https://github.com/BerriAI/litellm/issues/5739

* fix(input.md): update docs to clarify litellm supports content as a list of dictionaries

Fixes https://github.com/BerriAI/litellm/issues/5755

* fix(input.md): update input.md to include all message values

* fix(image_handling.py): follow image url redirects

Fixes https://github.com/BerriAI/litellm/issues/5763

* fix(router.py): Fix model key/base leak in error message

Fixes https://github.com/BerriAI/litellm/issues/5762

* fix(http_handler.py): fix linting error

* fix(azure.py): fix logging to show azure_ad_token being used

Fixes https://github.com/BerriAI/litellm/issues/5767

* fix(_redis.py): add redis sentinel support

Closes https://github.com/BerriAI/litellm/issues/4381

* feat(_redis.py): add redis sentinel support

Closes https://github.com/BerriAI/litellm/issues/4381

* test(test_completion_cost.py): fix test

* Databricks Integration: Integrate Databricks SDK as optional mechanism for fetching API base and token, if unspecified (#5746)

* LiteLLM Minor Fixes & Improvements (09/16/2024)  (#5723)

* coverage (#5713)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* Move (#5714)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix(litellm_logging.py): fix logging client re-init (#5710)

Fixes https://github.com/BerriAI/litellm/issues/5695

* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config

Fixes https://github.com/BerriAI/litellm/issues/5682

* feat(o1_handler.py): fake streaming for openai o1 models

Fixes https://github.com/BerriAI/litellm/issues/5694

* docs: deprecated traceloop integration in favor of native otel (#5249)

* fix: fix linting errors

* fix: fix linting errors

* fix(main.py): fix o1 import

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730)

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view

Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it

* fix(custom_logger.py): reset calltype

* fix: fix linting errors

* fix: fix linting error

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix: fix import

* Fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* DB test

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* Coverage

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* progress

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix test name

Signed-off-by: dbczumar <corey.zumar@databricks.com>

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

* test: fix test

* test(test_databricks.py): fix test

* fix(databricks/chat.py): handle custom endpoint (e.g. sagemaker)

* Apply code scanning fix for clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix(__init__.py): fix known fireworks ai models

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2024-09-19 13:25:29 -07:00
Ishaan Jaff
c220fc0e92 docs max_completion_tokens 2024-09-14 19:12:12 -07:00
Krish Dholakia
be3c7b401e
LiteLLM Minor fixes + improvements (08/03/2024) (#5488)
* fix(internal_user_endpoints.py): set budget_reset_at for /user/update

* fix(vertex_and_google_ai_studio_gemini.py): handle accumulated json

Fixes https://github.com/BerriAI/litellm/issues/5479

* fix(vertex_ai_and_gemini.py): fix assistant message function call when content is not None

Fixes https://github.com/BerriAI/litellm/issues/5490

* fix(proxy_server.py): generic state uuid for okta sso

* fix(lago.py): improve debug logs

Debugging for https://github.com/BerriAI/litellm/issues/5477

* docs(bedrock.md): add bedrock cross-region inferencing to docs

* fix(azure.py): return azure response headers on aembedding call

* feat(azure.py): return azure response headers for `/audio/transcription`

* fix(types/utils.py): standardize deepseek / anthropic prompt caching usage information

Closes https://github.com/BerriAI/litellm/issues/5285

* docs(usage.md): add docs on litellm usage object

* test(test_completion.py): mark flaky test
2024-09-03 21:21:34 -07:00
Krrish Dholakia
9df0588c2c docs(json_mode.md): update docs 2024-09-02 22:41:17 -07:00
Krrish Dholakia
0c26b36d9d docs(input.md): update docs on together ai response_format params support 2024-08-23 21:34:18 -07:00
Beltrán Aceves
5e583e0bf2 Fixed code snippet import typo in Structured Output docs 2024-08-20 23:01:22 +02:00
Krrish Dholakia
b8e4ef0abf docs(json_mode.md): add azure openai models to doc 2024-08-19 07:19:23 -07:00
Zbigniew Łukasiak
963c921c5a
Mismatch in example fixed 2024-08-14 15:07:10 +02:00
Krrish Dholakia
fdd9a07051 fix(utils.py): Break out of infinite streaming loop
Fixes https://github.com/BerriAI/litellm/issues/5158
2024-08-12 14:00:43 -07:00
Krrish Dholakia
0ea056971c docs(prefix.md): add prefix support to docs 2024-08-10 13:55:47 -07:00
Krrish Dholakia
2dd27a4e12 feat(utils.py): support validating json schema client-side if user opts in 2024-08-06 19:35:33 -07:00
Krrish Dholakia
0c88cc4153 docs(json_mode.md): add example of calling openai with pydantic model via litellm 2024-08-06 18:27:06 -07:00
Krrish Dholakia
f3a0eb8eb9 docs(json_mode.md): update json mode docs to show structured output responses
Relevant issue - https://github.com/BerriAI/litellm/issues/5074
2024-08-06 17:01:41 -07:00
Krrish Dholakia
696e75d69c docs: add github provider to docs 2024-08-03 09:20:23 -07:00
Krrish Dholakia
f1b7d2318c docs(input.md): update docs to show ollama tool calling 2024-07-30 09:56:24 -07:00
Krrish Dholakia
6d741a5424 docs(json_mode.md): add json mode to docs 2024-07-18 17:20:19 -07:00
Krrish Dholakia
ba334ff8b9 refactor(provider_specific_params.md): create separate doc for provider-specific param
Make it easier for people to know, how litellm handles provider-specific params.
2024-07-09 12:23:46 -07:00
berkecanrizai
40940cd606
fix: typo in vision docs 2024-07-05 13:31:12 +03:00
Krrish Dholakia
be8a6377f6 docs(input.md): add vertex ai json mode to mapped input params 2024-06-29 11:51:52 -07:00
Krrish Dholakia
a6aee18012 docs(token_usage.md): add response cost to usage docs 2024-06-26 18:05:47 -07:00
Krrish Dholakia
09f4eb7617 docs(reliable_completions.md): improve headers for easier searching 2024-06-26 08:09:31 -07:00
Krrish Dholakia
d6ed8c10b2 docs(function_call.md): cleanup 2024-06-25 18:26:34 -07:00
Krrish Dholakia
e96326a211 docs(input.md): update docs with parallel_tool_calls 2024-06-20 21:01:49 -07:00
Krrish Dholakia
3feaf231ac docs(drop_params.md): drop unsupported params 2024-06-20 17:43:07 -07:00
Krrish Dholakia
f86dcbb109 docs(input.md): clarify meaning of 'drop_params' 2024-06-19 10:04:58 -07:00
Ishaan Jaff
69a20c94fd docs - fix doc build time errors 2024-06-15 14:58:02 -07:00
Krrish Dholakia
162f9400d2 feat(utils.py): support dynamically setting 'drop_params'
Allows user to turn this on/off for individual calls by passing in as a completion arg
2024-06-05 08:44:04 -07:00
Krrish Dholakia
3db30ecb4c docs(batching.md): add batch completion fastest response on proxy to docs 2024-05-28 22:14:22 -07:00
Krrish Dholakia
2ee599b848 docs(batching.md): add batch completion to docs 2024-05-28 22:08:06 -07:00
Krrish Dholakia
9698fc77fd docs(input.md): add clarifai supported input params to docs 2024-05-24 08:57:50 -07:00
Krrish Dholakia
65c4d6be39 docs(databricks.md): add databricks api support to docs 2024-05-23 19:22:09 -07:00
Krrish Dholakia
af1d209f8f docs(input.md): add anthropic tool choice support to docs 2024-05-21 17:56:21 -07:00
Krrish Dholakia
7fa203c810 docs(input.md): add mistral to input param docs 2024-05-13 13:50:49 -07:00
Ishaan Jaff
62276fc221 docs link to litellm batch completions 2024-05-11 13:45:32 -07:00