litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-27 03:34:10 +00:00

Author	SHA1	Message	Date
Krish Dholakia	cfb6890b9f	Litellm dev 12 28 2024 p2 (#7458 ) * docs(sidebar.js): docs for support model access groups for wildcard routes * feat(key_management_endpoints.py): add check if user is premium_user when adding model access group for wildcard route * refactor(docs/): make control model access a root-level doc in proxy sidebar easier to discover how to control model access on litellm * docs: more cleanup * feat(fireworks_ai/): add document inlining support Enables user to call non-vision models with images/pdfs/etc. * test(test_fireworks_ai_translation.py): add unit testing for fireworks ai transform inline helper util * docs(docs/): add document inlining details to fireworks ai docs * feat(fireworks_ai/): allow user to dynamically disable auto add transform inline allows client-side disabling of this feature for proxy users * feat(fireworks_ai/): return 'supports_vision' and 'supports_pdf_input' true on all fireworks ai models now true as fireworks ai supports document inlining * test: fix tests * fix(router.py): add unit testing for _is_model_access_group_for_wildcard_route	2024-12-28 19:38:06 -08:00
Krish Dholakia	5af438ed89	Litellm dev 12 28 2024 p3 (#7464 ) * feat(deepgram/): initial e2e support for deepgram stt Uses deepgram's `/listen` endpoint to transcribe speech to text Closes https://github.com/BerriAI/litellm/issues/4875 * fix: fix linting errors * test: fix test	2024-12-28 19:18:58 -08:00
Ishaan Jaff	7cf347918e	ci/cd run again	2024-12-27 11:44:26 -08:00
Ishaan Jaff	17d5ff2fa4	(fix) initializing OTEL Logging on LiteLLM Proxy - ensure OTEL logger is initialized only once (#7435 ) * add otel to _custom_logger_compatible_callbacks_literal * remove extra code * fix _get_custom_logger_settings_from_proxy_server * update unit tests	2024-12-26 21:17:19 -08:00
Krish Dholakia	760328b6ad	Litellm dev 12 25 2025 p2 (#7420 ) * test: add new test image embedding to base llm unit tests Addresses https://github.com/BerriAI/litellm/issues/6515 * fix(bedrock/embed/multimodal-embeddings): strip data prefix from image urls for bedrock multimodal embeddings Fix https://github.com/BerriAI/litellm/issues/6515 * feat: initial commit for fireworks ai audio transcription support Relevant issue: https://github.com/BerriAI/litellm/issues/7134 * test: initial fireworks ai test * feat(fireworks_ai/): implemented fireworks ai audio transcription config * fix(utils.py): register fireworks ai audio transcription config, in config manager * fix(utils.py): add fireworks ai param translation to 'get_optional_params_transcription' * refactor(fireworks_ai/): define text completion route with model name handling moves model name handling to specific fireworks routes, as required by their api * refactor(fireworks_ai/chat): define transform_Request - allows fixing model if accounts/ is missing * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix(handler.py): fix linting errors * fix(main.py): fix tgai text completion route * refactor(together_ai/completion): refactors together ai text completion route to just use provider transform request * refactor: move test_fine_tuning_api out of local_testing reduces local testing ci/cd time	2024-12-25 18:35:34 -08:00
Krish Dholakia	3ac54483a7	Litellm dev 12 24 2024 p3 (#7403 ) * fix(model_prices_and_context_window.json): specify meta llama is a bedrock converse model route Fixes https://github.com/BerriAI/litellm/issues/7385 * test(test_get_model_info.py): enforce all new bedrock chat models added have the bedrock_converse route Prevents https://github.com/BerriAI/litellm/issues/7385 and https://github.com/BerriAI/litellm/discussions/7325 * fix(get_supported_openai_params.py): use vertex gemini config by default for vertex ai route Fixes https://github.com/BerriAI/litellm/issues/7378 * refactor(vertex_ai/gemini/): rename vertexaiconfig to vertexaibaseconfig - make it clear vertexaiconfig = vertexgemini config * build(model_prices_and_context_window.json): add gpt-4o-audio-preview-2024-12-17 Closes https://github.com/BerriAI/litellm/issues/7367 * test: fix test * test: fix o1 tests * fix: handle llm api errors * fix: fix linting errors	2024-12-24 18:07:53 -08:00
Krish Dholakia	c3edfc2c92	LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 35s Details * build(model_prices_and_context_window.json): add gemini-1.5-flash context caching * fix(context_caching/transformation.py): just use last identified cache point Fixes https://github.com/BerriAI/litellm/issues/6738 * fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google Fixes https://github.com/BerriAI/litellm/issues/6738 * fix(vertex_ai/gemini/): track context caching tokens * refactor(gemini/): place transformation.py inside `chat/` folder make it easy for user to know we support the equivalent endpoint * fix: fix import * refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder make it easier to see cost calculation logic * fix: fix linting errors * fix: fix circular import * feat(gemini/cost_calculator.py): support gemini context caching cost calculation generifies anthropic's cost calculation function and uses it across anthropic + gemini * build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching Closes https://github.com/BerriAI/litellm/issues/6891 * docs(gemini.md): add gemini context caching architecture diagram make it easier for user to understand how context caching works * docs(gemini.md): link to relevant gemini context caching code * docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code * fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario * fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation * test: fix test	2024-12-23 22:02:52 -08:00
Krish Dholakia	3671829e39	Complete 'requests' library removal (#7350 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure * refactor(watsonx/completion/handler.py): move to using base llm http handler removes 'requests' library usage * fix(watsonx_text/transformation.py): fix result transformation migrates to transformation.py, for usage with base llm http handler * fix(streaming_handler.py): migrate watsonx streaming to transformation.py ensures streaming works with base llm http handler * fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic * fix(watsonx/): fix chat route post completion route refactor * refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well * refactor(base.py): remove requests library usage from litellm * build(pyproject.toml): remove requests library usage * fix: fix linting errors * fix: fix linting errors * fix(types/utils.py): fix validation errors for modelresponsestream * fix(replicate/handler.py): fix linting errors * fix(litellm_logging.py): handle modelresponsestream object * fix(streaming_handler.py): fix modelresponsestream args * fix: remove unused imports * test: fix test * fix: fix test * test: fix test * test: fix tests * test: fix test * test: fix patch target * test: fix test	2024-12-22 07:21:25 -08:00
Krrish Dholakia	f6db785088	fix(__init__.py): correctly return azure_text models in models_by_provider dictionary	2024-12-21 14:56:07 -08:00
Ishaan Jaff	ce41cd977c	(Admin UI) - Test Key Tab - Allow using `UI Session` instead of manually creating a virtual key (#7348 ) * ui fix - allow searching model list + fix bug on filtering * qa fix - use correct provider name for azure_text * ui wrap content onto next line * ui fix - allow selecting current UI session when logging in * ui session budgets	2024-12-21 13:14:15 -08:00
Ishaan Jaff	6107f9f3f3	[Bug fix ]: Triton /infer handler incompatible with batch responses (#7337 ) * migrate triton to base llm http handler * clean up triton handler.py * use transform functions for triton * add TritonConfig * get openai params for triton * use triton embedding config * test_completion_triton_generate_api * test_completion_triton_infer_api * fix TritonConfig doc string * use TritonResponseIterator * fix triton embeddings * docs triton chat usage	2024-12-20 20:59:40 -08:00
Ishaan Jaff	617ac63d14	(feat) add infinity rerank models (#7321 ) * Support Infinity Reranker (custom reranking models) (#7247) * Support Infinity Reranker * Clean code * Included transformation.py * Clean code * Added Infinity reranker test * Clean code --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * transform_rerank_response * update handler.py * infinity rerank updates * ci/cd run again * add infinity unit tests * docs add instruction on how to add a new provider for rerank --------- Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>	2024-12-19 18:30:28 -08:00
Ishaan Jaff	5f15b0aa20	(code refactor) - Add `BaseRerankConfig`. Use `BaseRerankConfig` for `cohere/rerank` and `azure_ai/rerank` (#7319 ) * add base rerank config * working sync cohere rerank * update rerank types * update base rerank config * remove old rerank * add new cohere handler.py * add cohere rerank transform * add get_provider_rerank_config * add rerank to base llm http handler * add rerank utils * add arerank to llm http handler.py * add AzureAIRerankConfig * updates rerank config * update test rerank * fix unused imports * update get_provider_rerank_config * test_basic_rerank_caching * fix unused import * test rerank	2024-12-19 17:03:34 -08:00
Ishaan Jaff	c7f14e936a	(code quality) run ruff rule to ban unused imports (#7313 ) * remove unused imports * fix AmazonConverseConfig * fix test * fix import * ruff check fixes * test fixes * fix testing * fix imports	2024-12-19 12:33:42 -08:00
Krish Dholakia	6a45ee1ef7	fix(hosted_vllm/transformation.py): return fake api key, if none give… (#7301 ) * fix(hosted_vllm/transformation.py): return fake api key, if none give. Prevents httpx error Fixes https://github.com/BerriAI/litellm/issues/7291 * test: fix test * fix(main.py): add hosted_vllm/ support for embeddings endpoint Closes https://github.com/BerriAI/litellm/issues/7290 * docs(vllm.md): add docs on vllm embeddings usage * fix(__init__.py): fix sambanova model test * fix(base_llm_unit_tests.py): skip pydantic obj test if model takes >5s to respond	2024-12-18 18:41:53 -08:00
Krish Dholakia	2f08341a08	Litellm dev readd prompt caching (#7299 ) * fix(router.py): re-add saving model id on prompt caching valid successful deployment * fix(router.py): introduce optional pre_call_checks isolate prompt caching logic in a separate file * fix(prompt_caching_deployment_check.py): fix import * fix(router.py): new 'async_filter_deployments' event hook allows custom logger to filter deployments returned to routing strategy * feat(prompt_caching_deployment_check.py): initial working commit of prompt caching based routing * fix(cooldown_callbacks.py): fix linting error * fix(budget_limiter.py): move budget logger to async_filter_deployment hook * test: add unit test * test(test_router_helper_utils.py): add unit testing * fix(budget_limiter.py): fix linting errors * docs(config_settings.md): add 'optional_pre_call_checks' to router_settings param docs	2024-12-18 15:13:49 -08:00
Ishaan Jaff	7a5dd29fe0	(fix) unable to pass input_type parameter to Voyage AI embedding mode (#7276 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details * VoyageEmbeddingConfig * fix voyage logic to get params * add voyage embedding transformation * add get_provider_embedding_config * use BaseEmbeddingConfig * voyage clean up * use llm http handler for embedding transformations * test_voyage_ai_embedding_extra_params * add voyage async * test_voyage_ai_embedding_extra_params * add async for llm http handler * update BaseLLMEmbeddingTest * test_voyage_ai_embedding_extra_params * fix linting * fix get_provider_embedding_config * fix anthropic text test * update location of base/chat/transformation * fix import path * fix IBMWatsonXAIConfig	2024-12-17 19:23:49 -08:00
Ishaan Jaff	3c984ed60e	(feat) Add Azure Blob Storage Logging Integration (#7265 ) * add path to http handler * AzureBlobStorageLogger * test_azure_blob_storage * use constants for Azure storage * use helper get_azure_ad_token_from_entrata_id * azure blob storage support * get_azure_ad_token_from_azure_storage * fix import * azure logging * docs azure storage * add docs on azure blobs * add premium user check * add azure_storage as identified logging callback * async_upload_payload_to_azure_blob_storage * docs azure storage * callback_class_str_to_classType	2024-12-16 22:18:22 -08:00
Ishaan Jaff	2a92a60168	ci/cd run again	2024-12-16 08:19:14 -08:00
Ishaan Jaff	7103198805	(feat) Add Tag-based budgets on litellm router / proxy (#7236 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details * add BudgetConfig * add _get_tags_from_request_kwargs * test_tag_budgets_e2e_test_expect_to_fail * add a check for request tags * fix _async_get_cache_keys_for_router_budget_limiting * fix test * fix _sync_in_memory_spend_with_redis * _async_get_cache_keys_for_router_budget_limiting * fix _init_tag_budgets * fix type casting * docs show error for tag budget limit hit * fix _get_tags_from_request_kwargs * fix undo change	2024-12-14 17:28:36 -08:00
Krish Dholakia	516c2a6a70	Litellm remove circular imports (#7232 ) * fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py * fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py ' * refactor: fix litellm.ModelResponse import on pass through endpoints * refactor(litellm_logging.py): fix circular import for custom callbacks literal * fix(factory.py): fix circular imports inside prompt factory * fix(cost_calculator.py): fix circular import for 'litellm.Usage' * fix(proxy_server.py): fix potential circular import with `litellm.Router' * fix(proxy/utils.py): fix potential circular import in `litellm.Router` * fix: remove circular imports in 'auth_checks' and 'guardrails/' * fix(prompt_injection_detection.py): fix router impor t * fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through * fix(anthropic_pass_through_logging_handler.py): fix potential circular imports * fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import * fix(base.py): fix potential circular import * fix(handler.py): fix potential circular ref in codestral + cohere handler's * fix(azure.py): fix potential circular imports * fix(gpt_transformation.py): fix modelresponse import * fix(litellm_logging.py): add logging base class - simplify typing makes it easy for other files to type check the logging obj without introducing circular imports * fix(azure_ai/embed): fix potential circular import on handler.py * fix(databricks/): fix potential circular imports in databricks/ * fix(vertex_ai/): fix potential circular imports on vertex ai embeddings * fix(vertex_ai/image_gen): fix import * fix(watsonx-+-bedrock): cleanup imports * refactor(anthropic-pass-through-+-petals): cleanup imports * refactor(huggingface/): cleanup imports * fix(ollama-+-clarifai): cleanup circular imports * fix(openai_like/): fix impor t * fix(openai_like/): fix embedding handler cleanup imports * refactor(openai.py): cleanup imports * fix(sagemaker/transformation.py): fix import * ci(config.yml): add circular import test to ci/cd	2024-12-14 16:28:34 -08:00
Krish Dholakia	e68bb4e051	Litellm dev 12 12 2024 (#7203 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 47s Details * fix(azure/): support passing headers to azure openai endpoints Fixes https://github.com/BerriAI/litellm/issues/6217 * fix(utils.py): move default tokenizer to just openai hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls * fix(router.py): fix pattern matching router - add generic "" to it as well Fixes issue where generic "" model access group wouldn't show up * fix(pattern_match_deployments.py): match to more specific pattern match to more specific pattern allows setting generic wildcard model access group and excluding specific models more easily * fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty don't delete all router models b/c of empty list Fixes https://github.com/BerriAI/litellm/issues/7196 * fix(anthropic/): fix handling response_format for anthropic messages with anthropic api * fix(fireworks_ai/): support passing response_format + tool call in same message Addresses https://github.com/BerriAI/litellm/issues/7135 * Revert "fix(fireworks_ai/): support passing response_format + tool call in same message" This reverts commit `6a30dc6929`. * test: fix test * fix(replicate/): fix replicate default retry/polling logic * test: add unit testing for router pattern matching * test: update test to use default oai tokenizer * test: mark flaky test * test: skip flaky test	2024-12-13 08:54:03 -08:00
Ishaan Jaff	90f9aded9f	ci/cd run release pipeline	2024-12-12 10:48:47 -08:00
Krish Dholakia	481645e49c	fix(acompletion): support fallbacks on acompletion (#7184 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 45s Details * fix(acompletion): support fallbacks on acompletion allows health checks for wildcard routes to use fallback models * test: update cohere generate api testing * add max tokens to health check (#7000) * fix: fix health check test * test: update testing --------- Co-authored-by: Cameron <561860+wallies@users.noreply.github.com>	2024-12-11 19:20:54 -08:00
Krrish Dholakia	02dd0c6e7e	build: Squashed commit of https://github.com/BerriAI/litellm/pull/7171 Closes https://github.com/BerriAI/litellm/pull/7171	2024-12-11 01:10:12 -08:00
Krrish Dholakia	06074bb13b	build: Squashed commit of https://github.com/BerriAI/litellm/pull/7170 Closes https://github.com/BerriAI/litellm/pull/7170	2024-12-11 01:03:57 -08:00
Krrish Dholakia	b9b34a7b99	build: Squashed commit of https://github.com/BerriAI/litellm/pull/7165 Closes https://github.com/BerriAI/litellm/pull/7165	2024-12-11 01:00:33 -08:00
Ishaan Jaff	78d132c1fb	(Refactor) Code Quality improvement - rename `text_completion_codestral.py` -> `codestral/completion/` (#7172 ) * rename files * fix codestral fim organization * fix CodestralTextCompletionConfig * fix import CodestralTextCompletion * fix BaseLLM * fix imports * fix CodestralTextCompletionConfig * fix imports CodestralTextCompletion	2024-12-11 00:55:47 -08:00
Ishaan Jaff	400eb28a91	Code Quality Improvement - move `aleph_alpha` to deprecated_providers (#7168 ) * move aleph alpha to deprecated providers * fix import location * fix aleph_alpha * pytest skip * undo change to test file	2024-12-11 00:50:40 -08:00
Ishaan Jaff	21003c4337	Code Quality Improvement - use `vertex_ai/` as folder name for vertexAI (#7166 ) * fix rename vertex ai * run ci/cd again	2024-12-11 00:32:41 -08:00
Krish Dholakia	350cfc36f7	Litellm merge pr (#7161 ) * build: merge branch * test: fix openai naming * fix(main.py): fix openai renaming * style: ignore function length for config factory * fix(sagemaker/): fix routing logic * fix: fix imports * fix: fix override	2024-12-10 22:49:26 -08:00
Krish Dholakia	d5aae81c6d	Litellm vllm refactor (#7158 ) * refactor(vllm/): move vllm to use base llm config * test: mark flaky test	2024-12-10 21:48:35 -08:00
Krish Dholakia	405080396d	Litellm ollama refactor (#7162 ) * refactor(ollama/): refactor ollama `/api/generate` to use base llm config Addresses https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132 * test: skip unresponsive test * test(test_secret_manager.py): mark flaky test * test: fix google sm test * fix: fix init.py	2024-12-10 21:45:35 -08:00
Krish Dholakia	488913c69f	Revert "LiteLLM Common Base LLM Config (pt.4): Move Ollama to Base LLM Config…" (#7160 ) This reverts commit `40a22eb4c6`.	2024-12-10 21:44:54 -08:00
Krish Dholakia	40a22eb4c6	LiteLLM Common Base LLM Config (pt.4): Move Ollama to Base LLM Config (#7157 ) * refactor(ollama/): refactor ollama `/api/generate` to use base llm config Addresses https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132 * test: skip unresponsive test * test(test_secret_manager.py): mark flaky test * test: fix google sm test	2024-12-10 21:39:28 -08:00
Ishaan Jaff	bfb6891eb7	rename `llms/OpenAI/` -> `llms/openai/` (#7154 ) * rename OpenAI -> openai * fix file rename * fix rename changes * fix organization of openai/transcription * fix import OA fine tuning API * fix openai ft handler * fix handler import	2024-12-10 20:14:07 -08:00
Krish Dholakia	e903fe6038	refactor(sagemaker/): separate chat + completion routes + make them b… (#7151 ) * refactor(sagemaker/): separate chat + completion routes + make them both use base llm config Addresses https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132 * fix(main.py): pass hf model name + custom prompt dict to litellm params	2024-12-10 19:40:05 -08:00
Krish Dholakia	1e87782215	LiteLLM Common Base LLM Config (pt.3): Move all OAI compatible providers to base llm config (#7148 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 45s Details * refactor(fireworks_ai/): inherit from openai like base config refactors fireworks ai to use a common config * test: fix import in test * refactor(watsonx/): refactor watsonx to use llm base config refactors chat + completion routes to base config path * fix: fix linting error * refactor: inherit base llm config for oai compatible routes * test: fix test * test: fix test	2024-12-10 17:12:42 -08:00
Krish Dholakia	311432ca17	refactor(fireworks_ai/): inherit from openai like base config (#7146 ) * refactor(fireworks_ai/): inherit from openai like base config refactors fireworks ai to use a common config * test: fix import in test * refactor(watsonx/): refactor watsonx to use llm base config refactors chat + completion routes to base config path * fix: fix linting error * test: fix test * fix: fix test	2024-12-10 16:15:19 -08:00
Ishaan Jaff	bdb20821ea	(Refactor) Code Quality improvement - Use Common base handler for `anthropic_text/` (#7143 ) * add anthropic text provider * add ANTHROPIC_TEXT to LlmProviders * fix anthropic text implementation * working anthropic text claude-2 * test_acompletion_claude2_stream * add param mapping for anthropic text * fix unused imports * fix anthropic completion handler.py	2024-12-10 12:23:58 -08:00
Ishaan Jaff	bd39e1ab5d	(Refactor) Code Quality improvement - Use Common base handler for `cloudflare/` provider (#7127 ) * add get_complete_url to base config * cloudflare - refactor to following existing pattern * migrate cloudflare chat completions to base llm http handler * fix unused import * fix fake stream in cloudflare * fix cloudflare transformation * fix naming for BaseModelResponseIterator * add async cloudflare streaming test * test cloudflare * add handler.py * add handler.py in cohere handler.py	2024-12-10 10:12:22 -08:00
Krish Dholakia	5bbf906c83	Litellm code qa common config (#7113 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 44s Details * feat(base_llm): initial commit for common base config class Addresses code qa critique https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132 * feat(base_llm/): add transform request/response abstract methods to base config class * feat(cohere-+-clarifai): refactor integrations to use common base config class * fix: fix linting errors * refactor(anthropic/): move anthropic + vertex anthropic to use base config * test: fix xai test * test: fix tests * fix: fix linting errors * test: comment out WIP test * fix(transformation.py): fix is pdf used check * fix: fix linting error	2024-12-09 15:58:25 -08:00
Krish Dholakia	0c0498dd60	Litellm dev 12 07 2024 (#7086 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details * fix(main.py): support passing max retries to azure/openai embedding integrations Fixes https://github.com/BerriAI/litellm/issues/7003 * feat(team_endpoints.py): allow updating team model aliases Closes https://github.com/BerriAI/litellm/issues/6956 * feat(router.py): allow specifying model id as fallback - skips any cooldown check Allows a default model to be checked if all models in cooldown s/o @micahjsmith * docs(reliability.md): add fallback to specific model to docs * fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util Allows user to identify if messages/tools have prompt caching Related issue: https://github.com/BerriAI/litellm/issues/6784 * feat(router.py): store model id for prompt caching valid prompt Allows routing to that model id on subsequent requests * fix(router.py): only cache if prompt is valid prompt caching prompt prevents storing unnecessary items in cache * feat(router.py): support routing prompt caching enabled models to previous deployments Closes https://github.com/BerriAI/litellm/issues/6784 * test: fix linting errors * feat(databricks/): convert basemodel to dict and exclude none values allow passing pydantic message to databricks * fix(utils.py): ensure all chat completion messages are dict * (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081) * add custom_llm_provider to SpendLogsPayload * add custom_llm_provider to SpendLogs * add custom llm provider to SpendLogs payload * test_spend_logs_payload * Add MLflow to the side bar (#7031) Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * (bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082) * catch DB_CONNECTION_ERROR_TYPES * fix DB retry mechanism for SpendLog updates * use DB_CONNECTION_ERROR_TYPES in auth checks * fix exp back off for writing SpendLogs * use _raise_failed_update_spend_exception to ensure errors print as NON blocking * test_update_spend_logs_multiple_batches_with_failure * (Feat) Add StructuredOutputs support for Fireworks.AI (#7085) * fix model cost map fireworks ai "supports_response_schema": true, * fix supports_response_schema * fix map openai params fireworks ai * test_map_response_format * test_map_response_format * added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084) * bump: version 1.53.9 → 1.54.0 * fix deepinfra * litellm db fixes LiteLLM_UserTable (#7089) * ci/cd queue new release * fix llama-3.3-70b-versatile * refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090) * fix refactor - use consistent file naming convention * ci/cd run again * fix naming structure * fix use consistent naming (#7092) --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>	2024-12-08 00:30:33 -08:00
Ishaan Jaff	36e99ebce7	fix use consistent naming (#7092 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details	2024-12-07 22:01:00 -08:00
Ishaan Jaff	7d4a1cb4e2	refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090 ) * fix refactor - use consistent file naming convention * ci/cd run again * fix naming structure	2024-12-07 21:46:34 -08:00
Ishaan Jaff	191a0fefbc	ci/cd queue new release	2024-12-07 19:09:57 -08:00
Krish Dholakia	19a4273fda	feat(langfuse/): support langfuse prompt management (#7073 ) * feat(langfuse/): support langfuse prompt management Initial working commit for langfuse prompt management support Closes https://github.com/BerriAI/litellm/issues/6269 * test: update test * fix(litellm_logging.py): suppress linting error	2024-12-06 23:10:22 -08:00
Krish Dholakia	816f0ef8d2	LiteLLM Minor Fixes & Improvements (12/05/2024) (#7051 ) * fix(cost_calculator.py): move to using `.get_model_info()` for cost per token calculations ensures cost tracking is reliable - handles edge cases of parsing model cost map * build(model_prices_and_context_window.json): add 'supports_response_schema' for select tgai models Fixes https://github.com/BerriAI/litellm/pull/7037#discussion_r1872157329 * build(model_prices_and_context_window.json): remove 'pdf input' and 'vision' support from nova micro in model map Bedrock docs indicate no support for micro - https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html * fix(converse_transformation.py): support amazon nova tool use * fix(opentelemetry): Add missing LLM request type attribute to spans (#7041) * feat(opentelemetry): add LLM request type attribute to spans * lint * fix: curl usage (#7038) curl -d, --data <data> is lowercase d curl -D, --dump-header <filename> is uppercase D references: https://curl.se/docs/manpage.html#-d https://curl.se/docs/manpage.html#-D * fix(spend_tracking.py): handle empty 'id' in model response - when creating spend log Fixes https://github.com/BerriAI/litellm/issues/7023 * fix(streaming_chunk_builder.py): handle initial id being empty string Fixes https://github.com/BerriAI/litellm/issues/7023 * fix(anthropic_passthrough_logging_handler.py): add end user cost tracking for anthropic pass through endpoint * docs(pass_through/): refactor docs location + add table on supported features for pass through endpoints * feat(anthropic_passthrough_logging_handler.py): support end user cost tracking via anthropic sdk * docs(anthropic_completion.md): add docs on passing end user param for cost tracking on anthropic sdk * fix(litellm_logging.py): use standard logging payload if present in kwargs prevent datadog logging error for pass through endpoints * docs(bedrock.md): add rerank api usage example to docs * bugfix/change dummy tool name format (#7053) * fix viewing keys (#7042) * ui new build * build(model_prices_and_context_window.json): add bedrock region models to model cost map (#7044) * bye (#6982) * (fix) litellm router.aspeech (#6962) * doc Migrating Databases * fix aspeech on router * test_audio_speech_router * test_audio_speech_router * docs show supported providers on batches api doc * change dummy tool name format --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> * fix: fix linting errors * test: update test * fix(litellm_logging.py): fix pass through check * fix(test_otel_logging.py): fix test * fix(cost_calculator.py): update handling for cost per second * fix(cost_calculator.py): fix cost check * test: fix test * (fix) adding public routes when using custom header (#7045) * get_api_key_from_custom_header * add test_get_api_key_from_custom_header * fix testing use 1 file for test user api key auth * fix test user api key auth * test_custom_api_key_header_name * build: update ui build --------- Co-authored-by: Doron Kopit <83537683+doronkopit5@users.noreply.github.com> Co-authored-by: lloydchang <lloydchang@gmail.com> Co-authored-by: hgulersen <haymigulersen@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com>	2024-12-06 14:29:53 -08:00
Krish Dholakia	6bb934c0ac	fix(key_management_endpoints.py): override metadata field value on up… (#7008 ) * fix(key_management_endpoints.py): override metadata field value on update allow user to override tags * feat(__init__.py): expose new disable_end_user_cost_tracking_prometheus_only metric allow disabling end user cost tracking on prometheus - fixes cardinality issue * fix(litellm_pre_call_utils.py): add key/team level enforced params Fixes https://github.com/BerriAI/litellm/issues/6652 * fix(key_management_endpoints.py): allow user to pass in `enforced_params` as a top level param on /key/generate and /key/update * docs(enterprise.md): add docs on enforcing required params for llm requests * Add support of Galadriel API (#7005) * fix(router.py): robust retry after handling set retry after time to 0 if >0 healthy deployments. handle base case = 1 deployment * test(test_router.py): fix test * feat(bedrock/): add support for 'nova' models also adds explicit 'converse/' route for simpler routing * fix: fix 'supports_pdf_input' return if model supports pdf input on get_model_info * feat(converse_transformation.py): support bedrock pdf input * docs(document_understanding.md): add document understanding to docs * fix(litellm_pre_call_utils.py): fix linting error * fix(init.py): fix passing of bedrock converse models * feat(bedrock/converse): support 'response_format={"type": "json_object"}' * fix(converse_handler.py): fix linting error * fix(base_llm_unit_tests.py): fix test * fix: fix test * test: fix test * test: fix test * test: remove duplicate test --------- Co-authored-by: h4n0 <4738254+h4n0@users.noreply.github.com>	2024-12-03 23:03:50 -08:00
Ishaan Jaff	d558b643be	queue new release All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 45s Details	2024-12-03 20:54:25 -08:00

1 2 3 4 5 ...

662 commits