litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 10:44:24 +00:00

Author	SHA1	Message	Date
Krish Dholakia	d9eb8f42ff	Litellm dev 01 27 2025 p3 (#8047 ) * docs(reliability.md): add doc on disabling fallbacks per request * feat(litellm_pre_call_utils.py): support reading request timeout from request headers - new `x-litellm-timeout` param Allows setting dynamic model timeouts from vercel's AI sdk * test(test_proxy_server.py): add simple unit test for reading request timeout * test(test_fallbacks.py): add e2e test to confirm timeout passed in request headers is correctly read * feat(main.py): support passing metadata to openai in preview Resolves https://github.com/BerriAI/litellm/issues/6022#issuecomment-2616119371 * fix(main.py): fix passing openai metadata * docs(request_headers.md): document new request headers * build: Merge branch 'main' into litellm_dev_01_27_2025_p3 * test: loosen test	2025-01-28 18:01:27 -08:00
Krish Dholakia	6bafdbc546	Litellm dev 01 25 2025 p4 (#8006 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 34s Details * feat(main.py): use asyncio.sleep for mock_Timeout=true on async request adds unit testing to ensure proxy does not fail if specific Openai requests hang (e.g. recent o1 outage) * fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming Fixes https://github.com/BerriAI/litellm/issues/7942 * Revert "fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming" This reverts commit `7a052a64e3`. * fix(deepseek-r-1): return reasoning_content as a top-level param ensures compatibility with existing tools that use it * fix: fix linting error	2025-01-26 08:01:05 -08:00
Krish Dholakia	4857a1e9ba	refactor: cleanup dead codeblock (#7936 ) * refactor: cleanup dead codeblock * fix(main.py): add extra headers to headers * fix: remove dead codeblock	2025-01-24 21:48:23 -08:00
Krish Dholakia	8ca3229b26	Ensure base_model cost tracking works across all endpoints (#7989 ) * test(test_completion_cost.py): add sdk test to ensure base model is used for cost tracking * test(test_completion_cost.py): add sdk test to ensure custom pricing works * fix(main.py): add base model cost tracking support for embedding calls Enables base model cost tracking for embedding calls when base model set as a litellm_param * fix(litellm_logging.py): update logging object with litellm params - including base model, if given ensures base model param is always tracked * fix(main.py): fix linting errors	2025-01-24 21:05:26 -08:00
Krish Dholakia	1e011b66d3	Ollama ssl verify = False + Spend Logs reliability fixes (#7931 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(http_handler.py): support passing ssl verify dynamically and using the correct httpx client based on passed ssl verify param Fixes https://github.com/BerriAI/litellm/issues/6499 * feat(llm_http_handler.py): support passing `ssl_verify=False` dynamically in call args Closes https://github.com/BerriAI/litellm/issues/6499 * fix(proxy/utils.py): prevent bad logs from breaking all cost tracking + reset list regardless of success/failure prevents malformed logs from causing all spend tracking to break since they're constantly retried * test(test_proxy_utils.py): add test to ensure bad log is dropped * test(test_proxy_utils.py): ensure in-memory spend logs reset after bad log error * test(test_user_api_key_auth.py): add unit test to ensure end user id as str works * fix(auth_utils.py): ensure extracted end user id is always a str prevents db cost tracking errors * test(test_auth_utils.py): ensure get end user id from request body always returns a string * test: update tests * test: skip bedrock test- behaviour now supported * test: fix testing * refactor(spend_tracking_utils.py): reduce size of get_logging_payload * test: fix test * bump: version 1.59.4 → 1.59.5 * Revert "bump: version 1.59.4 → 1.59.5" This reverts commit `1182b46b2e`. * fix(utils.py): fix spend logs retry logic * fix(spend_tracking_utils.py): fix get tags * fix(spend_tracking_utils.py): fix end user id spend tracking on pass-through endpoints	2025-01-23 23:05:41 -08:00
Krish Dholakia	27560bd5ad	Litellm dev 01 22 2025 p4 (#7932 ) * feat(main.py): add new 'provider_specific_header' param allows passing extra header for specific provider * fix(litellm_pre_call_utils.py): add unit test for pre call utils * test(test_bedrock_completion.py): skip test now that bedrock supports this	2025-01-22 21:52:07 -08:00
Krish Dholakia	866fffb50d	Litellm dev 01 21 2025 p1 (#7898 ) * fix(utils.py): don't pass 'anthropic-beta' header to vertex - will cause request to fail * fix(utils.py): add flag to allow user to disable filtering invalid headers ensure user can control behaviour * style(utils.py): cleanup message * test(test_utils.py): add unit test to cover invalid header filtering * fix(proxy_server.py): fix custom openapi schema generation * fix(utils.py): pass extra headers if set * fix(main.py): fix image variation to use 'client' param	2025-01-21 20:36:11 -08:00
Ishaan Jaff	f1335362cf	(core sdk fix) - fix fallbacks stuck in infinite loop (#7751 ) * test_acompletion_fallbacks_basic * use common run_async_function * fix completion_with_fallbacks * fix completion with fallbacks * fix fallback utils * test_acompletion_fallbacks_basic * test_completion_fallbacks_sync * huggingface/mistralai/Mistral-7B-Instruct-v0.3	2025-01-13 19:34:34 -08:00
Ishaan Jaff	c8cedbed20	fix img gen cost	2025-01-12 16:31:04 -08:00
Krish Dholakia	ad2f66b3e3	[BETA] Add OpenAI `/images/variations` + Topaz API support (#7700 ) * feat(main.py): initial commit for `/image/variations` endpoint support * refactor(base_llm/): introduce new base llm base config for image variation endpoints * refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler * fix: test * feat(openai/): working openai `/image/variation` endpoint calls via sdk * feat(topaz/): topaz sync image variation call support Addresses https://github.com/BerriAI/litellm/issues/7593 ' * fix(topaz/transformation.py): fix linting errors * fix(openai/image_variations/handler.py): fix passing json data * fix(main.py): image_variation/ support async image variation route - `aimage_variation` * fix(test_get_model_info.py): fix test * fix: cleanup unused imports * feat(openai/): add async `/image/variations` endpoint support * feat(topaz/): support async `/image/variations` calls * fix: test * fix(utils.py): fix get_model_info_helper for no model info w/ provider config handles situation where model info is not known but provider config exists * test(test_router_fallbacks.py): mark flaky test * fix: fix unused imports * test: bump otel load test perf threshold - accounts for current load tests hitting same server	2025-01-11 23:27:46 -08:00
Krish Dholakia	becd4bc748	Litellm dev 01 11 2025 p3 (#7702 ) * fix(__init__.py): fix init to exclude pricing-only model cost values from real model names prevents bad health checks on wildcard routes * fix(get_llm_provider.py): fix to handle calling bedrock_converse models	2025-01-11 20:06:54 -08:00
Krish Dholakia	27892acdfc	Litellm dev 01 10 2025 p3 (#7682 ) * feat(langfuse.py): log the used prompt when prompt management used * test: fix test * docs(self_serve.md): add doc on restricting personal key creation on ui * feat(s3.py): support s3 logging with team alias prefixes (if available) New preview feature * fix(main.py): remove old if block - simplify to just await if coroutine returned fixes lm_studio async embedding error * fix(langfuse.py): handle get prompt check	2025-01-10 21:56:42 -08:00
Krish Dholakia	c10ae8879e	fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… (#7660 ) * fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url * refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic deduplicates code * fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages * docs(prompt_management.md): update prompt management to be in beta given feedback - this still needs to be revised (e.g. passing in user message, not ignoring) * refactor(prompt_management_base.py): introduce base class for prompt management allows consistent behaviour across prompt management integrations * feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base * fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set allows tracking what prompt was used for what purpose * feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse allows logging prompt id / prompt variables to langfuse * test: fix test * fix(router.py): cleanup unused imports * fix: fix linting error * fix: fix trace param typing * fix: fix linting errors * fix: fix code qa check	2025-01-10 07:31:59 -08:00
Krish Dholakia	865e6d5bda	fix(main.py): fix lm_studio/ embedding routing (#7658 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 36s Details * fix(main.py): fix lm_studio/ embedding routing adds the mapping + updates docs with example * docs(self_serve.md): update doc to show how to auto-add sso users to teams * fix(streaming_handler.py): simplify async iterator check, to just check if streaming response is an async iterable	2025-01-09 23:03:24 -08:00
Krish Dholakia	4e69711411	Litellm dev 01 07 2025 p1 (#7618 ) * fix(main.py): pass custom llm provider on litellm logging provider update * fix(cost_calculator.py): don't append provider name to return model if existing llm provider Fixes https://github.com/BerriAI/litellm/issues/7607 * fix(prometheus_services.py): fix prometheus system health error logging Fixes https://github.com/BerriAI/litellm/issues/7611	2025-01-07 21:22:31 -08:00
Ishaan Jaff	2ca0977921	`aiohttp_openai/` fixes - allow using `aiohttp_openai/gpt-4o` (#7598 ) * fixes for get_complete_url * update aiohttp tests * fix event loop for aiohtto * ci/cd run again * test_aiohttp_openai	2025-01-06 21:39:11 -08:00
Krish Dholakia	fef7839e8a	Litellm dev 01 06 2025 p1 (#7594 ) * fix(custom_logger.py): expose new 'async_get_chat_completion_prompt' event hook * fix(custom_logger.py): langfuse_prompt_management.py remove 'headers' from custom logger 'async_get_chat_completion_prompt' and 'get_chat_completion_prompt' event hooks * feat(router.py): expose new function for prompt management based routing * feat(router.py): partial working router prompt factory logic allows load balanced model to be used for model name w/ langfuse prompt management call * feat(router.py): fix prompt management with load balanced model group * feat(langfuse_prompt_management.py): support reading in openai params from langfuse enables user to define optional params on langfuse vs. client code * test(test_Router.py): add unit test for router based langfuse prompt management * fix: fix linting errors	2025-01-06 21:26:21 -08:00
Krish Dholakia	f6698e871f	Fix langfuse prompt management on proxy (#7535 ) * fix(types/utils.py): support langfuse + humanloop routes on llm router * fix(main.py): remove acompletion elif block just await if coroutine returned	2025-01-03 12:42:37 -08:00
Ishaan Jaff	d861aa8ff3	(perf) use `aiohttp` for `custom_openai` (#7514 ) * use aiohttp handler * BaseLLMAIOHTTPHandler * use CustomOpenAIChatConfig * CustomOpenAIChatConfig * CustomOpenAIChatConfig * fix linting * AiohttpOpenAIChatConfig * fix order * aiohttp_openai	2025-01-02 22:15:17 -08:00
Krish Dholakia	45b93f2721	Litellm dev 01 01 2025 p3 (#7503 ) * fix(utils.py): add new validate tool choice helper function Prevents https://github.com/BerriAI/litellm/issues/7483 * fix(main.py): add tool choice validation on .completion() prevents user error like - https://github.com/BerriAI/litellm/issues/7483 * fix(utils.py): fix return val of tool choice validation logic	2025-01-01 22:12:15 -08:00
Krish Dholakia	0120176541	Litellm dev 12 30 2024 p2 (#7495 ) * test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model * fix(base_llm_unit_tests.py): handle azure o1 preview response format tests skip as o1 on azure doesn't support tool calling yet * fix: initial commit of azure o1 handler using openai caller simplifies calling + allows fake streaming logic alr. implemented for openai to just work * feat(azure/o1_handler.py): fake o1 streaming for azure o1 models azure does not currently support streaming for o1 * feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info enables user to toggle on when azure allows o1 streaming without needing to bump versions * style(router.py): remove 'give feedback/get help' messaging when router is used Prevents noisy messaging Closes https://github.com/BerriAI/litellm/issues/5942 * fix(types/utils.py): handle none logprobs Fixes https://github.com/BerriAI/litellm/issues/328 * fix(exception_mapping_utils.py): fix error str unbound error * refactor(azure_ai/): move to openai_like chat completion handler allows for easy swapping of api base url's (e.g. ai.services.com) Fixes https://github.com/BerriAI/litellm/issues/7275 * refactor(azure_ai/): move to base llm http handler * fix(azure_ai/): handle differing api endpoints * fix(azure_ai/): make sure all unit tests are passing * fix: fix linting errors * fix: fix linting errors * fix: fix linting error * fix: fix linting errors * fix(azure_ai/transformation.py): handle extra body param * fix(azure_ai/transformation.py): fix max retries param handling * fix: fix test * test(test_azure_o1.py): fix test * fix(llm_http_handler.py): support handling azure ai unprocessable entity error * fix(llm_http_handler.py): handle sync invalid param error for azure ai * fix(azure_ai/): streaming support with base_llm_http_handler * fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai * fix: fix linting errors * fix(llm_http_handler.py): fix linting error * fix(azure_ai/): handle cohere tool call invalid index param error	2025-01-01 18:57:29 -08:00
Krish Dholakia	41e5b3aa8d	HumanLoop integration for Prompt Management (#7479 ) * feat(humanloop.py): initial commit for humanloop prompt management integration Closes https://github.com/BerriAI/litellm/issues/213 * feat(humanloop.py): working e2e humanloop prompt management integration Closes https://github.com/BerriAI/litellm/issues/213 * fix(humanloop.py): fix linting errors * fix: fix linting erro * fix: fix test * test: handle filenotfound error	2024-12-30 22:26:03 -08:00
Krish Dholakia	347779b813	Litellm dev 12 30 2024 p1 (#7480 ) * test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model * fix(base_llm_unit_tests.py): handle azure o1 preview response format tests skip as o1 on azure doesn't support tool calling yet * fix: initial commit of azure o1 handler using openai caller simplifies calling + allows fake streaming logic alr. implemented for openai to just work * feat(azure/o1_handler.py): fake o1 streaming for azure o1 models azure does not currently support streaming for o1 * feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info enables user to toggle on when azure allows o1 streaming without needing to bump versions * style(router.py): remove 'give feedback/get help' messaging when router is used Prevents noisy messaging Closes https://github.com/BerriAI/litellm/issues/5942 * test: fix azure o1 test * test: fix tests * fix: fix test	2024-12-30 21:52:52 -08:00
Ishaan Jaff	a003af6c04	(fix) `litellm.amoderation` - support using `model=openai/omni-moderation-latest`, `model=omni-moderation-latest`, `model=None` (#7475 ) * test_moderation_endpoint * fix litellm.amoderation	2024-12-30 09:42:51 -08:00
Krish Dholakia	cfb6890b9f	Litellm dev 12 28 2024 p2 (#7458 ) * docs(sidebar.js): docs for support model access groups for wildcard routes * feat(key_management_endpoints.py): add check if user is premium_user when adding model access group for wildcard route * refactor(docs/): make control model access a root-level doc in proxy sidebar easier to discover how to control model access on litellm * docs: more cleanup * feat(fireworks_ai/): add document inlining support Enables user to call non-vision models with images/pdfs/etc. * test(test_fireworks_ai_translation.py): add unit testing for fireworks ai transform inline helper util * docs(docs/): add document inlining details to fireworks ai docs * feat(fireworks_ai/): allow user to dynamically disable auto add transform inline allows client-side disabling of this feature for proxy users * feat(fireworks_ai/): return 'supports_vision' and 'supports_pdf_input' true on all fireworks ai models now true as fireworks ai supports document inlining * test: fix tests * fix(router.py): add unit testing for _is_model_access_group_for_wildcard_route	2024-12-28 19:38:06 -08:00
Krish Dholakia	5af438ed89	Litellm dev 12 28 2024 p3 (#7464 ) * feat(deepgram/): initial e2e support for deepgram stt Uses deepgram's `/listen` endpoint to transcribe speech to text Closes https://github.com/BerriAI/litellm/issues/4875 * fix: fix linting errors * test: fix test	2024-12-28 19:18:58 -08:00
Ishaan Jaff	4d648ee335	fix ahealth_check	2024-12-28 19:16:28 -08:00
Ishaan Jaff	1e06ee3162	(Refactor) - Re use litellm.completion/litellm.embedding etc for health checks (#7455 ) * add mode: realtime * add _realtime_health_check * test_realtime_health_check * azure _realtime_health_check * _realtime_health_check * Realtime Models * fix code quality * delete OAI / Azure custom health check code * simplest version of ahealth check * update tests * working health check post refactor * working aspeech health check * fix realtime health checks * test_audio_transcription_health_check * use get_audio_file_for_health_check * test_text_completion_health_check * ahealth_check * simplify health check code * update ahealth_check * fix import * fix unused imports * fix ahealth_check * fix local testing * test_async_realtime_health_check	2024-12-28 18:38:54 -08:00
Ishaan Jaff	4e65722a00	(Bug Fix) Add health check support for realtime models (#7453 ) * add mode: realtime * add _realtime_health_check * test_realtime_health_check * azure _realtime_health_check * _realtime_health_check * Realtime Models * fix code quality	2024-12-28 18:15:00 -08:00
Krish Dholakia	67b39bacf7	LiteLLM Minor Fixes & Improvements (12/27/2024) - p1 (#7448 ) * feat(main.py): mock_response() - support 'litellm.ContextWindowExceededError' in mock response enabled quicker router/fallback/proxy debug on context window errors * feat(exception_mapping_utils.py): extract special litellm errors from error str if calling `litellm_proxy/` as provider Closes https://github.com/BerriAI/litellm/issues/7259 * fix(user_api_key_auth.py): specify 'Received Proxy Server Request' is span kind server Closes https://github.com/BerriAI/litellm/issues/7298	2024-12-27 19:04:39 -08:00
Krish Dholakia	760328b6ad	Litellm dev 12 25 2025 p2 (#7420 ) * test: add new test image embedding to base llm unit tests Addresses https://github.com/BerriAI/litellm/issues/6515 * fix(bedrock/embed/multimodal-embeddings): strip data prefix from image urls for bedrock multimodal embeddings Fix https://github.com/BerriAI/litellm/issues/6515 * feat: initial commit for fireworks ai audio transcription support Relevant issue: https://github.com/BerriAI/litellm/issues/7134 * test: initial fireworks ai test * feat(fireworks_ai/): implemented fireworks ai audio transcription config * fix(utils.py): register fireworks ai audio transcription config, in config manager * fix(utils.py): add fireworks ai param translation to 'get_optional_params_transcription' * refactor(fireworks_ai/): define text completion route with model name handling moves model name handling to specific fireworks routes, as required by their api * refactor(fireworks_ai/chat): define transform_Request - allows fixing model if accounts/ is missing * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix(handler.py): fix linting errors * fix(main.py): fix tgai text completion route * refactor(together_ai/completion): refactors together ai text completion route to just use provider transform request * refactor: move test_fine_tuning_api out of local_testing reduces local testing ci/cd time	2024-12-25 18:35:34 -08:00
Ishaan Jaff	08a4c72692	(feat) `/batches` - track `user_api_key_alias`, `user_api_key_team_alias` etc for /batch requests (#7401 ) * run azure testing on ci/cd * update docs on azure batches endpoints * add input azure.jsonl * refactor - use separate file for batches endpoints * fixes for passing custom llm provider to /batch endpoints * pass custom llm provider to files endpoints * update azure batches doc * add info for azure batches api * update batches endpoints * use simple helper for raising proxy exception * update config.yml * fix imports * add type hints to get_litellm_params * update get_litellm_params * update get_litellm_params * update get slp * QOL - stop double logging a create batch operations on custom loggers * re use slp from og event * _create_standard_logging_object_for_completed_batch * fix linting errors * reduce num changes in PR * update BATCH_STATUS_POLL_MAX_ATTEMPTS	2024-12-24 17:44:28 -08:00
Ishaan Jaff	05b0d2026f	(feat) Add cost tracking for /batches requests OpenAI (#7384 ) * add basic logging for create`batch` * add create_batch as a call type * add basic dd logging for batches * basic batch creation logging on DD * batch endpoints add cost calc * fix batches_async_logging * separate folder for batches testing * new job for batches tests * test batches logging * fix validation logic * add vertex_batch_completions.jsonl * test test_async_create_batch * test_async_create_batch * update tests * test_completion_with_no_model * remove dead code * update load_vertex_ai_credentials * test_avertex_batch_prediction * update get async httpx client * fix get_async_httpx_client * update test_avertex_batch_prediction * fix batches testing config.yaml * add google deps * fix vertex files handler	2024-12-23 17:47:26 -08:00
Krish Dholakia	48316520f4	LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 (#7386 ) * fix(main.py): support 'mock_timeout=true' param allows mock requests on proxy to have a time delay, for testing * fix(main.py): ensure mock timeouts raise litellm.Timeout error triggers retry/fallbacks * fix: fix fallback + mock timeout testing * fix(router.py): always return remaining tpm/rpm limits, if limits are known allows for rate limit headers to be guaranteed * docs(timeout.md): add docs on mock timeout = true * fix(main.py): fix linting errors * test: fix test	2024-12-23 17:41:27 -08:00
Krish Dholakia	3671829e39	Complete 'requests' library removal (#7350 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure * refactor(watsonx/completion/handler.py): move to using base llm http handler removes 'requests' library usage * fix(watsonx_text/transformation.py): fix result transformation migrates to transformation.py, for usage with base llm http handler * fix(streaming_handler.py): migrate watsonx streaming to transformation.py ensures streaming works with base llm http handler * fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic * fix(watsonx/): fix chat route post completion route refactor * refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well * refactor(base.py): remove requests library usage from litellm * build(pyproject.toml): remove requests library usage * fix: fix linting errors * fix: fix linting errors * fix(types/utils.py): fix validation errors for modelresponsestream * fix(replicate/handler.py): fix linting errors * fix(litellm_logging.py): handle modelresponsestream object * fix(streaming_handler.py): fix modelresponsestream args * fix: remove unused imports * test: fix test * fix: fix test * test: fix test * test: fix tests * test: fix test * test: fix patch target * test: fix test	2024-12-22 07:21:25 -08:00
Ishaan Jaff	6107f9f3f3	[Bug fix ]: Triton /infer handler incompatible with batch responses (#7337 ) * migrate triton to base llm http handler * clean up triton handler.py * use transform functions for triton * add TritonConfig * get openai params for triton * use triton embedding config * test_completion_triton_generate_api * test_completion_triton_infer_api * fix TritonConfig doc string * use TritonResponseIterator * fix triton embeddings * docs triton chat usage	2024-12-20 20:59:40 -08:00
Krish Dholakia	27a4d08604	Litellm dev 2024 12 19 p3 (#7322 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params Fixes https://github.com/BerriAI/litellm/issues/7242 * test: new test for langfuse prompt management hook Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(main.py): add 'get_chat_completion_prompt' customlogger hook allows for langfuse prompt management Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(langfuse_prompt_management.py): working e2e langfuse prompt management works with `langfuse/` route * feat(main.py): initial tracing for dynamic langfuse params allows admin to specify langfuse keys by model in model_list * feat(main.py): support passing langfuse credentials dynamically * fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params allows dynamic langfuse params to work * fix: fix linting errors * docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial * docs(prompt_management.md): cleanup doc * docs: cleanup topnav * docs(prompt_management.md): update docs to be easier to use * fix: remove unused imports * docs(prompt_management.md): add architectural overview doc * fix(litellm_logging.py): fix dynamic param passing * fix(langfuse_prompt_management.py): fix linting errors * fix: fix linting errors * fix: use typing_extensions for typealias to ensure python3.8 compatibility * test: use stream_options in test to account for tiktoken diff * fix: improve import error message, and check run test earlier	2024-12-20 13:30:16 -08:00
Krish Dholakia	6a45ee1ef7	fix(hosted_vllm/transformation.py): return fake api key, if none give… (#7301 ) * fix(hosted_vllm/transformation.py): return fake api key, if none give. Prevents httpx error Fixes https://github.com/BerriAI/litellm/issues/7291 * test: fix test * fix(main.py): add hosted_vllm/ support for embeddings endpoint Closes https://github.com/BerriAI/litellm/issues/7290 * docs(vllm.md): add docs on vllm embeddings usage * fix(__init__.py): fix sambanova model test * fix(base_llm_unit_tests.py): skip pydantic obj test if model takes >5s to respond	2024-12-18 18:41:53 -08:00
Ishaan Jaff	7a5dd29fe0	(fix) unable to pass input_type parameter to Voyage AI embedding mode (#7276 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details * VoyageEmbeddingConfig * fix voyage logic to get params * add voyage embedding transformation * add get_provider_embedding_config * use BaseEmbeddingConfig * voyage clean up * use llm http handler for embedding transformations * test_voyage_ai_embedding_extra_params * add voyage async * test_voyage_ai_embedding_extra_params * add async for llm http handler * update BaseLLMEmbeddingTest * test_voyage_ai_embedding_extra_params * fix linting * fix get_provider_embedding_config * fix anthropic text test * update location of base/chat/transformation * fix import path * fix IBMWatsonXAIConfig	2024-12-17 19:23:49 -08:00
Krish Dholakia	019ddc32d6	Litellm dev 12 17 2024 p2 (#7277 ) * fix(openai/transcription/handler.py): call 'log_pre_api_call' on async calls * fix(openai/transcriptions/handler.py): call 'logging.pre_call' on sync whisper calls as well * fix(proxy_cli.py): remove default proxy_cli timeout param gets passed in as a dynamic request timeout and overrides config values * fix(langfuse.py): pass litellm httpx client - contains ssl certs (#7052) Fixes https://github.com/BerriAI/litellm/issues/7046	2024-12-17 14:05:14 -08:00
Krish Dholakia	b82add11ba	LITELLM: Remove `requests` library usage (#7235 ) * fix(generic_api_callback.py): remove requests lib usage * fix(budget_manager.py): remove requests lib usgae * fix(main.py): cleanup requests lib usage * fix(utils.py): remove requests lib usage * fix(argilla.py): fix argilla test * fix(athina.py): replace 'requests' lib usage with litellm module * fix(greenscale.py): replace 'requests' lib usage with httpx * fix: remove unused 'requests' lib import + replace usage in some places * fix(prompt_layer.py): remove 'requests' lib usage from prompt layer * fix(ollama_chat.py): remove 'requests' lib usage * fix(baseten.py): replace 'requests' lib usage * fix(codestral/): replace 'requests' lib usage * fix(predibase/): replace 'requests' lib usage * refactor: cleanup unused 'requests' lib imports * fix(oobabooga.py): cleanup 'requests' lib usage * fix(invoke_handler.py): remove unused 'requests' lib usage * refactor: cleanup unused 'requests' lib import * fix: fix linting errors * refactor(ollama/): move ollama to using base llm http handler removes 'requests' lib dep for ollama integration * fix(ollama_chat.py): fix linting errors * fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama	2024-12-17 12:50:04 -08:00
Krish Dholakia	ec36353b41	fix(main.py): fix retries being multiplied when using openai sdk (#7221 ) * fix(main.py): fix retries being multiplied when using openai sdk Closes https://github.com/BerriAI/litellm/pull/7130 * docs(prompt_management.md): add langfuse prompt management doc * feat(team_endpoints.py): allow teams to add their own models Enables teams to call their own finetuned models via the proxy * test: add better enforcement check testing for `/model/new` now that teams can add their own models * docs(team_model_add.md): tutorial for allowing teams to add their own models * test: fix test	2024-12-14 11:56:55 -08:00
Krish Dholakia	e68bb4e051	Litellm dev 12 12 2024 (#7203 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 47s Details * fix(azure/): support passing headers to azure openai endpoints Fixes https://github.com/BerriAI/litellm/issues/6217 * fix(utils.py): move default tokenizer to just openai hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls * fix(router.py): fix pattern matching router - add generic "" to it as well Fixes issue where generic "" model access group wouldn't show up * fix(pattern_match_deployments.py): match to more specific pattern match to more specific pattern allows setting generic wildcard model access group and excluding specific models more easily * fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty don't delete all router models b/c of empty list Fixes https://github.com/BerriAI/litellm/issues/7196 * fix(anthropic/): fix handling response_format for anthropic messages with anthropic api * fix(fireworks_ai/): support passing response_format + tool call in same message Addresses https://github.com/BerriAI/litellm/issues/7135 * Revert "fix(fireworks_ai/): support passing response_format + tool call in same message" This reverts commit `6a30dc6929`. * test: fix test * fix(replicate/): fix replicate default retry/polling logic * test: add unit testing for router pattern matching * test: update test to use default oai tokenizer * test: mark flaky test * test: skip flaky test	2024-12-13 08:54:03 -08:00
Krish Dholakia	481645e49c	fix(acompletion): support fallbacks on acompletion (#7184 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 45s Details * fix(acompletion): support fallbacks on acompletion allows health checks for wildcard routes to use fallback models * test: update cohere generate api testing * add max tokens to health check (#7000) * fix: fix health check test * test: update testing --------- Co-authored-by: Cameron <561860+wallies@users.noreply.github.com>	2024-12-11 19:20:54 -08:00
Ishaan Jaff	59daac5a3f	fix merge conflicts	2024-12-11 01:11:53 -08:00
Krrish Dholakia	02dd0c6e7e	build: Squashed commit of https://github.com/BerriAI/litellm/pull/7171 Closes https://github.com/BerriAI/litellm/pull/7171	2024-12-11 01:10:12 -08:00
Krrish Dholakia	06074bb13b	build: Squashed commit of https://github.com/BerriAI/litellm/pull/7170 Closes https://github.com/BerriAI/litellm/pull/7170	2024-12-11 01:03:57 -08:00
Krrish Dholakia	b9b34a7b99	build: Squashed commit of https://github.com/BerriAI/litellm/pull/7165 Closes https://github.com/BerriAI/litellm/pull/7165	2024-12-11 01:00:33 -08:00
Ishaan Jaff	78d132c1fb	(Refactor) Code Quality improvement - rename `text_completion_codestral.py` -> `codestral/completion/` (#7172 ) * rename files * fix codestral fim organization * fix CodestralTextCompletionConfig * fix import CodestralTextCompletion * fix BaseLLM * fix imports * fix CodestralTextCompletionConfig * fix imports CodestralTextCompletion	2024-12-11 00:55:47 -08:00
Ishaan Jaff	400eb28a91	Code Quality Improvement - move `aleph_alpha` to deprecated_providers (#7168 ) * move aleph alpha to deprecated providers * fix import location * fix aleph_alpha * pytest skip * undo change to test file	2024-12-11 00:50:40 -08:00

1 2 3 4 5 ...

1240 commits