litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 11:14:04 +00:00

Author	SHA1	Message	Date
Ishaan Jaff	32e8bdef6f	update clean up jobs	2024-12-28 19:45:19 -08:00
Krrish Dholakia	ab665dc7af	docs(spend_monitoring.md): cleanup doc	2024-12-28 19:42:03 -08:00
Krish Dholakia	cfb6890b9f	Litellm dev 12 28 2024 p2 (#7458 ) * docs(sidebar.js): docs for support model access groups for wildcard routes * feat(key_management_endpoints.py): add check if user is premium_user when adding model access group for wildcard route * refactor(docs/): make control model access a root-level doc in proxy sidebar easier to discover how to control model access on litellm * docs: more cleanup * feat(fireworks_ai/): add document inlining support Enables user to call non-vision models with images/pdfs/etc. * test(test_fireworks_ai_translation.py): add unit testing for fireworks ai transform inline helper util * docs(docs/): add document inlining details to fireworks ai docs * feat(fireworks_ai/): allow user to dynamically disable auto add transform inline allows client-side disabling of this feature for proxy users * feat(fireworks_ai/): return 'supports_vision' and 'supports_pdf_input' true on all fireworks ai models now true as fireworks ai supports document inlining * test: fix tests * fix(router.py): add unit testing for _is_model_access_group_for_wildcard_route	2024-12-28 19:38:06 -08:00
Ishaan Jaff	4e65722a00	(Bug Fix) Add health check support for realtime models (#7453 ) * add mode: realtime * add _realtime_health_check * test_realtime_health_check * azure _realtime_health_check * _realtime_health_check * Realtime Models * fix code quality	2024-12-28 18:15:00 -08:00
Ishaan Jaff	49fa6515c0	docs spend monitoring (#7461 )	2024-12-28 16:39:24 -08:00
Ishaan Jaff	8610c7bf93	docs release notes All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 41s Details	2024-12-27 21:41:21 -08:00
Ishaan Jaff	a962d88822	add keywords	2024-12-27 21:39:46 -08:00
Ishaan Jaff	570ab5498e	v1.56.3 release notes	2024-12-27 21:36:49 -08:00
Ishaan Jaff	79c783e83f	docs guardrails	2024-12-27 16:31:03 -08:00
Ishaan Jaff	0b4ef57172	docs add guardrail spec	2024-12-27 15:47:43 -08:00
Ishaan Jaff	b53861f3fb	docs update gemini/ link	2024-12-27 15:32:50 -08:00
Igor Ribeiro Lima	11932d0576	Add Gemini embedding doc (#7436 )	2024-12-27 15:27:26 -08:00
Ishaan Jaff	62753eea69	✨ (Feat) Log Guardrails run, guardrail response on logging integrations (#7445 ) * add guardrail_information to SLP * use standard_logging_guardrail_information * track StandardLoggingGuardrailInformation * use log_guardrail_information * use log_guardrail_information * docs guardrails * docs guardrails * update quick start * fix presidio logging for sync functions * update Guardrail type * enforce add_standard_logging_guardrail_information_to_request_data * update gd docs	2024-12-27 15:01:56 -08:00
Krrish Dholakia	37f998171b	docs(index.md): new release notes All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 39s Details	2024-12-26 22:01:29 -08:00
Krish Dholakia	9d82ff4793	Litellm dev 12 26 2024 p3 (#7434 ) * build(model_prices_and_context_window.json): update groq models to specify 'supports_vision' parameter Closes https://github.com/BerriAI/litellm/issues/7433 * docs(groq.md): add groq vision example to docs Closes https://github.com/BerriAI/litellm/issues/7433 * fix(prometheus.py): refactor self.litellm_proxy_failed_requests_metric to use label factory * feat(prometheus.py): new 'litellm_proxy_failed_requests_by_tag_metric' allows tracking failed requests by tag on proxy * fix(prometheus.py): fix exception logging * feat(prometheus.py): add new 'litellm_request_total_latency_by_tag_metric' enables tracking latency by use-case * feat(prometheus.py): add new llm api latency by tag metric * feat(prometheus.py): new litellm_deployment_latency_per_output_token_by_tag metric allows tracking deployment latency by tag * fix(prometheus.py): refactor 'litellm_requests_metric' to use enum values + label factory * feat(prometheus.py): new litellm_proxy_total_requests_by_tag metric allows tracking total requests by tag * feat(prometheus.py): new metric litellm_deployment_successful_fallbacks_by_tag allows tracking deployment fallbacks by tag * fix(prometheus.py): new 'litellm_deployment_failed_fallbacks_by_tag' metric allows tracking failed fallbacks on deployment by custom tag * test: fix test * test: rename test to run earlier * test: skip flaky test	2024-12-26 21:21:16 -08:00
Krish Dholakia	539f166166	Support budget/rate limit tiers for keys (#7429 ) * feat(proxy/utils.py): get associated litellm budget from db in combined_view for key allows user to create rate limit tiers and associate those to keys * feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set allows rate limit tiers to be easily applied to keys * docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers make feature discoverable * feat(key_management_endpoints.py): return litellm_budget_table value in key generate make it easy for user to know associated budget on key creation * fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate` * docs(key_management_endpoints.py): document budget_id usage * refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it * docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs * fix(customer_endpoints.py): use new pydantic obj name * docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm * Litellm dev 12 26 2024 p2 (#7432) * (Feat) Add logging for `POST v1/fine_tuning/jobs` (#7426) * init commit ft jobs logging * add ft logging * add logging for FineTuningJob * simple FT Job create test * (docs) - show all supported Azure OpenAI endpoints in overview (#7428) * azure batches * update doc * docs azure endpoints * docs endpoints on azure * docs azure batches api * docs azure batches api * fix(key_management_endpoints.py): fix key update to actually work * test(test_key_management.py): add e2e test asserting ui key update call works * fix: proxy/_types - fix linting erros * test: update test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * fix: test * fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers * fix: fix linting errors * test: fix test * fix: remove unused import * test: update test * docs(customer_endpoints.py): document new model_max_budget param * test: specify unique key alias * docs(budget_management_endpoints.py): document new model_max_budget param * test: fix test * test: fix tests --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-12-26 19:05:27 -08:00
Ishaan Jaff	12c4e7e695	docs guardrail params (#7430 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 36s Details	2024-12-26 11:08:47 -08:00
Ishaan Jaff	c1d6c35aef	(docs) - show all supported Azure OpenAI endpoints in overview (#7428 ) * azure batches * update doc * docs azure endpoints * docs endpoints on azure * docs azure batches api * docs azure batches api	2024-12-26 09:01:41 -08:00
Krrish Dholakia	2dcde8ce2b	docs: cleanup doc All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details	2024-12-25 21:23:42 -08:00
Krrish Dholakia	506b6f6517	docs(fireworks_ai.md): add audio transcription to fireworks ai doc	2024-12-25 21:22:51 -08:00
Ishaan Jaff	005f2fa1aa	docs - batches cost tracking (#7422 )	2024-12-25 20:13:26 -08:00
Krish Dholakia	760328b6ad	Litellm dev 12 25 2025 p2 (#7420 ) * test: add new test image embedding to base llm unit tests Addresses https://github.com/BerriAI/litellm/issues/6515 * fix(bedrock/embed/multimodal-embeddings): strip data prefix from image urls for bedrock multimodal embeddings Fix https://github.com/BerriAI/litellm/issues/6515 * feat: initial commit for fireworks ai audio transcription support Relevant issue: https://github.com/BerriAI/litellm/issues/7134 * test: initial fireworks ai test * feat(fireworks_ai/): implemented fireworks ai audio transcription config * fix(utils.py): register fireworks ai audio transcription config, in config manager * fix(utils.py): add fireworks ai param translation to 'get_optional_params_transcription' * refactor(fireworks_ai/): define text completion route with model name handling moves model name handling to specific fireworks routes, as required by their api * refactor(fireworks_ai/chat): define transform_Request - allows fixing model if accounts/ is missing * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix(handler.py): fix linting errors * fix(main.py): fix tgai text completion route * refactor(together_ai/completion): refactors together ai text completion route to just use provider transform request * refactor: move test_fine_tuning_api out of local_testing reduces local testing ci/cd time	2024-12-25 18:35:34 -08:00
Ishaan Jaff	157810fcbf	fix docs warning (#7419 )	2024-12-25 16:42:14 -08:00
Ishaan Jaff	0ce5f9fe58	(feat) Support Dynamic Params for `guardrails` (#7415 ) * update CustomGuardrail * unit test custom guardrails * add dynamic params for aporia * add dynamic params to bedrock guard * add dynamic params for all guardrails * fix linting * fix should_run_guardrail * _validate_premium_user * update guardrail doc * doc update * update code q * should_run_guardrail	2024-12-25 16:07:29 -08:00
Ishaan Jaff	77fa751639	update docs base docker	2024-12-25 15:51:19 -08:00
Ishaan Jaff	c6ca835046	docs files api All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 11s Details	2024-12-24 20:46:43 -08:00
Ishaan Jaff	47e12802df	(feat) `/batches` Add support for using `/batches` endpoints in OAI format (#7402 ) * run azure testing on ci/cd * update docs on azure batches endpoints * add input azure.jsonl * refactor - use separate file for batches endpoints * fixes for passing custom llm provider to /batch endpoints * pass custom llm provider to files endpoints * update azure batches doc * add info for azure batches api * update batches endpoints * use simple helper for raising proxy exception * update config.yml * fix imports * update tests * use existing settings * update env var used * update configs * update config.yml * update ft testing	2024-12-24 16:58:05 -08:00
Krish Dholakia	c3edfc2c92	LiteLLM Minor Fixes & Improvements (12/23/2024) - p3 (#7394 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 35s Details * build(model_prices_and_context_window.json): add gemini-1.5-flash context caching * fix(context_caching/transformation.py): just use last identified cache point Fixes https://github.com/BerriAI/litellm/issues/6738 * fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google Fixes https://github.com/BerriAI/litellm/issues/6738 * fix(vertex_ai/gemini/): track context caching tokens * refactor(gemini/): place transformation.py inside `chat/` folder make it easy for user to know we support the equivalent endpoint * fix: fix import * refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder make it easier to see cost calculation logic * fix: fix linting errors * fix: fix circular import * feat(gemini/cost_calculator.py): support gemini context caching cost calculation generifies anthropic's cost calculation function and uses it across anthropic + gemini * build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching Closes https://github.com/BerriAI/litellm/issues/6891 * docs(gemini.md): add gemini context caching architecture diagram make it easier for user to understand how context caching works * docs(gemini.md): link to relevant gemini context caching code * docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code * fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario * fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation * test: fix test	2024-12-23 22:02:52 -08:00
Ishaan Jaff	442d309bcd	update release notes	2024-12-23 21:48:33 -08:00
Ishaan Jaff	f4a2143b82	update release notes	2024-12-23 21:43:47 -08:00
Ishaan Jaff	1f466ec9cf	release notes	2024-12-23 21:38:56 -08:00
Ishaan Jaff	957fbc6dfb	docs batches	2024-12-23 21:24:06 -08:00
Ishaan Jaff	43077a88d5	docs add files to supported endpoints	2024-12-23 20:51:34 -08:00
Krish Dholakia	48316520f4	LiteLLM Minor Fixes & Improvements (12/23/2024) - P2 (#7386 ) * fix(main.py): support 'mock_timeout=true' param allows mock requests on proxy to have a time delay, for testing * fix(main.py): ensure mock timeouts raise litellm.Timeout error triggers retry/fallbacks * fix: fix fallback + mock timeout testing * fix(router.py): always return remaining tpm/rpm limits, if limits are known allows for rate limit headers to be guaranteed * docs(timeout.md): add docs on mock timeout = true * fix(main.py): fix linting errors * test: fix test	2024-12-23 17:41:27 -08:00
Krish Dholakia	db59e08958	Litellm dev 12 23 2024 p1 (#7383 ) * feat(guardrails_endpoint.py): new `/guardrails/list` endpoint Allow users to view what the available guardrails are * docs: document new `/guardrails/list` endpoint * docs(enterprise.md): update docs * fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats * fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests * fix: fix linting errors * fix: remove unused import	2024-12-23 16:33:31 -08:00
Krish Dholakia	3671829e39	Complete 'requests' library removal (#7350 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * refactor: initial commit moving watsonx_text to base_llm_http_handler + clarifying new provider directory structure * refactor(watsonx/completion/handler.py): move to using base llm http handler removes 'requests' library usage * fix(watsonx_text/transformation.py): fix result transformation migrates to transformation.py, for usage with base llm http handler * fix(streaming_handler.py): migrate watsonx streaming to transformation.py ensures streaming works with base llm http handler * fix(streaming_handler.py): fix streaming linting errors and remove watsonx conditional logic * fix(watsonx/): fix chat route post completion route refactor * refactor(watsonx/embed): refactor watsonx to use base llm http handler for embedding calls as well * refactor(base.py): remove requests library usage from litellm * build(pyproject.toml): remove requests library usage * fix: fix linting errors * fix: fix linting errors * fix(types/utils.py): fix validation errors for modelresponsestream * fix(replicate/handler.py): fix linting errors * fix(litellm_logging.py): handle modelresponsestream object * fix(streaming_handler.py): fix modelresponsestream args * fix: remove unused imports * test: fix test * fix: fix test * test: fix test * test: fix tests * test: fix test * test: fix patch target * test: fix test	2024-12-22 07:21:25 -08:00
Ishaan Jaff	8b1ea40e7b	add img to release notes All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 36s Details	2024-12-21 21:24:16 -08:00
Ishaan Jaff	bb801ad3d9	update release notes	2024-12-21 21:20:13 -08:00
Krish Dholakia	1aed988c11	Litellm docs update (#7365 ) * docs: fix release notes url + fix urlk * docs(index.md): add link to github releases * docs(index.md): add linkedin social url to release notes	2024-12-21 21:09:50 -08:00
Ishaan Jaff	31f5dcf8e5	docs	2024-12-21 20:52:39 -08:00
Ishaan Jaff	291229ee46	docs add 1.55.8 changelog	2024-12-21 20:51:39 -08:00
Ishaan Jaff	5a8f67c171	release notes v1.55.8	2024-12-21 20:31:54 -08:00
Ishaan Jaff	a3e732de39	(chore) - enforce model budgets on virtual keys as enterprise feature (#7353 ) * docs - enforce model budget as enterprise feature * docs link to correct place	2024-12-21 14:18:53 -08:00
Ishaan Jaff	6107f9f3f3	[Bug fix ]: Triton /infer handler incompatible with batch responses (#7337 ) * migrate triton to base llm http handler * clean up triton handler.py * use transform functions for triton * add TritonConfig * get openai params for triton * use triton embedding config * test_completion_triton_generate_api * test_completion_triton_infer_api * fix TritonConfig doc string * use TritonResponseIterator * fix triton embeddings * docs triton chat usage	2024-12-20 20:59:40 -08:00
Krish Dholakia	70a9ea99f2	Controll fallback prompts client-side (#7334 ) * feat(router.py): support passing model-specific messages in fallbacks * docs(routing.md): separate router timeouts into separate doc allow for 1 fallbacks doc (across proxy/router) * docs(routing.md): cleanup router docs * docs(reliability.md): cleanup docs * docs(reliability.md): cleaned up fallback doc just have 1 doc across sdk/proxy simplifies docs * docs(reliability.md): add setting model-specific fallback prompts * fix: fix linting errors * test: skip test causing openai rate limit errros * test: fix test * test: run vertex test first to catch error	2024-12-20 19:09:53 -08:00
jravi-fireworks	f8cf11f6d5	Fix LiteLLM documentation (#7333 ) Co-authored-by: Jetashree Ravi <jetashreeravi@Jetashrees-MBP.attlocal.net>	2024-12-20 15:04:23 -08:00
Krish Dholakia	27a4d08604	Litellm dev 2024 12 19 p3 (#7322 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params Fixes https://github.com/BerriAI/litellm/issues/7242 * test: new test for langfuse prompt management hook Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(main.py): add 'get_chat_completion_prompt' customlogger hook allows for langfuse prompt management Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(langfuse_prompt_management.py): working e2e langfuse prompt management works with `langfuse/` route * feat(main.py): initial tracing for dynamic langfuse params allows admin to specify langfuse keys by model in model_list * feat(main.py): support passing langfuse credentials dynamically * fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params allows dynamic langfuse params to work * fix: fix linting errors * docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial * docs(prompt_management.md): cleanup doc * docs: cleanup topnav * docs(prompt_management.md): update docs to be easier to use * fix: remove unused imports * docs(prompt_management.md): add architectural overview doc * fix(litellm_logging.py): fix dynamic param passing * fix(langfuse_prompt_management.py): fix linting errors * fix: fix linting errors * fix: use typing_extensions for typealias to ensure python3.8 compatibility * test: use stream_options in test to account for tiktoken diff * fix: improve import error message, and check run test earlier	2024-12-20 13:30:16 -08:00
Krrish Dholakia	834067c570	docs: refactor admin ui docs	2024-12-20 10:20:07 -08:00
Ishaan Jaff	35076212ef	docs infinity rerank api docs	2024-12-19 18:51:55 -08:00
Ishaan Jaff	3a6dba8853	docs base rerank config	2024-12-19 18:33:32 -08:00

... 8 9 10 11 12 ...

3393 commits