Litellm dev 2024 12 19 p3 (#7322)

* fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params Fixes https://github.com/BerriAI/litellm/issues/7242 * test: new test for langfuse prompt management hook Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(main.py): add 'get_chat_completion_prompt' customlogger hook allows for langfuse prompt management Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296 * feat(langfuse_prompt_management.py): working e2e langfuse prompt management works with `langfuse/` route * feat(main.py): initial tracing for dynamic langfuse params allows admin to specify langfuse keys by model in model_list * feat(main.py): support passing langfuse credentials dynamically * fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params allows dynamic langfuse params to work * fix: fix linting errors * docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial * docs(prompt_management.md): cleanup doc * docs: cleanup topnav * docs(prompt_management.md): update docs to be easier to use * fix: remove unused imports * docs(prompt_management.md): add architectural overview doc * fix(litellm_logging.py): fix dynamic param passing * fix(langfuse_prompt_management.py): fix linting errors * fix: fix linting errors * fix: use typing_extensions for typealias to ensure python3.8 compatibility * test: use stream_options in test to account for tiktoken diff * fix: improve import error message, and check run test earlier
2025-04-27 03:34:10 +00:00 · 2024-12-20 13:30:16 -08:00 · 2024-12-20 13:30:16 -08:00 · b026230b0a
commit b026230b0a
parent 205e2dbe3c
17 changed files with 648 additions and 260 deletions
--- a/tests/local_testing/test_cost_calc.py
+++ b/tests/local_testing/test_cost_calc.py
@ -92,7 +92,7 @@ def test_run(model: str):
        print("Non-stream cost  : NONE")
    print(f"Non-stream cost  : {completion_cost(response) * 100:.4f} (response)")

-    response = router.completion(**kwargs, stream=True)  # type: ignore
+    response = router.completion(**kwargs, stream=True, stream_options={"include_usage": True})  # type: ignore
    response = stream_chunk_builder(list(response), messages=kwargs["messages"])  # type: ignore
    output = response.choices[0].message.content.replace("\n", "")  # type: ignore
    streaming_cost_calc = completion_cost(response) * 100