Litellm dev 11 07 2024 (#6649)

* fix(streaming_handler.py): save finish_reasons which might show up mid-stream (store last received one)

Fixes https://github.com/BerriAI/litellm/issues/6104

* refactor: add readme to litellm_core_utils/

make it easier to navigate

* fix(team_endpoints.py): return team id + object for invalid team in `/team/list`

* fix(streaming_handler.py): remove import

* fix(pattern_match_deployments.py): default to user input if unable to map based on wildcards (#6646)

* fix(pattern_match_deployments.py): default to user input if unable to… (#6632)

* fix(pattern_match_deployments.py): default to user input if unable to map based on wildcards

* test: fix test

* test: reset test name

* test: update conftest to reload proxy server module between tests

* ci(config.yml): move langfuse out of local_testing

reduce ci/cd time

* ci(config.yml): cleanup langfuse ci/cd tests

* fix: update test to not use global proxy_server app module

* ci: move caching to a separate test pipeline

speed up ci pipeline

* test: update conftest to check if proxy_server attr exists before reloading

* build(conftest.py): don't block on inability to reload proxy_server

* ci(config.yml): update caching unit test filter to work on 'cache' keyword as well

* fix(encrypt_decrypt_utils.py): use function to get salt key

* test: mark flaky test

* test: handle anthropic overloaded errors

* refactor: create separate ci/cd pipeline for proxy unit tests

make ci/cd faster

* ci(config.yml): add litellm_proxy_unit_testing to build_and_test jobs

* ci(config.yml): generate prisma binaries for proxy unit tests

* test: readd vertex_key.json

* ci(config.yml): remove `-s` from proxy_unit_test cmd

speed up test

* ci: remove any 'debug' logging flag

speed up ci pipeline

* test: fix test

* test(test_braintrust.py): rerun

* test: add delay for braintrust test

* chore: comment for maritalk (#6607)

* Update gpt-4o-2024-08-06, and o1-preview, o1-mini models in model cost map  (#6654)

* Adding supports_response_schema to gpt-4o-2024-08-06 models

* o1 models do not support vision

---------

Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>

* (QOL improvement) add unit testing for all static_methods in litellm_logging.py  (#6640)

* add unit testing for standard logging payload

* unit testing for static methods in litellm_logging

* add code coverage check for litellm_logging

* litellm_logging_code_coverage

* test_get_final_response_obj

* fix validate_redacted_message_span_attributes

* test validate_redacted_message_span_attributes

* (feat) log error class, function_name on prometheus service failure hook + only log DB related failures on DB service hook  (#6650)

* log error on prometheus service failure hook

* use a more accurate function name for wrapper that handles logging db metrics

* fix log_db_metrics

* test_log_db_metrics_failure_error_types

* fix linting

* fix auth checks

* Update several Azure AI models in model cost map (#6655)

* Adding Azure Phi 3/3.5 models to model cost map

* Update gpt-4o-mini models

* Adding missing Azure Mistral models to model cost map

* Adding Azure Llama3.2 models to model cost map

* Fix Gemini-1.5-flash pricing

* Fix Gemini-1.5-flash output pricing

* Fix Gemini-1.5-pro prices

* Fix Gemini-1.5-flash output prices

* Correct gemini-1.5-pro prices

* Correction on Vertex Llama3.2 entry

---------

Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>

* fix(streaming_handler.py): fix linting error

* test: remove duplicate test

causes gemini ratelimit error

---------

Co-authored-by: nobuo kawasaki <nobu007@users.noreply.github.com>
Co-authored-by: Emerson Gomes <emerson.gomes@gmail.com>
Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
This commit is contained in:
Krish Dholakia 2024-11-08 19:34:22 +05:30 committed by GitHub
parent 9f2053e4af
commit 1bef6457c7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 4253 additions and 2151 deletions

View file

@ -3470,6 +3470,86 @@ def test_unit_test_custom_stream_wrapper_repeating_chunk(
continue
def test_unit_test_gemini_streaming_content_filter():
chunks = [
{
"text": "##",
"tool_use": None,
"is_finished": False,
"finish_reason": "stop",
"usage": {"prompt_tokens": 37, "completion_tokens": 1, "total_tokens": 38},
"index": 0,
},
{
"text": "",
"is_finished": False,
"finish_reason": "",
"usage": None,
"index": 0,
"tool_use": None,
},
{
"text": " Downsides of Prompt Hacking in a Customer Portal\n\nWhile prompt engineering can be incredibly",
"tool_use": None,
"is_finished": False,
"finish_reason": "stop",
"usage": {"prompt_tokens": 37, "completion_tokens": 17, "total_tokens": 54},
"index": 0,
},
{
"text": "",
"is_finished": False,
"finish_reason": "",
"usage": None,
"index": 0,
"tool_use": None,
},
{
"text": "",
"tool_use": None,
"is_finished": False,
"finish_reason": "content_filter",
"usage": {"prompt_tokens": 37, "completion_tokens": 17, "total_tokens": 54},
"index": 0,
},
{
"text": "",
"is_finished": False,
"finish_reason": "",
"usage": None,
"index": 0,
"tool_use": None,
},
]
completion_stream = ModelResponseListIterator(model_responses=chunks)
response = litellm.CustomStreamWrapper(
completion_stream=completion_stream,
model="gemini/gemini-1.5-pro",
custom_llm_provider="gemini",
logging_obj=litellm.Logging(
model="gemini/gemini-1.5-pro",
messages=[{"role": "user", "content": "Hey"}],
stream=True,
call_type="completion",
start_time=time.time(),
litellm_call_id="12345",
function_id="1245",
),
)
stream_finish_reason: Optional[str] = None
idx = 0
for chunk in response:
print(f"chunk: {chunk}")
if chunk.choices[0].finish_reason is not None:
stream_finish_reason = chunk.choices[0].finish_reason
idx += 1
print(f"num chunks: {idx}")
assert stream_finish_reason == "content_filter"
def test_unit_test_custom_stream_wrapper_openai():
"""
Test if last streaming chunk ends with '?', if the message repeats itself.