Commit graph

19 commits

Author SHA1 Message Date
Ishaan Jaff
8c845847cd fix vertex embedding perf test 2025-03-26 10:25:50 -07:00
Krish Dholakia
6fd18651d1
Support litellm.api_base for vertex_ai + gemini/ across completion, embedding, image_generation (#9516)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 19s
Helm unit test / unit-test (push) Successful in 20s
* test(tests): add unit testing for litellm_proxy integration

* fix(cost_calculator.py): fix tracking cost in sdk when calling proxy

* fix(main.py): respect litellm.api_base on `vertex_ai/` and `gemini/` routes

* fix(main.py): consistently support custom api base across gemini + vertexai on embedding + completion

* feat(vertex_ai/): test

* fix: fix linting error

* test: set api base as None before starting loadtest
2025-03-25 23:46:20 -07:00
Ishaan Jaff
753290e5b8
(perf) Fix memory leak on /completions route (#8551)
* initial mem util test

* fix _cached_get_model_info_helper

* test memory usage

* fix tests

* fix mem usage
2025-02-14 18:58:16 -08:00
Krish Dholakia
ad2f66b3e3
[BETA] Add OpenAI /images/variations + Topaz API support (#7700)
* feat(main.py): initial commit for `/image/variations` endpoint support

* refactor(base_llm/): introduce new base llm base config for image variation endpoints

* refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler

* fix: test

* feat(openai/): working openai `/image/variation` endpoint calls via sdk

* feat(topaz/): topaz sync image variation call support

Addresses https://github.com/BerriAI/litellm/issues/7593

'

* fix(topaz/transformation.py): fix linting errors

* fix(openai/image_variations/handler.py): fix passing json data

* fix(main.py): image_variation/

support async image variation route - `aimage_variation`

* fix(test_get_model_info.py): fix test

* fix: cleanup unused imports

* feat(openai/): add async `/image/variations` endpoint support

* feat(topaz/): support async `/image/variations` calls

* fix: test

* fix(utils.py): fix get_model_info_helper for no model info w/ provider config

handles situation where model info is not known but provider config exists

* test(test_router_fallbacks.py): mark flaky test

* fix: fix unused imports

* test: bump otel load test perf threshold - accounts for current load tests hitting same server
2025-01-11 23:27:46 -08:00
Ishaan Jaff
137879ffea vertex testing use pathrise-convert-1606954137718 2025-01-05 14:00:17 -08:00
Ishaan Jaff
ef8812d150 ci/cd update vertex acct 2025-01-05 13:43:32 -08:00
Krish Dholakia
31ace870a2
Litellm dev 12 28 2024 p1 (#7463)
* refactor(utils.py): migrate amazon titan config to base config

* refactor(utils.py): refactor bedrock meta invoke model translation to use base config

* refactor(utils.py): move bedrock ai21 to base config

* refactor(utils.py): move bedrock cohere to base config

* refactor(utils.py): move bedrock mistral to use base config

* refactor(utils.py): move all provider optional param translations to using a config

* docs(clientside_auth.md): clarify how to pass vertex region to litellm proxy

* fix(utils.py): handle scenario where custom llm provider is none / empty

* fix: fix get config

* test(test_otel_load_tests.py): widen perf margin

* fix(utils.py): fix get provider config check to handle custom llm's

* fix(utils.py): fix check
2024-12-28 20:26:00 -08:00
Ishaan Jaff
835db6ae98
(load testing) add vertex_ai embeddings load test (#6004)
* use vertex llm as base class for embeddings

* use correct vertex class in main.py

* set_headers in vertex llm base

* add types for vertex embedding requests

* add embedding handler for vertex

* use async mode for vertex embedding tests

* use vertexAI textEmbeddingConfig

* fix linting

* add sync and async mode testing for vertex ai embeddings

* add basic load test

* add vertex ai load test on ci cd
2024-10-03 14:39:15 +05:30
Ishaan Jaff
b878a67a7c fic otel load test % 2024-09-14 18:04:28 -07:00
Ishaan Jaff
c8d15544c8
[Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments (#5698)
* move cooldown logic to it's own helper

* add new track deployment metrics folder

* increment success, fails for deployment in current minute

* fix cooldown logic

* fix test_aaarouter_dynamic_cooldown_message_retry_time

* fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls

* clean up get from deployment test

* fix _async_get_healthy_deployments

* add mock InternalServerError

* test deployment failing 25% requests

* add test_high_traffic_cooldowns_one_bad_deployment

* fix vertex load test

* add test for rate limit error models in cool down

* change default cooldown time

* fix cooldown message time

* fix cooldown on 429 error

* fix doc string for _should_cooldown_deployment

* fix sync cooldown logic router
2024-09-14 18:01:19 -07:00
Ishaan Jaff
741c8e8a45
[Feat - Perf Improvement] DataDog Logger 91% lower latency (#5687)
* fix refactor dd to be an instance of custom logger

* migrate dd logger to be async

* clean up dd logging

* add datadog sync and async code

* use batching for datadog logger

* add doc string for dd logging

* add clear doc string

* fix doc string

* allow debugging intake url

* clean up requirements.txt

* allow setting custom batch size on logger

* fix dd logging to use compression

* fix linting

* add dd load test

* fix dd load test

* fix dd url

* add test_datadog_logging_http_request

* fix test_datadog_logging_http_request
2024-09-13 17:39:17 -07:00
Ishaan Jaff
cd8d7ca915
[Fix] Performance - use in memory cache when downloading images from a url (#5657)
* fix use in memory cache when getting images

* fix linting

* fix load testing

* fix load test size

* fix load test size

* trigger ci/cd again
2024-09-13 07:23:42 -07:00
Ishaan Jaff
88706488f9 fix otel load test 2024-09-11 21:27:31 -07:00
Ishaan Jaff
b80f27dce3 fix otel tests 2024-09-11 21:25:27 -07:00
Ishaan Jaff
97ecf86d3d fix langsmith load tests 2024-09-11 21:19:03 -07:00
Ishaan Jaff
b01a42ef4f fix langsmith load test 2024-09-11 21:16:16 -07:00
Ishaan Jaff
a1f8fcfeed fix load test 2024-09-11 21:06:42 -07:00
Ishaan Jaff
850b5dbadc add otel load test 2024-09-11 20:47:12 -07:00
Ishaan Jaff
e7b047223e add langsmith logging test 2024-09-11 20:35:11 -07:00