Ishaan Jaff
97d7a5e78e
fix deployment name
2025-04-19 09:23:22 -07:00
Ishaan Jaff
8a1023fa2d
test image gen fix in build and test
2025-04-02 21:33:24 -07:00
Ishaan Jaff
6b3bfa2b42
(Feat) - return x-litellm-attempted-fallbacks
in responses from litellm proxy ( #8558 )
...
* add_fallback_headers_to_response
* test x-litellm-attempted-fallbacks
* unit test attempted fallbacks
* fix add_fallback_headers_to_response
* docs document response headers
* fix file name
2025-02-15 14:54:23 -08:00
Krish Dholakia
6bafdbc546
Litellm dev 01 25 2025 p4 ( #8006 )
...
Read Version from pyproject.toml / read-version (push) Successful in 34s
* feat(main.py): use asyncio.sleep for mock_Timeout=true on async request
adds unit testing to ensure proxy does not fail if specific Openai requests hang (e.g. recent o1 outage)
* fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming
Fixes https://github.com/BerriAI/litellm/issues/7942
* Revert "fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming"
This reverts commit 7a052a64e3
.
* fix(deepseek-r-1): return reasoning_content as a top-level param
ensures compatibility with existing tools that use it
* fix: fix linting error
2025-01-26 08:01:05 -08:00
Krish Dholakia
08b124aeb6
Litellm dev 01 25 2025 p2 ( #8003 )
...
* fix(base_utils.py): supported nested json schema passed in for anthropic calls
* refactor(base_utils.py): refactor ref parsing to prevent infinite loop
* test(test_openai_endpoints.py): refactor anthropic test to use bedrock
* fix(langfuse_prompt_management.py): add unit test for sync langfuse calls
Resolves https://github.com/BerriAI/litellm/issues/7938#issuecomment-2613293757
2025-01-25 16:50:57 -08:00
Krish Dholakia
513b1904ab
Add attempted-retries
and timeout
values to response headers + more testing ( #7926 )
...
Read Version from pyproject.toml / read-version (push) Successful in 14s
* feat(router.py): add retry headers to response
makes it easy to add testing to ensure model-specific retries are respected
* fix(add_retry_headers.py): clarify attempted retries vs. max retries
* test(test_fallbacks.py): add test for checking if max retries set for model is respected
* test(test_fallbacks.py): assert values for attempted retries and max retries are as expected
* fix(utils.py): return timeout in litellm proxy response headers
* test(test_fallbacks.py): add test to assert model specific timeout used on timeout error
* test: add bad model with timeout to proxy
* fix: fix linting error
* fix(router.py): fix get model list from model alias
* test: loosen test restriction - account for other events on proxy
2025-01-22 22:19:44 -08:00
Krish Dholakia
3a7b13efa2
feat(health_check.py): set upperbound for api when making health check call ( #7865 )
...
Read Version from pyproject.toml / read-version (push) Successful in 10s
* feat(health_check.py): set upperbound for api when making health check call
prevent bad model from health check to hang and cause pod restarts
* fix(health_check.py): cleanup task once completed
* fix(constants.py): bump default health check timeout to 1min
* docs(health.md): add 'health_check_timeout' to health docs on litellm
* build(proxy_server_config.yaml): add bad model to health check
2025-01-18 19:47:43 -08:00
Ishaan Jaff
47e12802df
(feat) /batches
Add support for using /batches
endpoints in OAI format ( #7402 )
...
* run azure testing on ci/cd
* update docs on azure batches endpoints
* add input azure.jsonl
* refactor - use separate file for batches endpoints
* fixes for passing custom llm provider to /batch endpoints
* pass custom llm provider to files endpoints
* update azure batches doc
* add info for azure batches api
* update batches endpoints
* use simple helper for raising proxy exception
* update config.yml
* fix imports
* update tests
* use existing settings
* update env var used
* update configs
* update config.yml
* update ft testing
2024-12-24 16:58:05 -08:00
Krish Dholakia
4ac66bd843
LiteLLM Minor Fixes and Improvements (09/07/2024) ( #5580 )
...
* fix(litellm_logging.py): set completion_start_time_float to end_time_float if none
Fixes https://github.com/BerriAI/litellm/issues/5500
* feat(_init_.py): add new 'openai_text_completion_compatible_providers' list
Fixes https://github.com/BerriAI/litellm/issues/5558
Handles correctly routing fireworks ai calls when done via text completions
* fix: fix linting errors
* fix: fix linting errors
* fix(openai.py): fix exception raised
* fix(openai.py): fix error handling
* fix(_redis.py): allow all supported arguments for redis cluster (#5554 )
* Revert "fix(_redis.py): allow all supported arguments for redis cluster (#5554 )" (#5583 )
This reverts commit f2191ef4cb
.
* fix(router.py): return model alias w/ underlying deployment on router.get_model_list()
Fixes https://github.com/BerriAI/litellm/issues/5524#issuecomment-2336410666
* test: handle flaky tests
---------
Co-authored-by: Jonas Dittrich <58814480+Kakadus@users.noreply.github.com>
2024-09-09 18:54:17 -07:00
Krrish Dholakia
0a016d33e6
Revert "fix(router.py): return model alias w/ underlying deployment on router.get_model_list()"
...
This reverts commit 638896309c
.
2024-09-07 18:04:56 -07:00
Krrish Dholakia
638896309c
fix(router.py): return model alias w/ underlying deployment on router.get_model_list()
...
Fixes https://github.com/BerriAI/litellm/issues/5524#issuecomment-2336410666
2024-09-07 18:01:31 -07:00
Ishaan Jaff
f1ffa82062
fix use provider specific routing
2024-08-07 14:37:20 -07:00
Ishaan Jaff
404360b28d
test pass through endpoint
2024-08-06 12:16:00 -07:00
Ishaan Jaff
b35c63001d
fix setup for endpoints
2024-07-31 17:09:08 -07:00
Ishaan Jaff
c8dfc95e90
add examples on config
2024-07-31 15:29:06 -07:00
Ishaan Jaff
9863520376
support using */*
2024-07-25 18:48:56 -07:00
Ishaan Jaff
e2397c3b83
fix test_team_2logging langfuse
2024-06-19 21:14:18 -07:00
Ishaan Jaff
d409ffbaa9
fix test_chat_completion_different_deployments
2024-06-17 23:04:48 -07:00
Ishaan Jaff
cb386fda20
test - making mistral embedding request on proxy
2024-06-12 15:10:20 -07:00
Marc Abramowitz
83c242bbb3
Add commented set_verbose line to proxy_config
...
because I've wanted to do this a couple of times and couldn't remember
the exact syntax.
2024-05-16 15:59:37 -07:00
Krrish Dholakia
54587db402
fix(alerting.py): fix datetime comparison logic
2024-05-14 22:10:09 -07:00
Ishaan Jaff
9bde3ccd1d
(ci/cd) fixes
2024-05-13 20:49:02 -07:00
Krrish Dholakia
99e8f0715e
test(test_end_users.py): fix end user region routing test
2024-05-11 22:42:43 -07:00
Ishaan Jaff
9c4f1ec3e5
fix - failing test_end_user_specific_region test
2024-05-11 17:05:37 -07:00
Ishaan Jaff
a4695c3010
test - using langfuse as a failure callback
2024-05-10 17:37:32 -07:00
Krrish Dholakia
3d18897d69
feat(router.py): enable filtering model group by 'allowed_model_region'
2024-05-08 22:10:17 -07:00
Ishaan Jaff
6a06aba443
(ci/cd) use db connection limit
2024-05-06 11:15:22 -07:00
Ishaan Jaff
e8d3dd475a
fix fake endpoint used on ci/cd
2024-05-06 10:37:39 -07:00
Ishaan Jaff
56a75ee7fe
(ci/cd) fix tests
2024-05-01 13:42:54 -07:00
Krrish Dholakia
d4bca6707b
ci(proxy_server_config.yaml): use redis for usage-based-routing-v2
2024-04-22 13:34:36 -07:00
Krrish Dholakia
1507b23e30
test(test_openai_endpoints.py): make test stricter
2024-04-20 12:11:54 -07:00
Krrish Dholakia
01a1a8f731
fix(caching.py): dual cache async_batch_get_cache fix + testing
...
this fixes a bug in usage-based-routing-v2 which was caused b/c of how the result was being returned from dual cache async_batch_get_cache. it also adds unit testing for that function (and it's sync equivalent)
2024-04-19 15:03:25 -07:00
Ishaan Jaff
adae555fb1
Merge branch 'main' into litellm_fix_using_wildcard_openai_models_proxy
2024-04-15 14:35:06 -07:00
Ishaan Jaff
6df5337e65
test - wildcard openai models on proxy
2024-04-15 14:05:26 -07:00
Ishaan Jaff
ecc6aa060f
test - team based logging on proxy
2024-04-15 13:26:55 -07:00
Krrish Dholakia
ea1574c160
test(test_openai_endpoints.py): add concurrency testing for user defined rate limits on proxy
2024-04-12 18:56:13 -07:00
Krrish Dholakia
74aa230eac
fix(main.py): automatically infer mode for text completion models
2024-04-12 14:16:21 -07:00
Krrish Dholakia
3665b890f8
build(proxy_server_config.yaml): cleanup config
2024-04-11 20:20:09 -07:00
Krrish Dholakia
bdfb74f8a5
test(test_openai_endpoints.py): add local test, for proxy concurrency
2024-04-11 17:16:23 -07:00
Krrish Dholakia
695eb129ad
fix(router.py): fix client init for streaming timeouts
2024-04-05 12:30:15 -07:00
Krrish Dholakia
9fe1c26e0d
build(proxy_server_config.yaml): add store_model_in_db flag
2024-04-03 23:20:41 -07:00
Krrish Dholakia
c0204310ee
fix(main.py): fix translation to text_completions format for async text completion calls
2024-03-30 09:02:51 -07:00
Ishaan Jaff
4d22fe0947
bump: version 1.34.4 → 1.34.5
2024-03-26 18:04:05 -07:00
Ishaan Jaff
da503eab18
Merge branch 'main' into litellm_remove_litellm_telemetry
2024-03-26 11:35:02 -07:00
Ishaan Jaff
b6af44c4dc
(fix) telemetry = false
2024-03-26 11:23:23 -07:00
Krrish Dholakia
49e8cdbff9
fix(router.py): check for context window error when handling 400 status code errors
...
was causing proxy context window fallbacks to not work as expected
2024-03-26 08:08:15 -07:00
Krrish Dholakia
f588bff69b
fix(proxy_server.py): fix spend log update
2024-03-18 20:26:28 -07:00
Krish Dholakia
2c2f322d5a
Update proxy_server_config.yaml
2024-03-16 11:16:13 -07:00
Krish Dholakia
f3cf1ec71f
Update proxy_server_config.yaml
2024-03-15 22:40:39 -07:00
Krish Dholakia
24e673dd6d
Update proxy_server_config.yaml
2024-03-15 22:19:37 -07:00