Commit graph

3222 commits

Author SHA1 Message Date
ishaan-jaff
6a0c853ae4 (feat) add open router transforms, models, route 2023-12-18 09:55:35 +05:30
ishaan-jaff
1e57c0c152 (feat) completion set function, function_call default None 2023-12-18 09:54:43 +05:30
ishaan-jaff
6b272076d7 (feat) openrouter set transforms=[] default 2023-12-18 09:16:33 +05:30
ishaan-jaff
b15682bc1f (feat) set default openrouter configs 2023-12-18 08:55:51 +05:30
Joel Eriksson
e214e6ab47 Fix bug when iterating over lines in ollama response
async for line in resp.content.iter_any() will return
incomplete lines when the lines are long, and that
results in an exception being thrown by json.loads()
when it tries to parse the incomplete JSON

The default behavior of the stream reader for aiohttp
response objects is to iterate over lines, so just
removing .iter_any() fixes the bug
2023-12-17 20:23:26 +02:00
Joel Eriksson
a419d59542 Fix for issue that occured when proxying to ollama
In the text_completion() function, it previously threw an exception at:
raw_response = response._hidden_params.get("original_response", None)

Due to response being an coroutine object to an ollama_acompletion call,
so I added an asyncio.iscoroutine() check for the response and handle it
by calling response = asyncio.run(response)

I also had to fix atext_completion(), where init_response was an instance
of TextCompletionResponse.

Since this case was not handled by the if-elif that checks if init_response
is a coroutine, a dict or a ModelResponse instance, response was unbound
which threw an exception on the "return response" line.

Note that a regular pyright based linter detects that response is possibly
unbound, and that the same code pattern is used in multiple other places
in main.py.

I would suggest that you either change these cases:

init_response = await loop.run_in_executor(...
if isinstance(init_response, ...
    response = init_response
elif asyncio.iscoroutine(init_response):
    response = await init_response

To either just:

response = await loop.run_in_executor(
if asyncio.iscoroutine(response):
    response = await response

Or at the very least, include an else statement and set response = init_response,
so that response is never unbound when the code proceeds.
2023-12-17 17:27:47 +02:00
Krrish Dholakia
a3c7a340a5 fix(ollama.py): fix sync ollama streaming 2023-12-16 21:23:21 -08:00
Krrish Dholakia
13d088b72e feat(main.py): add support for image generation endpoint 2023-12-16 21:07:29 -08:00
Krrish Dholakia
7847ae1e23 fix(traceloop.py): add additional openllmetry traces 2023-12-16 19:21:39 -08:00
Krrish Dholakia
7c2fad2d57 fix(azure.py): fix azure streaming logging 2023-12-16 18:06:08 -08:00
Krrish Dholakia
3923c389fd build(Dockerfile): fixing build requirements 2023-12-16 17:52:30 -08:00
Krrish Dholakia
50b741f8fa fix(Dockerfile): support mac 2023-12-16 16:01:02 -08:00
Krrish Dholakia
3291de9e11 fix(proxy_server.py): setup depedencies on server startup 2023-12-16 11:56:11 -08:00
Krrish Dholakia
4e828ff541 fix(health.md): add background health check details to docs 2023-12-16 10:31:59 -08:00
ishaan-jaff
abd7e48dee (ci/cd) run again 2023-12-16 22:34:10 +05:30
ishaan-jaff
5ee6b87f2e (fix) vertexai - gemini 2023-12-16 22:15:41 +05:30
ishaan-jaff
a5fce3b2de (test) gemini vision 2023-12-16 19:16:32 +05:30
ishaan-jaff
4af13e44df (test) vertex ai: stop running 4 requests / test 2023-12-16 19:01:12 +05:30
ishaan-jaff
e527137bee (test) gemini-pro-vision 2023-12-16 18:58:31 +05:30
ishaan-jaff
764f31c970 (feat) add async, async+stream for gemini 2023-12-16 18:58:12 +05:30
ishaan-jaff
efe8b75200 (fix) use litellm.vertex_vision_models 2023-12-16 18:39:40 +05:30
ishaan-jaff
f3ebfb0517 (test) gemini vision test 2023-12-16 18:38:36 +05:30
ishaan-jaff
0bf29a14e8 init vertex_vision_models 2023-12-16 18:37:00 +05:30
ishaan-jaff
db188507b9 (test) gemini pro vision 2023-12-16 18:31:55 +05:30
ishaan-jaff
774a725ccb (feat) add vertex ai gemini-pro-vision 2023-12-16 18:31:03 +05:30
ishaan-jaff
ed0b5d29b0 (test) proxy - cache config 2023-12-16 14:45:06 +05:30
ishaan-jaff
6b7d0eada4 (feat) proxy - set cache configs on proxy 2023-12-16 14:44:39 +05:30
ishaan-jaff
20b5505476 (feat) show POST request for HF embeddings 2023-12-16 13:09:49 +05:30
ishaan-jaff
37bf94dcdf (fix) proxy remove errant print 2023-12-16 12:26:02 +05:30
Krrish Dholakia
5b4ca42de6 docs(routing.md): add docs on using caching groups across deployments 2023-12-15 21:51:59 -08:00
Krrish Dholakia
84ad9f441e feat(router.py): support caching groups 2023-12-15 21:45:51 -08:00
ishaan-jaff
a7822b8772 (test) ollama-local, add llava test 2023-12-16 10:35:43 +05:30
ishaan-jaff
287633887e (feat) add ollama/llava 2023-12-16 10:35:27 +05:30
Krrish Dholakia
add153d110 fix(huggingface_restapi.py): add support for additional hf embedding formats 2023-12-15 21:02:41 -08:00
Krrish Dholakia
5fe5149070 refactor(test_router_caching.py): add tests for router caching 2023-12-15 20:38:51 -08:00
Krrish Dholakia
4d8376a8e9 fix(router.py): init clients for azure/openai on set_model_list 2023-12-15 20:23:29 -08:00
Krrish Dholakia
4791dda66f feat(proxy_server.py): enable infinite retries on rate limited requests 2023-12-15 20:03:41 -08:00
Krrish Dholakia
e5268fa6bc fix(router.py): support openai-compatible endpoints 2023-12-15 14:47:54 -08:00
Krrish Dholakia
d2e9798de9 fix(router.py): adding mistral to list of openai-compatible endpoints 2023-12-15 14:24:43 -08:00
Krrish Dholakia
edb88e31e4 fix(together_ai.py): return empty tgai responses 2023-12-15 10:46:35 -08:00
Krrish Dholakia
a09a6f24a4 fix(together_ai.py): additional logging for together ai encoding prompt 2023-12-15 10:39:23 -08:00
Krrish Dholakia
cab870f73a fix(ollama.py): fix ollama async streaming for /completions calls 2023-12-15 09:28:32 -08:00
ishaan-jaff
c9fb4ba88c (feat) proxy - use async langfuse logger 2023-12-15 21:57:12 +05:30
ishaan-jaff
7915a5a120 (feat) langfuse add _async_log_event 2023-12-15 21:46:32 +05:30
ishaan-jaff
003cf92af3 (test) dynamodb set table name 2023-12-15 21:39:03 +05:30
ishaan-jaff
9bdd6e73bb (feat) proxy logs: dynamodb - set table name 2023-12-15 21:38:44 +05:30
ishaan-jaff
99c505dba6 (fix) make dynamo logger async for proxy 2023-12-15 18:52:09 +05:30
ishaan-jaff
fa57a6d1f8 (fix) async + stream +sync logging 2023-12-15 18:31:36 +05:30
ishaan-jaff
6af3e73495 (fix) async+stream logger - building complete resp 2023-12-15 18:13:29 +05:30
ishaan-jaff
6c58fe65ff (feat) dynamo db - log call_type 2023-12-15 17:27:48 +05:30