Krrish Dholakia
ff12e023ae
fix(fixes-for-text-completion-streaming): fixes for text completion streaming
2024-01-08 13:39:54 +05:30
Krrish Dholakia
6333fbfe56
fix(main.py): support cost calculation for text completion streaming object
2024-01-08 12:41:43 +05:30
Krrish Dholakia
b1fd0a164b
fix(huggingface_restapi.py): support timeouts for huggingface + openai text completions
...
https://github.com/BerriAI/litellm/issues/1334
2024-01-08 11:40:56 +05:30
Krrish Dholakia
8cee267a5b
fix(caching.py): support ttl, s-max-age, and no-cache cache controls
...
https://github.com/BerriAI/litellm/issues/1306
2024-01-03 12:42:43 +05:30
ishaan-jaff
f3b8d9c3ef
(fix) counting response tokens+streaming
2024-01-03 12:06:39 +05:30
ishaan-jaff
790dcff5e0
(feat) add xinference as an embedding provider
2024-01-02 15:32:26 +05:30
ishaan-jaff
70cdc16d6f
(feat) cache context manager - update cache
2023-12-30 19:50:53 +05:30
ishaan-jaff
ddddfe6602
(feat) add cache context manager
2023-12-30 19:32:51 +05:30
Krrish Dholakia
77be3e3114
fix(main.py): don't set timeout as an optional api param
2023-12-30 11:47:07 +05:30
ishaan-jaff
aee38d9329
(fix) batch_completions - set default timeout
2023-12-30 11:35:55 +05:30
Krrish Dholakia
38f55249e1
fix(router.py): support retry and fallbacks for atext_completion
2023-12-30 11:19:32 +05:30
ishaan-jaff
2f4cd3b569
(feat) proxy - support dynamic timeout per request
2023-12-30 10:55:42 +05:30
ishaan-jaff
ee682be093
(feat) add cloudflare streaming
2023-12-29 12:01:26 +05:30
ishaan-jaff
8fcfb7df22
(feat) cloudflare ai workers - add completion support
2023-12-29 11:34:58 +05:30
Krrish Dholakia
6f2734100f
fix(main.py): fix async text completion streaming + add new tests
2023-12-29 11:33:42 +05:30
ishaan-jaff
367e9913dc
(feat) v0 adding cloudflare
2023-12-29 09:32:29 +05:30
ishaan-jaff
95e6d2fbba
(feat) add voyage ai embeddings
2023-12-28 17:10:15 +05:30
ishaan-jaff
78f0c0228b
(feat) add mistral api embeddings
2023-12-28 16:41:55 +05:30
Krrish Dholakia
3b1685e7c6
feat(health_check.py): more detailed health check calls
2023-12-28 09:12:57 +05:30
ishaan-jaff
f4fe2575cc
(fix) use client for text_completion()
2023-12-27 15:20:26 +05:30
Krrish Dholakia
c9fdbaf898
fix(azure.py,-openai.py): correctly raise errors if streaming calls fail
2023-12-27 15:08:37 +05:30
Krrish Dholakia
c88a8d71f0
fix: fix linting issues
2023-12-27 12:21:31 +05:30
Krish Dholakia
5c3a61d62f
Merge pull request #1248 from danikhan632/main
...
updated oobabooga to new api and support for embeddings
2023-12-27 11:33:56 +05:30
Ishaan Jaff
22d0c21829
Merge pull request #1249 from evantancy/main
...
fix: helicone logging
2023-12-27 11:24:19 +05:30
evantancy
668c786099
fix: helicone logging
2023-12-27 12:16:29 +08:00
dan
c4dfd9be7c
updated oobabooga to new api and support for embeddings
2023-12-26 19:45:28 -05:00
ishaan-jaff
751d57379d
(fix) support ollama_chat for acompletion
2023-12-26 20:01:51 +05:30
Krrish Dholakia
f0b6b9dce2
fix(main.py): support ttl being set for completion, embedding, image generation calls
2023-12-26 17:22:40 +05:30
ishaan-jaff
a463625452
(chore) completion - move functions lower
2023-12-26 14:35:59 +05:30
ishaan-jaff
7b097305c1
(feat) support logprobs, top_logprobs openai
2023-12-26 14:00:42 +05:30
ishaan-jaff
0b0d22d58c
(feat) add logprobs, top_logprobs to litellm.completion
2023-12-26 13:39:48 +05:30
ishaan-jaff
8c35aebdf8
(feat) ollama chat
2023-12-25 23:04:17 +05:30
ishaan-jaff
da4ec6c8b6
(feat) add ollama_chat v0
2023-12-25 14:27:10 +05:30
Krrish Dholakia
4905929de3
refactor: add black formatting
2023-12-25 14:11:20 +05:30
Krrish Dholakia
343a06fd84
fix(proxy_server.py): raise streaming exceptions
2023-12-25 07:18:09 +05:30
Krrish Dholakia
1262d89ab3
feat(gemini.py): add support for completion calls for gemini-pro (google ai studio)
2023-12-24 09:42:58 +05:30
Krrish Dholakia
eaaad79823
feat(ollama.py): add support for async ollama embeddings
2023-12-23 18:01:25 +05:30
Krrish Dholakia
be68796eba
fix(router.py): add support for async image generation endpoints
2023-12-21 14:38:44 +05:30
Krrish Dholakia
f355e03515
feat(main.py): add async image generation support
2023-12-20 16:58:40 +05:30
Krrish Dholakia
b3962e483f
feat(azure.py): add support for azure image generations endpoint
2023-12-20 16:37:21 +05:30
Krrish Dholakia
f0df28362a
feat(ollama.py): add support for ollama function calling
2023-12-20 14:59:55 +05:30
Krish Dholakia
408f232bd7
Merge branch 'main' into main
2023-12-18 17:54:34 -08:00
Krrish Dholakia
34509d8dda
fix(main.py): return async completion calls
2023-12-18 17:41:54 -08:00
ishaan-jaff
bd15c61a65
(feat) OR default transforms=[]
2023-12-18 10:59:09 +05:30
ishaan-jaff
3a97a2817f
(fix) default args batch completion
2023-12-18 10:05:44 +05:30
ishaan-jaff
1e57c0c152
(feat) completion set function, function_call default None
2023-12-18 09:54:43 +05:30
ishaan-jaff
6b272076d7
(feat) openrouter set transforms=[]
default
2023-12-18 09:16:33 +05:30
Joel Eriksson
a419d59542
Fix for issue that occured when proxying to ollama
...
In the text_completion() function, it previously threw an exception at:
raw_response = response._hidden_params.get("original_response", None)
Due to response being an coroutine object to an ollama_acompletion call,
so I added an asyncio.iscoroutine() check for the response and handle it
by calling response = asyncio.run(response)
I also had to fix atext_completion(), where init_response was an instance
of TextCompletionResponse.
Since this case was not handled by the if-elif that checks if init_response
is a coroutine, a dict or a ModelResponse instance, response was unbound
which threw an exception on the "return response" line.
Note that a regular pyright based linter detects that response is possibly
unbound, and that the same code pattern is used in multiple other places
in main.py.
I would suggest that you either change these cases:
init_response = await loop.run_in_executor(...
if isinstance(init_response, ...
response = init_response
elif asyncio.iscoroutine(init_response):
response = await init_response
To either just:
response = await loop.run_in_executor(
if asyncio.iscoroutine(response):
response = await response
Or at the very least, include an else statement and set response = init_response,
so that response is never unbound when the code proceeds.
2023-12-17 17:27:47 +02:00
Krrish Dholakia
a3c7a340a5
fix(ollama.py): fix sync ollama streaming
2023-12-16 21:23:21 -08:00
Krrish Dholakia
13d088b72e
feat(main.py): add support for image generation endpoint
2023-12-16 21:07:29 -08:00