Commit graph

43 commits

Author SHA1 Message Date
frob
3df7231fa5
Disable special tokens in ollama completion when counting tokens
Some(?) models (eg, codegemma) don't return a prompt_eval_count field, so ollama.py tries to compute the value based on encoding of the prompt.  Unfortunately FIM symbols used in the prompt (eg, "<|fim_prefix|>") cause the encoder to throw an exception, so we disable special processing.
2024-04-19 21:38:42 +02:00
Gregory Nwosu
559a4cde23
created defaults for response["eval_count"]
there is no way in litellm to disable the cache in ollama that is removing the eval_count response keys from the json.
This PR allows the code to create sensible defaults for when the response is empty
see 
- https://github.com/ollama/ollama/issues/1573
- https://github.com/ollama/ollama/issues/2023
2024-04-08 02:03:54 +01:00
DaxServer
61b6f8be44 docs: Update references to Ollama repository url
Updated references to the Ollama repository URL from https://github.com/jmorganca/ollama to https://github.com/ollama/ollama.
2024-03-31 19:35:37 +02:00
Krrish Dholakia
48af367885 fix(ollama.py): fix type issue 2024-03-28 15:01:56 -07:00
onukura
f86472518d Add a feature to ollama aembedding to accept batch input 2024-03-27 21:39:19 +00:00
onukura
2df63cc621 Fix ollama embedding response 2024-03-25 16:26:49 +00:00
Lunik
cee20695eb
🐛 fix: Ollama vision models call arguments (like : llava)
Signed-off-by: Lunik <lunik@tiwabbit.fr>
2024-02-26 17:52:55 +01:00
Krrish Dholakia
d1db67890c fix(ollama.py): support format for ollama 2024-02-06 10:11:52 -08:00
Ishaan Jaff
14c9e239a1
Merge pull request #1750 from vanpelt/patch-2
Re-raise exception in async ollama streaming
2024-02-05 08:12:17 -08:00
Krrish Dholakia
312c7462c8 refactor(ollama.py): trigger rebuild 2024-02-03 20:23:43 -08:00
Krrish Dholakia
01cef1fe9e fix(ollama.py): fix api connection error
https://github.com/BerriAI/litellm/issues/1735
2024-02-03 20:22:33 -08:00
Chris Van Pelt
1568b162f5
Re-raise exception in async ollama streaming 2024-02-01 16:14:07 -08:00
Krrish Dholakia
d46df34ff5 fix(utils.py): fix streaming chunks to not return role, unless set 2024-02-01 09:55:56 -08:00
ishaan-jaff
5f2cbfc711 (feat) litellm.completion - support ollama timeout 2024-01-09 10:34:41 +05:30
Krrish Dholakia
88d498a54a fix(ollama.py): use tiktoken as backup for prompt token counting 2024-01-09 09:47:18 +05:30
Krrish Dholakia
4905929de3 refactor: add black formatting 2023-12-25 14:11:20 +05:30
Krrish Dholakia
eaaad79823 feat(ollama.py): add support for async ollama embeddings 2023-12-23 18:01:25 +05:30
Krrish Dholakia
eb2d13e2fb test(test_completion.py-+-test_streaming.py): add ollama endpoint to ci/cd pipeline 2023-12-22 12:21:33 +05:30
Krrish Dholakia
57607f111a fix(ollama.py): use litellm.request timeout for async call timeout 2023-12-22 11:22:24 +05:30
Krrish Dholakia
f0df28362a feat(ollama.py): add support for ollama function calling 2023-12-20 14:59:55 +05:30
ishaan-jaff
9995229b97 (fix) proxy + ollama - raise exception correctly 2023-12-19 18:48:34 +05:30
Joel Eriksson
e214e6ab47 Fix bug when iterating over lines in ollama response
async for line in resp.content.iter_any() will return
incomplete lines when the lines are long, and that
results in an exception being thrown by json.loads()
when it tries to parse the incomplete JSON

The default behavior of the stream reader for aiohttp
response objects is to iterate over lines, so just
removing .iter_any() fixes the bug
2023-12-17 20:23:26 +02:00
Krrish Dholakia
a3c7a340a5 fix(ollama.py): fix sync ollama streaming 2023-12-16 21:23:21 -08:00
Krrish Dholakia
4e828ff541 fix(health.md): add background health check details to docs 2023-12-16 10:31:59 -08:00
Krrish Dholakia
4791dda66f feat(proxy_server.py): enable infinite retries on rate limited requests 2023-12-15 20:03:41 -08:00
Krrish Dholakia
cab870f73a fix(ollama.py): fix ollama async streaming for /completions calls 2023-12-15 09:28:32 -08:00
Krish Dholakia
a6e78497b5
Merge pull request #1122 from emsi/main
Fix #1119, no content when streaming.
2023-12-14 10:01:00 -08:00
Krrish Dholakia
7b8851cce5 fix(ollama.py): fix async completion calls for ollama 2023-12-13 13:10:25 -08:00
Mariusz Woloszyn
1feb6317f6 Fix #1119, no content when streaming. 2023-12-13 21:42:35 +01:00
Krrish Dholakia
8e7116635f fix(ollama.py): add support for async streaming 2023-12-12 16:44:20 -08:00
ishaan-jaff
99b48eff17 (fix) tkinter import 2023-12-12 12:18:25 -08:00
Krrish Dholakia
2c1c75fdf0 fix(ollama.py): enable parallel ollama completion calls 2023-12-11 23:18:37 -08:00
ishaan-jaff
e82b8ed7e2 (feat) debug ollama POST request 2023-11-14 17:53:48 -08:00
Krrish Dholakia
ae35c13015 refactor(ai21,-aleph-alpha,-ollama): making ai21, aleph-alpha, ollama compatible with openai v1 sdk 2023-11-11 17:49:13 -08:00
ishaan-jaff
2f07460333 (feat) completion ollama raise exception when ollama resp != 200 2023-11-10 08:54:05 -08:00
Krrish Dholakia
6b40546e59 refactor(all-files): removing all print statements; adding pre-commit + flake8 to prevent future regressions 2023-11-04 12:50:15 -07:00
ishaan-jaff
7b3ee8d129 (feat) ollama raise Exceptions + use LiteLLM stream wrapper 2023-10-11 17:00:39 -07:00
Krrish Dholakia
306a38880d feat(ollama.py): exposing ollama config 2023-10-06 15:52:58 -07:00
Krrish Dholakia
a72880925c push cli tool 2023-09-26 13:30:47 -07:00
ishaan-jaff
2b9e3434ff fix async import error 2023-09-21 11:16:50 -07:00
ishaan-jaff
ac90c5286f conditional import async_generator 2023-09-21 11:09:57 -07:00
ishaan-jaff
35bb6f5a50 support acompletion + stream for ollama 2023-09-21 10:39:48 -07:00
ishaan-jaff
56bd8c1c52 olla upgrades, fix streaming, add non streaming resp 2023-09-09 14:07:13 -07:00