Commit graph

53 commits

Author SHA1 Message Date
Krish Dholakia
e00d4fb18c
Litellm dev 03 08 2025 p3 (#9089)
* feat(ollama_chat.py): pass down http client to ollama_chat

enables easier testing

* fix(factory.py): fix passing images to ollama's `/api/generate` endpoint

Fixes https://github.com/BerriAI/litellm/issues/6683

* fix(factory.py): fix ollama pt to handle templating correctly
2025-03-09 18:20:56 -07:00
Tomáš Dvořák
b2eb2365b9
fix: ollama chat async stream error propagation (#8870)
Ref: #8868
2025-02-28 08:11:56 -08:00
Krish Dholakia
dfbbf0bde8
fix: dictionary changed size during iteration error (#8327) (#8341)
Co-authored-by: Joey Feldberg <joeyfeldberg@users.noreply.github.com>
Co-authored-by: Joey Feldberg <12495578+joeyfeldberg@users.noreply.github.com>
2025-02-07 16:20:28 -08:00
Krish Dholakia
e67f18b153
LiteLLM Minor Fixes & Improvements (01/18/2025) - p1 (#7857)
* OllamaChatConfig supports JSON schema response format in optional parameters (#7832)

* fix(types/router.py): handle none values for bool types

Fixes https://github.com/BerriAI/litellm/issues/7855#issuecomment-2599781974

* test: handle no hf token in env

---------

Co-authored-by: trislaz <35226192+trislaz@users.noreply.github.com>
2025-01-18 19:03:50 -08:00
Krish Dholakia
ec5a354eac
add azure o1 pricing (#7715)
* build(model_prices_and_context_window.json): add azure o1 pricing

Closes https://github.com/BerriAI/litellm/issues/7712

* refactor: replace regex with string method for whitespace check in stop-sequences handling (#7713)

* Allows overriding keep_alive time in ollama (#7079)

* Allows overriding keep_alive time in ollama

* Also adds to ollama_chat

* Adds some info on the docs about this parameter

* fix: together ai warning (#7688)

Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>

* fix(proxy_server.py): handle config containing thread locked objects when using get_config_state

* fix(proxy_server.py): add exception to debug

* build(model_prices_and_context_window.json): update 'supports_vision' for azure o1

---------

Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>
Co-authored-by: Regis David Souza Mesquita <github@rdsm.dev>
Co-authored-by: Carl <45709281+capsenz@users.noreply.github.com>
Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>
2025-01-12 18:15:35 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Krish Dholakia
b82add11ba
LITELLM: Remove requests library usage (#7235)
* fix(generic_api_callback.py): remove requests lib usage

* fix(budget_manager.py): remove requests lib usgae

* fix(main.py): cleanup requests lib usage

* fix(utils.py): remove requests lib usage

* fix(argilla.py): fix argilla test

* fix(athina.py): replace 'requests' lib usage with litellm module

* fix(greenscale.py): replace 'requests' lib usage with httpx

* fix: remove unused 'requests' lib import + replace usage in some places

* fix(prompt_layer.py): remove 'requests' lib usage from prompt layer

* fix(ollama_chat.py): remove 'requests' lib usage

* fix(baseten.py): replace 'requests' lib usage

* fix(codestral/): replace 'requests' lib usage

* fix(predibase/): replace 'requests' lib usage

* refactor: cleanup unused 'requests' lib imports

* fix(oobabooga.py): cleanup 'requests' lib usage

* fix(invoke_handler.py): remove unused 'requests' lib usage

* refactor: cleanup unused 'requests' lib import

* fix: fix linting errors

* refactor(ollama/): move ollama to using base llm http handler

removes 'requests' lib dep for ollama integration

* fix(ollama_chat.py): fix linting errors

* fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama
2024-12-17 12:50:04 -08:00
Krish Dholakia
516c2a6a70
Litellm remove circular imports (#7232)
* fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py

* fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py

'

* refactor: fix litellm.ModelResponse import on pass through endpoints

* refactor(litellm_logging.py): fix circular import for custom callbacks literal

* fix(factory.py): fix circular imports inside prompt factory

* fix(cost_calculator.py): fix circular import for 'litellm.Usage'

* fix(proxy_server.py): fix potential circular import with `litellm.Router'

* fix(proxy/utils.py): fix potential circular import in `litellm.Router`

* fix: remove circular imports in 'auth_checks' and 'guardrails/'

* fix(prompt_injection_detection.py): fix router impor t

* fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through

* fix(anthropic_pass_through_logging_handler.py): fix potential circular imports

* fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import

* fix(base.py): fix potential circular import

* fix(handler.py): fix potential circular ref in codestral + cohere handler's

* fix(azure.py): fix potential circular imports

* fix(gpt_transformation.py): fix modelresponse import

* fix(litellm_logging.py): add logging base class - simplify typing

makes it easy for other files to type check the logging obj without introducing circular imports

* fix(azure_ai/embed): fix potential circular import on handler.py

* fix(databricks/): fix potential circular imports in databricks/

* fix(vertex_ai/): fix potential circular imports on vertex ai embeddings

* fix(vertex_ai/image_gen): fix import

* fix(watsonx-+-bedrock): cleanup imports

* refactor(anthropic-pass-through-+-petals): cleanup imports

* refactor(huggingface/): cleanup imports

* fix(ollama-+-clarifai): cleanup circular imports

* fix(openai_like/): fix impor t

* fix(openai_like/): fix embedding handler

cleanup imports

* refactor(openai.py): cleanup imports

* fix(sagemaker/transformation.py): fix import

* ci(config.yml): add circular import test to ci/cd
2024-12-14 16:28:34 -08:00
Krish Dholakia
350cfc36f7
Litellm merge pr (#7161)
* build: merge branch

* test: fix openai naming

* fix(main.py): fix openai renaming

* style: ignore function length for config factory

* fix(sagemaker/): fix routing logic

* fix: fix imports

* fix: fix override
2024-12-10 22:49:26 -08:00
Ishaan Jaff
d81ae45827
(Perf / latency improvement) improve pass through endpoint latency to ~50ms (before PR was 400ms) (#6874)
* use correct location for types

* fix types location

* perf improvement for pass through endpoints

* update lint check

* fix import

* fix ensure async clients test

* fix azure.py health check

* fix ollama
2024-11-22 18:47:26 -08:00
Ishaan Jaff
610974b4fc
(code quality) add ruff check PLR0915 for too-many-statements (#6309)
* ruff add PLR0915

* add noqa for PLR0915

* fix noqa

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915
2024-10-18 15:36:49 +05:30
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement (#5992)
* ci(config.yml): add a 'check_code_quality' step

Addresses https://github.com/BerriAI/litellm/issues/5991

* ci(config.yml): check why circle ci doesn't pick up this test

* ci(config.yml): fix to run 'check_code_quality' tests

* fix(__init__.py): fix unprotected import

* fix(__init__.py): don't remove unused imports

* build(ruff.toml): update ruff.toml to ignore unused imports

* fix: fix: ruff + pyright - fix linting + type-checking errors

* fix: fix linting errors

* fix(lago.py): fix module init error

* fix: fix linting errors

* ci(config.yml): cd into correct dir for checks

* fix(proxy_server.py): fix linting error

* fix(utils.py): fix bare except

causes ruff linting errors

* fix: ruff - fix remaining linting errors

* fix(clickhouse.py): use standard logging object

* fix(__init__.py): fix unprotected import

* fix: ruff - fix linting errors

* fix: fix linting errors

* ci(config.yml): cleanup code qa step (formatting handled in local_testing)

* fix(_health_endpoints.py): fix ruff linting errors

* ci(config.yml): just use ruff in check_code_quality pipeline for now

* build(custom_guardrail.py): include missing file

* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Ishaan Jaff
85acdb9193
[Feat] Add max_completion_tokens param (#5691)
* add max_completion_tokens

* add max_completion_tokens

* add max_completion_tokens support for OpenAI models

* add max_completion_tokens param

* add max_completion_tokens for bedrock converse models

* add test for converse maxTokens

* fix openai o1 param mapping test

* move test optional params

* add max_completion_tokens for anthropic api

* fix conftest

* add max_completion tokens for vertex ai partner models

* add max_completion_tokens for fireworks ai

* add max_completion_tokens for hf rest api

* add test for param mapping

* add param mapping for vertex, gemini + testing

* predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd

* add max_completion_tokens to openai supported params

* fix fireworks ai param mapping
2024-09-14 14:57:01 -07:00
Krish Dholakia
1e7e538261
LiteLLM Minor fixes + improvements (08/04/2024) (#5505)
* Minor IAM AWS OIDC Improvements (#5246)

* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.

* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.

* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.

* fix(router.py): log rejected requests

Fixes https://github.com/BerriAI/litellm/issues/5498

* refactor: don't use verbose_logger.exception, if exception is raised

User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.

* fix(datadog.py): support setting datadog source as an env var

Fixes https://github.com/BerriAI/litellm/issues/5508

* docs(logging.md): add dd_source to datadog docs

* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers

* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509)

* feat(anthropic.py): support 'cache_control' param for content when it is a string

* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519)

This reverts commit 3fac0349c2.

* refactor: ci/cd run again

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-09-04 22:16:55 -07:00
Krrish Dholakia
f36e7e0754 fix(ollama_chat.py): fix passing assistant message with tool call param
Fixes https://github.com/BerriAI/litellm/issues/5319
2024-08-22 10:00:03 -07:00
Krrish Dholakia
cc42f96d6a fix(ollama_chat.py): fix sync tool calling
Fixes https://github.com/BerriAI/litellm/issues/5245
2024-08-19 08:31:46 -07:00
Krrish Dholakia
61f4b71ef7 refactor: replace .error() with .exception() logging for better debugging on sentry 2024-08-16 09:22:47 -07:00
Fabrício Ceolin
936b76662f Follow redirects 2024-08-10 12:12:55 -03:00
Krrish Dholakia
3fee0b1dc5 fix(ollama_chat.py): fix passing auth headers to ollama
Fixes https://github.com/BerriAI/litellm/issues/5046
2024-08-05 09:33:09 -07:00
Krrish Dholakia
b25d4a8cb3 feat(ollama_chat.py): support ollama tool calling
Closes https://github.com/BerriAI/litellm/issues/4812
2024-07-26 21:51:54 -07:00
Krrish Dholakia
6e9f048618 fix: move to using pydantic obj for setting values 2024-07-11 13:18:36 -07:00
Krish Dholakia
2116dbcdc1
Merge pull request #4089 from paneru-rajan/ollama-func-calls
Fix: Output Structure of Ollama chat
2024-07-03 08:57:31 -07:00
corrm
b8a8b0847c Added improved function name handling in ollama_async_streaming 2024-06-24 05:56:56 +03:00
Edwin Jose George
d5f6e3ac08 refactor: black 2024-06-09 16:37:58 +09:30
Krrish Dholakia
6cca5612d2 refactor: replace 'traceback.print_exc()' with logging library
allows error logs to be in json format for otel logging
2024-06-06 13:47:43 -07:00
KX
d3921a3d28 fix: add missing seed parameter to ollama input
Current ollama interfacing does not allow for seed, which is supported in https://github.com/ollama/ollama/blob/main/docs/api.md#parameters and https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values

This resolves that by adding in handling of seed parameter.
2024-05-31 01:47:56 +08:00
Rajan Paneru
65b07bcb8c Preserving the Pydantic Message Object
Following statement replaces the Pydantic Message Object and initialize it with the dict
model_response["choices"][0]["message"] = response_json["message"]

We need to make sure message is always litellm.Message object

As a fix, based on the code of ollama.py file, i am updating just the content intead of entire object for both sync and async functions
2024-05-10 22:12:32 +09:30
Jack Collins
dffe616267 Make newline same in async function 2024-05-05 18:51:53 -07:00
Jack Collins
c217a07d5e Fix: Set finish_reason to tool_calls for non-stream responses 2024-05-05 18:47:58 -07:00
Jack Collins
107a77368f Parse streamed function calls as single delta 2024-05-05 18:47:16 -07:00
Krish Dholakia
0714eb3526
Merge branch 'main' into litellm_ollama_tool_call_reponse 2024-05-01 10:24:05 -07:00
merefield
50a917a096 FIX: use value not param name when mapping frequency_penalty 2024-04-20 09:25:35 +01:00
Krrish Dholakia
3c6b6355c7 fix(ollama_chat.py): accept api key as a param for ollama calls
allows user to call hosted ollama endpoint using bearer token for auth
2024-04-19 13:02:13 -07:00
DaxServer
61b6f8be44 docs: Update references to Ollama repository url
Updated references to the Ollama repository URL from https://github.com/jmorganca/ollama to https://github.com/ollama/ollama.
2024-03-31 19:35:37 +02:00
Krrish Dholakia
dfcc0c9ff0 fix(ollama_chat.py): don't pop from dictionary while iterating through it 2024-03-22 08:18:22 -07:00
Krrish Dholakia
524c244dd9 fix(utils.py): support response_format param for ollama
https://github.com/BerriAI/litellm/issues/2580
2024-03-19 21:07:20 -07:00
Krrish Dholakia
0e7b30bec9 fix(utils.py): return function name for ollama_chat function calls 2024-03-08 08:01:10 -08:00
Krrish Dholakia
12bb705f31 fix(ollama_chat.py): map tool call to assistant for ollama calls 2024-02-29 19:11:35 -08:00
Krrish Dholakia
73d8e3e640 fix(ollama_chat.py): fix token counting 2024-02-06 22:18:46 -08:00
Krrish Dholakia
d1db67890c fix(ollama.py): support format for ollama 2024-02-06 10:11:52 -08:00
Krrish Dholakia
9e091a0624 fix(ollama_chat.py): explicitly state if ollama call is streaming or not 2024-02-06 07:43:47 -08:00
Krrish Dholakia
2e3748e6eb fix(ollama_chat.py): fix ollama chat completion token counting 2024-02-06 07:30:26 -08:00
Krrish Dholakia
37de964da4 fix(ollama_chat.py): fix the way optional params are passed in 2024-01-30 15:48:48 -08:00
Krrish Dholakia
43f139fafd fix(ollama_chat.py): fix default token counting for ollama chat 2024-01-24 20:09:17 -08:00
TheDiscoMole
ed07de2729 changing ollama response parsing to expected behaviour 2024-01-19 23:36:24 +01:00
puffo
becff369dc fix(ollama_chat.py): use tiktoken as backup for prompt token counting 2024-01-18 10:47:24 -06:00
ishaan-jaff
3f6e6e7f55 (fix) ollama_chat - support function calling + fix for comp 2023-12-26 20:07:55 +05:30
ishaan-jaff
3839213d28 (feat) ollama_chat acompletion without streaming 2023-12-26 20:01:51 +05:30
ishaan-jaff
837ce269ae (feat) ollama_chat add async stream 2023-12-25 23:45:27 +05:30
ishaan-jaff
916ba9a6b3 (feat) ollama_chat - add streaming support 2023-12-25 23:38:01 +05:30