forked from phoenix/litellm-mirror
* build(deps): bump cookie and express in /docs/my-website (#6566) Bumps [cookie](https://github.com/jshttp/cookie) and [express](https://github.com/expressjs/express). These dependencies needed to be updated together. Updates `cookie` from 0.6.0 to 0.7.1 - [Release notes](https://github.com/jshttp/cookie/releases) - [Commits](https://github.com/jshttp/cookie/compare/v0.6.0...v0.7.1) Updates `express` from 4.20.0 to 4.21.1 - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/4.21.1/History.md) - [Commits](https://github.com/expressjs/express/compare/4.20.0...4.21.1) --- updated-dependencies: - dependency-name: cookie dependency-type: indirect - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * docs(virtual_keys.md): update Dockerfile reference (#6554) Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> * (proxy fix) - call connect on prisma client when running setup (#6534) * critical fix - call connect on prisma client when running setup * fix test_proxy_server_prisma_setup * fix test_proxy_server_prisma_setup * Add 3.5 haiku (#6588) * feat: add claude-3-5-haiku-20241022 entries * feat: add claude-3-5-haiku-20241022 and vertex_ai/claude-3-5-haiku@20241022 models * add missing entries, remove vision * remove image token costs * Litellm perf improvements 3 (#6573) * perf: move writing key to cache, to background task * perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils adds 200ms on calls with pgdb connected * fix(litellm_pre_call_utils.py'): rename call_type to actual call used * perf(proxy_server.py): remove db logic from _get_config_from_file was causing db calls to occur on every llm request, if team_id was set on key * fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db reduces latency/call by ~100ms * fix(proxy_server.py): minor fix on existing_settings not incl alerting * fix(exception_mapping_utils.py): map databricks exception string * fix(auth_checks.py): fix auth check logic * test: correctly mark flaky test * fix(utils.py): handle auth token error for tokenizers.from_pretrained * build: fix map * build: fix map * build: fix json for model map * fix ImageObject conversion (#6584) * (fix) litellm.text_completion raises a non-blocking error on simple usage (#6546) * unit test test_huggingface_text_completion_logprobs * fix return TextCompletionHandler convert_chat_to_text_completion * fix hf rest api * fix test_huggingface_text_completion_logprobs * fix linting errors * fix importLiteLLMResponseObjectHandler * fix test for LiteLLMResponseObjectHandler * fix test text completion * fix allow using 15 seconds for premium license check * testing fix bedrock deprecated cohere.command-text-v14 * (feat) add `Predicted Outputs` for OpenAI (#6594) * bump openai to openai==1.54.0 * add 'prediction' param * testing fix bedrock deprecated cohere.command-text-v14 * test test_openai_prediction_param.py * test_openai_prediction_param_with_caching * doc Predicted Outputs * doc Predicted Output * (fix) Vertex Improve Performance when using `image_url` (#6593) * fix transformation vertex * test test_process_gemini_image * test_image_completion_request * testing fix - bedrock has deprecated cohere.command-text-v14 * fix vertex pdf * bump: version 1.51.5 → 1.52.0 * Update setuptools in docker and fastapi to latest verison, in order to upgrade starlette version --------- Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: paul-gauthier <69695708+paul-gauthier@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Jacob Hagstedt <wcgs@novonordisk.com>
82 lines
2.3 KiB
Text
82 lines
2.3 KiB
Text
# Base image for building
|
|
ARG LITELLM_BUILD_IMAGE=python:3.11.8-slim
|
|
|
|
# Runtime image
|
|
ARG LITELLM_RUNTIME_IMAGE=python:3.11.8-slim
|
|
# Builder stage
|
|
FROM $LITELLM_BUILD_IMAGE AS builder
|
|
|
|
# Set the working directory to /app
|
|
WORKDIR /app
|
|
|
|
# Install build dependencies
|
|
RUN apt-get clean && apt-get update && \
|
|
apt-get install -y gcc python3-dev && \
|
|
rm -rf /var/lib/apt/lists/*
|
|
|
|
RUN pip install --upgrade pip setuptools && \
|
|
pip install build
|
|
|
|
# Copy the current directory contents into the container at /app
|
|
COPY . .
|
|
|
|
# Build Admin UI
|
|
RUN chmod +x docker/build_admin_ui.sh && ./docker/build_admin_ui.sh
|
|
|
|
# Build the package
|
|
RUN rm -rf dist/* && python -m build
|
|
|
|
# There should be only one wheel file now, assume the build only creates one
|
|
RUN ls -1 dist/*.whl | head -1
|
|
|
|
# Install the package
|
|
RUN pip install dist/*.whl
|
|
|
|
# install dependencies as wheels
|
|
RUN pip wheel --no-cache-dir --wheel-dir=/wheels/ -r requirements.txt
|
|
|
|
# Runtime stage
|
|
FROM $LITELLM_RUNTIME_IMAGE AS runtime
|
|
|
|
# Update dependencies and clean up - handles debian security issue
|
|
RUN apt-get update && apt-get upgrade -y && rm -rf /var/lib/apt/lists/*
|
|
|
|
# Upgrade pip and setuptools to the latest versions
|
|
RUN pip install --no-cache-dir --upgrade pip setuptools
|
|
|
|
WORKDIR /app
|
|
# Copy the current directory contents into the container at /app
|
|
COPY . .
|
|
RUN ls -la /app
|
|
|
|
# Copy the built wheel from the builder stage to the runtime stage; assumes only one wheel file is present
|
|
COPY --from=builder /app/dist/*.whl .
|
|
COPY --from=builder /wheels/ /wheels/
|
|
|
|
# Install the built wheel using pip; again using a wildcard if it's the only file
|
|
RUN pip install *.whl /wheels/* --no-index --find-links=/wheels/ && rm -f *.whl && rm -rf /wheels
|
|
|
|
# install semantic-cache [Experimental]- we need this here and not in requirements.txt because redisvl pins to pydantic 1.0
|
|
RUN pip install redisvl==0.0.7 --no-deps
|
|
|
|
# ensure pyjwt is used, not jwt
|
|
RUN pip uninstall jwt -y
|
|
RUN pip uninstall PyJWT -y
|
|
RUN pip install PyJWT==2.9.0 --no-cache-dir
|
|
|
|
# Build Admin UI
|
|
RUN chmod +x docker/build_admin_ui.sh && ./docker/build_admin_ui.sh
|
|
|
|
# Generate prisma client
|
|
RUN prisma generate
|
|
RUN chmod +x docker/entrypoint.sh
|
|
|
|
EXPOSE 4000/tcp
|
|
|
|
# # Set your entrypoint and command
|
|
|
|
ENTRYPOINT ["litellm"]
|
|
|
|
# Append "--detailed_debug" to the end of CMD to view detailed debug logs
|
|
# CMD ["--port", "4000", "--detailed_debug"]
|
|
CMD ["--port", "4000"]
|