ci: use ollama container image with loaded models (#2410)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 13s
Integration Tests / test-matrix (http, 3.10, agents) (push) Failing after 13s
Integration Tests / test-matrix (http, 3.10, inference) (push) Failing after 14s
Integration Tests / test-matrix (http, 3.10, datasets) (push) Failing after 16s
Integration Tests / test-matrix (http, 3.10, inspect) (push) Failing after 15s
Integration Tests / test-matrix (http, 3.10, post_training) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.10, providers) (push) Failing after 11s
Integration Tests / test-matrix (http, 3.10, scoring) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.10, tool_runtime) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.11, agents) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.11, datasets) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.11, inference) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.11, post_training) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.11, inspect) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.11, providers) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.11, scoring) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.11, tool_runtime) (push) Failing after 11s
Integration Tests / test-matrix (http, 3.12, agents) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.12, datasets) (push) Failing after 10s
Integration Tests / test-matrix (http, 3.12, inference) (push) Failing after 8s
Integration Tests / test-matrix (http, 3.12, inspect) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.12, post_training) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.12, providers) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.12, scoring) (push) Failing after 9s
Integration Tests / test-matrix (http, 3.12, tool_runtime) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.10, agents) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.10, inference) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.10, datasets) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.10, post_training) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.10, inspect) (push) Failing after 11s
Integration Tests / test-matrix (library, 3.10, providers) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.10, scoring) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.10, tool_runtime) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.11, agents) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, datasets) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.11, inference) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.11, inspect) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.11, post_training) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.11, providers) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.11, scoring) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.11, tool_runtime) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, agents) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, datasets) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, inference) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, inspect) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 9s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 8s
Test External Providers / test-external-providers (venv) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 16s
Unit Tests / unit-tests (3.11) (push) Failing after 8s
Unit Tests / unit-tests (3.10) (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 8s
Unit Tests / unit-tests (3.13) (push) Failing after 9s
Pre-commit / pre-commit (push) Successful in 1m3s

# What does this PR do?

Instead of downloading the models each time we now have a single Ollama
container that is baked with the models pulled and ready to use.

This will remove the CI flakiness on model pulling.

Signed-off-by: Sébastien Han <seb@redhat.com>
This commit is contained in:
Sébastien Han 2025-06-06 12:08:20 +02:00 committed by GitHub
parent 692709cd45
commit 0d0b8d2be1
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 29 additions and 194 deletions

View file

@ -1,26 +1,9 @@
name: Setup Ollama name: Setup Ollama
description: Start Ollama and cache model description: Start Ollama
inputs:
models:
description: Comma-separated list of models to pull
default: "llama3.2:3b-instruct-fp16,all-minilm:latest"
runs: runs:
using: "composite" using: "composite"
steps: steps:
- name: Install and start Ollama - name: Start Ollama
shell: bash shell: bash
run: | run: |
# the ollama installer also starts the ollama service docker run -d --name ollama -p 11434:11434 docker.io/leseb/ollama-with-models
curl -fsSL https://ollama.com/install.sh | sh
# Do NOT cache models - pulling the cache is actually slower than just pulling the model.
# It takes ~45 seconds to pull the models from the cache and unpack it, but only 30 seconds to
# pull them directly.
# Maybe this is because the cache is being pulled at the same time by all the matrix jobs?
- name: Pull requested models
if: inputs.models != ''
shell: bash
run: |
for model in $(echo "${{ inputs.models }}" | tr ',' ' '); do
ollama pull "$model"
done

View file

@ -50,7 +50,7 @@ jobs:
env: env:
INFERENCE_MODEL: "meta-llama/Llama-3.2-3B-Instruct" INFERENCE_MODEL: "meta-llama/Llama-3.2-3B-Instruct"
run: | run: |
LLAMA_STACK_LOG_FILE=server.log nohup uv run llama stack run ./llama_stack/templates/ollama/run.yaml --image-type venv & LLAMA_STACK_LOG_FILE=server.log nohup uv run llama stack run ./llama_stack/templates/ollama/run.yaml --image-type venv --env OLLAMA_URL="http://0.0.0.0:11434" &
- name: Wait for Llama Stack server to be ready - name: Wait for Llama Stack server to be ready
if: matrix.client-type == 'http' if: matrix.client-type == 'http'
@ -87,6 +87,7 @@ jobs:
- name: Run Integration Tests - name: Run Integration Tests
env: env:
INFERENCE_MODEL: "meta-llama/Llama-3.2-3B-Instruct" INFERENCE_MODEL: "meta-llama/Llama-3.2-3B-Instruct"
OLLAMA_URL: "http://0.0.0.0:11434"
run: | run: |
if [ "${{ matrix.client-type }}" == "library" ]; then if [ "${{ matrix.client-type }}" == "library" ]; then
stack_config="ollama" stack_config="ollama"
@ -107,7 +108,7 @@ jobs:
- name: Write ollama logs to file - name: Write ollama logs to file
if: ${{ always() }} if: ${{ always() }}
run: | run: |
sudo journalctl -u ollama.service > ollama.log sudo docker logs ollama > ollama.log
- name: Upload all logs to artifacts - name: Upload all logs to artifacts
if: ${{ always() }} if: ${{ always() }}

View file

@ -1,5 +1,5 @@
# This file was autogenerated by uv via the following command: # This file was autogenerated by uv via the following command:
# uv export --frozen --no-hashes --no-emit-project --output-file=requirements.txt # uv export --frozen --no-hashes --no-emit-project --no-default-groups --output-file=requirements.txt
aiohappyeyeballs==2.5.0 aiohappyeyeballs==2.5.0
# via aiohttp # via aiohttp
aiohttp==3.11.13 aiohttp==3.11.13
@ -14,10 +14,6 @@ anyio==4.8.0
# llama-stack-client # llama-stack-client
# openai # openai
# starlette # starlette
appnope==0.1.4 ; sys_platform == 'darwin'
# via ipykernel
asttokens==3.0.0
# via stack-data
async-timeout==5.0.1 ; python_full_version < '3.11' async-timeout==5.0.1 ; python_full_version < '3.11'
# via aiohttp # via aiohttp
attrs==25.1.0 attrs==25.1.0
@ -25,41 +21,19 @@ attrs==25.1.0
# aiohttp # aiohttp
# jsonschema # jsonschema
# referencing # referencing
black==25.1.0
certifi==2025.1.31 certifi==2025.1.31
# via # via
# httpcore # httpcore
# httpx # httpx
# requests # requests
cffi==1.17.1 ; implementation_name == 'pypy'
# via pyzmq
cfgv==3.4.0
# via pre-commit
charset-normalizer==3.4.1 charset-normalizer==3.4.1
# via requests # via requests
click==8.1.8 click==8.1.8
# via # via llama-stack-client
# black
# llama-stack-client
# uvicorn
colorama==0.4.6 ; sys_platform == 'win32' colorama==0.4.6 ; sys_platform == 'win32'
# via # via
# click # click
# ipython
# pytest
# tqdm # tqdm
comm==0.2.2
# via ipykernel
coverage==7.6.12
# via
# nbval
# pytest-cov
debugpy==1.8.12
# via ipykernel
decorator==5.1.1
# via ipython
distlib==0.3.9
# via virtualenv
distro==1.9.0 distro==1.9.0
# via # via
# llama-stack-client # llama-stack-client
@ -67,20 +41,11 @@ distro==1.9.0
ecdsa==0.19.1 ecdsa==0.19.1
# via python-jose # via python-jose
exceptiongroup==1.2.2 ; python_full_version < '3.11' exceptiongroup==1.2.2 ; python_full_version < '3.11'
# via # via anyio
# anyio
# ipython
# pytest
executing==2.2.0
# via stack-data
fastapi==0.115.8 fastapi==0.115.8
# via llama-stack # via llama-stack
fastjsonschema==2.21.1
# via nbformat
filelock==3.17.0 filelock==3.17.0
# via # via huggingface-hub
# huggingface-hub
# virtualenv
fire==0.7.0 fire==0.7.0
# via llama-stack # via llama-stack
frozenlist==1.5.0 frozenlist==1.5.0
@ -93,7 +58,6 @@ h11==0.16.0
# via # via
# httpcore # httpcore
# llama-stack # llama-stack
# uvicorn
httpcore==1.0.9 httpcore==1.0.9
# via httpx # via httpx
httpx==0.28.1 httpx==0.28.1
@ -103,119 +67,56 @@ httpx==0.28.1
# openai # openai
huggingface-hub==0.29.0 huggingface-hub==0.29.0
# via llama-stack # via llama-stack
identify==2.6.7
# via pre-commit
idna==3.10 idna==3.10
# via # via
# anyio # anyio
# httpx # httpx
# requests # requests
# yarl # yarl
iniconfig==2.0.0
# via pytest
ipykernel==6.29.5
# via nbval
ipython==8.32.0
# via ipykernel
jedi==0.19.2
# via ipython
jinja2==3.1.6 jinja2==3.1.6
# via # via llama-stack
# llama-stack
# pytest-html
jiter==0.8.2 jiter==0.8.2
# via openai # via openai
jsonschema==4.23.0 jsonschema==4.23.0
# via # via llama-stack
# llama-stack
# nbformat
jsonschema-specifications==2024.10.1 jsonschema-specifications==2024.10.1
# via jsonschema # via jsonschema
jupyter-client==8.6.3
# via
# ipykernel
# nbval
jupyter-core==5.7.2
# via
# ipykernel
# jupyter-client
# nbformat
llama-stack-client==0.2.10 llama-stack-client==0.2.10
# via llama-stack # via llama-stack
markdown-it-py==3.0.0 markdown-it-py==3.0.0
# via rich # via rich
markupsafe==3.0.2 markupsafe==3.0.2
# via jinja2 # via jinja2
matplotlib-inline==0.1.7
# via
# ipykernel
# ipython
mdurl==0.1.2 mdurl==0.1.2
# via markdown-it-py # via markdown-it-py
multidict==6.1.0 multidict==6.1.0
# via # via
# aiohttp # aiohttp
# yarl # yarl
mypy-extensions==1.0.0
# via black
nbformat==5.10.4
# via nbval
nbval==0.11.0
nest-asyncio==1.6.0
# via ipykernel
nodeenv==1.9.1
# via pre-commit
numpy==2.2.3 numpy==2.2.3
# via pandas # via pandas
openai==1.71.0 openai==1.71.0
# via llama-stack # via llama-stack
packaging==24.2 packaging==24.2
# via # via huggingface-hub
# black
# huggingface-hub
# ipykernel
# pytest
pandas==2.2.3 pandas==2.2.3
# via llama-stack-client # via llama-stack-client
parso==0.8.4
# via jedi
pathspec==0.12.1
# via black
pexpect==4.9.0 ; sys_platform != 'emscripten' and sys_platform != 'win32'
# via ipython
pillow==11.1.0 pillow==11.1.0
# via llama-stack # via llama-stack
platformdirs==4.3.6
# via
# black
# jupyter-core
# virtualenv
pluggy==1.5.0
# via pytest
pre-commit==4.1.0
prompt-toolkit==3.0.50 prompt-toolkit==3.0.50
# via # via
# ipython
# llama-stack # llama-stack
# llama-stack-client # llama-stack-client
propcache==0.3.0 propcache==0.3.0
# via # via
# aiohttp # aiohttp
# yarl # yarl
psutil==7.0.0
# via ipykernel
ptyprocess==0.7.0 ; sys_platform != 'emscripten' and sys_platform != 'win32'
# via pexpect
pure-eval==0.2.3
# via stack-data
pyaml==25.1.0 pyaml==25.1.0
# via llama-stack-client # via llama-stack-client
pyasn1==0.4.8 pyasn1==0.4.8
# via # via
# python-jose # python-jose
# rsa # rsa
pycparser==2.22 ; implementation_name == 'pypy'
# via cffi
pydantic==2.10.6 pydantic==2.10.6
# via # via
# fastapi # fastapi
@ -225,31 +126,9 @@ pydantic==2.10.6
pydantic-core==2.27.2 pydantic-core==2.27.2
# via pydantic # via pydantic
pygments==2.19.1 pygments==2.19.1
# via # via rich
# ipython
# rich
pytest==8.3.4
# via
# nbval
# pytest-asyncio
# pytest-cov
# pytest-html
# pytest-json-report
# pytest-metadata
# pytest-timeout
pytest-asyncio==0.25.3
pytest-cov==6.0.0
pytest-html==4.1.1
pytest-json-report==1.5.0
pytest-metadata==3.1.1
# via
# pytest-html
# pytest-json-report
pytest-timeout==2.4.0
python-dateutil==2.9.0.post0 python-dateutil==2.9.0.post0
# via # via pandas
# jupyter-client
# pandas
python-dotenv==1.0.1 python-dotenv==1.0.1
# via llama-stack # via llama-stack
python-jose==3.4.0 python-jose==3.4.0
@ -258,17 +137,10 @@ python-multipart==0.0.20
# via llama-stack # via llama-stack
pytz==2025.1 pytz==2025.1
# via pandas # via pandas
pywin32==308 ; platform_python_implementation != 'PyPy' and sys_platform == 'win32'
# via jupyter-core
pyyaml==6.0.2 pyyaml==6.0.2
# via # via
# huggingface-hub # huggingface-hub
# pre-commit
# pyaml # pyaml
pyzmq==26.2.1
# via
# ipykernel
# jupyter-client
referencing==0.36.2 referencing==0.36.2
# via # via
# jsonschema # jsonschema
@ -290,10 +162,6 @@ rpds-py==0.22.3
# referencing # referencing
rsa==4.9 rsa==4.9
# via python-jose # via python-jose
ruamel-yaml==0.18.10
ruamel-yaml-clib==0.2.12 ; python_full_version < '3.13' and platform_python_implementation == 'CPython'
# via ruamel-yaml
ruff==0.9.6
setuptools==80.8.0 setuptools==80.8.0
# via llama-stack # via llama-stack
six==1.17.0 six==1.17.0
@ -305,8 +173,6 @@ sniffio==1.3.1
# anyio # anyio
# llama-stack-client # llama-stack-client
# openai # openai
stack-data==0.6.3
# via ipython
starlette==0.45.3 starlette==0.45.3
# via # via
# fastapi # fastapi
@ -318,38 +184,16 @@ termcolor==2.5.0
# llama-stack-client # llama-stack-client
tiktoken==0.9.0 tiktoken==0.9.0
# via llama-stack # via llama-stack
tomli==2.2.1 ; python_full_version <= '3.11'
# via
# black
# coverage
# pytest
tornado==6.4.2
# via
# ipykernel
# jupyter-client
tqdm==4.67.1 tqdm==4.67.1
# via # via
# huggingface-hub # huggingface-hub
# llama-stack-client # llama-stack-client
# openai # openai
traitlets==5.14.3
# via
# comm
# ipykernel
# ipython
# jupyter-client
# jupyter-core
# matplotlib-inline
# nbformat
types-requests==2.32.0.20241016
types-setuptools==75.8.0.20250210
typing-extensions==4.12.2 typing-extensions==4.12.2
# via # via
# anyio # anyio
# black
# fastapi # fastapi
# huggingface-hub # huggingface-hub
# ipython
# llama-stack-client # llama-stack-client
# multidict # multidict
# openai # openai
@ -357,16 +201,10 @@ typing-extensions==4.12.2
# pydantic-core # pydantic-core
# referencing # referencing
# rich # rich
# uvicorn
tzdata==2025.1 tzdata==2025.1
# via pandas # via pandas
urllib3==2.3.0 urllib3==2.3.0
# via # via requests
# requests
# types-requests
uvicorn==0.34.0
virtualenv==20.29.2
# via pre-commit
wcwidth==0.2.13 wcwidth==0.2.13
# via prompt-toolkit # via prompt-toolkit
yarl==1.18.3 yarl==1.18.3

13
tests/Containerfile Normal file
View file

@ -0,0 +1,13 @@
# Containerfile used to build our all in one ollama image to run tests in CI
# podman build --platform linux/amd64 -f Containerfile -t ollama-with-models .
#
FROM --platform=linux/amd64 ollama/ollama:latest
# Start ollama and pull models in a single layer
RUN ollama serve & \
sleep 5 && \
ollama pull llama3.2:3b-instruct-fp16 && \
ollama pull all-minilm:latest
# Set the entrypoint to start ollama serve
ENTRYPOINT ["ollama", "serve"]