forked from phoenix-oss/llama-stack-mirror
ci: add new action to install ollama, cache the model (#2054)
# What does this PR do? This PR introduces a reusable GitHub Actions workflow for pulling and running an Ollama model, with caching to avoid repeated downloads. [//]: # (If resolving an issue, uncomment and update the line below) Closes: #1949 ## Test Plan 1. Trigger a workflow that uses the Ollama setup. Confirm that: - The model is pulled successfully. - It is placed in the correct directory, official at the moment (not ~ollama/.ollama/models as per comment so need to confirm this). 2. Re-run the same workflow to validate that: - The model is restored from the cache. - Execution succeeds with the cached model. [//]: # (## Documentation)
This commit is contained in:
parent
3022f7b642
commit
2413447467
2 changed files with 28 additions and 13 deletions
26
.github/actions/setup-ollama/action.yml
vendored
Normal file
26
.github/actions/setup-ollama/action.yml
vendored
Normal file
|
@ -0,0 +1,26 @@
|
||||||
|
name: Setup Ollama
|
||||||
|
description: Start Ollama and cache model
|
||||||
|
inputs:
|
||||||
|
models:
|
||||||
|
description: Comma-separated list of models to pull
|
||||||
|
default: "llama3.2:3b-instruct-fp16,all-minilm:latest"
|
||||||
|
runs:
|
||||||
|
using: "composite"
|
||||||
|
steps:
|
||||||
|
- name: Install and start Ollama
|
||||||
|
shell: bash
|
||||||
|
run: |
|
||||||
|
# the ollama installer also starts the ollama service
|
||||||
|
curl -fsSL https://ollama.com/install.sh | sh
|
||||||
|
|
||||||
|
# Do NOT cache models - pulling the cache is actually slower than just pulling the model.
|
||||||
|
# It takes ~45 seconds to pull the models from the cache and unpack it, but only 30 seconds to
|
||||||
|
# pull them directly.
|
||||||
|
# Maybe this is because the cache is being pulled at the same time by all the matrix jobs?
|
||||||
|
- name: Pull requested models
|
||||||
|
if: inputs.models != ''
|
||||||
|
shell: bash
|
||||||
|
run: |
|
||||||
|
for model in $(echo "${{ inputs.models }}" | tr ',' ' '); do
|
||||||
|
ollama pull "$model"
|
||||||
|
done
|
15
.github/workflows/integration-tests.yml
vendored
15
.github/workflows/integration-tests.yml
vendored
|
@ -38,19 +38,8 @@ jobs:
|
||||||
python-version: "3.10"
|
python-version: "3.10"
|
||||||
activate-environment: true
|
activate-environment: true
|
||||||
|
|
||||||
- name: Install and start Ollama
|
- name: Setup ollama
|
||||||
run: |
|
uses: ./.github/actions/setup-ollama
|
||||||
# the ollama installer also starts the ollama service
|
|
||||||
curl -fsSL https://ollama.com/install.sh | sh
|
|
||||||
|
|
||||||
# Do NOT cache models - pulling the cache is actually slower than just pulling the model.
|
|
||||||
# It takes ~45 seconds to pull the models from the cache and unpack it, but only 30 seconds to
|
|
||||||
# pull them directly.
|
|
||||||
# Maybe this is because the cache is being pulled at the same time by all the matrix jobs?
|
|
||||||
- name: Pull Ollama models (instruct and embed)
|
|
||||||
run: |
|
|
||||||
ollama pull llama3.2:3b-instruct-fp16
|
|
||||||
ollama pull all-minilm:latest
|
|
||||||
|
|
||||||
- name: Set Up Environment and Install Dependencies
|
- name: Set Up Environment and Install Dependencies
|
||||||
run: |
|
run: |
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue