mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-08-21 01:15:10 +00:00
test(recording): add a script to schedule recording workflow (#3170)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Successful in 5s
Python Package Build Test / build (3.13) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 9s
Test Llama Stack Build / build-single-provider (push) Failing after 10s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Vector IO Integration Tests / test-matrix (push) Failing after 14s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Test External API and Providers / test-external (venv) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s
Test Llama Stack Build / build (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Pre-commit / pre-commit (push) Successful in 1m19s
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 1s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 3s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Test Llama Stack Build / generate-matrix (push) Successful in 5s
Python Package Build Test / build (3.13) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 9s
Test Llama Stack Build / build-single-provider (push) Failing after 10s
Update ReadTheDocs / update-readthedocs (push) Failing after 10s
Vector IO Integration Tests / test-matrix (push) Failing after 14s
Unit Tests / unit-tests (3.13) (push) Failing after 10s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
Test External API and Providers / test-external (venv) (push) Failing after 13s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 17s
Test Llama Stack Build / build (push) Failing after 9s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Pre-commit / pre-commit (push) Successful in 1m19s
See comment here:
https://github.com/llamastack/llama-stack/pull/3162#issuecomment-3192859097
-- TL;DR it is quite complex to invoke the recording workflow correctly
for an end developer writing tests. This script simplifies the work.
No more manual GitHub UI navigation!
## Script Functionality
- Auto-detects your current branch and associated PR
- Finds the right repository context (works from forks!)
- Runs the workflow where it can actually commit back
- Validates prerequisites and provides helpful error messages
## How to Use
First ensure you are on the branch which introduced a new test and want
it recorded. **Make sure you have pushed this branch remotely, easiest
is to create a PR.**
```
# Record tests for current branch
./scripts/github/schedule-record-workflow.sh
# Record specific test subdirectories
./scripts/github/schedule-record-workflow.sh --test-subdirs "agents,inference"
# Record with vision tests enabled
./scripts/github/schedule-record-workflow.sh --run-vision-tests
# Record tests matching a pattern
./scripts/github/schedule-record-workflow.sh --test-pattern "test_streaming"
```
## Test Plan
Ran `./scripts/github/schedule-record-workflow.sh -s inference -k
tool_choice` which started
4820409329
which successfully committed recorded outputs.
This commit is contained in:
parent
914c7be288
commit
5e7c2250be
4 changed files with 329 additions and 2 deletions
10
.github/workflows/record-integration-tests.yml
vendored
10
.github/workflows/record-integration-tests.yml
vendored
|
@ -35,6 +35,16 @@ jobs:
|
|||
contents: write
|
||||
|
||||
steps:
|
||||
- name: Echo workflow inputs
|
||||
run: |
|
||||
echo "::group::Workflow Inputs"
|
||||
echo "test-subdirs: ${{ inputs.test-subdirs }}"
|
||||
echo "test-provider: ${{ inputs.test-provider }}"
|
||||
echo "run-vision-tests: ${{ inputs.run-vision-tests }}"
|
||||
echo "test-pattern: ${{ inputs.test-pattern }}"
|
||||
echo "branch: ${{ github.ref_name }}"
|
||||
echo "::endgroup::"
|
||||
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
|
||||
with:
|
||||
|
|
279
scripts/github/schedule-record-workflow.sh
Executable file
279
scripts/github/schedule-record-workflow.sh
Executable file
|
@ -0,0 +1,279 @@
|
|||
#!/bin/bash
|
||||
|
||||
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
||||
# All rights reserved.
|
||||
#
|
||||
# This source code is licensed under the terms described in the LICENSE file in
|
||||
# the root directory of this source tree.
|
||||
|
||||
# Script to easily trigger the integration test recording workflow
|
||||
# Usage: ./scripts/github/schedule-record-workflow.sh [options]
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
# Default values
|
||||
BRANCH=""
|
||||
TEST_SUBDIRS=""
|
||||
TEST_PROVIDER="ollama"
|
||||
RUN_VISION_TESTS=false
|
||||
TEST_PATTERN=""
|
||||
|
||||
# Help function
|
||||
show_help() {
|
||||
cat << EOF
|
||||
Usage: $0 [OPTIONS]
|
||||
|
||||
Trigger the integration test recording workflow remotely. This way you do not need to have Ollama running locally.
|
||||
|
||||
OPTIONS:
|
||||
-b, --branch BRANCH Branch to run the workflow on (defaults to current branch)
|
||||
-s, --test-subdirs DIRS Comma-separated list of test subdirectories to run (REQUIRED)
|
||||
-p, --test-provider PROVIDER Test provider to use: vllm or ollama (default: ollama)
|
||||
-v, --run-vision-tests Include vision tests in the recording
|
||||
-k, --test-pattern PATTERN Regex pattern to pass to pytest -k
|
||||
-h, --help Show this help message
|
||||
|
||||
EXAMPLES:
|
||||
# Record tests for current branch with agents subdirectory
|
||||
$0 --test-subdirs "agents"
|
||||
|
||||
# Record tests for specific branch with vision tests
|
||||
$0 -b my-feature-branch --test-subdirs "inference" --run-vision-tests
|
||||
|
||||
# Record multiple test subdirectories with specific provider
|
||||
$0 --test-subdirs "agents,inference" --test-provider vllm
|
||||
|
||||
# Record tests matching a specific pattern
|
||||
$0 --test-subdirs "inference" --test-pattern "test_streaming"
|
||||
|
||||
EOF
|
||||
}
|
||||
|
||||
# PREREQUISITES:
|
||||
# - GitHub CLI (gh) must be installed and authenticated
|
||||
# - jq must be installed for JSON parsing
|
||||
# - You must be in a git repository that is a fork or clone of llamastack/llama-stack
|
||||
# - The branch must exist on the remote repository where you want to run the workflow
|
||||
# - You must specify test subdirectories to run with -s/--test-subdirs
|
||||
|
||||
# Parse command line arguments
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case $1 in
|
||||
-b|--branch)
|
||||
BRANCH="$2"
|
||||
shift 2
|
||||
;;
|
||||
-s|--test-subdirs)
|
||||
TEST_SUBDIRS="$2"
|
||||
shift 2
|
||||
;;
|
||||
-p|--test-provider)
|
||||
TEST_PROVIDER="$2"
|
||||
shift 2
|
||||
;;
|
||||
-v|--run-vision-tests)
|
||||
RUN_VISION_TESTS=true
|
||||
shift
|
||||
;;
|
||||
-k|--test-pattern)
|
||||
TEST_PATTERN="$2"
|
||||
shift 2
|
||||
;;
|
||||
-h|--help)
|
||||
show_help
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
echo "Unknown option: $1"
|
||||
show_help
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Validate required parameters
|
||||
if [[ -z "$TEST_SUBDIRS" ]]; then
|
||||
echo "Error: --test-subdirs is required"
|
||||
echo "Please specify which test subdirectories to run, e.g.:"
|
||||
echo " $0 --test-subdirs \"agents,inference\""
|
||||
echo " $0 --test-subdirs \"inference\" --run-vision-tests"
|
||||
echo ""
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Validate test provider
|
||||
if [[ "$TEST_PROVIDER" != "vllm" && "$TEST_PROVIDER" != "ollama" ]]; then
|
||||
echo "❌ Error: Invalid test provider '$TEST_PROVIDER'"
|
||||
echo " Supported providers: vllm, ollama"
|
||||
echo " Example: $0 --test-subdirs \"agents\" --test-provider vllm"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Check if required tools are installed
|
||||
if ! command -v gh &> /dev/null; then
|
||||
echo "Error: GitHub CLI (gh) is not installed. Please install it from https://cli.github.com/"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if ! gh auth status &> /dev/null; then
|
||||
echo "Error: GitHub CLI is not authenticated. Please run 'gh auth login'"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# If no branch specified, use current branch
|
||||
if [[ -z "$BRANCH" ]]; then
|
||||
BRANCH=$(git branch --show-current)
|
||||
echo "No branch specified, using current branch: $BRANCH"
|
||||
|
||||
# Optionally look for associated PR for context (not required)
|
||||
echo "Looking for associated PR..."
|
||||
|
||||
# Search for PRs in the main repo that might match this branch
|
||||
# This searches llamastack/llama-stack for any PR with this head branch name
|
||||
if PR_INFO=$(gh pr list --repo llamastack/llama-stack --head "$BRANCH" --json number,headRefName,headRepository,headRepositoryOwner,url,state --limit 1 2>/dev/null) && [[ "$PR_INFO" != "[]" ]]; then
|
||||
# Parse PR info using jq
|
||||
PR_NUMBER=$(echo "$PR_INFO" | jq -r '.[0].number')
|
||||
PR_HEAD_REPO=$(echo "$PR_INFO" | jq -r '.[0].headRepositoryOwner.login // "llamastack"')
|
||||
PR_URL=$(echo "$PR_INFO" | jq -r '.[0].url')
|
||||
PR_STATE=$(echo "$PR_INFO" | jq -r '.[0].state')
|
||||
|
||||
if [[ -n "$PR_NUMBER" && -n "$PR_HEAD_REPO" ]]; then
|
||||
echo "✅ Found associated PR #$PR_NUMBER ($PR_STATE)"
|
||||
echo " URL: $PR_URL"
|
||||
echo " Head repository: $PR_HEAD_REPO/llama-stack"
|
||||
|
||||
# Check PR state and block if merged
|
||||
if [[ "$PR_STATE" == "CLOSED" ]]; then
|
||||
echo "ℹ️ Note: This PR is closed, but workflow can still run to update recordings."
|
||||
elif [[ "$PR_STATE" == "MERGED" ]]; then
|
||||
echo "❌ Error: This PR is already merged."
|
||||
echo " Cannot record tests for a merged PR since changes can't be committed back."
|
||||
echo " Create a new branch/PR if you need to record new tests."
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
else
|
||||
echo "ℹ️ No associated PR found for branch '$BRANCH'"
|
||||
echo "That's fine - the workflow just needs a pushed branch to run."
|
||||
fi
|
||||
echo ""
|
||||
fi
|
||||
|
||||
# Determine the target repository for workflow dispatch based on where the branch actually exists
|
||||
# We need to find which remote has the branch we want to run the workflow on
|
||||
|
||||
echo "Determining target repository for workflow..."
|
||||
|
||||
# Check if we have PR info with head repository
|
||||
if [[ -n "$PR_HEAD_REPO" ]]; then
|
||||
# Use the repository from the PR head
|
||||
TARGET_REPO="$PR_HEAD_REPO/llama-stack"
|
||||
echo "📍 Using PR head repository: $TARGET_REPO"
|
||||
|
||||
if [[ "$PR_HEAD_REPO" == "llamastack" ]]; then
|
||||
REPO_CONTEXT=""
|
||||
else
|
||||
REPO_CONTEXT="--repo $TARGET_REPO"
|
||||
fi
|
||||
else
|
||||
# Fallback: find which remote has the branch
|
||||
BRANCH_REMOTE=""
|
||||
for remote in $(git remote); do
|
||||
if git ls-remote --heads "$remote" "$BRANCH" | grep -q "$BRANCH"; then
|
||||
REMOTE_URL=$(git remote get-url "$remote")
|
||||
if [[ "$REMOTE_URL" == *"/llama-stack"* ]]; then
|
||||
REPO_OWNER=$(echo "$REMOTE_URL" | sed -n 's/.*[:/]\([^/]*\)\/llama-stack.*/\1/p')
|
||||
echo "📍 Found branch '$BRANCH' on remote '$remote' ($REPO_OWNER/llama-stack)"
|
||||
TARGET_REPO="$REPO_OWNER/llama-stack"
|
||||
BRANCH_REMOTE="$remote"
|
||||
break
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [[ -z "$BRANCH_REMOTE" ]]; then
|
||||
echo "Error: Could not find branch '$BRANCH' on any llama-stack remote"
|
||||
echo ""
|
||||
echo "This could mean:"
|
||||
echo " - The branch doesn't exist on any remote yet (push it first)"
|
||||
echo " - The branch name is misspelled"
|
||||
echo " - No llama-stack remotes are configured"
|
||||
echo ""
|
||||
echo "Available remotes:"
|
||||
git remote -v
|
||||
echo ""
|
||||
echo "To push your branch: git push <remote> $BRANCH"
|
||||
echo "Common remotes to try: origin, upstream, your-username"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [[ "$TARGET_REPO" == "llamastack/llama-stack" ]]; then
|
||||
REPO_CONTEXT=""
|
||||
else
|
||||
REPO_CONTEXT="--repo $TARGET_REPO"
|
||||
fi
|
||||
fi
|
||||
|
||||
echo " Workflow will run on: $TARGET_REPO"
|
||||
|
||||
# Verify the target repository has the workflow file
|
||||
echo "Verifying workflow exists on target repository..."
|
||||
if ! gh api "repos/$TARGET_REPO/contents/.github/workflows/record-integration-tests.yml" &>/dev/null; then
|
||||
echo "Error: The recording workflow does not exist on $TARGET_REPO"
|
||||
echo "This could mean:"
|
||||
echo " - The fork doesn't have the latest workflow file"
|
||||
echo " - The workflow file was renamed or moved"
|
||||
echo ""
|
||||
if [[ "$TARGET_REPO" != "llamastack/llama-stack" ]]; then
|
||||
echo "Try syncing your fork with upstream:"
|
||||
echo " git fetch upstream"
|
||||
echo " git checkout main"
|
||||
echo " git merge upstream/main"
|
||||
echo " git push origin main"
|
||||
fi
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Build the workflow dispatch command
|
||||
echo "Triggering integration test recording workflow..."
|
||||
echo "Branch: $BRANCH"
|
||||
echo "Test provider: $TEST_PROVIDER"
|
||||
echo "Test subdirs: $TEST_SUBDIRS"
|
||||
echo "Run vision tests: $RUN_VISION_TESTS"
|
||||
echo "Test pattern: ${TEST_PATTERN:-"(none)"}"
|
||||
echo ""
|
||||
|
||||
# Prepare inputs for gh workflow run
|
||||
INPUTS="-f test-subdirs='$TEST_SUBDIRS'"
|
||||
if [[ -n "$TEST_PROVIDER" ]]; then
|
||||
INPUTS="$INPUTS -f test-provider='$TEST_PROVIDER'"
|
||||
fi
|
||||
if [[ "$RUN_VISION_TESTS" == "true" ]]; then
|
||||
INPUTS="$INPUTS -f run-vision-tests=true"
|
||||
fi
|
||||
if [[ -n "$TEST_PATTERN" ]]; then
|
||||
INPUTS="$INPUTS -f test-pattern='$TEST_PATTERN'"
|
||||
fi
|
||||
|
||||
# Run the workflow
|
||||
WORKFLOW_CMD="gh workflow run record-integration-tests.yml --ref $BRANCH $REPO_CONTEXT $INPUTS"
|
||||
echo "Running: $WORKFLOW_CMD"
|
||||
echo ""
|
||||
|
||||
if eval "$WORKFLOW_CMD"; then
|
||||
echo "✅ Workflow triggered successfully!"
|
||||
echo ""
|
||||
echo "You can monitor the workflow run at:"
|
||||
echo "https://github.com/$TARGET_REPO/actions/workflows/record-integration-tests.yml"
|
||||
echo ""
|
||||
if [[ -n "$REPO_CONTEXT" ]]; then
|
||||
echo "Or use: gh run list --workflow=record-integration-tests.yml $REPO_CONTEXT"
|
||||
echo "And then: gh run watch <RUN_ID> $REPO_CONTEXT"
|
||||
else
|
||||
echo "Or use: gh run list --workflow=record-integration-tests.yml"
|
||||
echo "And then: gh run watch <RUN_ID>"
|
||||
fi
|
||||
else
|
||||
echo "❌ Failed to trigger workflow"
|
||||
exit 1
|
||||
fi
|
|
@ -60,7 +60,9 @@ FIREWORKS_API_KEY=your_key pytest -sv tests/integration/inference --stack-config
|
|||
|
||||
### Re-recording tests
|
||||
|
||||
If you want to re-record tests, you can do so with:
|
||||
#### Local Re-recording (Manual Setup Required)
|
||||
|
||||
If you want to re-record tests locally, you can do so with:
|
||||
|
||||
```bash
|
||||
LLAMA_STACK_TEST_INFERENCE_MODE=record \
|
||||
|
@ -71,7 +73,6 @@ LLAMA_STACK_TEST_INFERENCE_MODE=record \
|
|||
|
||||
This will record new API responses and overwrite the existing recordings.
|
||||
|
||||
|
||||
```{warning}
|
||||
|
||||
You must be careful when re-recording. CI workflows assume a specific setup for running the replay-mode tests. You must re-record the tests in the same way as the CI workflows. This means
|
||||
|
@ -79,6 +80,34 @@ You must be careful when re-recording. CI workflows assume a specific setup for
|
|||
- you are using the `starter` distribution.
|
||||
```
|
||||
|
||||
#### Remote Re-recording (Recommended)
|
||||
|
||||
**For easier re-recording without local setup**, use the automated recording workflow:
|
||||
|
||||
```bash
|
||||
# Record tests for specific test subdirectories
|
||||
./scripts/github/schedule-record-workflow.sh --test-subdirs "agents,inference"
|
||||
|
||||
# Record with vision tests enabled
|
||||
./scripts/github/schedule-record-workflow.sh --test-subdirs "inference" --run-vision-tests
|
||||
|
||||
# Record with specific provider
|
||||
./scripts/github/schedule-record-workflow.sh --test-subdirs "agents" --test-provider vllm
|
||||
```
|
||||
|
||||
This script:
|
||||
- 🚀 **Runs in GitHub Actions** - no local Ollama setup required
|
||||
- 🔍 **Auto-detects your branch** and associated PR
|
||||
- 🍴 **Works from forks** - handles repository context automatically
|
||||
- ✅ **Commits recordings back** to your branch
|
||||
|
||||
**Prerequisites:**
|
||||
- GitHub CLI: `brew install gh && gh auth login`
|
||||
- jq: `brew install jq`
|
||||
- Your branch pushed to a remote
|
||||
|
||||
**Supported providers:** `vllm`, `ollama`
|
||||
|
||||
|
||||
### Next Steps
|
||||
|
||||
|
|
|
@ -134,6 +134,15 @@ cat recordings/responses/abc123.json | jq '.'
|
|||
```
|
||||
|
||||
### Re-recording Tests
|
||||
|
||||
#### Remote Re-recording (Recommended)
|
||||
Use the automated workflow script for easier re-recording:
|
||||
```bash
|
||||
./scripts/github/schedule-record-workflow.sh --test-subdirs "inference,agents"
|
||||
```
|
||||
See the [main testing guide](../README.md#remote-re-recording-recommended) for full details.
|
||||
|
||||
#### Local Re-recording
|
||||
```bash
|
||||
# Re-record specific tests
|
||||
LLAMA_STACK_TEST_INFERENCE_MODE=record \
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue