phoenix-oss/llama-stack-mirror

Fork 1

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 04:04:14 +00:00

Matthew Farrellee de692162af

Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped

Details

Integration Tests (Replay) / discover-tests (push) Successful in 12s

Details

Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 15s

Details

Python Package Build Test / build (3.12) (push) Failing after 16s

Details

Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 25s

Details

Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s

Details

Python Package Build Test / build (3.13) (push) Failing after 17s

Details

SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 29s

Details

Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s

Details

Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 25s

Details

SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s

Details

Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 29s

Details

Unit Tests / unit-tests (3.12) (push) Failing after 20s

Details

Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 12s

Details

Test External API and Providers / test-external (venv) (push) Failing after 22s

Details

Unit Tests / unit-tests (3.13) (push) Failing after 18s

Details

Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 23s

Details

Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 24s

Details

Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 27s

Details

Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 24s

Details

Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 23s

Details

Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 24s

Details

Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 25s

Details

Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 27s

Details

Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 24s

Details

Update ReadTheDocs / update-readthedocs (push) Failing after 38s

Details

Pre-commit / pre-commit (push) Successful in 1m53s

Details

feat: add batches API with OpenAI compatibility (#3088 )

Add complete batches API implementation with protocol, providers, and
tests:

Core Infrastructure:
- Add batches API protocol using OpenAI Batch types directly
- Add Api.batches enum value and protocol mapping in resolver
- Add OpenAI "batch" file purpose support
- Include proper error handling (ConflictError, ResourceNotFoundError)

Reference Provider:
- Add ReferenceBatchesImpl with full CRUD operations (create, retrieve,
cancel, list)
- Implement background batch processing with configurable concurrency
- Add SQLite KVStore backend for persistence
- Support /v1/chat/completions endpoint with request validation

Comprehensive Test Suite:
- Add unit tests for provider implementation with validation
- Add integration tests for end-to-end batch processing workflows
- Add error handling tests for validation, malformed inputs, and edge
cases

Configuration:
- Add max_concurrent_batches and max_concurrent_requests_per_batch
options
- Add provider documentation with sample configurations

Test with -

```
$ uv run llama stack build --image-type venv --providers inference=YOU_PICK,files=inline::localfs,batches=inline::reference --run &
$ LLAMA_STACK_CONFIG=http://localhost:8321 uv run pytest tests/unit/providers/batches tests/integration/batches --text-model YOU_PICK
```

addresses #3066

2025-08-14 09:42:02 -04:00

919 B

Raw Blame History

Inference

Overview

Llama Stack Inference API for generating completions, chat completions, and embeddings.

This API provides the raw interface to the underlying models. Two kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search.

This section contains documentation for all available providers for the inference API.

Providers

:maxdepth: 1

inline_meta-reference
inline_sentence-transformers
remote_anthropic
remote_bedrock
remote_cerebras
remote_databricks
remote_fireworks
remote_gemini
remote_groq
remote_hf_endpoint
remote_hf_serverless
remote_llama-openai-compat
remote_nvidia
remote_ollama
remote_openai
remote_passthrough
remote_runpod
remote_sambanova
remote_tgi
remote_together
remote_vertexai
remote_vllm
remote_watsonx

919 B Raw Blame History

Inference

Overview

Providers

919 B

Raw Blame History