feat: split API and provider specs into separate llama-stack-api pkg

Extract API definitions, models, and provider specifications into a standalone llama-stack-api package that can be published to PyPI independently of the main llama-stack server. Motivation External providers currently import from llama-stack, which overrides the installed version and causes dependency conflicts. This separation allows external providers to: - Install only the type definitions they need without server dependencies - Avoid version conflicts with the installed llama-stack package - Be versioned and released independently This enables us to re-enable external provider module tests that were previously blocked by these import conflicts. Changes - Created llama-stack-api package with minimal dependencies (pydantic, jsonschema) - Moved APIs, providers datatypes, strong_typing, and schema_utils - Updated all imports from llama_stack.* to llama_stack_api.* - Preserved git history using git mv for moved files - Configured local editable install for development workflow - Updated linting and type-checking configuration for both packages - Rebased on top of upstream src/ layout changes Testing Package builds successfully and can be imported independently. All pre-commit hooks pass with expected exclusions maintained. Next Steps - Publish llama-stack-api to PyPI - Update external provider dependencies - Re-enable external provider module tests Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-12-04 02:03:44 +00:00 · 2025-10-30 12:25:23 -04:00 · 2025-10-30 12:25:23 -04:00 · 85d407c2a0
commit 85d407c2a0
parent e5a55f3677
359 changed files with 1259 additions and 980 deletions
--- a/tests/unit/rag/test_rag_query.py
+++ b/tests/unit/rag/test_rag_query.py
@ -7,13 +7,13 @@
 from unittest.mock import AsyncMock, MagicMock

 import pytest
-
-from llama_stack.apis.tools.rag_tool import RAGQueryConfig
-from llama_stack.apis.vector_io import (
+from llama_stack_api.apis.tools.rag_tool import RAGQueryConfig
+from llama_stack_api.apis.vector_io import (
    Chunk,
    ChunkMetadata,
    QueryChunksResponse,
 )
+
 from llama_stack.providers.inline.tool_runtime.rag.memory import MemoryToolRuntimeImpl


--- a/tests/unit/rag/test_vector_store.py
+++ b/tests/unit/rag/test_vector_store.py
@ -12,13 +12,13 @@ from unittest.mock import AsyncMock, MagicMock

 import numpy as np
 import pytest
-
-from llama_stack.apis.inference.inference import (
+from llama_stack_api.apis.inference.inference import (
    OpenAIEmbeddingData,
    OpenAIEmbeddingsRequestWithExtraBody,
 )
-from llama_stack.apis.tools import RAGDocument
-from llama_stack.apis.vector_io import Chunk
+from llama_stack_api.apis.tools import RAGDocument
+from llama_stack_api.apis.vector_io import Chunk
+
 from llama_stack.providers.utils.memory.vector_store import (
    URL,
    VectorStoreWithIndex,