llama-stack-mirror/llama_stack/providers/registry/synthetic_data_generation.py
Alina Ryan f86f107f15 (feat) Add synthetic_data_kit provider integration for synthetic_data_generation API
The synthetic_data_kit provider integration enables high-quality synthetic dataset
generation for fine-tuning LLMs. This commit sets up the initial provider
registration and fixes provider resolution to properly handle type casting and
imports, ensuring proper integration with llama-stack's provider system.

Implementation of the actual provider functionality will follow in a subsequent
commit.

Signed-off-by: Alina Ryan <aliryan@redhat.com>
2025-05-30 12:14:40 -04:00

28 lines
953 B
Python

# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the terms described in the LICENSE file in
# the root directory of this source tree.
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# SPDX-License-Identifier: MIT
from llama_stack.providers.datatypes import Api, InlineProviderSpec, ProviderSpec
def available_providers() -> list[ProviderSpec]:
return [
InlineProviderSpec(
api=Api.synthetic_data_generation,
provider_type="inline::synthetic_data_kit",
pip_packages=[
"synthetic-data-kit",
"vllm",
"pydantic",
],
module="llama_stack.providers.inline.synthetic_data_generation.synthetic_data_kit_inline",
config_class="llama_stack.providers.inline.synthetic_data_generation.config.SyntheticDataKitConfig",
),
]