llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-07 18:57:21 +00:00

History

Ashwin Bharambe 7519b73fcc feat(distro): fork off a starter-gpu distribution (#3240 ) The starter distribution added post-training which added torch dependencies which pulls in all the nvidia CUDA libraries. This made our starter container very big. We have worked hard to keep the starter container small so it serves its purpose as a starter. This PR tries to get it back to its size by forking off duplicate "-gpu" providers for post-training. These forked providers are then used for a new `starter-gpu` distribution which can pull in all dependencies.		2025-08-22 15:47:15 -07:00
..
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
agents.py	fix: only load mcp when enabled in tool_group (#2621 )	2025-07-04 20:27:05 +05:30
batches.py	feat: add batches API with OpenAI compatibility (with inference replay) (#3162 )	2025-08-15 15:34:15 -07:00
datasetio.py	docs: auto generated documentation for providers (#2543 )	2025-06-30 15:13:20 +02:00
eval.py	docs: auto generated documentation for providers (#2543 )	2025-06-30 15:13:20 +02:00
files.py	feat: Add S3 Files Provider (#3202 )	2025-08-22 10:38:59 -04:00
inference.py	feat: Add Google Vertex AI inference provider support (#2841 )	2025-08-11 08:22:04 -04:00
post_training.py	feat(distro): fork off a starter-gpu distribution (#3240 )	2025-08-22 15:47:15 -07:00
safety.py	docs: auto generated documentation for providers (#2543 )	2025-06-30 15:13:20 +02:00
scoring.py	docs: auto generated documentation for providers (#2543 )	2025-06-30 15:13:20 +02:00
telemetry.py	docs: auto generated documentation for providers (#2543 )	2025-06-30 15:13:20 +02:00
tool_runtime.py	fix: only load mcp when enabled in tool_group (#2621 )	2025-07-04 20:27:05 +05:30
vector_io.py	chore(tests): fix responses and vector_io tests (#3119 )	2025-08-12 16:15:53 -07:00