Add Runpod Provider + Distribution (#362)

Add Runpod as a inference provider for openAI compatible managed endpoints. Testing - Configured llama stack from scratch, set `remote::runpod` as a inference provider. - Added Runpod Endpoint URL and API key. - Started llama-stack server - llama stack run my-local-stack --port 3000 ``` curl http://localhost:5000/inference/chat_completion \ -H "Content-Type: application/json" \ -d '{ "model": "Llama3.1-8B-Instruct", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write me a 2 sentence poem about the moon"} ], "sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512} }' ``` --------- Signed-off-by: pandyamarut <pandyamarut@gmail.com>
2025-10-12 13:57:57 +00:00 · 2025-01-23 12:19:02 -08:00 · 2025-01-23 12:19:02 -08:00 · e2b5456e48
commit e2b5456e48
parent 86466b71a9
5 changed files with 190 additions and 0 deletions
--- a/distributions/runpod/build.yaml
+++ b/distributions/runpod/build.yaml
@ -0,0 +1,9 @@
+name: runpod
+distribution_spec:
+  description: Use Runpod for running LLM inference
+  providers:
+    inference: remote::runpod
+    memory: meta-reference
+    safety: meta-reference
+    agents: meta-reference
+    telemetry: meta-reference