llama-stack/distributions
Marut Pandya e2b5456e48
Add Runpod Provider + Distribution (#362)
Add Runpod as a inference provider for openAI compatible managed
endpoints.

Testing 
- Configured llama stack from scratch, set `remote::runpod` as a
inference provider.
- Added Runpod Endpoint URL and API key. 
- Started llama-stack server - llama stack run my-local-stack --port
3000
```
curl http://localhost:5000/inference/chat_completion \
-H "Content-Type: application/json" \
-d '{
	"model": "Llama3.1-8B-Instruct",
	"messages": [
		{"role": "system", "content": "You are a helpful assistant."},
		{"role": "user", "content": "Write me a 2 sentence poem about the moon"}
	],
	"sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512}
}' ```

---------

Signed-off-by: pandyamarut <pandyamarut@gmail.com>
2025-01-23 12:19:02 -08:00
..
bedrock Update default port from 5000 -> 8321 2025-01-16 15:26:48 -08:00
cerebras Update default port from 5000 -> 8321 2025-01-16 15:26:48 -08:00
dell-tgi More generic image type for OCI-compliant container technologies (#802) 2025-01-17 16:37:42 -08:00
fireworks [CICD] add simple test step for docker build workflow, fix prefix bug (#821) 2025-01-18 15:16:05 -08:00
meta-reference-gpu Update default port from 5000 -> 8321 2025-01-16 15:26:48 -08:00
meta-reference-quantized-gpu More generic image type for OCI-compliant container technologies (#802) 2025-01-17 16:37:42 -08:00
ollama Auto-generate distro yamls + docs (#468) 2024-11-18 14:57:06 -08:00
remote-nvidia Update default port from 5000 -> 8321 2025-01-16 15:26:48 -08:00
remote-vllm rename LLAMASTACK_PORT to LLAMA_STACK_PORT for consistency with other env vars (#744) 2025-01-10 11:09:49 -08:00
runpod Add Runpod Provider + Distribution (#362) 2025-01-23 12:19:02 -08:00
tgi Auto-generate distro yamls + docs (#468) 2024-11-18 14:57:06 -08:00
together [CICD] add simple test step for docker build workflow, fix prefix bug (#821) 2025-01-18 15:16:05 -08:00
vllm-gpu More generic image type for OCI-compliant container technologies (#802) 2025-01-17 16:37:42 -08:00
dependencies.json add mcp runtime as default to all providers (#816) 2025-01-17 16:40:58 -08:00