Add Runpod Provider + Distribution (#362)

Add Runpod as a inference provider for openAI compatible managed
endpoints.

Testing 
- Configured llama stack from scratch, set `remote::runpod` as a
inference provider.
- Added Runpod Endpoint URL and API key. 
- Started llama-stack server - llama stack run my-local-stack --port
3000
```
curl http://localhost:5000/inference/chat_completion \
-H "Content-Type: application/json" \
-d '{
	"model": "Llama3.1-8B-Instruct",
	"messages": [
		{"role": "system", "content": "You are a helpful assistant."},
		{"role": "user", "content": "Write me a 2 sentence poem about the moon"}
	],
	"sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512}
}' ```

---------

Signed-off-by: pandyamarut <pandyamarut@gmail.com>
This commit is contained in:
Marut Pandya 2025-01-23 12:19:02 -08:00 committed by GitHub
parent 86466b71a9
commit e2b5456e48
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 190 additions and 0 deletions

View file

@ -195,4 +195,13 @@ def available_providers() -> List[ProviderSpec]:
config_class="llama_stack.providers.remote.inference.nvidia.NVIDIAConfig",
),
),
remote_provider_spec(
api=Api.inference,
adapter=AdapterSpec(
adapter_type="runpod",
pip_packages=["openai"],
module="llama_stack.providers.adapters.inference.runpod",
config_class="llama_stack.providers.adapters.inference.runpod.RunpodImplConfig",
),
),
]