Marut Pandya
|
e2b5456e48
|
Add Runpod Provider + Distribution (#362)
Add Runpod as a inference provider for openAI compatible managed
endpoints.
Testing
- Configured llama stack from scratch, set `remote::runpod` as a
inference provider.
- Added Runpod Endpoint URL and API key.
- Started llama-stack server - llama stack run my-local-stack --port
3000
```
curl http://localhost:5000/inference/chat_completion \
-H "Content-Type: application/json" \
-d '{
"model": "Llama3.1-8B-Instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write me a 2 sentence poem about the moon"}
],
"sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512}
}' ```
---------
Signed-off-by: pandyamarut <pandyamarut@gmail.com>
|
2025-01-23 12:19:02 -08:00 |
|