llama-stack-mirror/llama_stack/providers
Marut Pandya e2b5456e48
Add Runpod Provider + Distribution (#362)
Add Runpod as a inference provider for openAI compatible managed
endpoints.

Testing 
- Configured llama stack from scratch, set `remote::runpod` as a
inference provider.
- Added Runpod Endpoint URL and API key. 
- Started llama-stack server - llama stack run my-local-stack --port
3000
```
curl http://localhost:5000/inference/chat_completion \
-H "Content-Type: application/json" \
-d '{
	"model": "Llama3.1-8B-Instruct",
	"messages": [
		{"role": "system", "content": "You are a helpful assistant."},
		{"role": "user", "content": "Write me a 2 sentence poem about the moon"}
	],
	"sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512}
}' ```

---------

Signed-off-by: pandyamarut <pandyamarut@gmail.com>
2025-01-23 12:19:02 -08:00
..
adapters/inference/runpod Add Runpod Provider + Distribution (#362) 2025-01-23 12:19:02 -08:00
inline Move tool_runtime.memory -> tool_runtime.rag 2025-01-22 20:25:02 -08:00
registry Add Runpod Provider + Distribution (#362) 2025-01-23 12:19:02 -08:00
remote Add vLLM raw completions API (#823) 2025-01-22 22:58:27 -08:00
tests remove test report 2025-01-23 01:06:39 -08:00
utils [inference api] modify content types so they follow a more standard structure (#841) 2025-01-22 12:16:18 -08:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
datatypes.py [memory refactor][1/n] Rename Memory -> VectorIO, MemoryBanks -> VectorDBs (#828) 2025-01-22 09:59:30 -08:00