llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-09 21:18:38 +00:00

History

Marut Pandya e2b5456e48 Add Runpod Provider + Distribution (#362 ) Add Runpod as a inference provider for openAI compatible managed endpoints. Testing - Configured llama stack from scratch, set `remote::runpod` as a inference provider. - Added Runpod Endpoint URL and API key. - Started llama-stack server - llama stack run my-local-stack --port 3000 ``` curl http://localhost:5000/inference/chat_completion \ -H "Content-Type: application/json" \ -d '{ "model": "Llama3.1-8B-Instruct", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write me a 2 sentence poem about the moon"} ], "sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512} }' ``` --------- Signed-off-by: pandyamarut <pandyamarut@gmail.com>		2025-01-23 12:19:02 -08:00
..
adapters/inference/runpod	Add Runpod Provider + Distribution (#362 )	2025-01-23 12:19:02 -08:00
inline	Move tool_runtime.memory -> tool_runtime.rag	2025-01-22 20:25:02 -08:00
registry	Add Runpod Provider + Distribution (#362 )	2025-01-23 12:19:02 -08:00
remote	Add vLLM raw completions API (#823 )	2025-01-22 22:58:27 -08:00
tests	remove test report	2025-01-23 01:06:39 -08:00
utils	[inference api] modify content types so they follow a more standard structure (#841 )	2025-01-22 12:16:18 -08:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
datatypes.py	[memory refactor][1/n] Rename Memory -> VectorIO, MemoryBanks -> VectorDBs (#828 )	2025-01-22 09:59:30 -08:00