llama-stack-mirror/llama_stack/providers/remote/inference/ramalama
Charlie Doern 4de45560bf feat: remote ramalama provider implementation
Implement remote ramalama provider using AsyncOpenAI as the client since ramalama doesn't have its own Async library.
Ramalama is similar to ollama, as it is a lightweight local inference server. However, it runs by default in a containerized mode.

RAMALAMA_URL is http://localhost:8080 by default

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-04-18 12:54:42 -04:00
..
__init__.py feat: remote ramalama provider implementation 2025-04-18 12:54:42 -04:00
config.py feat: remote ramalama provider implementation 2025-04-18 12:54:42 -04:00
models.py feat: remote ramalama provider implementation 2025-04-18 12:54:42 -04:00
openai_utils.py feat: remote ramalama provider implementation 2025-04-18 12:54:42 -04:00
ramalama.py feat: remote ramalama provider implementation 2025-04-18 12:54:42 -04:00