mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-28 04:21:58 +00:00
Implement remote ramalama provider using AsyncOpenAI as the client since ramalama doesn't have its own Async library. Ramalama is similar to ollama, as it is a lightweight local inference server. However, it runs by default in a containerized mode. RAMALAMA_URL is http://localhost:8080 by default Signed-off-by: Charlie Doern <cdoern@redhat.com> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| config.py | ||
| models.py | ||
| openai_utils.py | ||
| ramalama.py | ||