mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-11 13:44:38 +00:00
This PR extends the Llama Stack Batches API to support the /v1/embeddings endpoint, enabling efficient batch processing of embedding requests alongside the existing /v1/chat/completions and /v1/completions support. Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> |
||
---|---|---|
.. | ||
agent | ||
agents | ||
batches | ||
files | ||
inference | ||
nvidia | ||
utils | ||
vector_io | ||
test_bedrock.py | ||
test_configs.py |