llama-stack-mirror/llama_stack/providers/inline/batches/reference
Varsha Prasad Narsing 531b1451dc feat: Add /v1/embeddings endpoint to batches API
This PR extends the Llama Stack Batches API to support the /v1/embeddings endpoint, enabling efficient batch processing of embedding requests alongside the existing /v1/chat/completions and /v1/completions support.

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
2025-09-29 12:00:28 -07:00
..
__init__.py feat: add batches API with OpenAI compatibility (with inference replay) (#3162) 2025-08-15 15:34:15 -07:00
batches.py feat: Add /v1/embeddings endpoint to batches API 2025-09-29 12:00:28 -07:00
config.py feat: add batches API with OpenAI compatibility (with inference replay) (#3162) 2025-08-15 15:34:15 -07:00