mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-17 16:32:38 +00:00

qifengleqifengle 35a0a6cb7b feat(vector-io): add OpenGauss vector database provider

Implement OpenGauss vector database integration for Llama Stack with the following features:
- Add OpenGaussVectorIOAdapter for vector storage and retrieval
- Support native vector similarity search operations
- Provide configuration template for easy setup
- Add comprehensive unit tests
- Align with the latest Llama Stack provider architecture, including KVStore and OpenAI Vector Store Mixin.

The implementation allows Llama Stack users to leverage OpenGauss as an
enterprise-grade vector database for RAG applications.

2025-08-19 17:13:24 +08:00

907 B

Raw Blame History

Inference

Overview

Llama Stack Inference API for generating completions, chat completions, and embeddings.

This API provides the raw interface to the underlying models. Two kinds of models are supported:

LLM models: these models generate "raw" and "chat" (conversational) completions.
Embedding models: these models generate embeddings to be used for semantic search.

This section contains documentation for all available providers for the inference API.

Providers

:maxdepth: 1

inline_meta-reference
inline_sentence-transformers
remote_anthropic
remote_bedrock
remote_cerebras
remote_databricks
remote_fireworks
remote_gemini
remote_groq
remote_hf_endpoint
remote_hf_serverless
remote_llama-openai-compat
remote_nvidia
remote_ollama
remote_openai
remote_passthrough
remote_runpod
remote_sambanova
remote_tgi
remote_together
remote_vertexai
remote_vllm
remote_watsonx

907 B Raw Blame History

Inference

Overview

Providers

907 B

Raw Blame History