# Adding a New Vector Database This guide will walk you through the process of adding a new vector database to Llama Stack. > **_NOTE:_** Here's an example Pull Request of the [Milvus Vector Database Provider](https://github.com/meta-llama/llama-stack/pull/1467). Vector Database providers are used to store and retrieve vector embeddings. Vector databases are not limited to vector search but can support keyword and hybrid search. Additionally, vector database can also support operations like filtering, sorting, and aggregating vectors. ## Steps to Add a New Vector Database Provider 1. **Choose the Database Type**: Determine if your vector database is a remote service, inline, or both. - Remote databases make requests to external services, while inline databases execute locally. Some providers support both. 2. **Implement the Provider**: Create a new provider class that inherits from `VectorDatabaseProvider` and implements the required methods. - Implement methods for vector storage, retrieval, search, and any additional features your database supports. - You will need to implement the following methods for `YourVectorIndex`: - `YourVectorIndex.create()` - `YourVectorIndex.initialize()` - `YourVectorIndex.add_chunks()` - `YourVectorIndex.delete_chunk()` - `YourVectorIndex.query_vector()` - `YourVectorIndex.query_keyword()` - `YourVectorIndex.query_hybrid()` - You will need to implement the following methods for `YourVectorIOAdapter`: - `YourVectorIOAdapter.initialize()` - `YourVectorIOAdapter.shutdown()` - `YourVectorIOAdapter.list_vector_dbs()` - `YourVectorIOAdapter.register_vector_db()` - `YourVectorIOAdapter.unregister_vector_db()` - `YourVectorIOAdapter.insert_chunks()` - `YourVectorIOAdapter.query_chunks()` - `YourVectorIOAdapter.delete_chunks()` 3. **Add to Registry**: Register your provider in the appropriate registry file. - Update {repopath}`llama_stack/providers/registry/vector_io.py` to include your new provider. ```python from llama_stack.providers.registry.specs import InlineProviderSpec from llama_stack.providers.registry.api import Api InlineProviderSpec( api=Api.vector_io, provider_type="inline::milvus", pip_packages=["pymilvus>=2.4.10"], module="llama_stack.providers.inline.vector_io.milvus", config_class="llama_stack.providers.inline.vector_io.milvus.MilvusVectorIOConfig", api_dependencies=[Api.inference], optional_api_dependencies=[Api.files], description="", ), ``` 4. **Add Tests**: Create unit tests and integration tests for your provider in the `tests/` directory. - Unit Tests - By following the structure of the class methods, you will be able to easily run unit and integration tests for your database. 1. You have to configure the tests for your provide in `/tests/unit/providers/vector_io/conftest.py`. 2. Update the `vector_provider` fixture to include your provider if they are an inline provider. 3. Create a `your_vectorprovider_index` fixture that initializes your vector index. 4. Create a `your_vectorprovider_adapter` fixture that initializes your vector adapter. 5. Add your provider to the `vector_io_providers` fixture dictionary. - Please follow the naming convention of `your_vectorprovider_index` and `your_vectorprovider_adapter` as the tests require this to execute properly. - Integration Tests - Integration tests are located in {repopath}`tests/integration`. These tests use the python client-SDK APIs (from the `llama_stack_client` package) to test functionality. - The two set of integration tests are: - `tests/integration/vector_io/test_vector_io.py`: This file tests registration, insertion, and retrieval. - `tests/integration/vector_io/test_openai_vector_stores.py`: These tests are for OpenAI-compatible vector stores and test the OpenAI API compatibility. - You will need to update `skip_if_provider_doesnt_support_openai_vector_stores` to include your provider as well as `skip_if_provider_doesnt_support_openai_vector_stores_search` to test the appropriate search functionality. - Running the tests in the GitHub CI - You will need to update the `.github/workflows/integration-vector-io-tests.yml` file to include your provider. - If your provider is a remote provider, you will also have to add a container to spin up and run it in the action. - Updating the pyproject.yml - If you are adding tests for the `inline` provider you will have to update the `unit` group. - `uv add new_pip_package --group unit` - If you are adding tests for the `remote` provider you will have to update the `test` group, which is used in the GitHub CI for integration tests. - `uv add new_pip_package --group test` 5. **Update Documentation**: Please update the documentation for end users - Generate the provider documentation by running {repopath}`./scripts/provider_codegen.py`. - Update the autogenerated content in the registry/vector_io.py file with information about your provider. Please see other providers for examples.