# What does this PR do?
This changes all VectorIO providers classes to follow the pattern
`<ProviderName>VectorIOConfig` and `<ProviderName>VectorIOAdapter`. All
API endpoints for VectorIOs are currently consistent with `/vector-io`.
Note that API endpoint for VectorDB stay unchanged as `/vector-dbs`.
## Test Plan
I don't have a way to test all providers. This is a simple renaming so
things should work as expected.
---------
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
# What does this PR do?
- Configured ruff linter to automatically fix import sorting issues.
- Set --exit-non-zero-on-fix to ensure non-zero exit code when fixes are
applied.
- Enabled the 'I' selection to focus on import-related linting rules.
- Ran the linter, and formatted all codebase imports accordingly.
- Removed the black dep from the "dev" group since we use ruff
Signed-off-by: Sébastien Han <seb@redhat.com>
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
[//]: # (- [ ] Added a Changelog entry if the change is significant)
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
I tried running the Qdrant provider and found some bugs. See #1021 for
details. @terrytangyuan wrote there:
> Please feel free to submit your changes in a PR. I fixed similar
issues for pgvector provider. This might be an issue introduced from a
refactoring.
So I am submitting this PR.
Closes#1021
## Test Plan
Here are the highlights for what I did to test this:
References:
-
https://llama-stack.readthedocs.io/en/latest/getting_started/index.html
-
https://github.com/meta-llama/llama-stack-apps/blob/main/examples/agents/rag_with_vector_db.py
-
https://github.com/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/README.md#build-configure-and-run-llama-stack
Install and run Qdrant server:
```
podman pull qdrant/qdrant
mkdir qdrant-data
podman run -p 6333:6333 -v $(pwd)/qdrant-data:/qdrant/storage qdrant/qdrant
```
Install and run Llama Stack from the venv-support PR (mainly because I
didn't want to install conda):
```
brew install cmake # Should just need this once
git clone https://github.com/meta-llama/llama-models.git
gh repo clone cdoern/llama-stack
cd llama-stack
gh pr checkout 1018 # This is the checkout that introduces venv support for build/run. Otherwise you have to use conda. Eventually this wil be part of main, hopefully.
uv sync --extra dev
uv pip install -e .
source .venv/bin/activate
uv pip install qdrant_client
LLAMA_STACK_DIR=$(pwd) LLAMA_MODELS_DIR=../llama-models llama stack build --template ollama --image-type venv
```
```
edit llama_stack/templates/ollama/run.yaml
```
in that editor under:
```
vector_io:
```
add:
```
- provider_id: qdrant
provider_type: remote::qdrant
config: {}
```
see
https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/vector_io/qdrant/config.py#L14
for config options (but I didn't need any)
```
LLAMA_STACK_DIR=$(pwd) LLAMA_MODELS_DIR=../llama-models llama stack run ollama --image-type venv \
--port $LLAMA_STACK_PORT \
--env INFERENCE_MODEL=$INFERENCE_MODEL \
--env SAFETY_MODEL=$SAFETY_MODEL \
--env OLLAMA_URL=$OLLAMA_URL
```
Then I tested it out in a notebook. Key highlights included:
```
qdrant_provider = None
for provider in client.providers.list():
if provider.api == "vector_io" and provider.provider_id == "qdrant":
qdrant_provider = provider
qdrant_provider
assert qdrant_provider is not None, "QDrant is not a provider. You need to edit the run yaml file you use in your `llama stack run` call"
vector_db_id = f"test-vector-db-{uuid.uuid4().hex}"
client.vector_dbs.register(
vector_db_id=vector_db_id,
embedding_model="all-MiniLM-L6-v2",
embedding_dimension=384,
provider_id=qdrant_provider.provider_id,
)
```
Other than that, I just followed what was in
https://llama-stack.readthedocs.io/en/latest/getting_started/index.html
It would be good to have automated tests for this in the future, but
that would be a big undertaking.
Signed-off-by: Bill Murdock <bmurdock@redhat.com>
Lint check in main branch is failing. This fixes the lint check after we
moved to ruff in https://github.com/meta-llama/llama-stack/pull/921. We
need to move to a `ruff.toml` file as well as fixing and ignoring some
additional checks.
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
See https://github.com/meta-llama/llama-stack/issues/827 for the broader
design.
This is the first part:
- delete other kinds of memory banks (keyvalue, keyword, graph) for now;
we will introduce a keyvalue store API as part of this design but not
use it in the RAG tool yet.
- renaming of the APIs
2025-01-22 09:59:30 -08:00
Renamed from llama_stack/providers/remote/memory/qdrant/qdrant.py (Browse further)