Commit graph

4 commits

Author SHA1 Message Date
skamenan7
474b50b422 Add configurable embedding models for vector IO providers
This change lets users configure default embedding models at the provider level instead of always relying on system defaults. Each vector store provider can now specify an embedding_model and optional embedding_dimension in their config.

Key features:
- Auto-dimension lookup for standard models from the registry
- Support for Matryoshka embeddings with custom dimensions
- Three-tier priority: explicit params > provider config > system fallback
- Full backward compatibility - existing setups work unchanged
- Comprehensive test coverage with 20 test cases

Updated all vector IO providers (FAISS, Chroma, Milvus, Qdrant, etc.) with the new config fields and added detailed documentation with examples.

Fixes #2729
2025-07-15 16:46:40 -04:00
Francisco Arceo
ea80ea63ac
chore: Updating chunk id generation to ensure uniqueness (#2618)
# What does this PR do?
This handles an edge case for `generate_chunk_id` if the concatenation
of the `document_id` and `chunk_text` combination are not unique. Adding
the window location ensures uniqueness.

## Test Plan
Added unit test

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-07-04 10:26:35 +05:30
Jorge
4d0d2d685f
fix: Set parameter usedforsecurity=False when calling hashlib.md5 in order to fix rag_tool.insert on FIPS clusters (#2577)
Some checks failed
Integration Tests / test-matrix (library, 3.12, scoring) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.12, post_training) (push) Failing after 10s
Integration Tests / test-matrix (library, 3.12, providers) (push) Failing after 8s
Integration Tests / test-matrix (library, 3.12, tool_runtime) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.13, agents) (push) Failing after 6s
Integration Tests / test-matrix (library, 3.13, inspect) (push) Failing after 5s
Integration Tests / test-matrix (library, 3.13, tool_runtime) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 5s
Integration Tests / test-matrix (library, 3.12, vector_io) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 9s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 7s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 18s
Integration Tests / test-matrix (library, 3.13, inference) (push) Failing after 26s
Integration Tests / test-matrix (library, 3.13, post_training) (push) Failing after 25s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.13, scoring) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 7s
Integration Tests / test-matrix (library, 3.13, providers) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 23s
Python Package Build Test / build (3.12) (push) Failing after 1s
Integration Tests / test-matrix (library, 3.13, vector_io) (push) Failing after 24s
Integration Tests / test-matrix (library, 3.13, datasets) (push) Failing after 31s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
Test External Providers / test-external-providers (venv) (push) Failing after 5s
Unit Tests / unit-tests (3.13) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 34s
Python Package Build Test / build (3.13) (push) Failing after 33s
Pre-commit / pre-commit (push) Successful in 1m52s
# What does this PR do?
Set parameter `usedforsecurity=False` when calling hashlib.md5 in order
to fix rag_tool.insert on FIPS clusters

<!-- If resolving an issue, uncomment and update the line below -->
Closes #2571

---------

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
2025-07-02 12:07:05 +02:00
Francisco Arceo
82f13fe83e
feat: Add ChunkMetadata to Chunk (#2497)
# What does this PR do?
Adding `ChunkMetadata` so we can properly delete embeddings later.

More specifically, this PR refactors and extends the chunk metadata
handling in the vector database and introduces a distinction between
metadata used for model context and backend-only metadata required for
chunk management, storage, and retrieval. It also improves chunk ID
generation and propagation throughout the stack, enhances test coverage,
and adds new utility modules.

```python
class ChunkMetadata(BaseModel):
    """
    `ChunkMetadata` is backend metadata for a `Chunk` that is used to store additional information about the chunk that
        will NOT be inserted into the context during inference, but is required for backend functionality.
        Use `metadata` in `Chunk` for metadata that will be used during inference.
    """
    document_id: str | None = None
    chunk_id: str | None = None
    source: str | None = None
    created_timestamp: int | None = None
    updated_timestamp: int | None = None
    chunk_window: str | None = None
    chunk_tokenizer: str | None = None
    chunk_embedding_model: str | None = None
    chunk_embedding_dimension: int | None = None
    content_token_count: int | None = None
    metadata_token_count: int | None = None
```
Eventually we can migrate the document_id out of the `metadata` field.
I've introduced the changes so that `ChunkMetadata` is backwards
compatible with `metadata`.

<!-- If resolving an issue, uncomment and update the line below -->
Closes https://github.com/meta-llama/llama-stack/issues/2501 

## Test Plan
Added unit tests

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-06-25 15:55:23 -04:00