๐ŸŽ„ Advent of Haystack solutions are here, explore them now!
Maintained by deepset

Integration: FastEmbed

Use the FastEmbed embedding models

Authors
deepset
Nicola Procopio

Table of Contents

Overview

FastEmbed is a lightweight, fast, Python library built for embedding generation and document ranking.

  • Light and fast: quantized model weights; ONNX Runtime for inference via Optimum.
  • Performant embedding models: list of supported models - including multilingual models.
  • Support for sparse embedding models.
  • Good integration with Qdrant document store and retrievers.

Installation

pip install fastembed-haystack

Usage

Components

The fastembed-haystack integrations provides the following components:

  • Embedders:
    • FastembedTextEmbedder: creates a dense embedding for text (used in query/RAG pipelines).
    • FastembedDocumentEmbedder: enriches documents with dense embeddings (used in indexing pipelines).
    • FastembedSparseTextEmbedder: creates a sparse embedding for text (used in query/RAG pipelines).
    • FastembedSparseDocumentEmbedder: enriches documents with sparse embeddings (used in indexing pipelines).
  • Ranker:
    • FastembedRanker: ranks documents based on a query (used in query/RAG pipelines after the retrieval).

Example with dense embeddings

from haystack import Document, Pipeline
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder, FastembedTextEmbedder

document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")

documents = [
    Document(content="My name is Wolfgang and I live in Berlin"),
    Document(content="I saw a black horse running"),
    Document(content="Germany has many big cities"),
    Document(content="fastembed is supported by and maintained by Qdrant."),
]

document_embedder = FastembedDocumentEmbedder()
document_embedder.warm_up()
documents_with_embeddings = document_embedder.run(documents)["documents"]
document_store.write_documents(documents_with_embeddings)

query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", FastembedTextEmbedder())
query_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

query = "Who supports fastembed?"

result = query_pipeline.run({"text_embedder": {"text": query}})

For a more detailed example, see this notebook.

Example with sparse embeddings

Currently, Sparse Embedding retrieval is only supported by QdrantDocumentStore. You can install the package as follows:

pip install qdrant-haystack
from haystack import Document, Pipeline
from haystack_integrations.components.retrievers.qdrant import QdrantSparseEmbeddingRetriever
from haystack_integrations.document_stores.qdrant import QdrantDocumentStore
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder, FastembedTextEmbedder

document_store = QdrantDocumentStore(
    ":memory:",
    recreate_index=True,
    use_sparse_embeddings=True
)

documents = [
    Document(content="My name is Wolfgang and I live in Berlin"),
    Document(content="I saw a black horse running"),
    Document(content="Germany has many big cities"),
    Document(content="fastembed is supported by and maintained by Qdrant."),
]

sparse_document_embedder = FastembedSparseDocumentEmbedder()
sparse_document_embedder.warm_up()
documents_with_sparse_embeddings = sparse_document_embedder.run(documents)["documents"]
document_store.write_documents(documents_with_sparse_embeddings)

query_pipeline = Pipeline()
query_pipeline.add_component("sparse_text_embedder", FastembedSparseTextEmbedder())
query_pipeline.add_component("sparse_retriever", QdrantSparseEmbeddingRetriever(document_store=document_store))
query_pipeline.connect("sparse_text_embedder.sparse_embedding", "retriever.query_sparse_embedding")

query = "Who supports fastembed?"

result = query_pipeline.run({"sparse_text_embedder": {"text": query}})

For a more detailed example, see this notebook.

Example with ranker

from haystack import Document, Pipeline
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack_integrations.components.embedders.fastembed import FastembedDocumentEmbedder, FastembedTextEmbedder
from haystack_integrations.components.rankers.fastembed import FastembedRanker

document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")

query = "Who supports fastembed?"

documents = [
    Document(content="My name is Wolfgang and I live in Berlin"),
    Document(content="I saw a black horse running"),
    Document(content="Germany has many big cities"),
    Document(content="fastembed is supported by and maintained by Qdrant."),
]

document_embedder = FastembedDocumentEmbedder()
document_embedder.warm_up()
documents_with_embeddings = document_embedder.run(documents)["documents"]
document_store.write_documents(documents_with_embeddings)

query_pipeline = Pipeline()
query_pipeline.add_component("text_embedder", FastembedTextEmbedder())
query_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
query_pipeline.add_component("ranker", FastembedRanker(top_k=2))
query_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
query_pipeline.connect("retriever.documents", "ranker.documents")


result = query_pipeline.run({"text_embedder": {"text": query}, "ranker": { "query" : query }})

License

fastembed-haystack is distributed under the terms of the Apache-2.0 license.