๐Ÿ†• Build and deploy Haystack pipelines with deepset Studio

Integration: Arize Phoenix

Trace your Haystack pipelines with Arize Phoenix

Authors
Arize AI

Table of Contents

Overview

Arize Phoenix is Arize’s open-source platform that offers developers the quickest way to troubleshoot, evaluate, and experiment with LLM applications.

For a detailed integration guide, see the documentation for Phoenix + Haystack

Installation

pip install openinference-instrumentation-haystack haystack-ai opentelemetry-sdk opentelemetry-exporter-otlp arize-phoenix

Usage

To trace any Haystack pipeline with Phoenix, simply initialize OpenTelemetry and the HaystackInstrumentor. Haystack pipelines that run within the same environment send traces to Phoenix.

First, start a Phoenix instance to send traces to.

python -m phoenix.server.main serve

Now let’s connect our Haystack pipeline to Phoenix using OpenTelemetry.

from openinference.instrumentation.haystack import HaystackInstrumentor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import (
    OTLPSpanExporter,
)
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

endpoint = "http://localhost:6006/v1/traces" # The URL to your Phoenix instance
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))

HaystackInstrumentor().instrument(tracer_provider=tracer_provider)

Now, you can run a Haystack pipeline within the same environment, resulting in the following trace:

To run the example below, export your OpenAI Key to the OPENAI_API_KEY environment variable.

Arize Phoenix Demo

from haystack import Document, Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.document_stores.in_memory import InMemoryDocumentStore

document_store = InMemoryDocumentStore()
document_store.write_documents([
    Document(content="My name is Jean and I live in Paris."),
    Document(content="My name is Mark and I live in Berlin."),
    Document(content="My name is Giorgio and I live in Rome.")
])

prompt_template = """
Given these documents, answer the question.
Documents:
{% for doc in documents %}
    {{ doc.content }}
{% endfor %}
Question: {{question}}
Answer:
"""

retriever = InMemoryBM25Retriever(document_store=document_store)
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator()

rag_pipeline = Pipeline()
rag_pipeline.add_component("retriever", retriever)
rag_pipeline.add_component("prompt_builder", prompt_builder)
rag_pipeline.add_component("llm", llm)
rag_pipeline.connect("retriever", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder", "llm")

question = "Who lives in Paris?"
results = rag_pipeline.run(
    {
        "retriever": {"query": question},
        "prompt_builder": {"question": question},
    }
)

Resources