# Build with Gemma and Haystack 2.x

<img src="https://huggingface.co/blog/assets/gemma/Gemma-logo-small.png" width="200" style="display:inline;">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<img src="https://haystack.deepset.ai/images/haystack-ogimage.png" width="430" style="display:inline;">



We will see what we can build with the new [Google Gemma open models](https://blog.google/technology/developers/gemma-open-models/) and the [Haystack LLM framework](https://haystack.deepset.ai/).

## Installation

In [None]:
! pip install haystack-ai "huggingface_hub>=0.22.0"

## Authorization

- you need an Hugging Face account
- you need to accept Google conditions here: https://huggingface.co/google/gemma-7b-it and wait for the authorization

In [2]:
import getpass, os


os.environ["HF_API_TOKEN"] = getpass.getpass("Your Hugging Face token")

Your Hugging Face tokenÂ·Â·Â·Â·Â·Â·Â·Â·Â·Â·


## Chat with Gemma (travel assistant) ðŸ›©

For simplicity, we call the model using the free Hugging Face Inference API with the `HuggingFaceAPIChatGenerator`.

(We might also load it in Colab using the `HuggingFaceLocalChatGenerator` in a quantized version).

In [4]:
from haystack.components.generators.chat import HuggingFaceAPIChatGenerator
from haystack.dataclasses import ChatMessage

generator = HuggingFaceAPIChatGenerator(
    api_type="serverless_inference_api",
    api_params={"model": "google/gemma-7b-it"},
    generation_kwargs={"max_new_tokens": 350})

In [7]:
messages = []

while True:
  msg = input("Enter your message or Q to exit\nðŸ§‘ ")
  if msg=="Q":
    break
  messages.append(ChatMessage.from_user(msg))
  response = generator.run(messages=messages)
  assistant_resp = response['replies'][0]
  print("ðŸ¤– "+assistant_resp.content)
  messages.append(assistant_resp)

Enter your message or Q to exit
ðŸ§‘  can you help me planning a trip?
ðŸ¤– **Sure, I'd be happy to help you with that. Please provide me with the following information:**

* **What you want to do:** What are your interests and what do you want to see and experience on your trip?
* **Your budget:** How much you are willing to spend on your trip.
* **Your preferred travel dates:** When you would like to travel.
* **Your preferred travel style:** Whether you prefer solo travel, group travel, or a guided tour.
* **Your preferred location:** Where you would like to go.
* **Any specific interests or activities you want to include:** For example, hiking, sightseeing, or exploring the local culture.

Once I have this information, I can help you plan a trip that is tailored to your specific interests and budget. I can also provide you with recommendations for accommodations, transportation, and activities.
Enter your message or Q to exit
ðŸ§‘ I'm interested in Italy. what can I visit?
ðŸ¤– Sur

## RAG with Gemma (about Rock music) ðŸŽ¸

In [None]:
! pip install wikipedia

### Load data from Wikipedia

In [9]:
favourite_bands="""Audioslave
Blink-182
Dire Straits
Evanescence
Green Day
Muse (band)
Nirvana (band)
Sum 41
The Cure
The Smiths""".split("\n")

In [17]:
from IPython.display import Image
from pprint import pprint
import rich
import random

In [10]:
import wikipedia
from haystack.dataclasses import Document

raw_docs=[]

for title in favourite_bands:
    page = wikipedia.page(title=title, auto_suggest=False)
    doc = Document(content=page.content, meta={"title": page.title, "url":page.url})
    raw_docs.append(doc)

### Indexing Pipeline

In [11]:
from haystack import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.preprocessors import DocumentCleaner, DocumentSplitter
from haystack.components.writers import DocumentWriter
from haystack.document_stores.types import DuplicatePolicy

In [12]:
document_store = InMemoryDocumentStore()

In [13]:
indexing = Pipeline()
indexing.add_component("cleaner", DocumentCleaner())
indexing.add_component("splitter", DocumentSplitter(split_by='sentence', split_length=2))
indexing.add_component("writer", DocumentWriter(document_store=document_store, policy=DuplicatePolicy.OVERWRITE))

indexing.connect("cleaner", "splitter")
indexing.connect("splitter", "writer")

<haystack.pipeline.Pipeline at 0x7f7a11777490>

In [19]:
indexing.run({"cleaner":{"documents":raw_docs}})

{'writer': {'documents_written': 1584}}

In [20]:
document_store.filter_documents()[0].meta

{'title': 'Audioslave',
 'url': 'https://en.wikipedia.org/wiki/Audioslave',
 'source_id': 'e3deff3d39ef107e8b0d69415ea61644b73175086cfbeee03d5f5d6946619fcf'}

### RAG Pipeline

In [29]:
from haystack.components.builders import PromptBuilder

prompt_template = """
<start_of_turn>user
Using the information contained in the context, give a comprehensive answer to the question.
If the answer is contained in the context, also report the source URL.
If the answer cannot be deduced from the context, do not give an answer.

Context:
  {% for doc in documents %}
  {{ doc.content }} URL:{{ doc.meta['url'] }}
  {% endfor %};
  Question: {{query}}<end_of_turn>

<start_of_turn>model
"""
prompt_builder = PromptBuilder(template=prompt_template)

Here, we use the `HuggingFaceAPIGenerator` since it is not a chat setting and we don't envision multi-turn conversations but just RAG.

In [30]:
from haystack.components.generators import HuggingFaceAPIGenerator

generator = HuggingFaceAPIGenerator(
    api_type="serverless_inference_api",
    api_params={"model": "google/gemma-7b-it"},
    generation_kwargs={"max_new_tokens": 500})

In [31]:
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever

rag = Pipeline()
rag.add_component("retriever", InMemoryBM25Retriever(document_store=document_store, top_k=5))
rag.add_component("prompt_builder", prompt_builder)
rag.add_component("llm", generator)

rag.connect("retriever.documents", "prompt_builder.documents")
rag.connect("prompt_builder.prompt", "llm.prompt")

<haystack.pipeline.Pipeline at 0x7f7a10ee5180>

### Let's ask some questions!

In [36]:
def get_generative_answer(query):

  results = rag.run({
      "retriever": {"query": query},
      "prompt_builder": {"query": query}
    }
  )

  answer = results["llm"]["replies"][0]
  rich.print(answer)

In [43]:
get_generative_answer("Audioslave was formed by members of two iconic bands. Can you name the bands and discuss the sound of Audioslave in comparison?")

Ranking by BM25...:   0%|          | 0/1565 [00:00<?, ? docs/s]

In [83]:
nice_questions_to_try="""What was the original name of Sum 41?
What was the title of Nirvana's breakthrough album released in 1991?
Green Day's "American Idiot" is a rock opera. What's the story it tells?
Audioslave was formed by members of two iconic bands. Can you name the bands and discuss the sound of Audioslave in comparison?
Evanescence's "Bring Me to Life" features a male vocalist. Who is he, and how does his voice complement Amy Lee's in the song?
What is Sum 41's debut studio album called?
Who was the lead singer of Audioslave?
When was Nirvana's first studio album, "Bleach," released?
Were the Smiths an influential band?
What is the name of Evanescence's debut album?
Which band was Morrissey the lead singer of before he formed The Smiths?
Dire Straits' hit song "Money for Nothing" features a guest vocal by a famous artist. Who is this artist?
Who played the song "Like a stone"?""".split('\n')

In [90]:
q=random.choice(nice_questions_to_try)
print(q)
get_generative_answer(q)

What was the original name of Sum 41?


Ranking by BM25...:   0%|          | 0/1565 [00:00<?, ? docs/s]

This is a simple demo.
We can improve the RAG Pipeline using better retrieval techniques: Embedding Retrieval, Hybrid Retrieval...

(*Notebook by [Stefano Fiorucci](https://github.com/anakin87)*)