Maintained by deepset
Integration: Cerebras
Use LLMs served by Cerebras API
Table of Contents
Overview
Cerebras is the go-to platform for fast and effortless AI training and inference.
Usage
Cerebras API is OpenAI compatible, making it easy to use in Haystack via OpenAI Generators.
Using Generator
Here’s an example of using llama3.1-8b
served via Cerebras to perform question answering on a web page.
You need to set the environment variable CEREBRAS_API_KEY
and choose a
compatible model.
from haystack import Pipeline
from haystack.utils import Secret
from haystack.components.fetchers import LinkContentFetcher
from haystack.components.converters import HTMLToDocument
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
fetcher = LinkContentFetcher()
converter = HTMLToDocument()
prompt_template = """
According to the contents of this website:
{% for document in documents %}
{{document.content}}
{% endfor %}
Answer the given question: {{query}}
Answer:
"""
prompt_builder = PromptBuilder(template=prompt_template)
llm = OpenAIGenerator(
api_key=Secret.from_env_var("CEREBRAS_API_KEY"),
api_base_url="https://api.cerebras.ai/v1",
model="llama3.1-8b"
)
pipeline = Pipeline()
pipeline.add_component("fetcher", fetcher)
pipeline.add_component("converter", converter)
pipeline.add_component("prompt", prompt_builder)
pipeline.add_component("llm", llm)
pipeline.connect("fetcher.streams", "converter.sources")
pipeline.connect("converter.documents", "prompt.documents")
pipeline.connect("prompt.prompt", "llm.prompt")
result = pipeline.run({"fetcher": {"urls": ["https://cerebras.ai/inference"]},
"prompt": {"query": "Why should I use Cerebras for serving LLMs?"}})
print(result["llm"]["replies"][0])
Using ChatGenerator
See an example of engaging in a multi-turn conversation with llama3.1-8b
.
You need to set the environment variable CEREBRAS_API_KEY
and choose a
compatible model.
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret
generator = OpenAIChatGenerator(
api_key=Secret.from_env_var("CEREBRAS_API_KEY"),
api_base_url="https://api.cerebras.ai/v1",
model="llama3.1-8b",
generation_kwargs = {"max_tokens": 512}
)
messages = []
while True:
msg = input("Enter your message or Q to exit\n๐ง ")
if msg=="Q":
break
messages.append(ChatMessage.from_user(msg))
response = generator.run(messages=messages)
assistant_resp = response['replies'][0]
print("๐ค "+assistant_resp.content)
messages.append(assistant_resp)