Getting Started with h2oGPTe
The h2ogpte Python client lets users integrate with h2oGPTe.
To access h2oGPTe, use the following URL:
h2oGPTe Configuration
The following steps describe how to prepare the h2oGPTe API key needed to access the server:
Log in to https://h2ogpte.genai.h2o.ai
Navigate to
Settings
>API Keys
and create or get the API key.
Upload Documents, Audio, Video, Images …
h2oGPTe is able to ingest a wide variety of data ranging from documents (PDF, text, web pages) to video. Create h2oGPTe client and use it to upload the documents and ingest them:
from h2ogpte import H2OGPTE
client = H2OGPTE(
address="https://h2ogpte.genai.h2o.ai",
api_key=API_KEY)
# Chat with LLM
chat_session_id = client.create_chat_session()
with client.connect(chat_session_id) as session:
# Simple Question for Document Collection
answer = session.query(
"Can you hallucinate?",
).content
print(answer)
# Create Collection
collection_id = client.create_collection(
name="My first h2oGPTe collection",
description="PDF -> text -> summary",
)
file_path = "...path to document(s)..."
with open(file_path.resolve(), "rb") as f:
upload_id = client.upload(file_path.name, f)
# Converting the input into chunked text and embeddings...
client.ingest_uploads(collection_id, [upload_id])
Talk to Collection
Start a Q&A session:
chat_session_id = client.create_chat_session(collection_id)
with client.connect(chat_session_id) as session:
# Simple Question for Document Collection
answer = session.query(
"What was net revenue?",
).content
print(answer)
Advanced Controls for Document Q&A
From the client, you can control the LLM, the RAG type, and the prompting. The client can also return streaming responses.
chat_session_id = client.create_chat_session(collection_id)
with client.connect(chat_session_id) as session:
# Choose different LLM
answer = session.query(
"What was net revenue?",
llm="gpt-4-0613",
).content
print(answer)
# No RAG (LLM only) - documents ignored
answer = session.query(
message="Who are you?",
rag_config={
"rag_type": "llm_only",
},
).content
print(answer)
# RAG (Retrieval Augmented Generation)
answer = session.query(
message="What was net revenue?",
rag_config={
"rag_type": "rag",
},
).content
print(answer)
# HyDE RAG (Hypothetical Document Embedding)
hyde1 = session.query(
message="What was net revenue?",
rag_config={"rag_type": "hyde1"},
).content
print(hyde1)
# HyDE RAG+ (Combined HyDE+RAG)
hyde2 = session.query(
message="What was net revenue?",
rag_config={"rag_type": "hyde2"},
).content
print(hyde2)
# Custom System Prompt
answer = session.query(
message="What was net revenue?",
system_prompt="YOU ARE THE UPPERCASE MACHINE. YOU ANSWER IN UPPERCASE LETTERS.",
).content
print(answer)
# Custom RAG prompting (e.g., for foreign languages)
answer = session.query(
message="Was war der Umsatz?",
system_prompt="Sie sind ein Professor der Deutschen Sprache.",
pre_prompt_query="Achten Sie darauf und merken Sie sich die folgenden Informationen, die bei der Beantwortung der Frage oder des Imperativs nach Ende des Kontexts hilfreich sein werden.\n",
prompt_query="Nur nach den Informationen in den Dokumentenquellen, die im oben genannten Kontext bereitgestellt wurden, ",
).content
print(answer)
# Streaming
partial_msg = ""
full_msg = ""
def callback(message):
nonlocal full_msg
nonlocal partial_msg
if isinstance(message, ChatMessage):
full_msg = message.content
elif isinstance(message, PartialChatMessage):
partial_msg += message.content
print(partial_msg)
session.query("What was net revenue?", timeout=60, callback=callback)
Self-Reflection
Ask another LLM for a reflection of the answer given the question and the context provided. Can be used to evaluate the LLM’s performance (to some extent, requires a strong LLM).
self_reflection_config = dict(
llm_reflection="gpt-4-0613",
prompt_reflection='Here\'s the prompt and the response:\n"""Prompt:\n%s\n"""\n\n"""Response:\n%s\n"""\n\nWhat is the quality of the response for the given prompt? Answer with GOOD or BAD.',
system_prompt_reflection="",
llm_args_reflection="{}",
)
chat_session_id = client.create_chat_session(collection_id)
with client.connect(chat_session_id) as session:
reply = session.query(
message="What color are roses",
self_reflection_config=self_reflection_config,
).content
print(reply)
reflection = client.list_chat_message_meta_part(
reply.id, "self_reflection"
).content
print(reflection)
Summarization of Documents
Create a custom summary of a document:
documents = client.list_documents_in_collection(collection_id, offset=0, limit=99)
document_id = documents[0].id
summary = client.process_document(
document_id=document_id,
pre_prompt_summary="Summarize the content below into a list of bullets.\n",
prompt_summary="Now summarize the above into a couple of paragraphs.",
max_num_chunks=200,
sampling_strategy="auto",
)
# Summary
for s in summary.content.split("\n"):
print(s)
Summarization of Arbitrary Text Chunks
Create a summary from arbitrary text blocks:
summary = client.summarize_content(
pre_prompt_summary="Summarize the content below into a list of bullets.\n",
text_context_list=["long text to summarize", "more text to summarize", "yet more tet to summarize"],
prompt_summary="Now summarize the above into a couple of paragraphs.",
)
# Summary
for s in summary.content.split("\n"):
print(s)
Data Extraction
You can extract data with custom prompts:
extract = client.extract_data(
text_context_list=["Jack Smith", "Jane Doe and her husband Jim Doe", "This is nothing important"],
pre_prompt_extract="Pay attention and look at all people. Your job is to collect their names.\n",
prompt_extract="List all people names as JSON.",
keep_intermediate_results=True,
)
# List of LLM answers per text input
for extract_list_item in extract.content:
for s in extract_list_item.split("\n"):
print(s)
Similarly get action plans, translations, hashtags, …