Getting Started with h2oGPTe

The h2ogpte Python client lets users integrate with h2oGPTe.

To access h2oGPTe, use the following URL:

h2oGPTe Configuration

The following steps describe how to prepare the h2oGPTe API key needed to access the server:

  1. Log in to https://h2ogpte.genai.h2o.ai

  2. Navigate to Settings > API Keys and create or get the API key.

Upload Documents, Audio, Video, Images …

h2oGPTe is able to ingest a wide variety of data ranging from documents (PDF, text, web pages) to video. Create h2oGPTe client and use it to upload the documents and ingest them:

from h2ogpte import H2OGPTE

client = H2OGPTE(
    address="https://h2ogpte.genai.h2o.ai",
    api_key=API_KEY)

# Chat with LLM
chat_session_id = client.create_chat_session()
with client.connect(chat_session_id) as session:
    # Simple Question for Document Collection
    answer = session.query(
        "Can you hallucinate?",
    ).content
    print(answer)

# Create Collection
collection_id = client.create_collection(
    name="My first h2oGPTe collection",
    description="PDF -> text -> summary",
)

file_path = "...path to document(s)..."
with open(file_path.resolve(), "rb") as f:
    upload_id = client.upload(file_path.name, f)

# Converting the input into chunked text and embeddings...
client.ingest_uploads(collection_id, [upload_id])

Talk to Collection

Start a Q&A session:

chat_session_id = client.create_chat_session(collection_id)
with client.connect(chat_session_id) as session:
    # Simple Question for Document Collection
    answer = session.query(
        "What was net revenue?",
    ).content
    print(answer)

Advanced Controls for Document Q&A

From the client, you can control the LLM, the RAG type, and the prompting. The client can also return streaming responses.

chat_session_id = client.create_chat_session(collection_id)
with client.connect(chat_session_id) as session:
    # Choose different LLM
    answer = session.query(
        "What was net revenue?",
        llm="gpt-4-0613",
    ).content
    print(answer)

    # No RAG (LLM only) - documents ignored
    answer = session.query(
        message="Who are you?",
        rag_config={
            "rag_type": "llm_only",
        },
    ).content
    print(answer)

    # RAG (Retrieval Augmented Generation)
    answer = session.query(
        message="What was net revenue?",
        rag_config={
            "rag_type": "rag",
        },
    ).content
    print(answer)

    # HyDE RAG (Hypothetical Document Embedding)
    hyde1 = session.query(
        message="What was net revenue?",
        rag_config={"rag_type": "hyde1"},
    ).content
    print(hyde1)

    # HyDE RAG+ (Combined HyDE+RAG)
    hyde2 = session.query(
        message="What was net revenue?",
        rag_config={"rag_type": "hyde2"},
    ).content
    print(hyde2)

    # Custom System Prompt
    answer = session.query(
        message="What was net revenue?",
        system_prompt="YOU ARE THE UPPERCASE MACHINE. YOU ANSWER IN UPPERCASE LETTERS.",
    ).content
    print(answer)

    # Custom RAG prompting (e.g., for foreign languages)
    answer = session.query(
        message="Was war der Umsatz?",
        system_prompt="Sie sind ein Professor der Deutschen Sprache.",
        pre_prompt_query="Achten Sie darauf und merken Sie sich die folgenden Informationen, die bei der Beantwortung der Frage oder des Imperativs nach Ende des Kontexts hilfreich sein werden.\n",
        prompt_query="Nur nach den Informationen in den Dokumentenquellen, die im oben genannten Kontext bereitgestellt wurden, ",
    ).content
    print(answer)

    # Streaming
    partial_msg = ""
    full_msg = ""
    def callback(message):
        nonlocal full_msg
        nonlocal partial_msg
        if isinstance(message, ChatMessage):
            full_msg = message.content
        elif isinstance(message, PartialChatMessage):
            partial_msg += message.content
            print(partial_msg)
    session.query("What was net revenue?", timeout=60, callback=callback)

Self-Reflection

Ask another LLM for a reflection of the answer given the question and the context provided. Can be used to evaluate the LLM’s performance (to some extent, requires a strong LLM).

self_reflection_config = dict(
    llm_reflection="gpt-4-0613",
    prompt_reflection='Here\'s the prompt and the response:\n"""Prompt:\n%s\n"""\n\n"""Response:\n%s\n"""\n\nWhat is the quality of the response for the given prompt? Answer with GOOD or BAD.',
    system_prompt_reflection="",
    llm_args_reflection="{}",
)

chat_session_id = client.create_chat_session(collection_id)
with client.connect(chat_session_id) as session:
    reply = session.query(
        message="What color are roses",
        self_reflection_config=self_reflection_config,
    ).content
    print(reply)

    reflection = client.list_chat_message_meta_part(
        reply.id, "self_reflection"
    ).content
    print(reflection)

Summarization of Documents

Create a custom summary of a document:

documents = client.list_documents_in_collection(collection_id, offset=0, limit=99)
document_id = documents[0].id
summary = client.process_document(
    document_id=document_id,
    pre_prompt_summary="Summarize the content below into a list of bullets.\n",
    prompt_summary="Now summarize the above into a couple of paragraphs.",
    max_num_chunks=200,
    sampling_strategy="auto",
)

# Summary
for s in summary.content.split("\n"):
    print(s)

Summarization of Arbitrary Text Chunks

Create a summary from arbitrary text blocks:

summary = client.summarize_content(
    pre_prompt_summary="Summarize the content below into a list of bullets.\n",
    text_context_list=["long text to summarize", "more text to summarize", "yet more tet to summarize"],
    prompt_summary="Now summarize the above into a couple of paragraphs.",
)

# Summary
for s in summary.content.split("\n"):
    print(s)

Data Extraction

You can extract data with custom prompts:

extract = client.extract_data(
    text_context_list=["Jack Smith", "Jane Doe and her husband Jim Doe", "This is nothing important"],
    pre_prompt_extract="Pay attention and look at all people. Your job is to collect their names.\n",
    prompt_extract="List all people names as JSON.",
    keep_intermediate_results=True,
)

# List of LLM answers per text input
for extract_list_item in extract.content:
    for s in extract_list_item.split("\n"):
        print(s)

Similarly get action plans, translations, hashtags, …