h2ogpte package

Submodules

h2ogpte.h2ogpte module

class h2ogpte.h2ogpte.H2OGPTE(address: str, api_key: str | None = None, token_provider: TokenProvider | None = None, verify: bool | str = True, strict_version_check: bool = False)

Bases: H2OGPTESyncBase

Connect to and interact with an h2oGPTe server.

add_agent_key(agent_keys: List[dict]) → List[dict]

add_role_to_group(group_id: str, roles: List[str]) → List[UserRole]

add_role_to_user(user_id: str, roles: List[str]) → List[UserRole]

add_user_document_permission(user_id: str, document_id: str) → [<class 'str'>, <class 'str'>]

Associates a user with a document they have permission on. Args:

user_id:
String The id of the user that has the permission.

document_id:
String The id of the document that the permission is for.

Returns:: [user_id, document_id]: A tuple containing the user_id and document_id.

Send a message and get a response from an LLM.

Note: This method is only recommended if you are passing a chat conversation or for low-volume testing. For general chat with an LLM, we recommend session.query() for higher throughput in multi-user environments. The following code sample shows the recommended method:
# Establish a chat session
chat_session_id = client.create_chat_session()
# Connect to the chat session
with client.connect(chat_session_id) as session:
    # Send a basic query and print the reply
    reply = session.query("Hello", timeout=60)
    print(reply.content)
Format of inputs content:
{text_context_list}
"""\n{chat_conversation}{question}
Args:

question:
Text query to send to the LLM.

text_context_list:
List of raw text strings to be included, will be converted to a string like this: “

“.join(text_context_list)

system_prompt:

Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto for the model default, or None for h2oGPTe default. Defaults to ‘’ for no system prompt.

pre_prompt_query:

Text that is prepended before the contextual document chunks in text_context_list. Only used if text_context_list is provided.

prompt_query:

Text that is appended after the contextual document chunks in text_context_list. Only used if text_context_list is provided.

llm:

Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:: temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

chat_conversation:

List of tuples for (human, bot) conversation that will be pre-appended to an (question, None) case for a query.

guardrails_settings:

Guardrails Settings.

timeout:

Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:: h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: Answer: The response text and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

archive_collection(collection_id: str) → str

Archive a collection along with its associated data.

Args:

collection_id:: ID of the collection to archive.

assign_agent_key_for_tool(tool_dict_list: List[dict]) → List[Tuple] | None

assign_permissions_to_role(role_name: str, permission_names: Iterable[str]) → Result

bulk_delete_global_configurations(key_names: List[str]) → List[ConfigItem]

bulk_delete_user_configurations_for_user(user_id: str, key_names: List[str]) → List[UserConfigItem]

cancel_job(job_id: str) → Result

Stops a specific job from running on the server.

Args:

job_id:: String id of the job to cancel.

Returns:

Result: Status of canceling the job.

cancel_user_job(job_id: str) → Result

As an admin, stops a specific user job from running on the server.

Args:

job_id:: String id of the job to cancel.

Returns:

Result: Status of canceling the job.

count_assets() → ObjectCount

Counts number of objects owned by the user.

Returns:: ObjectCount: The count of chat sessions, collections, and documents.

count_chat_sessions() → int

Counts number of chat sessions owned by the user.

Returns:: int: The count of chat sessions owned by the user.

count_chat_sessions_for_collection(collection_id: str) → int

Counts number of chat sessions in a specific collection.

Args:

collection_id:: String id of the collection to count chat sessions for.

Returns:

int: The count of chat sessions in that collection.

count_collections() → int

Counts number of collections owned by the user.

Returns:: int: The count of collections owned by the user.

count_documents() → int

Counts number of documents accessed by the user.

Returns:: int: The count of documents accessed by the user.

count_documents_in_collection(collection_id: str) → int

Counts the number of documents in a specific collection.

Args:

collection_id:: String id of the collection to count documents for.

Returns:

int: The number of documents in that collection.

count_documents_owned_by_me() → int

Counts number of documents owned by the user.

Returns:: int: The count of documents owned by the user.

count_prompt_templates() → int

Counts number of prompt templates

Returns:: int: The count of prompt templates

count_question_reply_feedback() → int

Fetch user’s questions and answers with feedback count.

Returns:: int: the count of questions and replies that have a user feedback.

create_api_key_for_user(user_id: str, name: str | None = None, collection_id: str | None = None, expires_in: str | None = None) → str

Allows admins to create a new api key for a specific user and optionally make it specific to a collection. Args:

user_id:
String: The id of the user the API key is for.

name:
(Optional) String: The name of the API key.

collection_id:
(Optional) String: The id of the specific collection.

expires_in:
(Optional) String: The expiration for the API key as an interval. Ex. “30 days” or “30 minutes”

Returns:: String: The id of the API key.

create_chat_session(collection_id: str | None = None) → str

Creates a new chat session for asking questions (of documents).

Args:

collection_id:: String id of the collection to chat with. If None, chat with LLM directly.

Returns:

str: The ID of the newly created chat session.

create_chat_session_on_default_collection() → str

Creates a new chat session for asking questions of documents on the default collection.

Returns:: str: The ID of the newly created chat session.

Creates a new collection.

Args:

name:

Name of the collection.

description:

Description of the collection

embedding_model:

embedding model to use. call list_embedding_models() to list of options.

prompt_template_id:

ID of the prompt template to get the prompts from. None to fall back to system defaults.

collection_settings:

(Optional) Dictionary with key/value pairs to configure certain collection specific settings max_tokens_per_chunk: Approximate max. number of tokens per chunk for text-dominated document pages. For images, chunks can be larger. chunk_overlap_tokens: Approximate number of tokens that are overlapping between successive chunks. gen_doc_summaries: Whether to auto-generate document summaries (uses LLM) gen_doc_questions: Whether to auto-generate sample questions for each document (uses LLM) audio_input_language: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices. ocr_model: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models.

Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. auto - Automatic will auto-select the best OCR model for every page. off - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).

tesseract_lang: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices. keep_tables_as_one_chunk: When tables are identified by the table parser the table tokens will be kept in a single chunk. chunk_by_page: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is true. handwriting_check: Check pages for handwriting. Will use specialized models if handwriting is found. follow_links: Whether to import all web pages linked from this URL will be imported. External links will be ignored. Links to other pages on the same domain will be followed as long as they are at the same level or below the URL you specify. Each page will be transformed into a PDF document. max_depth: Max depth of recursion when following links, only when follow_links is True. Max_depth of 0 means don’t follow any links, max_depth of 1 means follow only top-level links, etc. Use -1 for automatic (system settings). max_documents: Max number of documents when following links, only when follow_links is True. Use None for automatic (system defaults). Use -1 for max (system limit). root_dir: Root directory for document storage copy_document: Whether to copy the document when importing an existing document. guardrails_settings itself is a dictionary of the following keys.

column_redaction_config: list of list for redacting columns from CSV/TSV files (regex_pattern, fill_value) disallowed_regex_patterns: list of regular expressions that match custom PII presidio_labels_to_flag: list of entities to be flagged as PII by the built-in Presidio model. pii_labels_to_flag: list of entities to be flagged as PII by the built-in PII model. pii_detection_parse_action: what to do when PII is detected during parsing of documents. One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content in the ingested documents with redaction bars. pii_detection_llm_input_action: what to do when PII is detected in the input to the LLM (document content and user prompts). One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content with placeholders. pii_detection_llm_output_action: what to do when PII is detected in the output of the LLM. One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content with placeholders. prompt_guard_labels_to_flag: list of entities to be flagged as safety violations in user prompts by the built-in prompt guard model. guardrails_labels_to_flag: list of entities to be flagged as safety violations in user prompts. Must be a subset of guardrails_entities, if provided. guardrails_safe_category: (Optional) name of the safe category for guardrails. Must be a key in guardrails_entities, if provided. Otherwise uses system defaults. guardrails_entities: (Optional) dictionary of entities and their descriptions for the guardrails model to classify. The first entry is the “safe” class, the rest are “unsafe” classes. column_redaction_custom_entities_to_flag: list of entities to redact in tabular data files. Must be a subset of column_redaction_custom_entities, if provided. column_redaction_custom_entities: (Optional) dictionary of entities and a short description for the LLM to check for and redact columns containing PII in tabular data files. guardrails_llm: LLM to use for guardrails and PII detection. Use “auto” for automatic. Use H2OGPTE.get_llms() to see all available options.

Example:

Note: Call client.get_guardrails_settings() to see all options for guardrails_settings.

collection_settings=dict(

max_tokens_per_chunk=320, chunk_overlap_tokens=0, guardrails_settings=dict(

disallowed_regex_patterns=[“secret_disallowed_word”, r”(?!0{3})(?!6{3})[0-8]d{2}-(?!0{2})d{2}-(?!0{4})d{4}”], presidio_labels_to_flag=[“IBAN_CODE”, “US_SSN”], pii_labels_to_flag=[“IBAN”, “SSN”], pii_detection_parse_action=”redact”, pii_detection_llm_input_action=”redact”, pii_detection_llm_output_action=”allow”, prompt_guard_labels_to_flag=[“JAILBREAK”], guardrails_labels_to_flag=[

“Violent Crimes”, “Non-Violent Crimes”, “Intellectual Property”, “Code Interpreter Abuse”,

], guardrails_safe_category=”Safe”, guardrails_entities={

“Safe”: “Messages that do not contain any of the following unsafe content”, “Violent Crimes”: “Messages that enable, encourage, or endorse the commission of violent crimes, including: (1) unlawful violence toward people (ex: terrorism, genocide, murder, hate-motivated violent crimes, child abuse, assault, battery, kidnapping) and (2) unlawful violence toward animals (ex: animal abuse)”, “Non-Violent Crimes”: “Messages that enable, encourage, or endorse the commission of non-violent crimes, including: (1) personal crimes (ex: labor trafficking, threats, intimidation, hate-motivated non-violent crimes), (2) financial crimes (ex: fraud, scams, money laundering), (3) property crimes (ex: theft, arson, vandalism), (4) drug crimes (ex: creating or using regulated substances), (5) weapons crimes (ex: producing unlicensed firearms), and (6) cyber crimes (ex: hacking)”, “Defamation”: “Messages that are both verifiably false and likely to injure a living person’s reputation”, “Specialized Advice”: “Messages that contain specialized financial, medical, or legal advice, or that indicate dangerous activities or objects are safe”, “Intellectual Property”: “Messages that may violate the intellectual property rights of any third party”, “Code Interpreter Abuse”: “Messages that seek to abuse code interpreters, including those that enable denial of service attacks, container escapes or privilege escalation exploits”,

}, column_redaction_custom_entities_to_flag=[

“Mother’s Maiden Name”

], column_redaction_custom_entities={

“Mother’s Maiden Name”: “Mother’s maiden name.”

}, guardrails_llm=”meta-llama/Llama-3.3-70B-Instruct”,

),

)

thumbnail:

(Optional) Path to the thumbnail image for the collection. Must include appropriate file extension.

chat_settings:

(Optional) Dictionary with key/value pairs to configure the default values for certain chat specific settings The following keys are supported, see the client.session() documentation for more details. llm: str llm_args: dict self_reflection_config: dict rag_config: dict include_chat_history: bool tags: list[str]

Returns:

str: The ID of the newly created collection.

create_prompt_template(name: str, description: str | None = None, lang: str | None = None, system_prompt: str | None = None, pre_prompt_query: str | None = None, prompt_query: str | None = None, hyde_no_rag_llm_prompt_extension: str | None = None, pre_prompt_summary: str | None = None, prompt_summary: str | None = None, system_prompt_reflection: str | None = None, pre_prompt_reflection: str | None = None, prompt_reflection: str | None = None, auto_gen_description_prompt: str | None = None, auto_gen_document_summary_pre_prompt_summary: str | None = None, auto_gen_document_summary_prompt_summary: str | None = None, auto_gen_document_sample_questions_prompt: str | None = None, default_sample_questions: List[str] | None = None, image_batch_image_prompt: str | None = None, image_batch_final_prompt: str | None = None) → str

Create a new prompt template

Args:

name:: Name of the prompt template
description:: Description of the prompt template
lang:: Language code
system_prompt:: System Prompt
pre_prompt_query:: Text that is prepended before the contextual document chunks.
prompt_query:: Text that is appended to the beginning of the user’s message.
hyde_no_rag_llm_prompt_extension:: LLM prompt extension.
pre_prompt_summary:: Prompt that goes before each large piece of text to summarize
prompt_summary:: Prompt that goes after each large piece of text to summarize
system_prompt_reflection:: System Prompt for self-reflection
pre_prompt_reflection:: Deprecated - ignored
prompt_reflection:: Template for self-reflection, must contain two occurrences of %s for full previous prompt (including system prompt, document related context and prompts if applicable, and user prompts) and answer
auto_gen_description_prompt:: prompt to create a description of the collection.
auto_gen_document_summary_pre_prompt_summary:: pre_prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_summary_prompt_summary:: prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_sample_questions_prompt:: prompt to create sample questions for a freshly imported document (if enabled).
default_sample_questions:: default sample questions in case there are no auto-generated sample questions.
image_batch_final_prompt:: Prompt for each image batch for vision models
image_batch_image_prompt:: Prompt to reduce all answers each image batch for vision models

Returns:

str: The ID of the newly created prompt template.

create_tag(tag_name: str) → str

Creates a new tag.

Args:

tag_name:: String representing the tag to create.

Returns:

String: The id of the created tag.

create_topic_model(collection_id: str, timeout: float | None = None) → Job

create_user_group(name: str, description: str) → UserGroup

create_user_role(name: str, description: str) → UserRole

deactivate_api_key(api_key_id: str) → Result

Allows admins to deactivate an API key.

Note: You cannot undo this action.

Args:

api_key_id:: String: The id of the API key.

Returns:

Result: Status of the deactivate request.

delete_agent_directories(chat_session_id: str) → bool

delete_agent_keys(key_ids: List[str]) → None

delete_agent_server_files(chat_session_id: str) → bool

delete_agent_tool_association(associate_ids: List[str]) → int

delete_agent_tool_preference() → None

delete_api_keys(api_key_ids: List[str]) → Result

Allows admins to delete API keys.

Args:

api_key_ids:: List[str]: The API keys to delete.

Returns:

Result: Status of the delete request.

delete_chat_messages(chat_message_ids: Iterable[str]) → Result

Deletes specific chat messages.

Args:

chat_message_ids:: List of string ids of chat messages to delete from the system.

Returns:

Result: Status of the delete job.

delete_chat_sessions(chat_session_ids: Iterable[str], timeout: float | None = None) → Job

Deletes chat sessions and related messages.

Args:

chat_session_ids:: List of string ids of chat sessions to delete from the system.
timeout:: Timeout in seconds.

Returns:

Result: The delete job.

delete_collections(collection_ids: Iterable[str], timeout: float | None = None) → Job

Deletes collections from the environment.

Documents in the collection that are owned by other users will not be deleted.

Args:

collection_ids:: List of string ids of collections to delete from the system.
timeout:: Timeout in seconds.

delete_collections_as_admin(collection_ids: Iterable[str], timeout: float | None = None) → Job

Deletes collections and their associated data from the environment (needs appropriate permission).

Args:

collection_ids:: List of string ids of collections to delete from the system.
timeout:: Timeout in seconds.

delete_document_summaries(summaries_ids: Iterable[str]) → Result

Deletes document summaries.

Args:

summaries_ids:: List of string ids of a document summary to delete from the system.

Returns:

Result: Status of the delete job.

delete_documents(document_ids: Iterable[str], timeout: float | None = None) → Job

Deletes documents from the system.

Args:

document_ids:: List of string ids to delete from the system and all collections.
timeout:: Timeout in seconds.

delete_documents_from_collection(collection_id: str, document_ids: Iterable[str], timeout: float | None = None) → Job

Removes documents from a collection.

See Also: H2OGPTE.delete_documents for completely removing the document from the environment.

Args:

collection_id:: String of the collection to remove documents from.
document_ids:: List of string ids to remove from the collection.
timeout:: Timeout in seconds.

delete_multiple_agent_directories(chat_session_ids: List[str], dir_types: List[str]) → bool

delete_prompt_templates(ids: Iterable[str]) → Result

Deletes prompt templates

Args:

ids:: List of string ids of prompte templates to delete from the system.

Returns:

Result: Status of the delete job.

delete_upload(upload_id: str) → str

Delete a file previously uploaded with the “upload” method.

See Also:

upload: Upload the files into the system to then be ingested into a collection. ingest_uploads: Add the uploaded files to a collection.

Args:

upload_id:: ID of a file to remove

Returns:

upload_id: The upload id of the removed.

Raises:

Exception: The delete upload request was unsuccessful.

delete_user_groups_by_ids(groups_ids: Iterable[str]) → Result

delete_user_groups_by_names(groups_names: Iterable[str]) → Result

delete_user_roles_by_ids(roles_ids: Iterable[str]) → Result

delete_user_roles_by_names(roles_names: Iterable[str]) → Result

download_document(destination_directory: str | Path, destination_file_name: str, document_id: str) → Path

Downloads a document to a local system directory.

Args:

destination_directory:: Destination directory to save file into.
destination_file_name:: Destination file name.
document_id:: Document ID.

Returns:

Path: Path of downloaded document

download_reference_highlighting(message_id: str, destination_directory: str, output_type: str = 'combined', limit: int | None = None) → list

Get PDFs with reference highlighting

Args:

message_id:: ID of the message to get references from
destination_directory:: Destination directory to save files into.
output_type: str one of: "combined" Generates a PDF file for each source document, with all relevant chunks highlighted in each respective file. This option consolidates all highlights for each source document into a single PDF, making it easy to view all highlights related to that document at once. "split" Generates a separate PDF file for each chunk, with only the relevant chunk highlighted in each file. This option is useful for focusing on individual sections without interference from other parts of the text. The output files names will be in the format “{document_id}_{chunk_id}.pdf”
limit:: The number of references to consider based on the highest confidence scores.

Returns:

list[Path]: List of paths of downloaded documents with highlighting

encode_for_retrieval(chunks: Iterable[str], embedding_model: str | None = None) → List[List[float]]

Encode texts for semantic searching.

See Also: H2OGPTE.match for getting a list of chunks that semantically match each encoded text.

Args:

chunks:: List of strings of texts to be encoded.
embedding_model:: embedding model to use. call list_embedding_models() to list of options.

Returns:

List of list of floats: Each list in the list is the encoded original text.

Extract information from one or more contexts using an LLM.

pre_prompt_extract and prompt_extract variables must be used together. If these variables are not set, the inputs texts will be summarized into bullet points.

Format of extract content:
"{pre_prompt_extract}"""
{text_context_list}
"""\n{prompt_extract}"
Examples:
extract = h2ogpte.extract_data(
    text_context_list=chunks,
    pre_prompt_extract="Pay attention and look at all people. Your job is to collect their names.\n",
    prompt_extract="List all people's names as JSON.",
)
Args:

text_context_list:
List of raw text strings to extract data from.

system_prompt:
Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto or None for the model default. Defaults to ‘’ for no system prompt.

pre_prompt_extract:
Text that is prepended before the list of texts. If not set, the inputs will be summarized.

prompt_extract:
Text that is appended after the list of texts. If not set, the inputs will be summarized.

llm:
Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:
temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

guardrails_settings:
Guardrails Settings.

timeout:
Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:
h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: ExtractionAnswer: The list of text responses and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

get_agent_key_tool_associations() → List[dict]

get_agent_keys() → List[dict]

get_agent_server_files(chat_session_id: str) → List[dict]

get_agent_tool_preference() → List[str]

get_agent_tools_dict() → dict

get_all_directory_stats(chat_session_id: str, detail_level: int = 0) → dict

get_chat_session_prompt_template(chat_session_id: str) → PromptTemplate | None

Get the prompt template for a chat_session

Args:

chat_session_id:: ID of the chat session

Returns:

str: ID of the prompt template.

get_chat_session_questions(chat_session_id: str, limit: int) → List[SuggestedQuestion]

List suggested questions

Args:

chat_session_id:: A chat session ID of which to return the suggested questions
limit:: How many questions to return.

Returns:

List: A list of questions.

get_chunks(collection_id: str, chunk_ids: Iterable[int]) → List[Chunk]

Get the text of specific chunks in a collection.

Args:

collection_id:: String id of the collection to search in.
chunk_ids:: List of ints for the chunks to return. Chunks are indexed starting at 1.

Returns:

Chunk: The text of the chunk.

Raises:

Exception: One or more chunks could not be found.

get_collection(collection_id: str) → Collection

Get metadata about a collection.

Args:

collection_id:: String id of the collection to search for.

Returns:

Collection: Metadata about the collection.

Raises:

KeyError: The collection was not found.

get_collection_for_chat_session(chat_session_id: str) → Collection

Get metadata about the collection of a chat session.

Args:

chat_session_id:: String id of the chat session to search for.

Returns:

Collection: Metadata about the collection.

get_collection_prompt_template(collection_id: str) → PromptTemplate | None

Get the prompt template for a collection

Args:

collection_id:: ID of the collection

Returns:

str: ID of the prompt template.

get_collection_questions(collection_id: str, limit: int) → List[SuggestedQuestion]

List suggested questions

Args:

collection_id:: A collection ID of which to return the suggested questions
limit:: How many questions to return.

Returns:

List: A list of questions.

get_default_collection() → CollectionInfo

Get the default collection, to be used for collection API-keys.

Returns:: CollectionInfo: Default collection info.

get_directory_stats(directory_name: str, chat_session_id: str, detail_level: int = 0) → dict

get_document(document_id: str, include_layout: bool = False) → Document

Fetches information about a specific document.

Args:

document_id:: String id of the document.
include_layout:: Include the layout prediction results.

Returns:

Document: Metadata about the Document.

Raises:

KeyError: The document was not found.

get_document_content(file_name: str, document_id: str) → bytes

Downloads a document and return its content as a byte array.

Args:

file_name:: File name.
document_id:: Document ID.

Returns:

Path: File content

get_global_configurations() → List[ConfigItem]

get_global_configurations_by_admin() → List[ConfigItem]

get_guardrails_settings(action: str = 'redact', sensitive: bool = True, non_sensitive: bool = True, all_guardrails: bool = True, guardrails_settings: dict | None = None) → Dict[str, str | List[str]]: Helper to get reasonable (easy to use) defaults for Guardrails/PII settings. To be further customized. :param action: what to do when detecting PII, either “redact” or “fail” (“allow” would keep PII intact). Guardrails models always fail upon detecting safety violations. :param sensitive: whether to include the most sensitive PII entities like SSN, bank account info :param non_sensitive: whether to include all non-sensitive PII entities, such as IP addresses, locations, names, e-mail addresses etc. :param all_guardrails: whether to include all possible entities for prompt guard and guardrails models, or just system defaults :param guardrails_settings: existing guardrails settings (e.g., from collection settings) to obtain guardrails entities, guardrails_entities_to_flag, column redaction custom_pii_entities, column_redaction_pii_to_flag from instead of system defaults :return: dictionary to pass to collection creation or process_document method

get_h2ogpt_system_stats() → dict

get_job(job_id: str) → Job

Fetches information about a specific job.

Args:

job_id:: String id of the job.

Returns:

Job: Metadata about the Job.

get_llm_and_auto_reasoning_llm_names() → Dict[str, str]

Get mapping of llm to its reasoning_model when [“auto”] is passed as visible_reasoning_models

Returns:: dictionary {‘llm1’: ‘llm1_reasoning_llm’, etc.}

get_llm_and_auto_vision_llm_names() → Dict[str, str]

Get mapping of llm to its vision_model when [“auto”] is passed as visible_vision_models

Returns:: dictionary {‘llm1’: ‘llm1_vision_llm’, etc.}

get_llm_name_for_rest(llm: str | int | None)

get_llm_names() → List[str]

Lists names of available LLMs in the environment.

Returns:: list of string: Name of each available model.

get_llm_performance_by_llm(interval: str) → List[LLMPerformance]

get_llm_usage_24h() → float

get_llm_usage_24h_by_llm() → List[LLMUsage]

get_llm_usage_24h_with_limits() → LLMUsageLimit

get_llm_usage_6h() → float

get_llm_usage_6h_by_llm() → List[LLMUsage]

get_llm_usage_by_llm(interval: str) → List[LLMUsage]

get_llm_usage_by_llm_and_user(interval: str) → List[LLMWithUserUsage]

get_llm_usage_by_user(interval: str) → List[UserWithLLMUsage]

get_llm_usage_with_limits(interval: str) → LLMUsageLimit

get_llms() → List[Dict[str, Any]]

Lists metadata information about available LLMs in the environment.

Returns:: list of dict (string, ANY): Name and details about each available model.

get_meta() → Meta

Returns information about the environment and the user.

Returns:: Meta: Details about the version and license of the environment and the user’s name and email.

get_prompt_template(id: str | None = None) → PromptTemplate

Get a prompt template

Args:

id:: String id of the prompt template to retrieve or None for default

Returns:

PromptTemplate: prompts

Raises:

KeyError: The prompt template was not found.

get_reasoning_capable_llm_names() → List[str]

Lists names of available reasoning-capable (that can natively reason) in the environment.

Returns:: list of string: Name of each available model.

get_scheduler_stats() → SchedulerStats

Count the number of global, pending jobs on the server.

Returns:: SchedulerStats: The queue length for number of jobs.

get_tag(tag_name: str) → Tag

Returns an existing tag.

Args:

tag_name:: String The name of the tag to retrieve.

Returns:

Tag: The requested tag.

Raises:

KeyError: The tag was not found.

get_user_all_agent_directories(offset: int, limit: int, filter_text: str | None) → List[dict]

get_user_configurations() → List[UserConfigItem]

Gets the user configurations for the current user.

Returns:: List[UserConfigItem]: List of user configurations.

get_user_configurations_for_user(user_id: str) → List[UserConfigItem]

Gets the user configurations for a specific user (to be used by admins only).

Args:

user_id:: The unique identifier of the user.

Returns:

List[UserConfigItem]: List of user configurations.

get_vision_capable_llm_names() → List[str]

Lists names of available vision-capable multi-modal LLMs (that can natively handle images as input) in the environment.

Returns:: list of string: Name of each available model.

Import all documents from a collection into an existing collection

Args:

collection_id:: Collection ID to add documents to.
src_collection_id:: Collection ID to import documents from.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
copy_document:: Whether to save a new copy of the document
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Import an already stored document to an existing collection

Args:

collection_id:: Collection ID to add documents to.
document_id:: Document ID to add.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
copy_document:: Whether to save a new copy of the document
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

ingest_agent_only_to_standard(collection_id: str, document_id: str, gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, audio_input_language: str | None = None, ocr_model: str | None = None, restricted: bool = False, permissions: List[SharePermission] | None = None, tesseract_lang: str | None = None, keep_tables_as_one_chunk: bool | None = None, chunk_by_page: bool | None = None, handwriting_check: bool | None = None, timeout: float | None = None)

For files uploaded in “agent_only” ingest mode, convert to PDF and parse

See Also:

upload: Upload the files into the system to then be ingested into a collection. delete_upload: Delete uploaded file

Args:

collection_id:: String id of the collection to add the ingested documents into.
document_id:: ID of document to be parsed.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
restricted:: Whether the document should be restricted only to certain users.
permissions:: List of permissions. Each permission is a SharePermission object.
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.

Add files from the Azure Blob Storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
container:: Name of the Azure Blob Storage container.
path:: Path or list of paths to files or directories within an Azure Blob Storage container. Examples: file1, dir1/file2, dir3/dir4/
account_name:: Name of a storage account
credentials:: The object with Azure credentials. If the object is not provided, only a public container will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Metadata to be associated with the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Add files from the local system into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
root_dir:: String path of where to look for files.
glob:: String of the glob pattern used to match files in the root directory.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Add files from the Google Cloud Storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: The path or list of paths of GCS files or directories. Examples: gs://bucket/file, gs://bucket/../dir/
credentials:: The object holding a path to a JSON key of Google Cloud service account. If the object is not provided, only public buckets will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

ingest_from_plain_text(collection_id: str, plain_text: str, file_name: str, gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, metadata: Dict[str, Any] | None = None, timeout: float | None = None)

Add plain text to a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
plain_text:: String of the plain text to ingest.
file_name:: String of the file name to use for the document.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds

Add files from the AWS S3 storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: The path or list of paths of S3 files or directories. Examples: s3://bucket/file, s3://bucket/../dir/
region:: The name of the region used for interaction with AWS services.
credentials:: The object with S3 credentials. If the object is not provided, only public buckets will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

ingest_uploads(collection_id: str, upload_ids: Iterable[str], gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, audio_input_language: str | None = None, ocr_model: str | None = None, restricted: bool = False, permissions: List[SharePermission] | None = None, tesseract_lang: str | None = None, keep_tables_as_one_chunk: bool | None = None, chunk_by_page: bool | None = None, handwriting_check: bool | None = None, metadata: Dict[str, Any] | None = None, timeout: float | None = None, ingest_mode: str | None = None) → Job

Add uploaded documents into a specific collection.

See Also:

upload: Upload the files into the system to then be ingested into a collection. delete_upload: Delete uploaded file

Args:

collection_id:: String id of the collection to add the ingested documents into.
upload_ids:: List of string ids of each uploaded document to add to the collection.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
restricted:: Whether the document should be restricted only to certain users.
permissions:: List of permissions. Each permission is a SharePermission object.
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Metadata to be associated with the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Crawl and ingest a URL into a collection.

The web page or document linked from this URL will be imported.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: String of the url to crawl.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
follow_links:: Whether to import all web pages linked from this URL will be imported. External links will be ignored. Links to other pages on the same domain will be followed as long as they are at the same level or below the URL you specify. Each page will be transformed into a PDF document.
max_depth:: Max depth of recursion when following links, only when follow_links is True. Max_depth of 0 means don’t follow any links, max_depth of 1 means follow only top-level links, etc. Use -1 for automatic (system settings).
max_documents:: Max number of documents when following links, only when follow_links is True. Use None for automatic (system defaults). Use -1 for max (system limit).
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

is_collection_permission_granted(collection_id: str, permission: str) → bool

is_permission_granted(permission: str) → bool

list_all_api_keys(offset: int, limit: int, key_filter: str = '') → List[APIKey]

Allows admins to list all the API keys that exist.

Args:

offset:: Int: How many keys to skip before returning.
limit:: Int: How many keys to return.
key_filter:: String: Only returns keys for usernames matching this filter.

Returns:

List[APIKey]: List of APIKeys with metadata about each key.

list_all_collections_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[CollectionInfo]

Fetch all users’ collection metadata sorted by last update time.

This is for admin use only and includes private, public, and shared collections in the result.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of CollectionInfo: Metadata about each collection.

list_all_jobs() → List[UserJobs]

List all jobs (to be used by admins only).

Returns:: list of UserJobs

list_all_tags() → List[Tag]

Lists all existing tags.

Returns:: List of Tags: List of existing tags.

list_chat_message_meta_part(message_id: str, info_type: str) → ChatMessageMeta

Fetch one chat message meta information.

Args:

message_id:: Message id to which the metadata should be pulled.
info_type:: Metadata type to fetch. Valid choices are: “self_reflection”, “usage_stats”, “prompt_raw”, “llm_only”, “hyde1”, “py_client_code”

Returns:

ChatMessageMeta: Metadata information about the chat message.

list_chat_message_references(message_id: str, limit: int | None = None) → List[ChatMessageReference]

Fetch metadata for references of a chat message.

References are only available for messages sent from an LLM, an empty list will be returned for messages sent by the user.

Args:

message_id:: String id of the message to get references for.
limit:: The number of references to consider based on the highest confidence scores.

Returns:

list of ChatMessageReference: Metadata including the document name, polygon information, and score.

list_chat_messages(chat_session_id: str, offset: int, limit: int) → List[ChatMessage]

Fetch chat message and metadata for messages in a chat session.

Messages without a reply_to are from the end user, messages with a reply_to are from an LLM and a response to a specific user message.

Args:

chat_session_id:: String id of the chat session to filter by.
offset:: How many chat messages to skip before returning.
limit:: How many chat messages to return.

Returns:

list of ChatMessage: Text and metadata for chat messages.

list_chat_messages_full(chat_session_id: str, offset: int, limit: int) → List[ChatMessageFull]

Fetch chat message and metadata for messages in a chat session.

Messages without a reply_to are from the end user, messages with a reply_to are from an LLM and a response to a specific user message.

Args:

chat_session_id:: String id of the chat session to filter by.
offset:: How many chat messages to skip before returning.
limit:: How many chat messages to return.

Returns:

list of ChatMessageFull: Text and metadata for chat messages.

list_chat_sessions_for_collection(collection_id: str, offset: int, limit: int) → List[ChatSessionForCollection]

Fetch chat session metadata for chat sessions in a collection.

Args:

collection_id:: String id of the collection to filter by.
offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionForCollection: Metadata about each chat session including the latest message.

list_chat_sessions_for_document(document_id: str, offset: int, limit: int) → List[ChatSessionForDocument]

Fetch chat session metadata for chat session that produced a specific document (typically through agents).

Args:

document_id:: String id of the document to filter by.
offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionForDocument: Metadata about each chat session including the latest message.

list_collection_group_permissions(collection_id: str) → List[GroupSharePermission]

Returns a list of group access permissions for a given collection.

The returned list of group permissions denoting which groups have access to the collection and their access level.

Args:

collection_id:: ID of the collection to inspect.

Returns:

list of GroupSharePermission: Group sharing permissions list for the given collection.

list_collection_permissions(collection_id: str) → List[SharePermission]

Returns a list of access permissions for a given collection.

The returned list of permissions denotes who has access to the collection and their access level.

Args:

collection_id:: ID of the collection to inspect.

Returns:

list of SharePermission: Sharing permissions list for the given collection.

list_collections_for_document(document_id: str, offset: int, limit: int) → List[CollectionInfo]

Fetch metadata about each collection the document is a part of.

At this time, each document will only be available in a single collection.

Args:

document_id:: String id of the document to search for.
offset:: How many collections to skip before returning.
limit:: How many collections to return.

Returns:

list of CollectionInfo: Metadata about each collection.

list_document_chunks(document_id: str, collection_id: str | None = None) → List[SearchResult]

Returns all chunks for a specific document.

Args:

document_id:: ID of the document.
collection_id:: ID of the collection the document belongs to. If not specified, an arbitrary collections containing the document is chosen.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

list_documents_from_tags(collection_id: str, tags: List[str]) → List[Document]

Lists documents that have the specified set of tags within a collection. Args:

collection_id:
String The id of the collection to find documents in.

tags:
List of Strings representing the tags to retrieve documents for.

Returns:: List of Documents: All the documents with the specified tags.

list_documents_in_collection(collection_id: str, offset: int, limit: int, metadata_filter: dict = {}) → List[DocumentInfo]

Fetch document metadata for documents in a collection.

Args:

collection_id:: String id of the collection to filter by.
offset:: How many documents to skip before returning.
limit:: How many documents to return.
metadata_filter:: Metadata filter to apply to the documents.

Returns:

list of DocumentInfo: Metadata about each document.

list_embedding_models() → List[str]

list_group_permissions(group_id: str) → List[UserPermission]

list_group_permissions_by_name(group_names: List[str]) → List[UserPermission]

list_group_roles(group_id: str) → List[UserRole]

list_jobs() → List[Job]

List the user’s jobs.

Returns:: list of Job:

list_list_chat_message_meta(message_id: str) → List[ChatMessageMeta]

Fetch chat message meta information.

Args:

message_id:: Message id to which the metadata should be pulled.

Returns:

list of ChatMessageMeta: Metadata about the chat message.

list_prompt_group_permissions(prompt_id: str) → List[GroupSharePermission]

Returns a list of group access permissions for a given prompt template.

The returned list of group permissions denoting which groups have access to the prompt template.

Args:

prompt_id:: ID of the prompt template to inspect.

Returns:

list of GroupSharePermission: Group sharing permissions list for the given prompt template.

list_prompt_permissions(prompt_id: str) → List[SharePermission]

Returns a list of access permissions for a given prompt template.

The returned list of permissions denotes who has access to the prompt template and their access level.

Args:

prompt_id:: ID of the prompt template to inspect.

Returns:

list of SharePermission: Sharing permissions list for the given prompt template.

list_question_reply_feedback_data(offset: int, limit: int) → List[QuestionReplyData]

Fetch user’s questions and answers that have a feedback.

Questions and answers with metadata and feedback information.

Args:

offset:: How many conversations to skip before returning.
limit:: How many conversations to return.

Returns:

list of QuestionReplyData: Metadata about questions and answers.

list_recent_chat_sessions(offset: int, limit: int) → List[ChatSessionInfo]

Fetch user’s chat session metadata sorted by last update time.

Chats across all collections will be accessed.

Args:

offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionInfo: Metadata about each chat session including the latest message.

list_recent_collections(offset: int, limit: int) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.

Returns:

list of CollectionInfo: Metadata about each collection.

list_recent_collections_filter(offset: int, limit: int, current_user_only: bool = False, name_filter: str = '') → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time with filter options.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
current_user_only:: When true, will only return the user owned collections.
name_filter:: Only returns collections with names matching this filter.

Returns:

list of CollectionInfo: Metadata about each collection.

list_recent_collections_metadata_filter(offset: int, limit: int, current_user_only: bool, metadata_filter: dict) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time with a filter on metadata.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
current_user_only:: When true, will only return the user owned collections.
metadata_filter:: Only returns collections with metadata matching this filter.

Returns:

list of CollectionInfo: Metadata about each collection.

list_recent_collections_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of CollectionInfo: Metadata about each collection.

list_recent_document_summaries(document_id: str, offset: int, limit: int) → List[ProcessedDocument]

Fetches recent document summaries/extractions/transformations

Args:

document_id:: document ID for which to return summaries
offset:: How many summaries to skip before returning summaries.
limit:: How many summaries to return.

list_recent_documents(offset: int, limit: int, metadata_filter: dict = {}) → List[DocumentInfo]

Fetch user’s document metadata sorted by last update time.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.
metadata_filter:: Metadata filter to apply to the documents.

Returns:

list of DocumentInfo: Metadata about each document.

list_recent_documents_with_summaries(offset: int, limit: int) → List[DocumentInfoSummary]

Fetch user’s document metadata sorted by last update time, including the latest document summary.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.

Returns:

list of DocumentInfoSummary: Metadata about each document.

list_recent_documents_with_summaries_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[DocumentInfoSummary]

Fetch user’s document metadata sorted by last update time, including the latest document summary.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of DocumentInfoSummary: Metadata about each document.

list_recent_prompt_templates(offset: int, limit: int) → List[PromptTemplate]

Fetch user’s prompt templates sorted by last update time.

Args:

offset:: How many prompt templates to skip before returning.
limit:: How many prompt templates to return.

Returns:

list of PromptTemplate: set of prompts

list_recent_prompt_templates_sort(offset: int, limit: int, sort_column: str, ascending: bool, template_type: str = 'all', filter: str = '') → List[PromptTemplate]

Fetch user’s prompt templates sorted by last update time.

Args:

offset:: How many prompt templates to skip before returning.
limit:: How many prompt templates to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.
template_type:: When set, will be used as a type filter, possible values are: all, user, system.
filter:: When set, will be used as a filter on some prompt template columns.

Returns:

list of PromptTemplate: set of prompts

list_system_groups() → List[UserGroup]

list_system_permissions() → List[UserPermission]

list_system_roles() → List[UserRole]

list_upload() → List[str]

List pending file uploads to the H2OGPTE backend.

Uploaded files are not yet accessible and need to be ingested into a collection.

See Also:: upload: Upload the files into the system to then be ingested into a collection. ingest_uploads: Add the uploaded files to a collection. delete_upload: Delete uploaded file
Returns:: List[str]: The pending upload ids to be used in ingest jobs.
Raises:: Exception: The upload list request was unsuccessful.

list_user_permissions(user_id: str | None = None) → List[UserPermission]

list_user_role_permissions(roles: List[str]) → List[UserPermission]

list_user_roles(user_id: str | None = None) → List[UserRole]

list_users(offset: int, limit: int) → List[User]

List system users.

Returns a list of all registered users fo the system, a registered user, is a users that has logged in at least once.

Args:

offset:: How many users to skip before returning.
limit:: How many users to return.

Returns:

list of User: Metadata about each user.

make_collection_private(collection_id: str)

Make a collection private

Once a collection is private, other users will no longer be able to access chat history or documents related to the collection.

Args:

collection_id:: ID of the collection to make private.

make_collection_public(collection_id: str)

Make a collection public

Once a collection is public, it will be accessible to all authenticated users of the system.

Args:

collection_id:: ID of the collection to make public.

match_chunks(collection_id: str, vectors: List[List[float]], topics: List[str], offset: int, limit: int, cut_off: float = 0, width: int = 0) → List[SearchResult]

Find chunks related to a message using semantic search.

Chunks are sorted by relevance and similarity score to the message.

See Also: H2OGPTE.encode_for_retrieval to create vectors from messages.

Args:

collection_id:: ID of the collection to search within.
vectors:: A list of vectorized message for running semantic search.
topics:: A list of document_ids used to filter which documents in the collection to search.
offset:: How many chunks to skip before returning chunks.
limit:: How many chunks to return.
cut_off:: Exclude matches with distances higher than this cut off.
width:: How many chunks before and after a match to return - not implemented.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

Processes a document to either create a global or piecewise summary/extraction/transformation of a document.

Effective prompt created (excluding the system prompt):

"{pre_prompt_summary}
"""
{text from document}
"""
{prompt_summary}"

Args:

document_id:

String id of the document to create a summary from.

system_prompt:

System Prompt

pre_prompt_summary:

Prompt that goes before each large piece of text to summarize

prompt_summary:

Prompt that goes after each large piece of text to summarize

image_batch_final_prompt:

Prompt for each image batch for vision models

image_batch_image_prompt:

Prompt to reduce all answers each image batch for vision models

llm:

LLM to use

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:: temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. seed (int, default: 0) — The seed for the random number generator when sampling during generation (if temp>0 or top_k>1 or top_p<1), seed=0 picks a random seed. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. enable_vision (str, default: “auto”) - Controls vision mode, send images to the LLM in addition to text chunks. Only if have models that support vision, use get_vision_capable_llm_names() to see list. One of [“on”, “off”, “auto”]. visible_vision_models (List[str], default: [“auto”]) - Controls which vision model to use when processing images. Use get_vision_capable_llm_names() to see list. Must provide exactly one model. [“auto”] for automatic.

max_num_chunks:

Max limit of chunks to send to the summarizer

sampling_strategy:

How to sample if the document has more chunks than max_num_chunks. Options are “auto”, “uniform”, “first”, “first+last”, default is “auto” (a hybrid of them all).

pages:

List of specific pages (of the ingested document in PDF form) to use from the document. 1-based indexing.

schema:

Optional JSON schema to use for guided json generation.

keep_intermediate_results:

Whether to keep intermediate results. Default: disabled. If disabled, further LLM calls are applied to the intermediate results until one global summary is obtained: map+reduce (i.e., summary). If enabled, the results’ content will be a list of strings (the results of applying the LLM to different pieces of document context): map (i.e., extract).

guardrails_settings:

Guardrails Settings.

meta_data_to_include:

A dictionary containing flags that indicate whether each piece of document metadata is to be included as part of the context given to the LLM. Only used if enable_vision is disabled. Default is {

“name”: True, “text”: True, “page”: True, “captions”: True, “uri”: False, “connector”: False, “original_mtime”: False, “age”: False, “score”: False,

}

timeout:

Amount of time in seconds to allow the request to run. The default is 86400 seconds.

Returns:

ProcessedDocument: Processed document. The content is either a string (keep_intermediate_results=False) or a list of strings (keep_intermediate_results=True).

Raises:

TimeoutError: The request did not complete in time. SessionError: No summary or extraction created. Document wasn’t part of a collection, or LLM timed out, etc.

remove_collection_expiry_date(collection_id: str) → str

Remove an expiry date from a collection.

Args:

collection_id:: ID of the collection to update.

remove_collection_inactivity_interval(collection_id: str) → str

Remove an inactivity interval for a collection.

Args:

collection_id:: ID of the collection to update.

remove_collection_size_limit(collection_id: str) → str

Remove a size limit for a collection.

Args:

collection_id:: ID of the collection to update.

remove_collection_thumbnail(collection_id: str, timeout: float | None = None)

Remove a thumbnail from a collection.

Args:

collection_id:: Collection you want to remove the thumbnail from.
timeout:: Amount of time in seconds to allow the request to run. The default is 86400 seconds.

remove_role_from_group(group_id: str, roles: List[str]) → List[UserRole]

remove_role_from_user(user_id: str, roles: List[str]) → List[UserRole]

rename_chat_session(chat_session_id: str, name: str)

Update a chat session name

Args:

chat_session_id:: String id of the document to search for.
name:: The new chat session name.

reset_and_share_prompt_template(prompt_id: str, new_usernames: List[str]) → ShareResponseStatus

Remove all users who have access to a prompt template (except for the owner) and share it with the provided list of new users.

Args:

prompt_id:: ID of the prompt template to un-share.
new_usernames:: The list of usernames belonging to the users this prompt template will be shared with.

ShareResponseStatus: Status of share request.

reset_and_share_prompt_template_with_groups(prompt_id: str, new_groups: List[str]) → ShareResponseStatus

Remove all groups who have access to a prompt template and share it with the provided list of new group ids.

Args:

prompt_id:: ID of the prompt template to un-share.
new_groups:: The list of group ids this prompt template will be shared with.

ShareResponseStatus: Status of share request.

reset_collection_prompt_settings(collection_id: str) → str

Reset the prompt settings for a given collection.

Args:

collection_id:: ID of the collection to update.

Returns:

str: ID of the updated collection.

reset_roles_for_group(group_id: str, roles: List[str]) → List[UserRole]

reset_roles_for_user(user_id: str, roles: List[str]) → List[UserRole]

reset_user_configurations_for_user(key_name: str, user_id: str) → List[UserConfigItem]

Reset a user configuration for a specific user (to be used by admins only).

Returns:: List[UserConfigItem]: List of user configurations.

run_selftest(llm: str, mode: str) → dict

Run a self-test for a given LLM Args:

llm:
Name of LLM

mode:
one of [“quick”, “rag”, “full”, “agent”]

Returns:: Dictionary with performance stats. If “error” is filled, the test failed.

search_chunks(collection_id: str, query: str, topics: List[str], offset: int, limit: int) → List[SearchResult]

Find chunks related to a message using lexical search.

Chunks are sorted by relevance and similarity score to the message.

Args:

collection_id:: ID of the collection to search within.
query:: Question or imperative from the end user to search a collection for.
topics:: A list of document_ids used to filter which documents in the collection to search.
offset:: How many chunks to skip before returning chunks.
limit:: How many chunks to return.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

set_api_key_expiration(api_key_id: str, expires_in: str | None = None) → Result

Allows admins to set an expiration on an API key.

Args:

api_key_id:: String: The id of the API key.
expires_in:: (Optional) String: The expiration for the API key as an interval or None (to remove an expiration that was previously set). Ex. “30 days” or “30 minutes”

Returns:

Result: Status of the expiration request.

set_chat_message_votes(chat_message_id: str, votes: int) → Result

Change the vote value of a chat message.

Set the exact value of a vote for a chat message. Any message type can be updated, but only LLM response votes will be visible in the UI. The expectation is 0: unvoted, -1: dislike, 1 like. Values outside of this will not be viewable in the UI.

Args:

chat_message_id:: ID of a chat message, any message can be used but only LLM responses will be visible in the UI.
votes:: Integer value for the message. Only -1 and 1 will be visible in the UI as dislike and like respectively.

Returns:

Result: The status of the update.

Raises:

Exception: The upload request was unsuccessful.

set_chat_session_collection(chat_session_id: str, collection_id: str | None) → str

Set the collection for a chat_session

Args:

chat_session_id:: ID of the chat session
collection_id:: ID of the collection, or None to chat with the LLM only.

Returns:

str: ID of the updated chat session

set_chat_session_prompt_template(chat_session_id: str, prompt_template_id: str | None) → str

Set the prompt template for a chat_session

Args:

chat_session_id:: ID of the chat session
prompt_template_id:: ID of the prompt template to get the prompts from. None to delete and fall back to system defaults.

Returns:

str: ID of the updated chat session

set_collection_expiry_date(collection_id: str, expiry_date: str, timezone: str | None = None) → str

Set an expiry date for a collection.

Args:

collection_id:: ID of the collection to update.
expiry_date:: The expiry date as a string in ‘YYYY-MM-DD’ format.
timezone:: Optional timezone to associate with expiry date (with IANA timezone support).

set_collection_inactivity_interval(collection_id: str, inactivity_interval: int) → str

Set an inactivity interval for a collection.

Args:

collection_id:: ID of the collection to update.
inactivity_interval:: The inactivity interval as an integer number of days.

set_collection_prompt_template(collection_id: str, prompt_template_id: str | None, strict_check: bool = False) → str

Set the prompt template for a collection

Args:

collection_id:: ID of the collection to update.
prompt_template_id:: ID of the prompt template to get the prompts from. None to delete and fall back to system defaults.
strict_check:: whether to check that the collection’s embedding model and the prompt template are optimally compatible

Returns:

str: ID of the updated collection.

set_collection_size_limit(collection_id: str, limit: int | str) → str

Set a maximum limit on the total size of documents (sum) added to a collection. The limit is measured in bytes.

Args:

collection_id:: ID of the collection to update.
limit:: The bytes limit, possible values follow the format: 12345, “1GB”, or “1GiB”.

set_collection_thumbnail(collection_id: str, file_path: Path, timeout: float | None = None)

Upload an image file to be set as a collection’s thumbnail.

The image file will not be considered as a collection document. Acceptable image file types include: .png, .jpg, .jpeg, .svg

Args:

collection_id:: Collection you want to add the thumbnail to.
file_path:: Path to the image file. Must include appropriate file extension.
timeout:: Amount of time in seconds to allow the request to run.

Raises:

ValueError: The file is invalid. Exception: The upload request was unsuccessful.

set_global_configuration(key_name: str, string_value: str, can_overwrite: bool, is_public: bool, value_type: str | None = None) → List[ConfigItem]

Set a global configuration.

Note: Both default collection size limit and inactivity interval can be disabled. To do so, pass ‘-1’ as the string_value.

Args:

key_name:: The name of the global config key.
string_value:: The value to be set for the global config.
can_overwrite:: Whether user settings can override this global setting.
is_public:: Whether users can see the value for this global setting.
value_type:: The type of the value to be set for the global config.

Returns:

List[ConfigItem]: List of global configurations.

set_user_configuration_for_user(key_name: str, string_value: str, user_id: str, value_type: str | None = None) → List[UserConfigItem]

Set a user configuration for a specific user (overrides the global configuration and to be used by admins only).

Note: Both default collection size limit and inactivity interval can be disabled. To do so, pass ‘-1’ as the string_value.

Args:

key_name:: The name of the global config key.
string_value:: The value to be set for the config.
user_id:: The user id you want to apply the config for.
value_type:: The type of the value to be set for the config.

Returns:

List[UserConfigItem]: List of user configurations.

share_collection(collection_id: str, permission: SharePermission) → ShareResponseStatus

Share a collection to a user.

The permission attribute defined the level of access, and who can access the collection, the collection_id attribute denotes the collection to be shared.

Args:

collection_id:: ID of the collection to share.
permission:: Defines the rule for sharing, i.e. permission level.

Returns:

ShareResponseStatus: Status of share request.

share_collection_with_group(collection_id: str, permission: GroupSharePermission) → ShareResponseStatus

Share a collection to a group.

The permission attribute defines the level of access, and which group can access the collection, the collection_id attribute denotes the collection to be shared.

Args:

collection_id:: ID of the collection to share.
permission:: Defines the rule for sharing, i.e. permission level and group.

Returns:

ShareResponseStatus: Status of share request.

share_prompt(prompt_id: str, permission: SharePermission) → ShareResponseStatus

Share a prompt template to a user.

Args:

prompt_id:: ID of the prompt template to share.
permission:: Defines the rule for sharing, i.e. permission level.

Returns:

ShareResponseStatus: Status of share request.

share_prompt_with_group(prompt_id: str, permission: GroupSharePermission) → ShareResponseStatus

Share a prompt to a group.

Args:

prompt_id:: ID of the prompt to share.
permission:: Specific permissions for a group.

Returns:

ShareResponseStatus: Status of share request.

Summarize one or more contexts using an LLM.

Effective prompt created (excluding the system prompt):
"{pre_prompt_summary}
"""
{text_context_list}
"""
{prompt_summary}"
Args:

text_context_list:
List of raw text strings to be summarized.

system_prompt:
Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto for the model default or None for h2oGPTe defaults. Defaults to ‘’ for no system prompt.

pre_prompt_summary:
Text that is prepended before the list of texts. The default can be customized per environment, but the standard default is "In order to write a concise single-paragraph or bulleted list summary, pay attention to the following text:\n"

prompt_summary:
Text that is appended after the list of texts. The default can be customized per environment, but the standard default is "Using only the text above, write a condensed and concise summary of key results (preferably as bullet points):\n"

llm:
Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:
temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

guardrails_settings:
Guardrails Settings.

timeout:
Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:
h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: Answer: The response text and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

summarize_document(*args, **kwargs) → DocumentSummary

tag_document(document_id: str, tag_name: str) → str

Adds a tag to a document.

Args:

document_id:: String id of the document to attach the tag to.
tag_name:: String representing the tag to attach.

Returns:

String: The id of the newly created tag.

unarchive_collection(collection_id: str) → str

Restore an archived collection to an active status.

Args:

collection_id:: ID of the collection to restore.

unshare_collection(collection_id: str, permission: SharePermission) → ShareResponseStatus

Remove sharing of a collection to a user.

The permission attribute defined the level of access, and who can access the collection, the collection_id attribute denotes the collection to be shared.

In case of un-sharing, the SharePermission’s user is sufficient.

Args:

collection_id:: ID of the collection to un-share.
permission:: Defines the user for which collection access is revoked.

ShareResponseStatus: Status of share request.

unshare_collection_for_all(collection_id: str) → ShareResponseStatus

Remove sharing of a collection to all other users but the original owner.

Args:

collection_id:: ID of the collection to un-share.

ShareResponseStatus: Status of share request.

unshare_collection_from_group(collection_id: str, permission: GroupSharePermission) → ShareResponseStatus

Remove sharing of a collection from a group.

The permission attribute defines which group to remove access from, the collection_id attribute denotes the collection to be unshared. In case of un-sharing, the GroupSharePermission’s group_id is sufficient.

Args:

collection_id:: ID of the collection to un-share.
permission:: Defines the group for which collection access is revoked.

Returns:

ShareResponseStatus: Status of share request.

unshare_prompt(prompt_id: str, permission: SharePermission) → ShareResponseStatus

Remove sharing of a prompt template to a user.

Args:

prompt_id:: ID of the prompt template to un-share.
permission:: Defines the user for which collection access is revoked.

ShareResponseStatus: Status of share request.

unshare_prompt_for_all(prompt_id: str) → ShareResponseStatus

Remove sharing of a prompt template to all other users but the original owner (owner action only).

Args:

prompt_id:: ID of the prompt template to un-share.

ShareResponseStatus: Status of share request.

unshare_prompt_from_group(prompt_id: str, permission: GroupSharePermission) → ShareResponseStatus

Unshare a prompt from a group.

Args:

prompt_id:: ID of the prompt to un-share.
permission:: Specific permissions for a group.

Returns:

ShareResponseStatus: Status of share request.

untag_document(document_id: str, tag_name: str) → str

Removes an existing tag from a document.

Args:

document_id:: String id of the document to remove the tag from.
tag_name:: String representing the tag to remove.

Returns:

String: The id of the removed tag.

update_agent_key(key_id: str, name: str | None = None, value: str | None = None, key_type: str | None = None, description: str | None = None) → dict | None

update_agent_tool_preference(reference_value: List[str]) → None

update_collection(collection_id: str, name: str, description: str) → str

Update the metadata for a given collection.

All variables are required. You can use h2ogpte.get_collection(<id>).name or description to get the existing values if you only want to change one or the other.

Args:

collection_id:: ID of the collection to update.
name:: New name of the collection, this is required.
description:: New description of the collection, this is required.

Returns:

str: ID of the updated collection.

update_collection_metadata(collection_id: str, collection_metadata: dict) → str

Set the new collection metadata overwriting the existing metadata. Be careful not to delete any settings you want to keep.

Args:

collection_id:: ID of the collection to update.
collection_metadata:: Dictionary containing the new collection metadata.

Returns:

str: ID of the updated collection.

update_collection_rag_type(collection_id: str, name: str, description: str, rag_type) → str

Update the metadata for a given collection.

All variables are required. You can use h2ogpte.get_collection(<id>).name or description to get the existing values if you only want to change one or the other.

Args:

collection_id:

ID of the collection to update.

name:

New name of the collection, this is required.

description:

New description of the collection, this is required.

rag_type: str one of

"auto" Automatically select the best rag_type. "llm_only" LLM Only - Answer the query without any supporting document contexts.

Requires 1 LLM or Agent call.

"agent_only" Agent Only - Answer the query with only original files passed to agent.: Requires 1 Agent call.
"rag" RAG (Retrieval Augmented Generation) - Use supporting document contexts: to answer the query. Requires 1 LLM or Agent call.
"hyde1" LLM Only + RAG composite - HyDE RAG (Hypothetical Document Embedding).: Use ‘LLM Only’ response to find relevant contexts from a collection for generating a response. Requires 2 LLM calls.
"hyde2" HyDE + RAG composite - Use the ‘HyDE RAG’ response to find relevant: contexts from a collection for generating a response. Requires 3 LLM calls.
"rag+" Summary RAG - Like RAG, but uses more context and recursive: summarization to overcome LLM context limits. Keeps all retrieved chunks, puts them in order, adds neighboring chunks, then uses the summary API to get the answer. Can require several LLM calls.
"all_data" All Data RAG - Like Summary RAG, but includes all document: chunks. Uses recursive summarization to overcome LLM context limits. Can require several LLM calls.

Returns:

str: ID of the updated collection.

update_collection_settings(collection_id: str, collection_settings: dict) → str

Set the new collection settings, must be complete. Be careful not to delete any settings you want to keep.

Args:

collection_id:: ID of the collection to update.
collection_settings:: Dictionary containing the new collection settings.

Returns:

str: ID of the updated collection.

update_document_metadata(document_id: str, document_metadata: dict) → str

Set the new document metadata overwriting the existing metadata. Be careful not to delete any settings you want to keep.

Args:

document_id:: ID of the document to update.
document_metadata:: Dictionary containing the new document metadata.

Returns:

str: ID of the updated document.

update_document_name(document_id: str, name: str) → str

Update the name metadata for a given document.

Args:

document_id:: ID of the document to update.
name:: New name of the document, must include file extension.

Returns:

str: ID of the updated document.

update_document_uri(document_id: str, uri: str) → str

Update the URI metadata for a given document.

Args:

document_id:: ID of the document to update.
uri:: New URI of the document, this is required.

Returns:

str: ID of the updated document.

update_prompt_template(id: str, name: str, description: str | None = None, lang: str | None = None, system_prompt: str | None = None, pre_prompt_query: str | None = None, prompt_query: str | None = None, hyde_no_rag_llm_prompt_extension: str | None = None, pre_prompt_summary: str | None = None, prompt_summary: str | None = None, system_prompt_reflection: str | None = None, pre_prompt_reflection: str | None = None, prompt_reflection: str | None = None, auto_gen_description_prompt: str | None = None, auto_gen_document_summary_pre_prompt_summary: str | None = None, auto_gen_document_summary_prompt_summary: str | None = None, auto_gen_document_sample_questions_prompt: str | None = None, default_sample_questions: List[str] | None = None, image_batch_image_prompt: str | None = None, image_batch_final_prompt: str | None = None) → str

Update a prompt template

Args:

id:: String ID of the prompt template to update
name:: Name of the prompt template
description:: Description of the prompt template
lang:: Language code
system_prompt:: System Prompt
pre_prompt_query:: Text that is prepended before the contextual document chunks.
prompt_query:: Text that is appended to the beginning of the user’s message.
hyde_no_rag_llm_prompt_extension:: LLM prompt extension.
pre_prompt_summary:: Prompt that goes before each large piece of text to summarize
prompt_summary:: Prompt that goes after each large piece of text to summarize
system_prompt_reflection:: System Prompt for self-reflection
pre_prompt_reflection:: Deprecated - ignored
prompt_reflection:: Template for self-reflection, must contain two occurrences of %s for full previous prompt (including system prompt, document related context and prompts if applicable, and user prompts) and answer
auto_gen_description_prompt:: prompt to create a description of the collection.
auto_gen_document_summary_pre_prompt_summary:: pre_prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_summary_prompt_summary:: prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_sample_questions_prompt:: prompt to create sample questions for a freshly imported document (if enabled).
default_sample_questions:: default sample questions in case there are no auto-generated sample questions.
image_batch_final_prompt:: Prompt for each image batch for vision models
image_batch_image_prompt:: Prompt to reduce all answers each image batch for vision models

Returns:

str: The ID of the updated prompt template.

update_question_reply_feedback(reply_id: str, expected_answer: str, user_comment: str)

Update feedback for a specific answer to a question.

Args:

reply_id:: UUID of the reply.
expected_answer:: Expected answer.
user_comment:: User comment.

Returns:

None

update_tag(tag_name: str, description: str, format: str) → str

Updates a tag.

Args:

tag_name:: String representing the tag to update.
description:: String describing the tag.
format:: String representing the format of the tag.

Returns:

String: The id of the updated tag.

upload(file_name: str, file: Any, uri: str | None = None) → str

Upload a file to the H2OGPTE backend.

Uploaded files are not yet accessible and need to be ingested into a collection.

See Also:

ingest_uploads: Add the uploaded files to a collection. delete_upload: Delete uploaded file

Args:

file_name:: What to name the file on the server, must include file extension.
file:: File object to upload, often an opened file from with open(…) as f.
uri:: Optional - URI you would like to associate with the file.

Returns:

str: The upload id to be used in ingest jobs.

Raises:

Exception: The upload request was unsuccessful.

h2ogpte.session module

class h2ogpte.session.Session(address: str, chat_session_id: str, client: H2OGPTE = None, prompt_template_id: str | None = None, open_timeout: int = 10, close_timeout: int = 10, max_connect_retries: int = 10, connect_retry_delay: int = 0.5, connect_retry_max_delay: int = 60)

Bases: object

Create and participate in a chat session.

This is a live connection to the h2oGPTe server contained to a specific chat session on top of a single collection of documents. Users will find all questions and responses in this session in a single chat history in the UI.

See Also:

H2OGPTE.connect: To initialize a session on an existing connection.

Args:

address:: Full URL of the h2oGPTe server to connect to.
chat_session_id:: The ID of the chat session the queries should be sent to.
client:: Set to the value of H2OGPTE client object used to perform other calls to the system.

Examples:

# Example 1: Best practice, create a session using the H2OGPTE module
with h2ogpte.connect(chat_session_id) as session:
    answer1 = session.query('How many paper clips were shipped to Scranton?', timeout=10)
    answer2 = session.query('Did David Brent co-sign the contract with Initech?', timeout=10)

# Example 2: Connect and disconnect manually
session = Session(
    address=address,
    client=client,
    chat_session_id=chat_session_id
)
session.connect()
answer = session.query("Are there any dogs in the documents?")
session.disconnect()

connect()

Connect to an h2oGPTe server.

This is primarily an internal function used when users create a session using with from the H2OGPTE.connection() function.

property connection: ClientConnection

disconnect()

Disconnect from an h2oGPTe server.

This is primarily an internal function used when users create a session using with from the H2OGPTE.connection() function.

Retrieval-augmented generation for a query on a collection.

Finds a collection of chunks relevant to the query using similarity scores. Sends these and any additional instructions to an LLM.

Format of questions or imperatives:

"{pre_prompt_query}
"""
{similar_context_chunks}
"""                {prompt_query}{message}"

Args:

message:

Query or instruction from the end user to the LLM.

system_prompt:

Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto or None for the model default. Defaults to ‘’ for no system prompt.

pre_prompt_query:

Text that is prepended before the contextual document chunks. The default can be customized per environment, but the standard default is "Pay attention and remember the information below, which will help to answer the question or imperative after the context ends.\n"

prompt_query:

Text that is appended to the beginning of the user’s message. The default can be customized per environment, but the standard default is “According to only the information in the document sources provided within the context above, “

image_batch_final_prompt:

Prompt for each image batch for vision models

image_batch_image_prompt:

Prompt to reduce all answers each image batch for vision models

pre_prompt_summary:

Not used, use H2OGPTE.process_document to summarize.

prompt_summary:

Not used, use H2OGPTE.process_document to summarize.

llm:

Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Use None or “auto” for automatic model routing, set cost_controls for detailed control over automatic routing.

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:

temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. seed (int, default: 0) — The seed for the random number generator when sampling during generation (if temp>0 or top_k>1 or top_p<1), seed=0 picks a random seed. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. enable_vision (str, default: “auto”) - Controls vision mode, send images to the LLM in addition to text chunks. Only if have models that support vision, use get_vision_capable_llm_names() to see list. One of [“on”, “off”, “auto”]. visible_vision_models (List[str], default: [“auto”]) - Controls which vision model to use when processing images. Use get_vision_capable_llm_names() to see list. Must provide exactly one model. [“auto”] for automatic. images_num_max (int, default: None) — Maximum number of images to process. json_preserve_system_prompt (bool, default: None) — Whether to preserve system prompt in JSON response. client_metadata (str, default: None) — Additional metadata to send with the request. min_chars_per_yield (int, default: 1) — Minimum characters to yield in streaming response. cost_controls: Optional dictionary

max_cost (float) - Sets the maximum allowed cost in USD per LLM call when doing Automatic model routing. If the estimated cost based on input and output token counts is higher than this limit, the request will fail as early as possible. max_cost_per_million_tokens (float) - Only consider models that cost less than this value in USD per million tokens when doing automatic routing. Using the max of input and output cost. model (List[str] or None) - Optional subset of models to consider when doing automatic routing. None means consider all models. willingness_to_pay (float) - Controls the willingness to pay extra for a more accurate model for every LLM call when doing automatic routing, in units of USD per +10% increase in accuracy. We start with the least accurate model. For each more accurate model, we accept it if the increase in estimated cost divided by the increase in estimated accuracy is no more than this value divided by 10%, up to the upper limit specified above. Lower values will try to keep the cost as low as possible, higher values will approach the cost limit to increase accuracy. 0 means unlimited. willingness_to_wait (float) - Controls the willingness to wait longer for a more accurate model for every LLM call when doing automatic routing, in units of seconds per +10% increase in accuracy. We start with the least accurate model. For each more accurate model, we accept it if the increase in estimated time divided by the increase in estimated accuracy is no more than this value divided by 10%. Lower values will try to keep the time as low as possible, higher values will take longer to increase accuracy. 0 means unlimited.

use_agent (bool, default: False) - If True, use the AI agent (with access to tools) to generate the response. agent_accuracy (str, default: “standard”) - Effort level by the agent. Only if use_agent=True. One of [“quick”, “basic”, “standard”, “maximum”]. agent_max_turns (Optional[Union[str, int]], default: “auto”) - Optional max. number of back-and-forth turns with the agent. Only if use_agent=True. Either “auto” or an integer. agent_tools (Optional[Union[str, List[str]]], default: “auto”) - Either “auto”, “all”, “any” to enable all available tools, or a specific list of tools to use. Only if use_agent=True. Get full list by calling await self._lang(“get_agent_tools_options”). agent_type (str, default: “auto”) — Type of agent to use for task processing. agent_original_files (List[str], default: None) — List of file paths for agent to process. agent_timeout (int, default: None) — Timeout in seconds for each agent turn. agent_total_timeout (int, default: 3600) — Total timeout in seconds for all agent processing. agent_code_writer_system_message (str, default: None) — System message for agent code writer. agent_num_executable_code_blocks_limit (int, default: 1) — Maximum number of executable code blocks. agent_system_site_packages (bool, default: True) — Whether agent has access to system site packages. agent_main_model (Optional[str], default: None) — Main model to use for agent. agent_max_stream_length (Optional[int], default: None) — Maximum stream length for agent response. agent_max_memory_usage (int, default: 16*1024**3) — Maximum memory usage for agent in bytes (16GB default). agent_main_reasoning_effort (Optional[int], default: None) — Effort level for main reasoning. agent_advanced_reasoning_effort (Optional[int], default: None) — Effort level for advanced reasoning. agent_max_confidence_level (Optional[int], default: None) — Maximum confidence level for agent responses. agent_planning_forced_mode (Optional[bool], default: None) — Whether to force planning mode for agent. agent_too_soon_forced_mode (Optional[bool], default: None) — Whether to force “too soon” mode for agent. agent_critique_forced_mode (Optional[int], default: None) — Whether to force critique mode for agent. agent_stream_files (bool, default: True) — Whether to stream files from agent.

self_reflection_config:

Dictionary of arguments for self-reflection, can contain the following string:string mappings:

llm_reflection: str
"gpt-4-0613" or "" to disable reflection

prompt_reflection: str
‘Here’s the prompt and the response: """Prompt:\n%s\n"""\n\n""" Response:\n%s\n"""\n\nWhat is the quality of the response for the given prompt? Respond with a score ranging from Score: 0/10 (worst) to Score: 10/10 (best), and give a brief explanation why.'

system_prompt_reflection: str
""

llm_args_reflection: str
"{}"

rag_config:

Dictionary of arguments to control RAG (retrieval-augmented-generation) types. Can contain the following key/value pairs: rag_type: str one of

"auto" Automatically select the best rag_type. "llm_only" LLM Only - Answer the query without any supporting document contexts.

Requires 1 LLM or Agent call.

"agent_only" Agent Only - Answer the query with only original files passed to agent.
Requires 1 Agent call.

"rag" RAG (Retrieval Augmented Generation) - Use supporting document contexts
to answer the query. Requires 1 LLM or Agent call.

"hyde1" LLM Only + RAG composite - HyDE RAG (Hypothetical Document Embedding).
Use ‘LLM Only’ response to find relevant contexts from a collection for generating a response. Requires 2 LLM calls.

"hyde2" HyDE + RAG composite - Use the ‘HyDE RAG’ response to find relevant
contexts from a collection for generating a response. Requires 3 LLM calls.

"rag+" Summary RAG - Like RAG, but uses more context and recursive
summarization to overcome LLM context limits. Keeps all retrieved chunks, puts them in order, adds neighboring chunks, then uses the summary API to get the answer. Can require several LLM calls.

"all_data" All Data RAG - Like Summary RAG, but includes all document
chunks. Uses recursive summarization to overcome LLM context limits. Can require several LLM calls.

"all_data" All Data RAG - Like Summary RAG, but includes all document
chunks. Uses recursive summarization to overcome LLM context limits. Can require several LLM calls.

hyde_no_rag_llm_prompt_extension: str

Add this prompt to every user’s prompt, when generating answers to be used for subsequent retrieval during HyDE. Only used when rag_type is “hyde1” or “hyde2”. example: '\nKeep the answer brief, and list the 5 most relevant key words at the end.'

num_neighbor_chunks_to_include: int

Number of neighboring chunks to include for every retrieved relevant chunk. Helps to keep surrounding context together. Only enabled for rag_type “rag+”. Defaults to 1.

meta_data_to_include:

A dictionary containing flags that indicate whether each piece of document metadata is to be included as part of the context for a chat with a collection. Default is {

“name”: True, “text”: True, “page”: True, “captions”: True, “uri”: False, “connector”: False, “original_mtime”: False, “age”: False, “score”: False,

}

rag_max_chunks:

Maximum number of document chunks to retrieve for RAG. If not specified (default: -1), actual number depends on rag_type and admin configuration. Set to >0 values to enable. Can be combined with rag_min_chunk_score.

rag_min_chunk_score:

Minimum score of document chunks to retrieve for RAG. If not specified (default: 0.0), will not filter chunks by score. Set to >0 values to enable. Can be combined with rag_max_chunks.

include_chat_history:

Whether to include chat history. Includes previous questions and answers for the current chat session for each new chat request. Disable if require deterministic answers for a given question. Choices are: [“on”,”off”,”auto”,True,False]

tags:

A list of tags from which to pull the context for RAG.

metadata_filter:

A dictionary to filter documents by metadata, from which to pull the context for RAG.

timeout:

Amount of time in seconds to allow the request to run. The default is 1000 seconds.

retries:

Amount of retries to allow the request to run when hits a network issue. The default is 3.

callback:

Function for processing partial messages, used for streaming responses to an end user.

Returns:

ChatMessage: The response text and details about the response from the LLM. For example:

ChatMessage(
    id='XXX',
    content='The information provided in the context...',
    reply_to='YYY',
    votes=0,
    created_at=datetime.datetime(2023, 10, 24, 20, 12, 34, 875026)
    type_list=[],
)

Raises:

TimeoutError: The request did not complete in time.

h2ogpte.session.deserialize(response: str) → ChatResponse | ChatAcknowledgement

h2ogpte.session.serialize(request: ChatRequest) → str

h2ogpte.types module

class h2ogpte.types.APIKey(*, id: str, username: str, name: str, hint: str, created_at: datetime, expires_at: datetime | None = None, is_active: bool, collection_name: str | None = None, collection_id: str | None = None, is_global_key: bool)

Bases: BaseModel

collection_id: str | None

collection_name: str | None

created_at: datetime

expires_at: datetime | None

hint: str

id: str

is_active: bool

is_global_key: bool

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

username: str

class h2ogpte.types.Answer(*, content: str, error: str, prompt_raw: str = '', llm: str, input_tokens: int = 0, output_tokens: int = 0, origin: str = 'N/A')

Bases: BaseModel

content: str

error: str

input_tokens: int

llm: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

origin: str

output_tokens: int

prompt_raw: str

class h2ogpte.types.ChatAcknowledgement(t: str, session_id: str, correlation_id: str, message_id: str, username: str, body: str)

Bases: object

body: str

correlation_id: str

message_id: str

session_id: str

t: str

username: str

class h2ogpte.types.ChatMessage(*, id: str, content: str, reply_to: str | None = None, votes: int, created_at: datetime, type_list: List[str] | None = None, error: str | None = None)

Bases: BaseModel

content: str

created_at: datetime

error: str | None

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

reply_to: str | None

type_list: List[str] | None

votes: int

class h2ogpte.types.ChatMessageFull(*, id: str, username: str | None = None, content: str, reply_to: str | None = None, votes: int, created_at: datetime, type_list: List[ChatMessageMeta] | None = [], has_references: bool, total_references: int, error: str | None = None)

Bases: BaseModel

content: str

created_at: datetime

error: str | None

has_references: bool

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

reply_to: str | None

total_references: int

type_list: List[ChatMessageMeta] | None

username: str | None

votes: int

class h2ogpte.types.ChatMessageMeta(*, message_type: str, content: str)

Bases: BaseModel

content: str

message_type: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class h2ogpte.types.ChatMessageReference(*, document_id: str, document_name: str, chunk_id: int, pages: str, score: float, content: str, collection_id: str, were_references_deleted: bool)

Bases: BaseModel

chunk_id: int

collection_id: str

content: str

document_id: str

document_name: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pages: str

score: float

were_references_deleted: bool

Bases: object

body: str

correlation_id: str

image_batch_final_prompt: str | None = None

image_batch_image_prompt: str | None = None

include_chat_history: bool | str | None = False

llm: str | int | None

llm_args: str | None

metadata_filter: str | None = None

mode: str

pre_prompt_query: str | None

pre_prompt_summary: str | None

prompt_query: str | None

prompt_summary: str | None

rag_config: str | None

self_reflection_config: str | None

session_id: str

system_prompt: str | None

t: str

tags: List[str] | None = None

class h2ogpte.types.ChatResponse(t: str, session_id: str, message_id: str, reply_to_id: str, body: str, error: str)

Bases: object

body: str

error: str

message_id: str

reply_to_id: str

session_id: str

t: str

class h2ogpte.types.ChatSessionCount(*, chat_session_count: int)

Bases: BaseModel

chat_session_count: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class h2ogpte.types.ChatSessionForCollection(*, id: str, latest_message_content: str | None = None, updated_at: datetime, name: str | None = None)

Bases: BaseModel

id: str

latest_message_content: str | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None

updated_at: datetime

class h2ogpte.types.ChatSessionForDocument(*, id: str, latest_message_content: str | None = None, updated_at: datetime)

Bases: BaseModel

id: str

latest_message_content: str | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

updated_at: datetime

class h2ogpte.types.ChatSessionInfo(*, id: str, name: str | None = None, latest_message_content: str | None = None, collection_id: str | None = None, collection_name: str | None = None, prompt_template_id: str | None = None, updated_at: datetime)

Bases: BaseModel

collection_id: str | None

collection_name: str | None

id: str

latest_message_content: str | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None

prompt_template_id: str | None

updated_at: datetime

class h2ogpte.types.Chunk(*, text: str, id: int, name: str, size: int, pages: str)

Bases: BaseModel

id: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

pages: str

size: int

text: str

class h2ogpte.types.Collection(*, id: str, name: str, description: str, document_count: int, document_size: int, created_at: datetime, updated_at: datetime, username: str, rag_type: str | None = None, embedding_model: str | None = None, prompt_template_id: str | None = None, collection_settings: dict | None = None, is_public: bool, thumbnail: str | None = None, metadata_dict: dict | None = None, chat_settings: dict | None = None, status: str, expiry_date: datetime | None = None, inactivity_interval: int | None = None, size_limit: int | None = None)

Bases: BaseModel

chat_settings: dict | None

collection_settings: dict | None

created_at: datetime

description: str

document_count: int

document_size: int

embedding_model: str | None

expiry_date: datetime | None

id: str

inactivity_interval: int | None

is_public: bool

metadata_dict: dict | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

prompt_template_id: str | None

rag_type: str | None

size_limit: int | None

status: str

thumbnail: str | None

updated_at: datetime

username: str

class h2ogpte.types.CollectionCount(*, collection_count: int)

Bases: BaseModel

collection_count: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class h2ogpte.types.CollectionInfo(*, id: str, name: str, description: str, document_count: int, document_size: int, updated_at: datetime, user_count: int, is_public: bool, username: str, sessions_count: int, status: str, expiry_date: datetime | None = None, inactivity_interval: int | None = None, archived_at: datetime | None = None, size_limit: int | None = None, metadata_dict: dict | None = None)

Bases: BaseModel

archived_at: datetime | None

description: str

document_count: int

document_size: int

expiry_date: datetime | None

id: str

inactivity_interval: int | None

is_public: bool

metadata_dict: dict | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

sessions_count: int

size_limit: int | None

status: str

updated_at: datetime

user_count: int

username: str

class h2ogpte.types.ConfigItem(*, key_name: str, string_value: str, value_type: str, can_overwrite: bool, upper_bound: int | float | None = None, is_public: bool)

Bases: BaseModel

can_overwrite: bool

is_public: bool

key_name: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

string_value: str

upper_bound: int | float | None

value_type: str

class h2ogpte.types.Document(*, id: str, name: str, type: str, size: int, page_count: int, guardrails_settings: dict | None = None, connector: str | None = None, uri: str | None = None, original_type: str | None = None, original_mtime: datetime | None = None, meta_data_dict: dict | None = None, status: Status, created_at: datetime, updated_at: datetime, user_source_file: dict | None = None, page_ocr_model_dict: dict | None = None, page_layout_dict: dict | None = None, metadata_dict: dict | None = None)

Bases: BaseModel

connector: str | None

created_at: datetime

guardrails_settings: dict | None

id: str

meta_data_dict: dict | None

metadata_dict: dict | None

model_config: ClassVar[ConfigDict] = {'use_enum_values': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

original_mtime: datetime | None

original_type: str | None

page_count: int

page_layout_dict: dict | None

page_ocr_model_dict: dict | None

size: int

status: Status

type: str

updated_at: datetime

uri: str | None

user_source_file: dict | None

class h2ogpte.types.DocumentCount(*, document_count: int)

Bases: BaseModel

document_count: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class h2ogpte.types.DocumentInfo(*, id: str, username: str, name: str, type: str, size: int, page_count: int, guardrails_settings: dict | None = None, connector: str | None = None, uri: str | None = None, original_type: str | None = None, meta_data_dict: dict | None = None, status: Status, updated_at: datetime, user_source_file: dict | None = None, page_ocr_model_dict: dict | None = None, page_layout_dict: dict | None = None, metadata_dict: dict | None = None)

Bases: BaseModel

connector: str | None

guardrails_settings: dict | None

id: str

meta_data_dict: dict | None

metadata_dict: dict | None

model_config: ClassVar[ConfigDict] = {'use_enum_values': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

original_type: str | None

page_count: int

page_layout_dict: dict | None

page_ocr_model_dict: dict | None

size: int

status: Status

type: str

updated_at: datetime

uri: str | None

user_source_file: dict | None

username: str

class h2ogpte.types.DocumentInfoSummary(*, id: str, username: str, name: str, type: str, size: int, page_count: int, guardrails_settings: dict | None = None, connector: str | None = None, uri: str | None = None, original_type: str | None = None, meta_data_dict: dict | None = None, status: Status, updated_at: datetime, user_source_file: dict | None = None, usage_stats: str | None = None, summary: str | None = None, summary_parameters: str | None = None, page_ocr_model_dict: dict | None = None, page_layout_dict: dict | None = None, metadata_dict: dict | None = None)

Bases: BaseModel

connector: str | None

guardrails_settings: dict | None

id: str

meta_data_dict: dict | None

metadata_dict: dict | None

model_config: ClassVar[ConfigDict] = {'use_enum_values': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

original_type: str | None

page_count: int

page_layout_dict: dict | None

page_ocr_model_dict: dict | None

size: int

status: Status

summary: str | None

summary_parameters: str | None

type: str

updated_at: datetime

uri: str | None

usage_stats: str | None

user_source_file: dict | None

username: str

class h2ogpte.types.DocumentSummary(*, id: str, content: str, error: str, document_id: str, kwargs: str, created_at: datetime, usage_stats: str | None = None)

Bases: ProcessedDocument

content: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class h2ogpte.types.ExtractionAnswer(*, content: List[str], error: str, llm: str, input_tokens: int = 0, output_tokens: int = 0)

Bases: BaseModel

content: List[str]

error: str

input_tokens: int

llm: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

output_tokens: int

class h2ogpte.types.GroupSharePermission(*, group_id: str, permissions: List[str] | None = None)

Bases: BaseModel

group_id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

permissions: List[str] | None

class h2ogpte.types.Identifier(*, id: str, error: str | None = None)

Bases: BaseModel

error: str | None

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class h2ogpte.types.Job(*, id: str, name: str, passed: float, failed: float, progress: float, completed: bool, canceled: bool, date: datetime, kind: JobKind, statuses: List[JobStatus], errors: List[str], last_update_date: datetime, duration: str, duration_seconds: float, start_time: float | None = None, canceled_by: str | None = None, timeout: float | None = None)

Bases: BaseModel

canceled: bool

canceled_by: str | None

completed: bool

date: datetime

duration: str

duration_seconds: float

errors: List[str]

failed: float

id: str

kind: JobKind

last_update_date: datetime

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

passed: float

progress: float

start_time: float | None

statuses: List[JobStatus]

timeout: float | None

class h2ogpte.types.JobKind(value)

Bases: str, Enum

An enumeration.

CreateTopicModelJob = 'CreateTopicModelJob'

DeleteChatSessionsJob = 'DeleteChatSessionsJob'

DeleteCollectionThumbnailJob = 'DeleteCollectionThumbnailJob'

DeleteCollectionsAsAdminJob = 'DeleteCollectionsAsAdminJob'

DeleteCollectionsJob = 'DeleteCollectionsJob'

DeleteDocumentsFromCollectionJob = 'DeleteDocumentsFromCollectionJob'

DeleteDocumentsJob = 'DeleteDocumentsJob'

DocumentSummaryJob = 'DocumentProcessJob'

ImportCollectionIntoCollectionJob = 'ImportCollectionIntoCollectionJob'

ImportDocumentIntoCollectionJob = 'ImportDocumentIntoCollectionJob'

IndexFilesJob = 'IndexFilesJob'

IngestAgentOnlyToStandardJob = 'IngestAgentOnlyToStandardJob'

IngestFromCloudStorageJob = 'IngestFromCloudStorageJob'

IngestFromFileSystemJob = 'IngestFromFileSystemJob'

IngestPlainTextJob = 'IngestPlainTextJob'

IngestUploadsJob = 'IngestUploadsJob'

IngestWebsiteJob = 'IngestWebsiteJob'

NoOpJob = 'NoOpJob'

UpdateCollectionStatsJob = 'UpdateCollectionStatsJob'

UpdateCollectionThumbnailJob = 'UpdateCollectionThumbnailJob'

class h2ogpte.types.JobStatus(*, id: str, status: str)

Bases: BaseModel

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: str

class h2ogpte.types.LLMPerformance(*, llm_name: str, call_count: int, input_tokens: int, output_tokens: int, tokens_per_second: float, time_to_first_token: float)

Bases: BaseModel

call_count: int

input_tokens: int

llm_name: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

output_tokens: int

time_to_first_token: float

tokens_per_second: float

class h2ogpte.types.LLMUsage(*, llm_name: str, llm_cost: float, call_count: int, input_tokens: int, output_tokens: int)

Bases: BaseModel

call_count: int

input_tokens: int

llm_cost: float

llm_name: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

output_tokens: int

class h2ogpte.types.LLMUsageLimit(*, current: float, max_allowed_24h: float, cost_unit: str, interval: str)

Bases: BaseModel

cost_unit: str

current: float

interval: str

max_allowed_24h: float

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class h2ogpte.types.LLMWithUserUsage(*, llm_name: str, total_cost: float, total_calls: int, total_input_tokens: int, total_output_tokens: int, user_usage: List[UserUsage])

Bases: BaseModel

llm_name: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

total_calls: int

total_cost: float

total_input_tokens: int

total_output_tokens: int

user_usage: List[UserUsage]

class h2ogpte.types.Meta(*, version: str, build: str, username: str, user_id: str, email: str, is_guest: bool, license_expired: bool, license_expiry_date: str, global_configs: List[ConfigItem], user_configs: List[UserConfigItem], picture: str | None, groups: List[str] | None, permissions: List[str], ui_config: MetaUIConfig)

Bases: BaseModel

build: str

email: str

global_configs: List[ConfigItem]

groups: List[str] | None

is_guest: bool

license_expired: bool

license_expiry_date: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

permissions: List[str]

picture: str | None

ui_config: MetaUIConfig

user_configs: List[UserConfigItem]

user_id: str

username: str

version: str

class h2ogpte.types.MetaUIConfig(*, logo: str, chat_logo: str, primary_color: str, chat_name: str, greeting: str, show_private_button: bool, show_workers_status: bool, show_live_logs: bool, show_eval: bool, show_extractors: bool, public_mode: bool)

Bases: BaseModel

chat_logo: str

chat_name: str

greeting: str

logo: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

primary_color: str

public_mode: bool

show_eval: bool

show_extractors: bool

show_live_logs: bool

show_private_button: bool

show_workers_status: bool

class h2ogpte.types.ObjectCount(*, chat_session_count: int, collection_count: int, document_count: int)

Bases: BaseModel

chat_session_count: int

collection_count: int

document_count: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class h2ogpte.types.PartialChatMessage(*, id: str, content: str, reply_to: str | None = None)

Bases: BaseModel

content: str

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

reply_to: str | None

class h2ogpte.types.ProcessedDocument(*, id: str, content: str | List[str], error: str, document_id: str, kwargs: str, created_at: datetime, usage_stats: str | None = None)

Bases: BaseModel

content: str | List[str]

created_at: datetime

document_id: str

error: str

id: str

kwargs: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

usage_stats: str | None

Bases: BaseModel

auto_gen_description_prompt: str | None

auto_gen_document_sample_questions_prompt: str | None

auto_gen_document_summary_pre_prompt_summary: str | None

auto_gen_document_summary_prompt_summary: str | None

created_at: datetime | None

default_sample_questions: List[str] | None

description: str | None

group_count: int | None

hyde_no_rag_llm_prompt_extension: str | None

id: str | None

image_batch_final_prompt: str | None

image_batch_image_prompt: str | None

is_default: bool

lang: str | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

pre_prompt_query: str | None

pre_prompt_reflection: str | None

pre_prompt_summary: str | None

prompt_query: str | None

prompt_reflection: str | None

prompt_summary: str | None

system_prompt: str | None

system_prompt_reflection: str | None

user_count: int | None

user_id: str | None

username: str | None

class h2ogpte.types.PromptTemplateCount(*, prompt_template_count: int)

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

prompt_template_count: int

class h2ogpte.types.QuestionReplyData(*, question_content: str, reply_content: str, question_id: str, reply_id: str, llm: str | None, system_prompt: str | None = None, pre_prompt_query: str | None = None, prompt_query: str | None = None, pre_prompt_summary: str | None = None, prompt_summary: str | None = None, rag_config: str | None = None, collection_documents: List[str] | None = None, votes: int, expected_answer: str | None = None, user_comment: str | None = None, collection_id: str | None = None, collection_name: str | None = None, response_created_at_time: str, prompt_template_id: str | None = None, include_chat_history: bool | str | None = None)

Bases: BaseModel

collection_documents: List[str] | None

collection_id: str | None

collection_name: str | None

expected_answer: str | None

include_chat_history: bool | str | None

llm: str | None

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

pre_prompt_query: str | None

pre_prompt_summary: str | None

prompt_query: str | None

prompt_summary: str | None

prompt_template_id: str | None

question_content: str

question_id: str

rag_config: str | None

reply_content: str

reply_id: str

response_created_at_time: str

system_prompt: str | None

user_comment: str | None

votes: int

class h2ogpte.types.QuestionReplyDataCount(*, question_reply_data_count: int)

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

question_reply_data_count: int

class h2ogpte.types.QueueInfo(*, name: str, length: int)

Bases: BaseModel

length: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

class h2ogpte.types.Result(*, status: Status)

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {'use_enum_values': True}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: Status

class h2ogpte.types.SchedulerStats(*, queue_length: int, queue_infos: List[QueueInfo])

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

queue_infos: List[QueueInfo]

queue_length: int

class h2ogpte.types.SearchResult(*, id: int, topic: str, name: str, text: str, size: int, pages: str, score: float)

Bases: BaseModel

id: int

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

pages: str

score: float

size: int

text: str

topic: str

class h2ogpte.types.SearchResults(*, result: List[SearchResult])

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

result: List[SearchResult]

exception h2ogpte.types.SessionError: Bases: Exception

class h2ogpte.types.SharePermission(*, username: str, permissions: List[str] | None = None)

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

permissions: List[str] | None

username: str

class h2ogpte.types.ShareResponseStatus(*, status: str)

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

status: str

class h2ogpte.types.Status(value)

Bases: str, Enum

An enumeration.

AgentOnly = 'agent_only'

Canceled = 'canceled'

Completed = 'completed'

Failed = 'failed'

Queued = 'queued'

Running = 'running'

Scheduled = 'scheduled'

Unknown = 'unknown'

class h2ogpte.types.SuggestedQuestion(*, question: str)

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

question: str

class h2ogpte.types.Tag(*, id: str, name: str, description: str | None = None, format: str | None = None)

Bases: BaseModel

description: str | None

format: str | None

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

class h2ogpte.types.User(*, id: str, email: str, username: str)

Bases: BaseModel

email: str

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

username: str

class h2ogpte.types.UserConfigItem(*, key_name: str, string_value: str, value_type: str)

Bases: BaseModel

key_name: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

string_value: str

value_type: str

class h2ogpte.types.UserGroup(*, id: str, name: str, description: str | None = None)

Bases: BaseModel

description: str | None

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

class h2ogpte.types.UserJobs(*, username: str, user_id: str, jobs: List[Job])

Bases: BaseModel

jobs: List[Job]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

user_id: str

username: str

class h2ogpte.types.UserPermission(*, id: str, name: str, description: str | None = None, category: str | None = None)

Bases: BaseModel

category: str | None

description: str | None

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

class h2ogpte.types.UserRole(*, id: str, name: str, description: str | None = None)

Bases: BaseModel

description: str | None

id: str

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str

class h2ogpte.types.UserUsage(*, user_id: str, username: str, email: str, llm_cost: float, call_count: int, input_tokens: int, output_tokens: int)

Bases: BaseModel

call_count: int

email: str

input_tokens: int

llm_cost: float

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

output_tokens: int

user_id: str

username: str

class h2ogpte.types.UserWithLLMUsage(*, user_id: str, username: str, email: str, llm_usage: List[LLMUsage])

Bases: BaseModel

email: str

llm_usage: List[LLMUsage]

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

user_id: str

username: str

Module contents

h2oGPTe - AI for documents and more

h2ogpte is the Python client library for H2O.ai’s Enterprise h2oGPTe, a RAG (Retrieval-Augmented Generation) based platform built on top of many open-source software components such as h2oGPT, hnswlib, Torch, Transformers, Golang, Python, k8s, Docker, PyMuPDF, DocTR, and many more.

h2oGPTe is designed to help organizations improve their business using generative AI. It focuses on scaling as your organization expands the number of use cases, users, and documents and has the goal of being your one stop for integrating any model or LLM functionality into your business.

Main Features

Contextualize chat with your own data using RAG (Retrieval-Augmented Generation)
Scalable backend and frontend, multi-user, high throughput
Fully containerized with Kubernetes
Multi-modal support for text, images, and audio
Highly customizable prompting for:
- talk to LLM
- talk to document
- talk to collection of documents
- talk to every page of a collection (Map/Reduce), summary, extraction
LLM agnostic, choose the model you need for your use case

class h2ogpte.H2OGPTE(address: str, api_key: str | None = None, token_provider: TokenProvider | None = None, verify: bool | str = True, strict_version_check: bool = False)

Bases: H2OGPTESyncBase

Connect to and interact with an h2oGPTe server.

add_agent_key(agent_keys: List[dict]) → List[dict]

add_role_to_group(group_id: str, roles: List[str]) → List[UserRole]

add_role_to_user(user_id: str, roles: List[str]) → List[UserRole]

add_user_document_permission(user_id: str, document_id: str) → [<class 'str'>, <class 'str'>]

Associates a user with a document they have permission on. Args:

user_id:
String The id of the user that has the permission.

document_id:
String The id of the document that the permission is for.

Returns:: [user_id, document_id]: A tuple containing the user_id and document_id.

Send a message and get a response from an LLM.

Note: This method is only recommended if you are passing a chat conversation or for low-volume testing. For general chat with an LLM, we recommend session.query() for higher throughput in multi-user environments. The following code sample shows the recommended method:
# Establish a chat session
chat_session_id = client.create_chat_session()
# Connect to the chat session
with client.connect(chat_session_id) as session:
    # Send a basic query and print the reply
    reply = session.query("Hello", timeout=60)
    print(reply.content)
Format of inputs content:
{text_context_list}
"""\n{chat_conversation}{question}
Args:

question:
Text query to send to the LLM.

text_context_list:
List of raw text strings to be included, will be converted to a string like this: “

“.join(text_context_list)

system_prompt:

Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto for the model default, or None for h2oGPTe default. Defaults to ‘’ for no system prompt.

pre_prompt_query:

Text that is prepended before the contextual document chunks in text_context_list. Only used if text_context_list is provided.

prompt_query:

Text that is appended after the contextual document chunks in text_context_list. Only used if text_context_list is provided.

llm:

Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:: temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

chat_conversation:

List of tuples for (human, bot) conversation that will be pre-appended to an (question, None) case for a query.

guardrails_settings:

Guardrails Settings.

timeout:

Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:: h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: Answer: The response text and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

archive_collection(collection_id: str) → str

Archive a collection along with its associated data.

Args:

collection_id:: ID of the collection to archive.

assign_agent_key_for_tool(tool_dict_list: List[dict]) → List[Tuple] | None

assign_permissions_to_role(role_name: str, permission_names: Iterable[str]) → Result

bulk_delete_global_configurations(key_names: List[str]) → List[ConfigItem]

bulk_delete_user_configurations_for_user(user_id: str, key_names: List[str]) → List[UserConfigItem]

cancel_job(job_id: str) → Result

Stops a specific job from running on the server.

Args:

job_id:: String id of the job to cancel.

Returns:

Result: Status of canceling the job.

cancel_user_job(job_id: str) → Result

As an admin, stops a specific user job from running on the server.

Args:

job_id:: String id of the job to cancel.

Returns:

Result: Status of canceling the job.

count_assets() → ObjectCount

Counts number of objects owned by the user.

Returns:: ObjectCount: The count of chat sessions, collections, and documents.

count_chat_sessions() → int

Counts number of chat sessions owned by the user.

Returns:: int: The count of chat sessions owned by the user.

count_chat_sessions_for_collection(collection_id: str) → int

Counts number of chat sessions in a specific collection.

Args:

collection_id:: String id of the collection to count chat sessions for.

Returns:

int: The count of chat sessions in that collection.

count_collections() → int

Counts number of collections owned by the user.

Returns:: int: The count of collections owned by the user.

count_documents() → int

Counts number of documents accessed by the user.

Returns:: int: The count of documents accessed by the user.

count_documents_in_collection(collection_id: str) → int

Counts the number of documents in a specific collection.

Args:

collection_id:: String id of the collection to count documents for.

Returns:

int: The number of documents in that collection.

count_documents_owned_by_me() → int

Counts number of documents owned by the user.

Returns:: int: The count of documents owned by the user.

count_prompt_templates() → int

Counts number of prompt templates

Returns:: int: The count of prompt templates

count_question_reply_feedback() → int

Fetch user’s questions and answers with feedback count.

Returns:: int: the count of questions and replies that have a user feedback.

create_api_key_for_user(user_id: str, name: str | None = None, collection_id: str | None = None, expires_in: str | None = None) → str

Allows admins to create a new api key for a specific user and optionally make it specific to a collection. Args:

user_id:
String: The id of the user the API key is for.

name:
(Optional) String: The name of the API key.

collection_id:
(Optional) String: The id of the specific collection.

expires_in:
(Optional) String: The expiration for the API key as an interval. Ex. “30 days” or “30 minutes”

Returns:: String: The id of the API key.

create_chat_session(collection_id: str | None = None) → str

Creates a new chat session for asking questions (of documents).

Args:

collection_id:: String id of the collection to chat with. If None, chat with LLM directly.

Returns:

str: The ID of the newly created chat session.

create_chat_session_on_default_collection() → str

Creates a new chat session for asking questions of documents on the default collection.

Returns:: str: The ID of the newly created chat session.

Creates a new collection.

Args:

name:

Name of the collection.

description:

Description of the collection

embedding_model:

embedding model to use. call list_embedding_models() to list of options.

prompt_template_id:

ID of the prompt template to get the prompts from. None to fall back to system defaults.

collection_settings:

(Optional) Dictionary with key/value pairs to configure certain collection specific settings max_tokens_per_chunk: Approximate max. number of tokens per chunk for text-dominated document pages. For images, chunks can be larger. chunk_overlap_tokens: Approximate number of tokens that are overlapping between successive chunks. gen_doc_summaries: Whether to auto-generate document summaries (uses LLM) gen_doc_questions: Whether to auto-generate sample questions for each document (uses LLM) audio_input_language: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices. ocr_model: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models.

Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. auto - Automatic will auto-select the best OCR model for every page. off - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).

tesseract_lang: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices. keep_tables_as_one_chunk: When tables are identified by the table parser the table tokens will be kept in a single chunk. chunk_by_page: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is true. handwriting_check: Check pages for handwriting. Will use specialized models if handwriting is found. follow_links: Whether to import all web pages linked from this URL will be imported. External links will be ignored. Links to other pages on the same domain will be followed as long as they are at the same level or below the URL you specify. Each page will be transformed into a PDF document. max_depth: Max depth of recursion when following links, only when follow_links is True. Max_depth of 0 means don’t follow any links, max_depth of 1 means follow only top-level links, etc. Use -1 for automatic (system settings). max_documents: Max number of documents when following links, only when follow_links is True. Use None for automatic (system defaults). Use -1 for max (system limit). root_dir: Root directory for document storage copy_document: Whether to copy the document when importing an existing document. guardrails_settings itself is a dictionary of the following keys.

column_redaction_config: list of list for redacting columns from CSV/TSV files (regex_pattern, fill_value) disallowed_regex_patterns: list of regular expressions that match custom PII presidio_labels_to_flag: list of entities to be flagged as PII by the built-in Presidio model. pii_labels_to_flag: list of entities to be flagged as PII by the built-in PII model. pii_detection_parse_action: what to do when PII is detected during parsing of documents. One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content in the ingested documents with redaction bars. pii_detection_llm_input_action: what to do when PII is detected in the input to the LLM (document content and user prompts). One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content with placeholders. pii_detection_llm_output_action: what to do when PII is detected in the output of the LLM. One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content with placeholders. prompt_guard_labels_to_flag: list of entities to be flagged as safety violations in user prompts by the built-in prompt guard model. guardrails_labels_to_flag: list of entities to be flagged as safety violations in user prompts. Must be a subset of guardrails_entities, if provided. guardrails_safe_category: (Optional) name of the safe category for guardrails. Must be a key in guardrails_entities, if provided. Otherwise uses system defaults. guardrails_entities: (Optional) dictionary of entities and their descriptions for the guardrails model to classify. The first entry is the “safe” class, the rest are “unsafe” classes. column_redaction_custom_entities_to_flag: list of entities to redact in tabular data files. Must be a subset of column_redaction_custom_entities, if provided. column_redaction_custom_entities: (Optional) dictionary of entities and a short description for the LLM to check for and redact columns containing PII in tabular data files. guardrails_llm: LLM to use for guardrails and PII detection. Use “auto” for automatic. Use H2OGPTE.get_llms() to see all available options.

Example:

Note: Call client.get_guardrails_settings() to see all options for guardrails_settings.

collection_settings=dict(

max_tokens_per_chunk=320, chunk_overlap_tokens=0, guardrails_settings=dict(

disallowed_regex_patterns=[“secret_disallowed_word”, r”(?!0{3})(?!6{3})[0-8]d{2}-(?!0{2})d{2}-(?!0{4})d{4}”], presidio_labels_to_flag=[“IBAN_CODE”, “US_SSN”], pii_labels_to_flag=[“IBAN”, “SSN”], pii_detection_parse_action=”redact”, pii_detection_llm_input_action=”redact”, pii_detection_llm_output_action=”allow”, prompt_guard_labels_to_flag=[“JAILBREAK”], guardrails_labels_to_flag=[

“Violent Crimes”, “Non-Violent Crimes”, “Intellectual Property”, “Code Interpreter Abuse”,

], guardrails_safe_category=”Safe”, guardrails_entities={

“Safe”: “Messages that do not contain any of the following unsafe content”, “Violent Crimes”: “Messages that enable, encourage, or endorse the commission of violent crimes, including: (1) unlawful violence toward people (ex: terrorism, genocide, murder, hate-motivated violent crimes, child abuse, assault, battery, kidnapping) and (2) unlawful violence toward animals (ex: animal abuse)”, “Non-Violent Crimes”: “Messages that enable, encourage, or endorse the commission of non-violent crimes, including: (1) personal crimes (ex: labor trafficking, threats, intimidation, hate-motivated non-violent crimes), (2) financial crimes (ex: fraud, scams, money laundering), (3) property crimes (ex: theft, arson, vandalism), (4) drug crimes (ex: creating or using regulated substances), (5) weapons crimes (ex: producing unlicensed firearms), and (6) cyber crimes (ex: hacking)”, “Defamation”: “Messages that are both verifiably false and likely to injure a living person’s reputation”, “Specialized Advice”: “Messages that contain specialized financial, medical, or legal advice, or that indicate dangerous activities or objects are safe”, “Intellectual Property”: “Messages that may violate the intellectual property rights of any third party”, “Code Interpreter Abuse”: “Messages that seek to abuse code interpreters, including those that enable denial of service attacks, container escapes or privilege escalation exploits”,

}, column_redaction_custom_entities_to_flag=[

“Mother’s Maiden Name”

], column_redaction_custom_entities={

“Mother’s Maiden Name”: “Mother’s maiden name.”

}, guardrails_llm=”meta-llama/Llama-3.3-70B-Instruct”,

),

)

thumbnail:

(Optional) Path to the thumbnail image for the collection. Must include appropriate file extension.

chat_settings:

(Optional) Dictionary with key/value pairs to configure the default values for certain chat specific settings The following keys are supported, see the client.session() documentation for more details. llm: str llm_args: dict self_reflection_config: dict rag_config: dict include_chat_history: bool tags: list[str]

Returns:

str: The ID of the newly created collection.

create_prompt_template(name: str, description: str | None = None, lang: str | None = None, system_prompt: str | None = None, pre_prompt_query: str | None = None, prompt_query: str | None = None, hyde_no_rag_llm_prompt_extension: str | None = None, pre_prompt_summary: str | None = None, prompt_summary: str | None = None, system_prompt_reflection: str | None = None, pre_prompt_reflection: str | None = None, prompt_reflection: str | None = None, auto_gen_description_prompt: str | None = None, auto_gen_document_summary_pre_prompt_summary: str | None = None, auto_gen_document_summary_prompt_summary: str | None = None, auto_gen_document_sample_questions_prompt: str | None = None, default_sample_questions: List[str] | None = None, image_batch_image_prompt: str | None = None, image_batch_final_prompt: str | None = None) → str

Create a new prompt template

Args:

name:: Name of the prompt template
description:: Description of the prompt template
lang:: Language code
system_prompt:: System Prompt
pre_prompt_query:: Text that is prepended before the contextual document chunks.
prompt_query:: Text that is appended to the beginning of the user’s message.
hyde_no_rag_llm_prompt_extension:: LLM prompt extension.
pre_prompt_summary:: Prompt that goes before each large piece of text to summarize
prompt_summary:: Prompt that goes after each large piece of text to summarize
system_prompt_reflection:: System Prompt for self-reflection
pre_prompt_reflection:: Deprecated - ignored
prompt_reflection:: Template for self-reflection, must contain two occurrences of %s for full previous prompt (including system prompt, document related context and prompts if applicable, and user prompts) and answer
auto_gen_description_prompt:: prompt to create a description of the collection.
auto_gen_document_summary_pre_prompt_summary:: pre_prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_summary_prompt_summary:: prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_sample_questions_prompt:: prompt to create sample questions for a freshly imported document (if enabled).
default_sample_questions:: default sample questions in case there are no auto-generated sample questions.
image_batch_final_prompt:: Prompt for each image batch for vision models
image_batch_image_prompt:: Prompt to reduce all answers each image batch for vision models

Returns:

str: The ID of the newly created prompt template.

create_tag(tag_name: str) → str

Creates a new tag.

Args:

tag_name:: String representing the tag to create.

Returns:

String: The id of the created tag.

create_topic_model(collection_id: str, timeout: float | None = None) → Job

create_user_group(name: str, description: str) → UserGroup

create_user_role(name: str, description: str) → UserRole

deactivate_api_key(api_key_id: str) → Result

Allows admins to deactivate an API key.

Note: You cannot undo this action.

Args:

api_key_id:: String: The id of the API key.

Returns:

Result: Status of the deactivate request.

delete_agent_directories(chat_session_id: str) → bool

delete_agent_keys(key_ids: List[str]) → None

delete_agent_server_files(chat_session_id: str) → bool

delete_agent_tool_association(associate_ids: List[str]) → int

delete_agent_tool_preference() → None

delete_api_keys(api_key_ids: List[str]) → Result

Allows admins to delete API keys.

Args:

api_key_ids:: List[str]: The API keys to delete.

Returns:

Result: Status of the delete request.

delete_chat_messages(chat_message_ids: Iterable[str]) → Result

Deletes specific chat messages.

Args:

chat_message_ids:: List of string ids of chat messages to delete from the system.

Returns:

Result: Status of the delete job.

delete_chat_sessions(chat_session_ids: Iterable[str], timeout: float | None = None) → Job

Deletes chat sessions and related messages.

Args:

chat_session_ids:: List of string ids of chat sessions to delete from the system.
timeout:: Timeout in seconds.

Returns:

Result: The delete job.

delete_collections(collection_ids: Iterable[str], timeout: float | None = None) → Job

Deletes collections from the environment.

Documents in the collection that are owned by other users will not be deleted.

Args:

collection_ids:: List of string ids of collections to delete from the system.
timeout:: Timeout in seconds.

delete_collections_as_admin(collection_ids: Iterable[str], timeout: float | None = None) → Job

Deletes collections and their associated data from the environment (needs appropriate permission).

Args:

collection_ids:: List of string ids of collections to delete from the system.
timeout:: Timeout in seconds.

delete_document_summaries(summaries_ids: Iterable[str]) → Result

Deletes document summaries.

Args:

summaries_ids:: List of string ids of a document summary to delete from the system.

Returns:

Result: Status of the delete job.

delete_documents(document_ids: Iterable[str], timeout: float | None = None) → Job

Deletes documents from the system.

Args:

document_ids:: List of string ids to delete from the system and all collections.
timeout:: Timeout in seconds.

delete_documents_from_collection(collection_id: str, document_ids: Iterable[str], timeout: float | None = None) → Job

Removes documents from a collection.

See Also: H2OGPTE.delete_documents for completely removing the document from the environment.

Args:

collection_id:: String of the collection to remove documents from.
document_ids:: List of string ids to remove from the collection.
timeout:: Timeout in seconds.

delete_multiple_agent_directories(chat_session_ids: List[str], dir_types: List[str]) → bool

delete_prompt_templates(ids: Iterable[str]) → Result

Deletes prompt templates

Args:

ids:: List of string ids of prompte templates to delete from the system.

Returns:

Result: Status of the delete job.

delete_upload(upload_id: str) → str

Delete a file previously uploaded with the “upload” method.

See Also:

upload: Upload the files into the system to then be ingested into a collection. ingest_uploads: Add the uploaded files to a collection.

Args:

upload_id:: ID of a file to remove

Returns:

upload_id: The upload id of the removed.

Raises:

Exception: The delete upload request was unsuccessful.

delete_user_groups_by_ids(groups_ids: Iterable[str]) → Result

delete_user_groups_by_names(groups_names: Iterable[str]) → Result

delete_user_roles_by_ids(roles_ids: Iterable[str]) → Result

delete_user_roles_by_names(roles_names: Iterable[str]) → Result

download_document(destination_directory: str | Path, destination_file_name: str, document_id: str) → Path

Downloads a document to a local system directory.

Args:

destination_directory:: Destination directory to save file into.
destination_file_name:: Destination file name.
document_id:: Document ID.

Returns:

Path: Path of downloaded document

download_reference_highlighting(message_id: str, destination_directory: str, output_type: str = 'combined', limit: int | None = None) → list

Get PDFs with reference highlighting

Args:

message_id:: ID of the message to get references from
destination_directory:: Destination directory to save files into.
output_type: str one of: "combined" Generates a PDF file for each source document, with all relevant chunks highlighted in each respective file. This option consolidates all highlights for each source document into a single PDF, making it easy to view all highlights related to that document at once. "split" Generates a separate PDF file for each chunk, with only the relevant chunk highlighted in each file. This option is useful for focusing on individual sections without interference from other parts of the text. The output files names will be in the format “{document_id}_{chunk_id}.pdf”
limit:: The number of references to consider based on the highest confidence scores.

Returns:

list[Path]: List of paths of downloaded documents with highlighting

encode_for_retrieval(chunks: Iterable[str], embedding_model: str | None = None) → List[List[float]]

Encode texts for semantic searching.

See Also: H2OGPTE.match for getting a list of chunks that semantically match each encoded text.

Args:

chunks:: List of strings of texts to be encoded.
embedding_model:: embedding model to use. call list_embedding_models() to list of options.

Returns:

List of list of floats: Each list in the list is the encoded original text.

Extract information from one or more contexts using an LLM.

pre_prompt_extract and prompt_extract variables must be used together. If these variables are not set, the inputs texts will be summarized into bullet points.

Format of extract content:
"{pre_prompt_extract}"""
{text_context_list}
"""\n{prompt_extract}"
Examples:
extract = h2ogpte.extract_data(
    text_context_list=chunks,
    pre_prompt_extract="Pay attention and look at all people. Your job is to collect their names.\n",
    prompt_extract="List all people's names as JSON.",
)
Args:

text_context_list:
List of raw text strings to extract data from.

system_prompt:
Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto or None for the model default. Defaults to ‘’ for no system prompt.

pre_prompt_extract:
Text that is prepended before the list of texts. If not set, the inputs will be summarized.

prompt_extract:
Text that is appended after the list of texts. If not set, the inputs will be summarized.

llm:
Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:
temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

guardrails_settings:
Guardrails Settings.

timeout:
Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:
h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: ExtractionAnswer: The list of text responses and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

get_agent_key_tool_associations() → List[dict]

get_agent_keys() → List[dict]

get_agent_server_files(chat_session_id: str) → List[dict]

get_agent_tool_preference() → List[str]

get_agent_tools_dict() → dict

get_all_directory_stats(chat_session_id: str, detail_level: int = 0) → dict

get_chat_session_prompt_template(chat_session_id: str) → PromptTemplate | None

Get the prompt template for a chat_session

Args:

chat_session_id:: ID of the chat session

Returns:

str: ID of the prompt template.

get_chat_session_questions(chat_session_id: str, limit: int) → List[SuggestedQuestion]

List suggested questions

Args:

chat_session_id:: A chat session ID of which to return the suggested questions
limit:: How many questions to return.

Returns:

List: A list of questions.

get_chunks(collection_id: str, chunk_ids: Iterable[int]) → List[Chunk]

Get the text of specific chunks in a collection.

Args:

collection_id:: String id of the collection to search in.
chunk_ids:: List of ints for the chunks to return. Chunks are indexed starting at 1.

Returns:

Chunk: The text of the chunk.

Raises:

Exception: One or more chunks could not be found.

get_collection(collection_id: str) → Collection

Get metadata about a collection.

Args:

collection_id:: String id of the collection to search for.

Returns:

Collection: Metadata about the collection.

Raises:

KeyError: The collection was not found.

get_collection_for_chat_session(chat_session_id: str) → Collection

Get metadata about the collection of a chat session.

Args:

chat_session_id:: String id of the chat session to search for.

Returns:

Collection: Metadata about the collection.

get_collection_prompt_template(collection_id: str) → PromptTemplate | None

Get the prompt template for a collection

Args:

collection_id:: ID of the collection

Returns:

str: ID of the prompt template.

get_collection_questions(collection_id: str, limit: int) → List[SuggestedQuestion]

List suggested questions

Args:

collection_id:: A collection ID of which to return the suggested questions
limit:: How many questions to return.

Returns:

List: A list of questions.

get_default_collection() → CollectionInfo

Get the default collection, to be used for collection API-keys.

Returns:: CollectionInfo: Default collection info.

get_directory_stats(directory_name: str, chat_session_id: str, detail_level: int = 0) → dict

get_document(document_id: str, include_layout: bool = False) → Document

Fetches information about a specific document.

Args:

document_id:: String id of the document.
include_layout:: Include the layout prediction results.

Returns:

Document: Metadata about the Document.

Raises:

KeyError: The document was not found.

get_document_content(file_name: str, document_id: str) → bytes

Downloads a document and return its content as a byte array.

Args:

file_name:: File name.
document_id:: Document ID.

Returns:

Path: File content

get_global_configurations() → List[ConfigItem]

get_global_configurations_by_admin() → List[ConfigItem]

get_guardrails_settings(action: str = 'redact', sensitive: bool = True, non_sensitive: bool = True, all_guardrails: bool = True, guardrails_settings: dict | None = None) → Dict[str, str | List[str]]: Helper to get reasonable (easy to use) defaults for Guardrails/PII settings. To be further customized. :param action: what to do when detecting PII, either “redact” or “fail” (“allow” would keep PII intact). Guardrails models always fail upon detecting safety violations. :param sensitive: whether to include the most sensitive PII entities like SSN, bank account info :param non_sensitive: whether to include all non-sensitive PII entities, such as IP addresses, locations, names, e-mail addresses etc. :param all_guardrails: whether to include all possible entities for prompt guard and guardrails models, or just system defaults :param guardrails_settings: existing guardrails settings (e.g., from collection settings) to obtain guardrails entities, guardrails_entities_to_flag, column redaction custom_pii_entities, column_redaction_pii_to_flag from instead of system defaults :return: dictionary to pass to collection creation or process_document method

get_h2ogpt_system_stats() → dict

get_job(job_id: str) → Job

Fetches information about a specific job.

Args:

job_id:: String id of the job.

Returns:

Job: Metadata about the Job.

get_llm_and_auto_reasoning_llm_names() → Dict[str, str]

Get mapping of llm to its reasoning_model when [“auto”] is passed as visible_reasoning_models

Returns:: dictionary {‘llm1’: ‘llm1_reasoning_llm’, etc.}

get_llm_and_auto_vision_llm_names() → Dict[str, str]

Get mapping of llm to its vision_model when [“auto”] is passed as visible_vision_models

Returns:: dictionary {‘llm1’: ‘llm1_vision_llm’, etc.}

get_llm_name_for_rest(llm: str | int | None)

get_llm_names() → List[str]

Lists names of available LLMs in the environment.

Returns:: list of string: Name of each available model.

get_llm_performance_by_llm(interval: str) → List[LLMPerformance]

get_llm_usage_24h() → float

get_llm_usage_24h_by_llm() → List[LLMUsage]

get_llm_usage_24h_with_limits() → LLMUsageLimit

get_llm_usage_6h() → float

get_llm_usage_6h_by_llm() → List[LLMUsage]

get_llm_usage_by_llm(interval: str) → List[LLMUsage]

get_llm_usage_by_llm_and_user(interval: str) → List[LLMWithUserUsage]

get_llm_usage_by_user(interval: str) → List[UserWithLLMUsage]

get_llm_usage_with_limits(interval: str) → LLMUsageLimit

get_llms() → List[Dict[str, Any]]

Lists metadata information about available LLMs in the environment.

Returns:: list of dict (string, ANY): Name and details about each available model.

get_meta() → Meta

Returns information about the environment and the user.

Returns:: Meta: Details about the version and license of the environment and the user’s name and email.

get_prompt_template(id: str | None = None) → PromptTemplate

Get a prompt template

Args:

id:: String id of the prompt template to retrieve or None for default

Returns:

PromptTemplate: prompts

Raises:

KeyError: The prompt template was not found.

get_reasoning_capable_llm_names() → List[str]

Lists names of available reasoning-capable (that can natively reason) in the environment.

Returns:: list of string: Name of each available model.

get_scheduler_stats() → SchedulerStats

Count the number of global, pending jobs on the server.

Returns:: SchedulerStats: The queue length for number of jobs.

get_tag(tag_name: str) → Tag

Returns an existing tag.

Args:

tag_name:: String The name of the tag to retrieve.

Returns:

Tag: The requested tag.

Raises:

KeyError: The tag was not found.

get_user_all_agent_directories(offset: int, limit: int, filter_text: str | None) → List[dict]

get_user_configurations() → List[UserConfigItem]

Gets the user configurations for the current user.

Returns:: List[UserConfigItem]: List of user configurations.

get_user_configurations_for_user(user_id: str) → List[UserConfigItem]

Gets the user configurations for a specific user (to be used by admins only).

Args:

user_id:: The unique identifier of the user.

Returns:

List[UserConfigItem]: List of user configurations.

get_vision_capable_llm_names() → List[str]

Lists names of available vision-capable multi-modal LLMs (that can natively handle images as input) in the environment.

Returns:: list of string: Name of each available model.

Import all documents from a collection into an existing collection

Args:

collection_id:: Collection ID to add documents to.
src_collection_id:: Collection ID to import documents from.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
copy_document:: Whether to save a new copy of the document
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Import an already stored document to an existing collection

Args:

collection_id:: Collection ID to add documents to.
document_id:: Document ID to add.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
copy_document:: Whether to save a new copy of the document
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

ingest_agent_only_to_standard(collection_id: str, document_id: str, gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, audio_input_language: str | None = None, ocr_model: str | None = None, restricted: bool = False, permissions: List[SharePermission] | None = None, tesseract_lang: str | None = None, keep_tables_as_one_chunk: bool | None = None, chunk_by_page: bool | None = None, handwriting_check: bool | None = None, timeout: float | None = None)

For files uploaded in “agent_only” ingest mode, convert to PDF and parse

See Also:

upload: Upload the files into the system to then be ingested into a collection. delete_upload: Delete uploaded file

Args:

collection_id:: String id of the collection to add the ingested documents into.
document_id:: ID of document to be parsed.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
restricted:: Whether the document should be restricted only to certain users.
permissions:: List of permissions. Each permission is a SharePermission object.
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.

Add files from the Azure Blob Storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
container:: Name of the Azure Blob Storage container.
path:: Path or list of paths to files or directories within an Azure Blob Storage container. Examples: file1, dir1/file2, dir3/dir4/
account_name:: Name of a storage account
credentials:: The object with Azure credentials. If the object is not provided, only a public container will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Metadata to be associated with the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Add files from the local system into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
root_dir:: String path of where to look for files.
glob:: String of the glob pattern used to match files in the root directory.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Add files from the Google Cloud Storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: The path or list of paths of GCS files or directories. Examples: gs://bucket/file, gs://bucket/../dir/
credentials:: The object holding a path to a JSON key of Google Cloud service account. If the object is not provided, only public buckets will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

ingest_from_plain_text(collection_id: str, plain_text: str, file_name: str, gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, metadata: Dict[str, Any] | None = None, timeout: float | None = None)

Add plain text to a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
plain_text:: String of the plain text to ingest.
file_name:: String of the file name to use for the document.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds

Add files from the AWS S3 storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: The path or list of paths of S3 files or directories. Examples: s3://bucket/file, s3://bucket/../dir/
region:: The name of the region used for interaction with AWS services.
credentials:: The object with S3 credentials. If the object is not provided, only public buckets will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

ingest_uploads(collection_id: str, upload_ids: Iterable[str], gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, audio_input_language: str | None = None, ocr_model: str | None = None, restricted: bool = False, permissions: List[SharePermission] | None = None, tesseract_lang: str | None = None, keep_tables_as_one_chunk: bool | None = None, chunk_by_page: bool | None = None, handwriting_check: bool | None = None, metadata: Dict[str, Any] | None = None, timeout: float | None = None, ingest_mode: str | None = None) → Job

Add uploaded documents into a specific collection.

See Also:

upload: Upload the files into the system to then be ingested into a collection. delete_upload: Delete uploaded file

Args:

collection_id:: String id of the collection to add the ingested documents into.
upload_ids:: List of string ids of each uploaded document to add to the collection.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
restricted:: Whether the document should be restricted only to certain users.
permissions:: List of permissions. Each permission is a SharePermission object.
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Metadata to be associated with the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Crawl and ingest a URL into a collection.

The web page or document linked from this URL will be imported.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: String of the url to crawl.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
follow_links:: Whether to import all web pages linked from this URL will be imported. External links will be ignored. Links to other pages on the same domain will be followed as long as they are at the same level or below the URL you specify. Each page will be transformed into a PDF document.
max_depth:: Max depth of recursion when following links, only when follow_links is True. Max_depth of 0 means don’t follow any links, max_depth of 1 means follow only top-level links, etc. Use -1 for automatic (system settings).
max_documents:: Max number of documents when following links, only when follow_links is True. Use None for automatic (system defaults). Use -1 for max (system limit).
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

is_collection_permission_granted(collection_id: str, permission: str) → bool

is_permission_granted(permission: str) → bool

list_all_api_keys(offset: int, limit: int, key_filter: str = '') → List[APIKey]

Allows admins to list all the API keys that exist.

Args:

offset:: Int: How many keys to skip before returning.
limit:: Int: How many keys to return.
key_filter:: String: Only returns keys for usernames matching this filter.

Returns:

List[APIKey]: List of APIKeys with metadata about each key.

list_all_collections_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[CollectionInfo]

Fetch all users’ collection metadata sorted by last update time.

This is for admin use only and includes private, public, and shared collections in the result.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of CollectionInfo: Metadata about each collection.

list_all_jobs() → List[UserJobs]

List all jobs (to be used by admins only).

Returns:: list of UserJobs

list_all_tags() → List[Tag]

Lists all existing tags.

Returns:: List of Tags: List of existing tags.

list_chat_message_meta_part(message_id: str, info_type: str) → ChatMessageMeta

Fetch one chat message meta information.

Args:

message_id:: Message id to which the metadata should be pulled.
info_type:: Metadata type to fetch. Valid choices are: “self_reflection”, “usage_stats”, “prompt_raw”, “llm_only”, “hyde1”, “py_client_code”

Returns:

ChatMessageMeta: Metadata information about the chat message.

list_chat_message_references(message_id: str, limit: int | None = None) → List[ChatMessageReference]

Fetch metadata for references of a chat message.

References are only available for messages sent from an LLM, an empty list will be returned for messages sent by the user.

Args:

message_id:: String id of the message to get references for.
limit:: The number of references to consider based on the highest confidence scores.

Returns:

list of ChatMessageReference: Metadata including the document name, polygon information, and score.

list_chat_messages(chat_session_id: str, offset: int, limit: int) → List[ChatMessage]

Fetch chat message and metadata for messages in a chat session.

Messages without a reply_to are from the end user, messages with a reply_to are from an LLM and a response to a specific user message.

Args:

chat_session_id:: String id of the chat session to filter by.
offset:: How many chat messages to skip before returning.
limit:: How many chat messages to return.

Returns:

list of ChatMessage: Text and metadata for chat messages.

list_chat_messages_full(chat_session_id: str, offset: int, limit: int) → List[ChatMessageFull]

Fetch chat message and metadata for messages in a chat session.

Messages without a reply_to are from the end user, messages with a reply_to are from an LLM and a response to a specific user message.

Args:

chat_session_id:: String id of the chat session to filter by.
offset:: How many chat messages to skip before returning.
limit:: How many chat messages to return.

Returns:

list of ChatMessageFull: Text and metadata for chat messages.

list_chat_sessions_for_collection(collection_id: str, offset: int, limit: int) → List[ChatSessionForCollection]

Fetch chat session metadata for chat sessions in a collection.

Args:

collection_id:: String id of the collection to filter by.
offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionForCollection: Metadata about each chat session including the latest message.

list_chat_sessions_for_document(document_id: str, offset: int, limit: int) → List[ChatSessionForDocument]

Fetch chat session metadata for chat session that produced a specific document (typically through agents).

Args:

document_id:: String id of the document to filter by.
offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionForDocument: Metadata about each chat session including the latest message.

list_collection_group_permissions(collection_id: str) → List[GroupSharePermission]

Returns a list of group access permissions for a given collection.

The returned list of group permissions denoting which groups have access to the collection and their access level.

Args:

collection_id:: ID of the collection to inspect.

Returns:

list of GroupSharePermission: Group sharing permissions list for the given collection.

list_collection_permissions(collection_id: str) → List[SharePermission]

Returns a list of access permissions for a given collection.

The returned list of permissions denotes who has access to the collection and their access level.

Args:

collection_id:: ID of the collection to inspect.

Returns:

list of SharePermission: Sharing permissions list for the given collection.

list_collections_for_document(document_id: str, offset: int, limit: int) → List[CollectionInfo]

Fetch metadata about each collection the document is a part of.

At this time, each document will only be available in a single collection.

Args:

document_id:: String id of the document to search for.
offset:: How many collections to skip before returning.
limit:: How many collections to return.

Returns:

list of CollectionInfo: Metadata about each collection.

list_document_chunks(document_id: str, collection_id: str | None = None) → List[SearchResult]

Returns all chunks for a specific document.

Args:

document_id:: ID of the document.
collection_id:: ID of the collection the document belongs to. If not specified, an arbitrary collections containing the document is chosen.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

list_documents_from_tags(collection_id: str, tags: List[str]) → List[Document]

Lists documents that have the specified set of tags within a collection. Args:

collection_id:
String The id of the collection to find documents in.

tags:
List of Strings representing the tags to retrieve documents for.

Returns:: List of Documents: All the documents with the specified tags.

list_documents_in_collection(collection_id: str, offset: int, limit: int, metadata_filter: dict = {}) → List[DocumentInfo]

Fetch document metadata for documents in a collection.

Args:

collection_id:: String id of the collection to filter by.
offset:: How many documents to skip before returning.
limit:: How many documents to return.
metadata_filter:: Metadata filter to apply to the documents.

Returns:

list of DocumentInfo: Metadata about each document.

list_embedding_models() → List[str]

list_group_permissions(group_id: str) → List[UserPermission]

list_group_permissions_by_name(group_names: List[str]) → List[UserPermission]

list_group_roles(group_id: str) → List[UserRole]

list_jobs() → List[Job]

List the user’s jobs.

Returns:: list of Job:

list_list_chat_message_meta(message_id: str) → List[ChatMessageMeta]

Fetch chat message meta information.

Args:

message_id:: Message id to which the metadata should be pulled.

Returns:

list of ChatMessageMeta: Metadata about the chat message.

list_prompt_group_permissions(prompt_id: str) → List[GroupSharePermission]

Returns a list of group access permissions for a given prompt template.

The returned list of group permissions denoting which groups have access to the prompt template.

Args:

prompt_id:: ID of the prompt template to inspect.

Returns:

list of GroupSharePermission: Group sharing permissions list for the given prompt template.

list_prompt_permissions(prompt_id: str) → List[SharePermission]

Returns a list of access permissions for a given prompt template.

The returned list of permissions denotes who has access to the prompt template and their access level.

Args:

prompt_id:: ID of the prompt template to inspect.

Returns:

list of SharePermission: Sharing permissions list for the given prompt template.

list_question_reply_feedback_data(offset: int, limit: int) → List[QuestionReplyData]

Fetch user’s questions and answers that have a feedback.

Questions and answers with metadata and feedback information.

Args:

offset:: How many conversations to skip before returning.
limit:: How many conversations to return.

Returns:

list of QuestionReplyData: Metadata about questions and answers.

list_recent_chat_sessions(offset: int, limit: int) → List[ChatSessionInfo]

Fetch user’s chat session metadata sorted by last update time.

Chats across all collections will be accessed.

Args:

offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionInfo: Metadata about each chat session including the latest message.

list_recent_collections(offset: int, limit: int) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.

Returns:

list of CollectionInfo: Metadata about each collection.

list_recent_collections_filter(offset: int, limit: int, current_user_only: bool = False, name_filter: str = '') → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time with filter options.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
current_user_only:: When true, will only return the user owned collections.
name_filter:: Only returns collections with names matching this filter.

Returns:

list of CollectionInfo: Metadata about each collection.

list_recent_collections_metadata_filter(offset: int, limit: int, current_user_only: bool, metadata_filter: dict) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time with a filter on metadata.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
current_user_only:: When true, will only return the user owned collections.
metadata_filter:: Only returns collections with metadata matching this filter.

Returns:

list of CollectionInfo: Metadata about each collection.

list_recent_collections_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of CollectionInfo: Metadata about each collection.

list_recent_document_summaries(document_id: str, offset: int, limit: int) → List[ProcessedDocument]

Fetches recent document summaries/extractions/transformations

Args:

document_id:: document ID for which to return summaries
offset:: How many summaries to skip before returning summaries.
limit:: How many summaries to return.

list_recent_documents(offset: int, limit: int, metadata_filter: dict = {}) → List[DocumentInfo]

Fetch user’s document metadata sorted by last update time.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.
metadata_filter:: Metadata filter to apply to the documents.

Returns:

list of DocumentInfo: Metadata about each document.

list_recent_documents_with_summaries(offset: int, limit: int) → List[DocumentInfoSummary]

Fetch user’s document metadata sorted by last update time, including the latest document summary.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.

Returns:

list of DocumentInfoSummary: Metadata about each document.

list_recent_documents_with_summaries_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[DocumentInfoSummary]

Fetch user’s document metadata sorted by last update time, including the latest document summary.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of DocumentInfoSummary: Metadata about each document.

list_recent_prompt_templates(offset: int, limit: int) → List[PromptTemplate]

Fetch user’s prompt templates sorted by last update time.

Args:

offset:: How many prompt templates to skip before returning.
limit:: How many prompt templates to return.

Returns:

list of PromptTemplate: set of prompts

list_recent_prompt_templates_sort(offset: int, limit: int, sort_column: str, ascending: bool, template_type: str = 'all', filter: str = '') → List[PromptTemplate]

Fetch user’s prompt templates sorted by last update time.

Args:

offset:: How many prompt templates to skip before returning.
limit:: How many prompt templates to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.
template_type:: When set, will be used as a type filter, possible values are: all, user, system.
filter:: When set, will be used as a filter on some prompt template columns.

Returns:

list of PromptTemplate: set of prompts

list_system_groups() → List[UserGroup]

list_system_permissions() → List[UserPermission]

list_system_roles() → List[UserRole]

list_upload() → List[str]

List pending file uploads to the H2OGPTE backend.

Uploaded files are not yet accessible and need to be ingested into a collection.

See Also:: upload: Upload the files into the system to then be ingested into a collection. ingest_uploads: Add the uploaded files to a collection. delete_upload: Delete uploaded file
Returns:: List[str]: The pending upload ids to be used in ingest jobs.
Raises:: Exception: The upload list request was unsuccessful.

list_user_permissions(user_id: str | None = None) → List[UserPermission]

list_user_role_permissions(roles: List[str]) → List[UserPermission]

list_user_roles(user_id: str | None = None) → List[UserRole]

list_users(offset: int, limit: int) → List[User]

List system users.

Returns a list of all registered users fo the system, a registered user, is a users that has logged in at least once.

Args:

offset:: How many users to skip before returning.
limit:: How many users to return.

Returns:

list of User: Metadata about each user.

make_collection_private(collection_id: str)

Make a collection private

Once a collection is private, other users will no longer be able to access chat history or documents related to the collection.

Args:

collection_id:: ID of the collection to make private.

make_collection_public(collection_id: str)

Make a collection public

Once a collection is public, it will be accessible to all authenticated users of the system.

Args:

collection_id:: ID of the collection to make public.

match_chunks(collection_id: str, vectors: List[List[float]], topics: List[str], offset: int, limit: int, cut_off: float = 0, width: int = 0) → List[SearchResult]

Find chunks related to a message using semantic search.

Chunks are sorted by relevance and similarity score to the message.

See Also: H2OGPTE.encode_for_retrieval to create vectors from messages.

Args:

collection_id:: ID of the collection to search within.
vectors:: A list of vectorized message for running semantic search.
topics:: A list of document_ids used to filter which documents in the collection to search.
offset:: How many chunks to skip before returning chunks.
limit:: How many chunks to return.
cut_off:: Exclude matches with distances higher than this cut off.
width:: How many chunks before and after a match to return - not implemented.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

Processes a document to either create a global or piecewise summary/extraction/transformation of a document.

Effective prompt created (excluding the system prompt):

"{pre_prompt_summary}
"""
{text from document}
"""
{prompt_summary}"

Args:

document_id:

String id of the document to create a summary from.

system_prompt:

System Prompt

pre_prompt_summary:

Prompt that goes before each large piece of text to summarize

prompt_summary:

Prompt that goes after each large piece of text to summarize

image_batch_final_prompt:

Prompt for each image batch for vision models

image_batch_image_prompt:

Prompt to reduce all answers each image batch for vision models

llm:

LLM to use

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:: temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. seed (int, default: 0) — The seed for the random number generator when sampling during generation (if temp>0 or top_k>1 or top_p<1), seed=0 picks a random seed. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. enable_vision (str, default: “auto”) - Controls vision mode, send images to the LLM in addition to text chunks. Only if have models that support vision, use get_vision_capable_llm_names() to see list. One of [“on”, “off”, “auto”]. visible_vision_models (List[str], default: [“auto”]) - Controls which vision model to use when processing images. Use get_vision_capable_llm_names() to see list. Must provide exactly one model. [“auto”] for automatic.

max_num_chunks:

Max limit of chunks to send to the summarizer

sampling_strategy:

How to sample if the document has more chunks than max_num_chunks. Options are “auto”, “uniform”, “first”, “first+last”, default is “auto” (a hybrid of them all).

pages:

List of specific pages (of the ingested document in PDF form) to use from the document. 1-based indexing.

schema:

Optional JSON schema to use for guided json generation.

keep_intermediate_results:

Whether to keep intermediate results. Default: disabled. If disabled, further LLM calls are applied to the intermediate results until one global summary is obtained: map+reduce (i.e., summary). If enabled, the results’ content will be a list of strings (the results of applying the LLM to different pieces of document context): map (i.e., extract).

guardrails_settings:

Guardrails Settings.

meta_data_to_include:

A dictionary containing flags that indicate whether each piece of document metadata is to be included as part of the context given to the LLM. Only used if enable_vision is disabled. Default is {

“name”: True, “text”: True, “page”: True, “captions”: True, “uri”: False, “connector”: False, “original_mtime”: False, “age”: False, “score”: False,

}

timeout:

Amount of time in seconds to allow the request to run. The default is 86400 seconds.

Returns:

ProcessedDocument: Processed document. The content is either a string (keep_intermediate_results=False) or a list of strings (keep_intermediate_results=True).

Raises:

TimeoutError: The request did not complete in time. SessionError: No summary or extraction created. Document wasn’t part of a collection, or LLM timed out, etc.

remove_collection_expiry_date(collection_id: str) → str

Remove an expiry date from a collection.

Args:

collection_id:: ID of the collection to update.

remove_collection_inactivity_interval(collection_id: str) → str

Remove an inactivity interval for a collection.

Args:

collection_id:: ID of the collection to update.

remove_collection_size_limit(collection_id: str) → str

Remove a size limit for a collection.

Args:

collection_id:: ID of the collection to update.

remove_collection_thumbnail(collection_id: str, timeout: float | None = None)

Remove a thumbnail from a collection.

Args:

collection_id:: Collection you want to remove the thumbnail from.
timeout:: Amount of time in seconds to allow the request to run. The default is 86400 seconds.

remove_role_from_group(group_id: str, roles: List[str]) → List[UserRole]

remove_role_from_user(user_id: str, roles: List[str]) → List[UserRole]

rename_chat_session(chat_session_id: str, name: str)

Update a chat session name

Args:

chat_session_id:: String id of the document to search for.
name:: The new chat session name.

reset_and_share_prompt_template(prompt_id: str, new_usernames: List[str]) → ShareResponseStatus

Remove all users who have access to a prompt template (except for the owner) and share it with the provided list of new users.

Args:

prompt_id:: ID of the prompt template to un-share.
new_usernames:: The list of usernames belonging to the users this prompt template will be shared with.

ShareResponseStatus: Status of share request.

reset_and_share_prompt_template_with_groups(prompt_id: str, new_groups: List[str]) → ShareResponseStatus

Remove all groups who have access to a prompt template and share it with the provided list of new group ids.

Args:

prompt_id:: ID of the prompt template to un-share.
new_groups:: The list of group ids this prompt template will be shared with.

ShareResponseStatus: Status of share request.

reset_collection_prompt_settings(collection_id: str) → str

Reset the prompt settings for a given collection.

Args:

collection_id:: ID of the collection to update.

Returns:

str: ID of the updated collection.

reset_roles_for_group(group_id: str, roles: List[str]) → List[UserRole]

reset_roles_for_user(user_id: str, roles: List[str]) → List[UserRole]

reset_user_configurations_for_user(key_name: str, user_id: str) → List[UserConfigItem]

Reset a user configuration for a specific user (to be used by admins only).

Returns:: List[UserConfigItem]: List of user configurations.

run_selftest(llm: str, mode: str) → dict

Run a self-test for a given LLM Args:

llm:
Name of LLM

mode:
one of [“quick”, “rag”, “full”, “agent”]

Returns:: Dictionary with performance stats. If “error” is filled, the test failed.

search_chunks(collection_id: str, query: str, topics: List[str], offset: int, limit: int) → List[SearchResult]

Find chunks related to a message using lexical search.

Chunks are sorted by relevance and similarity score to the message.

Args:

collection_id:: ID of the collection to search within.
query:: Question or imperative from the end user to search a collection for.
topics:: A list of document_ids used to filter which documents in the collection to search.
offset:: How many chunks to skip before returning chunks.
limit:: How many chunks to return.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

set_api_key_expiration(api_key_id: str, expires_in: str | None = None) → Result

Allows admins to set an expiration on an API key.

Args:

api_key_id:: String: The id of the API key.
expires_in:: (Optional) String: The expiration for the API key as an interval or None (to remove an expiration that was previously set). Ex. “30 days” or “30 minutes”

Returns:

Result: Status of the expiration request.

set_chat_message_votes(chat_message_id: str, votes: int) → Result

Change the vote value of a chat message.

Set the exact value of a vote for a chat message. Any message type can be updated, but only LLM response votes will be visible in the UI. The expectation is 0: unvoted, -1: dislike, 1 like. Values outside of this will not be viewable in the UI.

Args:

chat_message_id:: ID of a chat message, any message can be used but only LLM responses will be visible in the UI.
votes:: Integer value for the message. Only -1 and 1 will be visible in the UI as dislike and like respectively.

Returns:

Result: The status of the update.

Raises:

Exception: The upload request was unsuccessful.

set_chat_session_collection(chat_session_id: str, collection_id: str | None) → str

Set the collection for a chat_session

Args:

chat_session_id:: ID of the chat session
collection_id:: ID of the collection, or None to chat with the LLM only.

Returns:

str: ID of the updated chat session

set_chat_session_prompt_template(chat_session_id: str, prompt_template_id: str | None) → str

Set the prompt template for a chat_session

Args:

chat_session_id:: ID of the chat session
prompt_template_id:: ID of the prompt template to get the prompts from. None to delete and fall back to system defaults.

Returns:

str: ID of the updated chat session

set_collection_expiry_date(collection_id: str, expiry_date: str, timezone: str | None = None) → str

Set an expiry date for a collection.

Args:

collection_id:: ID of the collection to update.
expiry_date:: The expiry date as a string in ‘YYYY-MM-DD’ format.
timezone:: Optional timezone to associate with expiry date (with IANA timezone support).

set_collection_inactivity_interval(collection_id: str, inactivity_interval: int) → str

Set an inactivity interval for a collection.

Args:

collection_id:: ID of the collection to update.
inactivity_interval:: The inactivity interval as an integer number of days.

set_collection_prompt_template(collection_id: str, prompt_template_id: str | None, strict_check: bool = False) → str

Set the prompt template for a collection

Args:

collection_id:: ID of the collection to update.
prompt_template_id:: ID of the prompt template to get the prompts from. None to delete and fall back to system defaults.
strict_check:: whether to check that the collection’s embedding model and the prompt template are optimally compatible

Returns:

str: ID of the updated collection.

set_collection_size_limit(collection_id: str, limit: int | str) → str

Set a maximum limit on the total size of documents (sum) added to a collection. The limit is measured in bytes.

Args:

collection_id:: ID of the collection to update.
limit:: The bytes limit, possible values follow the format: 12345, “1GB”, or “1GiB”.

set_collection_thumbnail(collection_id: str, file_path: Path, timeout: float | None = None)

Upload an image file to be set as a collection’s thumbnail.

The image file will not be considered as a collection document. Acceptable image file types include: .png, .jpg, .jpeg, .svg

Args:

collection_id:: Collection you want to add the thumbnail to.
file_path:: Path to the image file. Must include appropriate file extension.
timeout:: Amount of time in seconds to allow the request to run.

Raises:

ValueError: The file is invalid. Exception: The upload request was unsuccessful.

set_global_configuration(key_name: str, string_value: str, can_overwrite: bool, is_public: bool, value_type: str | None = None) → List[ConfigItem]

Set a global configuration.

Note: Both default collection size limit and inactivity interval can be disabled. To do so, pass ‘-1’ as the string_value.

Args:

key_name:: The name of the global config key.
string_value:: The value to be set for the global config.
can_overwrite:: Whether user settings can override this global setting.
is_public:: Whether users can see the value for this global setting.
value_type:: The type of the value to be set for the global config.

Returns:

List[ConfigItem]: List of global configurations.

set_user_configuration_for_user(key_name: str, string_value: str, user_id: str, value_type: str | None = None) → List[UserConfigItem]

Set a user configuration for a specific user (overrides the global configuration and to be used by admins only).

Note: Both default collection size limit and inactivity interval can be disabled. To do so, pass ‘-1’ as the string_value.

Args:

key_name:: The name of the global config key.
string_value:: The value to be set for the config.
user_id:: The user id you want to apply the config for.
value_type:: The type of the value to be set for the config.

Returns:

List[UserConfigItem]: List of user configurations.

share_collection(collection_id: str, permission: SharePermission) → ShareResponseStatus

Share a collection to a user.

The permission attribute defined the level of access, and who can access the collection, the collection_id attribute denotes the collection to be shared.

Args:

collection_id:: ID of the collection to share.
permission:: Defines the rule for sharing, i.e. permission level.

Returns:

ShareResponseStatus: Status of share request.

share_collection_with_group(collection_id: str, permission: GroupSharePermission) → ShareResponseStatus

Share a collection to a group.

The permission attribute defines the level of access, and which group can access the collection, the collection_id attribute denotes the collection to be shared.

Args:

collection_id:: ID of the collection to share.
permission:: Defines the rule for sharing, i.e. permission level and group.

Returns:

ShareResponseStatus: Status of share request.

share_prompt(prompt_id: str, permission: SharePermission) → ShareResponseStatus

Share a prompt template to a user.

Args:

prompt_id:: ID of the prompt template to share.
permission:: Defines the rule for sharing, i.e. permission level.

Returns:

ShareResponseStatus: Status of share request.

share_prompt_with_group(prompt_id: str, permission: GroupSharePermission) → ShareResponseStatus

Share a prompt to a group.

Args:

prompt_id:: ID of the prompt to share.
permission:: Specific permissions for a group.

Returns:

ShareResponseStatus: Status of share request.

Summarize one or more contexts using an LLM.

Effective prompt created (excluding the system prompt):
"{pre_prompt_summary}
"""
{text_context_list}
"""
{prompt_summary}"
Args:

text_context_list:
List of raw text strings to be summarized.

system_prompt:
Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto for the model default or None for h2oGPTe defaults. Defaults to ‘’ for no system prompt.

pre_prompt_summary:
Text that is prepended before the list of texts. The default can be customized per environment, but the standard default is "In order to write a concise single-paragraph or bulleted list summary, pay attention to the following text:\n"

prompt_summary:
Text that is appended after the list of texts. The default can be customized per environment, but the standard default is "Using only the text above, write a condensed and concise summary of key results (preferably as bullet points):\n"

llm:
Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:
temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

guardrails_settings:
Guardrails Settings.

timeout:
Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:
h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: Answer: The response text and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

summarize_document(*args, **kwargs) → DocumentSummary

tag_document(document_id: str, tag_name: str) → str

Adds a tag to a document.

Args:

document_id:: String id of the document to attach the tag to.
tag_name:: String representing the tag to attach.

Returns:

String: The id of the newly created tag.

unarchive_collection(collection_id: str) → str

Restore an archived collection to an active status.

Args:

collection_id:: ID of the collection to restore.

unshare_collection(collection_id: str, permission: SharePermission) → ShareResponseStatus

Remove sharing of a collection to a user.

The permission attribute defined the level of access, and who can access the collection, the collection_id attribute denotes the collection to be shared.

In case of un-sharing, the SharePermission’s user is sufficient.

Args:

collection_id:: ID of the collection to un-share.
permission:: Defines the user for which collection access is revoked.

ShareResponseStatus: Status of share request.

unshare_collection_for_all(collection_id: str) → ShareResponseStatus

Remove sharing of a collection to all other users but the original owner.

Args:

collection_id:: ID of the collection to un-share.

ShareResponseStatus: Status of share request.

unshare_collection_from_group(collection_id: str, permission: GroupSharePermission) → ShareResponseStatus

Remove sharing of a collection from a group.

The permission attribute defines which group to remove access from, the collection_id attribute denotes the collection to be unshared. In case of un-sharing, the GroupSharePermission’s group_id is sufficient.

Args:

collection_id:: ID of the collection to un-share.
permission:: Defines the group for which collection access is revoked.

Returns:

ShareResponseStatus: Status of share request.

unshare_prompt(prompt_id: str, permission: SharePermission) → ShareResponseStatus

Remove sharing of a prompt template to a user.

Args:

prompt_id:: ID of the prompt template to un-share.
permission:: Defines the user for which collection access is revoked.

ShareResponseStatus: Status of share request.

unshare_prompt_for_all(prompt_id: str) → ShareResponseStatus

Remove sharing of a prompt template to all other users but the original owner (owner action only).

Args:

prompt_id:: ID of the prompt template to un-share.

ShareResponseStatus: Status of share request.

unshare_prompt_from_group(prompt_id: str, permission: GroupSharePermission) → ShareResponseStatus

Unshare a prompt from a group.

Args:

prompt_id:: ID of the prompt to un-share.
permission:: Specific permissions for a group.

Returns:

ShareResponseStatus: Status of share request.

untag_document(document_id: str, tag_name: str) → str

Removes an existing tag from a document.

Args:

document_id:: String id of the document to remove the tag from.
tag_name:: String representing the tag to remove.

Returns:

String: The id of the removed tag.

update_agent_key(key_id: str, name: str | None = None, value: str | None = None, key_type: str | None = None, description: str | None = None) → dict | None

update_agent_tool_preference(reference_value: List[str]) → None

update_collection(collection_id: str, name: str, description: str) → str

Update the metadata for a given collection.

All variables are required. You can use h2ogpte.get_collection(<id>).name or description to get the existing values if you only want to change one or the other.

Args:

collection_id:: ID of the collection to update.
name:: New name of the collection, this is required.
description:: New description of the collection, this is required.

Returns:

str: ID of the updated collection.

update_collection_metadata(collection_id: str, collection_metadata: dict) → str

Set the new collection metadata overwriting the existing metadata. Be careful not to delete any settings you want to keep.

Args:

collection_id:: ID of the collection to update.
collection_metadata:: Dictionary containing the new collection metadata.

Returns:

str: ID of the updated collection.

update_collection_rag_type(collection_id: str, name: str, description: str, rag_type) → str

Update the metadata for a given collection.

All variables are required. You can use h2ogpte.get_collection(<id>).name or description to get the existing values if you only want to change one or the other.

Args:

collection_id:

ID of the collection to update.

name:

New name of the collection, this is required.

description:

New description of the collection, this is required.

rag_type: str one of

"auto" Automatically select the best rag_type. "llm_only" LLM Only - Answer the query without any supporting document contexts.

Requires 1 LLM or Agent call.

"agent_only" Agent Only - Answer the query with only original files passed to agent.: Requires 1 Agent call.
"rag" RAG (Retrieval Augmented Generation) - Use supporting document contexts: to answer the query. Requires 1 LLM or Agent call.
"hyde1" LLM Only + RAG composite - HyDE RAG (Hypothetical Document Embedding).: Use ‘LLM Only’ response to find relevant contexts from a collection for generating a response. Requires 2 LLM calls.
"hyde2" HyDE + RAG composite - Use the ‘HyDE RAG’ response to find relevant: contexts from a collection for generating a response. Requires 3 LLM calls.
"rag+" Summary RAG - Like RAG, but uses more context and recursive: summarization to overcome LLM context limits. Keeps all retrieved chunks, puts them in order, adds neighboring chunks, then uses the summary API to get the answer. Can require several LLM calls.
"all_data" All Data RAG - Like Summary RAG, but includes all document: chunks. Uses recursive summarization to overcome LLM context limits. Can require several LLM calls.

Returns:

str: ID of the updated collection.

update_collection_settings(collection_id: str, collection_settings: dict) → str

Set the new collection settings, must be complete. Be careful not to delete any settings you want to keep.

Args:

collection_id:: ID of the collection to update.
collection_settings:: Dictionary containing the new collection settings.

Returns:

str: ID of the updated collection.

update_document_metadata(document_id: str, document_metadata: dict) → str

Set the new document metadata overwriting the existing metadata. Be careful not to delete any settings you want to keep.

Args:

document_id:: ID of the document to update.
document_metadata:: Dictionary containing the new document metadata.

Returns:

str: ID of the updated document.

update_document_name(document_id: str, name: str) → str

Update the name metadata for a given document.

Args:

document_id:: ID of the document to update.
name:: New name of the document, must include file extension.

Returns:

str: ID of the updated document.

update_document_uri(document_id: str, uri: str) → str

Update the URI metadata for a given document.

Args:

document_id:: ID of the document to update.
uri:: New URI of the document, this is required.

Returns:

str: ID of the updated document.

update_prompt_template(id: str, name: str, description: str | None = None, lang: str | None = None, system_prompt: str | None = None, pre_prompt_query: str | None = None, prompt_query: str | None = None, hyde_no_rag_llm_prompt_extension: str | None = None, pre_prompt_summary: str | None = None, prompt_summary: str | None = None, system_prompt_reflection: str | None = None, pre_prompt_reflection: str | None = None, prompt_reflection: str | None = None, auto_gen_description_prompt: str | None = None, auto_gen_document_summary_pre_prompt_summary: str | None = None, auto_gen_document_summary_prompt_summary: str | None = None, auto_gen_document_sample_questions_prompt: str | None = None, default_sample_questions: List[str] | None = None, image_batch_image_prompt: str | None = None, image_batch_final_prompt: str | None = None) → str

Update a prompt template

Args:

id:: String ID of the prompt template to update
name:: Name of the prompt template
description:: Description of the prompt template
lang:: Language code
system_prompt:: System Prompt
pre_prompt_query:: Text that is prepended before the contextual document chunks.
prompt_query:: Text that is appended to the beginning of the user’s message.
hyde_no_rag_llm_prompt_extension:: LLM prompt extension.
pre_prompt_summary:: Prompt that goes before each large piece of text to summarize
prompt_summary:: Prompt that goes after each large piece of text to summarize
system_prompt_reflection:: System Prompt for self-reflection
pre_prompt_reflection:: Deprecated - ignored
prompt_reflection:: Template for self-reflection, must contain two occurrences of %s for full previous prompt (including system prompt, document related context and prompts if applicable, and user prompts) and answer
auto_gen_description_prompt:: prompt to create a description of the collection.
auto_gen_document_summary_pre_prompt_summary:: pre_prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_summary_prompt_summary:: prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_sample_questions_prompt:: prompt to create sample questions for a freshly imported document (if enabled).
default_sample_questions:: default sample questions in case there are no auto-generated sample questions.
image_batch_final_prompt:: Prompt for each image batch for vision models
image_batch_image_prompt:: Prompt to reduce all answers each image batch for vision models

Returns:

str: The ID of the updated prompt template.

update_question_reply_feedback(reply_id: str, expected_answer: str, user_comment: str)

Update feedback for a specific answer to a question.

Args:

reply_id:: UUID of the reply.
expected_answer:: Expected answer.
user_comment:: User comment.

Returns:

None

update_tag(tag_name: str, description: str, format: str) → str

Updates a tag.

Args:

tag_name:: String representing the tag to update.
description:: String describing the tag.
format:: String representing the format of the tag.

Returns:

String: The id of the updated tag.

upload(file_name: str, file: Any, uri: str | None = None) → str

Upload a file to the H2OGPTE backend.

Uploaded files are not yet accessible and need to be ingested into a collection.

See Also:

ingest_uploads: Add the uploaded files to a collection. delete_upload: Delete uploaded file

Args:

file_name:: What to name the file on the server, must include file extension.
file:: File object to upload, often an opened file from with open(…) as f.
uri:: Optional - URI you would like to associate with the file.

Returns:

str: The upload id to be used in ingest jobs.

Raises:

Exception: The upload request was unsuccessful.

class h2ogpte.H2OGPTEAsync(address: str, api_key: str | None = None, token_provider: AsyncTokenProvider | None = None, verify: bool | str = True, strict_version_check: bool = False)

Bases: object

Connect to and interact with an h2oGPTe server, via an async interface.

INITIAL_WAIT_INTERVAL = 0.1

MAX_WAIT_INTERVAL = 1.0

TIMEOUT = 3600.0

WAIT_BACKOFF_FACTOR = 1.4

async add_agent_key(agent_keys: List[dict]) → List[dict]

async add_role_to_group(group_id: str, roles: List[str]) → List[UserRole]

async add_role_to_user(user_id: str, roles: List[str]) → List[UserRole]

async add_user_document_permission(user_id: str, document_id: str) → [<class 'str'>, <class 'str'>]

Associates a user with a document they have permission on. Args:

user_id:
String The id of the user that has the permission.

document_id:
String The id of the document that the permission is for.

Returns:: [user_id, document_id]: A tuple containing the user_id and document_id.

Send a message and get a response from an LLM.

Note: This method is only recommended if you are passing a chat conversation or for low-volume testing. For general chat with an LLM, we recommend session.query() for higher throughput in multi-user environments. The following code sample shows the recommended method:
# Establish a chat session
chat_session_id = client.create_chat_session()
# Connect to the chat session
with client.connect(chat_session_id) as session:
    # Send a basic query and print the reply
    reply = session.query("Hello", timeout=60)
    print(reply.content)
Format of inputs content:
{text_context_list}
"""\n{chat_conversation}{question}
Args:

question:
Text query to send to the LLM.

text_context_list:
List of raw text strings to be included, will be converted to a string like this: “

“.join(text_context_list)

system_prompt:

Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto for the model default, or None for h2oGPTe default. Defaults to ‘’ for no system prompt.

pre_prompt_query:

Text that is prepended before the contextual document chunks in text_context_list. Only used if text_context_list is provided.

prompt_query:

Text that is appended after the contextual document chunks in text_context_list. Only used if text_context_list is provided.

llm:

Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:: temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

chat_conversation:

List of tuples for (human, bot) conversation that will be pre-appended to an (question, None) case for a query.

guardrails_settings:

Guardrails Settings.

timeout:

Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:: h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: Answer: The response text and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

async archive_collection(collection_id: str) → str

Archive a collection along with its associated data.

Args:

collection_id:: ID of the collection to archive.

async assign_agent_key_for_tool(tool_dict_list: List[dict]) → List[Tuple] | None

async assign_permissions_to_role(role_name: str, permission_names: Iterable[str]) → Result

async bulk_delete_global_configurations(key_names: List[str]) → List[ConfigItem]

async bulk_delete_user_configurations_for_user(user_id: str, key_names: List[str]) → List[UserConfigItem]

async cancel_job(job_id: str) → Result

Stops a specific job from running on the server.

Args:

job_id:: String id of the job to cancel.

Returns:

Result: Status of canceling the job.

async cancel_user_job(job_id: str) → Result

As an admin, stops a specific user job from running on the server.

Args:

job_id:: String id of the job to cancel.

Returns:

Result: Status of canceling the job.

async close()

connect(chat_session_id: str, prompt_template_id: str | None = None, open_timeout: int = 10, close_timeout: int = 10, max_connect_retries: int = 10, connect_retry_delay: int = 0.5, connect_retry_max_delay: int = 60) → SessionAsync

Create and participate in a chat session. This is a live connection to the H2OGPTE server contained to a specific chat session on top of a single collection of documents. Users will find all questions and responses in this session in a single chat history in the UI.

Args:

chat_session_id:: ID of the chat session to connect to.
prompt_template_id:: ID of the prompt template to use.
open_timeout:: Timeout in seconds for opening the connection.
close_timeout:: Timeout in seconds for closing the connection.
max_connect_retries:: Maximum number of connection retry attempts.
connect_retry_delay:: Initial delay in seconds between connection retries.
connect_retry_max_delay:: Maximum delay in seconds between connection retries.

Returns:

Session: Live chat session connection with an LLM.

async count_assets() → ObjectCount

Counts number of objects owned by the user.

Returns:: ObjectCount: The count of chat sessions, collections, and documents.

async count_chat_sessions() → int

Counts number of chat sessions owned by the user.

Returns:: int: The count of chat sessions owned by the user.

async count_chat_sessions_for_collection(collection_id: str) → int

Counts number of chat sessions in a specific collection.

Args:

collection_id:: String id of the collection to count chat sessions for.

Returns:

int: The count of chat sessions in that collection.

async count_collections() → int

Counts number of collections owned by the user.

Returns:: int: The count of collections owned by the user.

async count_documents() → int

Counts number of documents accessed by the user.

Returns:: int: The count of documents accessed by the user.

async count_documents_in_collection(collection_id: str) → int

Counts the number of documents in a specific collection.

Args:

collection_id:: String id of the collection to count documents for.

Returns:

int: The number of documents in that collection.

async count_documents_owned_by_me() → int

Counts number of documents owned by the user.

Returns:: int: The count of documents owned by the user.

async count_prompt_templates() → int

Counts number of prompt templates

Returns:: int: The count of prompt templates

async count_question_reply_feedback() → int

Fetch user’s questions and answers with feedback count.

Returns:: int: the count of questions and replies that have a user feedback.

async create_api_key_for_user(user_id: str, name: str | None = None, collection_id: str | None = None, expires_in: str | None = None) → str

Allows admins to create a new api key for a specific user and optionally make it specific to a collection. Args:

user_id:
String: The id of the user the API key is for.

name:
(Optional) String: The name of the API key.

collection_id:
(Optional) String: The id of the specific collection.

expires_in:
(Optional) String: The expiration for the API key as an interval. Ex. “30 days” or “30 minutes”

Returns:: String: The id of the API key.

async create_chat_session(collection_id: str | None = None) → str

Creates a new chat session for asking questions (of documents).

Args:

collection_id:: String id of the collection to chat with. If None, chat with LLM directly.

Returns:

str: The ID of the newly created chat session.

async create_chat_session_on_default_collection() → str

Creates a new chat session for asking questions of documents on the default collection.

Returns:: str: The ID of the newly created chat session.

async create_collection(name: str, description: str, embedding_model: str | None = None, prompt_template_id: str | None = None, collection_settings: dict | None = None, thumbnail: Path | None = None, chat_settings: dict | None = None) → str

Creates a new collection.

Args:

name:

Name of the collection.

description:

Description of the collection

embedding_model:

embedding model to use. call list_embedding_models() to list of options.

prompt_template_id:

ID of the prompt template to get the prompts from. None to fall back to system defaults.

collection_settings:

(Optional) Dictionary with key/value pairs to configure certain collection specific settings max_tokens_per_chunk: Approximate max. number of tokens per chunk for text-dominated document pages. For images, chunks can be larger. chunk_overlap_tokens: Approximate number of tokens that are overlapping between successive chunks. gen_doc_summaries: Whether to auto-generate document summaries (uses LLM) gen_doc_questions: Whether to auto-generate sample questions for each document (uses LLM) audio_input_language: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices. ocr_model: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models.

Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. auto - Automatic will auto-select the best OCR model for every page. off - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).

tesseract_lang: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices. keep_tables_as_one_chunk: When tables are identified by the table parser the table tokens will be kept in a single chunk. chunk_by_page: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is true. handwriting_check: Check pages for handwriting. Will use specialized models if handwriting is found. follow_links: Whether to import all web pages linked from this URL will be imported. External links will be ignored. Links to other pages on the same domain will be followed as long as they are at the same level or below the URL you specify. Each page will be transformed into a PDF document. max_depth: Max depth of recursion when following links, only when follow_links is True. Max_depth of 0 means don’t follow any links, max_depth of 1 means follow only top-level links, etc. Use -1 for automatic (system settings). max_documents: Max number of documents when following links, only when follow_links is True. Use None for automatic (system defaults). Use -1 for max (system limit). root_dir: Root directory for document storage copy_document: Whether to copy the document when importing an existing document. guardrails_settings itself is a dictionary of the following keys.

column_redaction_config: list of list for redacting columns from CSV/TSV files (regex_pattern, fill_value) disallowed_regex_patterns: list of regular expressions that match custom PII presidio_labels_to_flag: list of entities to be flagged as PII by the built-in Presidio model. pii_labels_to_flag: list of entities to be flagged as PII by the built-in PII model. pii_detection_parse_action: what to do when PII is detected during parsing of documents. One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content in the ingested documents with redaction bars. pii_detection_llm_input_action: what to do when PII is detected in the input to the LLM (document content and user prompts). One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content with placeholders. pii_detection_llm_output_action: what to do when PII is detected in the output of the LLM. One of [“allow”, “redact”, “fail”]. Redact will replace disallowed content with placeholders. prompt_guard_labels_to_flag: list of entities to be flagged as safety violations in user prompts by the built-in prompt guard model. guardrails_labels_to_flag: list of entities to be flagged as safety violations in user prompts. Must be a subset of guardrails_entities, if provided. guardrails_safe_category: (Optional) name of the safe category for guardrails. Must be a key in guardrails_entities, if provided. Otherwise uses system defaults. guardrails_entities: (Optional) dictionary of entities and their descriptions for the guardrails model to classify. The first entry is the “safe” class, the rest are “unsafe” classes. column_redaction_custom_entities_to_flag: list of entities to redact in tabular data files. Must be a subset of column_redaction_custom_entities, if provided. column_redaction_custom_entities: (Optional) dictionary of entities and a short description for the LLM to check for and redact columns containing PII in tabular data files. guardrails_llm: LLM to use for guardrails and PII detection. Use “auto” for automatic. Use H2OGPTE.get_llms() to see all available options.

Example:

Note: Call client.get_guardrails_settings() to see all options for guardrails_settings.

collection_settings=dict(

max_tokens_per_chunk=320, chunk_overlap_tokens=0, guardrails_settings=dict(

disallowed_regex_patterns=[“secret_disallowed_word”, r”(?!0{3})(?!6{3})[0-8]d{2}-(?!0{2})d{2}-(?!0{4})d{4}”], presidio_labels_to_flag=[“IBAN_CODE”, “US_SSN”], pii_labels_to_flag=[“IBAN”, “SSN”], pii_detection_parse_action=”redact”, pii_detection_llm_input_action=”redact”, pii_detection_llm_output_action=”allow”, prompt_guard_labels_to_flag=[“JAILBREAK”], guardrails_labels_to_flag=[

“Violent Crimes”, “Non-Violent Crimes”, “Intellectual Property”, “Code Interpreter Abuse”,

], guardrails_safe_category=”Safe”, guardrails_entities={

“Safe”: “Messages that do not contain any of the following unsafe content”, “Violent Crimes”: “Messages that enable, encourage, or endorse the commission of violent crimes, including: (1) unlawful violence toward people (ex: terrorism, genocide, murder, hate-motivated violent crimes, child abuse, assault, battery, kidnapping) and (2) unlawful violence toward animals (ex: animal abuse)”, “Non-Violent Crimes”: “Messages that enable, encourage, or endorse the commission of non-violent crimes, including: (1) personal crimes (ex: labor trafficking, threats, intimidation, hate-motivated non-violent crimes), (2) financial crimes (ex: fraud, scams, money laundering), (3) property crimes (ex: theft, arson, vandalism), (4) drug crimes (ex: creating or using regulated substances), (5) weapons crimes (ex: producing unlicensed firearms), and (6) cyber crimes (ex: hacking)”, “Defamation”: “Messages that are both verifiably false and likely to injure a living person’s reputation”, “Specialized Advice”: “Messages that contain specialized financial, medical, or legal advice, or that indicate dangerous activities or objects are safe”, “Intellectual Property”: “Messages that may violate the intellectual property rights of any third party”, “Code Interpreter Abuse”: “Messages that seek to abuse code interpreters, including those that enable denial of service attacks, container escapes or privilege escalation exploits”,

}, column_redaction_custom_entities_to_flag=[

“Mother’s Maiden Name”

], column_redaction_custom_entities={

“Mother’s Maiden Name”: “Mother’s maiden name.”

}, guardrails_llm=”meta-llama/Llama-3.3-70B-Instruct”,

),

)

thumbnail:

(Optional) Path to the thumbnail image for the collection. Must include appropriate file extension.

chat_settings:

(Optional) Dictionary with key/value pairs to configure the default values for certain chat specific settings The following keys are supported, see the client.session() documentation for more details. llm: str llm_args: dict self_reflection_config: dict rag_config: dict include_chat_history: bool tags: list[str]

Returns:

str: The ID of the newly created collection.

async create_prompt_template(name: str, description: str | None = None, lang: str | None = None, system_prompt: str | None = None, pre_prompt_query: str | None = None, prompt_query: str | None = None, hyde_no_rag_llm_prompt_extension: str | None = None, pre_prompt_summary: str | None = None, prompt_summary: str | None = None, system_prompt_reflection: str | None = None, pre_prompt_reflection: str | None = None, prompt_reflection: str | None = None, auto_gen_description_prompt: str | None = None, auto_gen_document_summary_pre_prompt_summary: str | None = None, auto_gen_document_summary_prompt_summary: str | None = None, auto_gen_document_sample_questions_prompt: str | None = None, default_sample_questions: List[str] | None = None, image_batch_image_prompt: str | None = None, image_batch_final_prompt: str | None = None) → str

Create a new prompt template

Args:

name:: Name of the prompt template
description:: Description of the prompt template
lang:: Language code
system_prompt:: System Prompt
pre_prompt_query:: Text that is prepended before the contextual document chunks.
prompt_query:: Text that is appended to the beginning of the user’s message.
hyde_no_rag_llm_prompt_extension:: LLM prompt extension.
pre_prompt_summary:: Prompt that goes before each large piece of text to summarize
prompt_summary:: Prompt that goes after each large piece of text to summarize
system_prompt_reflection:: System Prompt for self-reflection
pre_prompt_reflection:: Deprecated - ignored
prompt_reflection:: Template for self-reflection, must contain two occurrences of %s for full previous prompt (including system prompt, document related context and prompts if applicable, and user prompts) and answer
auto_gen_description_prompt:: prompt to create a description of the collection.
auto_gen_document_summary_pre_prompt_summary:: pre_prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_summary_prompt_summary:: prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_sample_questions_prompt:: prompt to create sample questions for a freshly imported document (if enabled).
default_sample_questions:: default sample questions in case there are no auto-generated sample questions.
image_batch_final_prompt:: Prompt for each image batch for vision models
image_batch_image_prompt:: Prompt to reduce all answers each image batch for vision models

Returns:

str: The ID of the newly created prompt template.

async create_tag(tag_name: str) → str

Creates a new tag.

Args:

tag_name:: String representing the tag to create.

Returns:

String: The id of the created tag.

async create_topic_model(collection_id: str, timeout: float | None = None) → Job

async create_user_group(name: str, description: str) → UserGroup

async create_user_role(name: str, description: str) → UserRole

async deactivate_api_key(api_key_id: str) → Result

Allows admins to deactivate an API key.

Note: You cannot undo this action.

Args:

api_key_id:: String: The id of the API key.

Returns:

Result: Status of the deactivate request.

async delete_agent_directories(chat_session_id: str) → bool

async delete_agent_keys(key_ids: List[str]) → None

async delete_agent_server_files(chat_session_id: str) → bool

async delete_agent_tool_association(associate_ids: List[str]) → int

async delete_agent_tool_preference() → None

async delete_api_keys(api_key_ids: List[str]) → Result

Allows admins to delete API keys.

Args:

api_key_ids:: List[str]: The API keys to delete.

Returns:

Result: Status of the delete request.

async delete_chat_messages(chat_message_ids: Iterable[str]) → Result

Deletes specific chat messages.

Args:

chat_message_ids:: List of string ids of chat messages to delete from the system.

Returns:

Result: Status of the delete job.

async delete_chat_sessions(chat_session_ids: Iterable[str], timeout: float | None = None) → Job

Deletes chat sessions and related messages.

Args:

chat_session_ids:: List of string ids of chat sessions to delete from the system.
timeout:: Timeout in seconds.

Returns:

Result: The delete job.

async delete_collections(collection_ids: Iterable[str], timeout: float | None = None) → Job

Deletes collections from the environment.

Documents in the collection that are owned by other users will not be deleted.

Args:

collection_ids:: List of string ids of collections to delete from the system.
timeout:: Timeout in seconds.

async delete_collections_as_admin(collection_ids: Iterable[str], timeout: float | None = None) → Job

Deletes collections and their associated data from the environment (needs appropriate permission).

Args:

collection_ids:: List of string ids of collections to delete from the system.
timeout:: Timeout in seconds.

async delete_document_summaries(summaries_ids: Iterable[str]) → Result

Deletes document summaries.

Args:

summaries_ids:: List of string ids of a document summary to delete from the system.

Returns:

Result: Status of the delete job.

async delete_documents(document_ids: Iterable[str], timeout: float | None = None) → Job

Deletes documents from the system.

Args:

document_ids:: List of string ids to delete from the system and all collections.
timeout:: Timeout in seconds.

async delete_documents_from_collection(collection_id: str, document_ids: Iterable[str], timeout: float | None = None) → Job

Removes documents from a collection.

See Also: H2OGPTE.delete_documents for completely removing the document from the environment.

Args:

collection_id:: String of the collection to remove documents from.
document_ids:: List of string ids to remove from the collection.
timeout:: Timeout in seconds.

async delete_multiple_agent_directories(chat_session_ids: List[str], dir_types: List[str]) → bool

async delete_prompt_templates(ids: Iterable[str]) → Result

Deletes prompt templates

Args:

ids:: List of string ids of prompte templates to delete from the system.

Returns:

Result: Status of the delete job.

async delete_upload(upload_id: str) → str

Delete a file previously uploaded with the “upload” method.

See Also:

upload: Upload the files into the system to then be ingested into a collection. ingest_uploads: Add the uploaded files to a collection.

Args:

upload_id:: ID of a file to remove

Returns:

upload_id: The upload id of the removed.

Raises:

Exception: The delete upload request was unsuccessful.

async delete_user_groups_by_ids(groups_ids: Iterable[str]) → Result

async delete_user_groups_by_names(groups_names: Iterable[str]) → Result

async delete_user_roles_by_ids(roles_ids: Iterable[str]) → Result

async delete_user_roles_by_names(roles_names: Iterable[str]) → Result

async download_document(destination_directory: str | Path, destination_file_name: str, document_id: str) → Path

Downloads a document to a local system directory.

Args:

destination_directory:: Destination directory to save file into.
destination_file_name:: Destination file name.
document_id:: Document ID.

Returns:

Path: Path of downloaded document

async download_reference_highlighting(message_id: str, destination_directory: str, output_type: str = 'combined', limit: int | None = None) → list

Get PDFs with reference highlighting

Args:

message_id:: ID of the message to get references from
destination_directory:: Destination directory to save files into.
output_type: str one of: "combined" Generates a PDF file for each source document, with all relevant chunks highlighted in each respective file. This option consolidates all highlights for each source document into a single PDF, making it easy to view all highlights related to that document at once. "split" Generates a separate PDF file for each chunk, with only the relevant chunk highlighted in each file. This option is useful for focusing on individual sections without interference from other parts of the text. The output files names will be in the format “{document_id}_{chunk_id}.pdf”
limit:: The number of references to consider based on the highest confidence scores.

Returns:

list[Path]: List of paths of downloaded documents with highlighting

async encode_for_retrieval(chunks: Iterable[str], embedding_model: str | None = None) → List[List[float]]

Encode texts for semantic searching.

See Also: H2OGPTE.match for getting a list of chunks that semantically match each encoded text.

Args:

chunks:: List of strings of texts to be encoded.
embedding_model:: embedding model to use. call list_embedding_models() to list of options.

Returns:

List of list of floats: Each list in the list is the encoded original text.

Extract information from one or more contexts using an LLM.

pre_prompt_extract and prompt_extract variables must be used together. If these variables are not set, the inputs texts will be summarized into bullet points.

Format of extract content:
"{pre_prompt_extract}"""
{text_context_list}
"""\n{prompt_extract}"
Examples:
extract = h2ogpte.extract_data(
    text_context_list=chunks,
    pre_prompt_extract="Pay attention and look at all people. Your job is to collect their names.\n",
    prompt_extract="List all people's names as JSON.",
)
Args:

text_context_list:
List of raw text strings to extract data from.

system_prompt:
Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto or None for the model default. Defaults to ‘’ for no system prompt.

pre_prompt_extract:
Text that is prepended before the list of texts. If not set, the inputs will be summarized.

prompt_extract:
Text that is appended after the list of texts. If not set, the inputs will be summarized.

llm:
Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:
temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

guardrails_settings:
Guardrails Settings.

timeout:
Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:
h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: ExtractionAnswer: The list of text responses and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

async get_agent_key_tool_associations() → List[dict]

async get_agent_keys() → List[dict]

async get_agent_server_files(chat_session_id: str) → List[dict]

async get_agent_tool_preference() → List[str]

async get_agent_tools_dict() → dict

async get_all_directory_stats(chat_session_id: str, detail_level: int = 0) → dict

async get_chat_session_prompt_template(chat_session_id: str) → PromptTemplate | None

Get the prompt template for a chat_session

Args:

chat_session_id:: ID of the chat session

Returns:

str: ID of the prompt template.

async get_chat_session_questions(chat_session_id: str, limit: int) → List[SuggestedQuestion]

List suggested questions

Args:

chat_session_id:: A chat session ID of which to return the suggested questions
limit:: How many questions to return.

Returns:

List: A list of questions.

async get_chunks(collection_id: str, chunk_ids: Iterable[int]) → List[Chunk]

Get the text of specific chunks in a collection.

Args:

collection_id:: String id of the collection to search in.
chunk_ids:: List of ints for the chunks to return. Chunks are indexed starting at 1.

Returns:

Chunk: The text of the chunk.

Raises:

Exception: One or more chunks could not be found.

async get_collection(collection_id: str) → Collection

Get metadata about a collection.

Args:

collection_id:: String id of the collection to search for.

Returns:

Collection: Metadata about the collection.

Raises:

KeyError: The collection was not found.

async get_collection_for_chat_session(chat_session_id: str) → Collection

Get metadata about the collection of a chat session.

Args:

chat_session_id:: String id of the chat session to search for.

Returns:

Collection: Metadata about the collection.

async get_collection_prompt_template(collection_id: str) → PromptTemplate | None

Get the prompt template for a collection

Args:

collection_id:: ID of the collection

Returns:

str: ID of the prompt template.

async get_collection_questions(collection_id: str, limit: int) → List[SuggestedQuestion]

List suggested questions

Args:

collection_id:: A collection ID of which to return the suggested questions
limit:: How many questions to return.

Returns:

List: A list of questions.

async get_default_collection() → CollectionInfo

Get the default collection, to be used for collection API-keys.

Returns:: CollectionInfo: Default collection info.

async get_directory_stats(directory_name: str, chat_session_id: str, detail_level: int = 0) → dict

async get_document(document_id: str, include_layout: bool = False) → Document

Fetches information about a specific document.

Args:

document_id:: String id of the document.
include_layout:: Include the layout prediction results.

Returns:

Document: Metadata about the Document.

Raises:

KeyError: The document was not found.

async get_document_content(file_name: str, document_id: str) → bytes

Downloads a document and return its content as a byte array.

Args:

file_name:: File name.
document_id:: Document ID.

Returns:

Path: File content

async get_global_configurations() → List[ConfigItem]

async get_global_configurations_by_admin() → List[ConfigItem]

async get_guardrails_settings(action: str = 'redact', sensitive: bool = True, non_sensitive: bool = True, all_guardrails: bool = True, guardrails_settings: dict | None = None) → Dict[str, str | List[str]]: Helper to get reasonable (easy to use) defaults for Guardrails/PII settings. To be further customized. :param action: what to do when detecting PII, either “redact” or “fail” (“allow” would keep PII intact). Guardrails models always fail upon detecting safety violations. :param sensitive: whether to include the most sensitive PII entities like SSN, bank account info :param non_sensitive: whether to include all non-sensitive PII entities, such as IP addresses, locations, names, e-mail addresses etc. :param all_guardrails: whether to include all possible entities for prompt guard and guardrails models, or just system defaults :param guardrails_settings: existing guardrails settings (e.g., from collection settings) to obtain guardrails entities, guardrails_entities_to_flag, column redaction custom_pii_entities, column_redaction_pii_to_flag from instead of system defaults :return: dictionary to pass to collection creation or process_document method

async get_h2ogpt_system_stats() → dict

async get_job(job_id: str) → Job

Fetches information about a specific job.

Args:

job_id:: String id of the job.

Returns:

Job: Metadata about the Job.

async get_llm_and_auto_reasoning_llm_names() → Dict[str, str]

Get mapping of llm to its reasoning_model when [“auto”] is passed as visible_reasoning_models

Returns:: dictionary {‘llm1’: ‘llm1_reasoning_llm’, etc.}

async get_llm_and_auto_vision_llm_names() → Dict[str, str]

Get mapping of llm to its vision_model when [“auto”] is passed as visible_vision_models

Returns:: dictionary {‘llm1’: ‘llm1_vision_llm’, etc.}

async get_llm_name_for_rest(llm: str | int | None)

async get_llm_names() → List[str]

Lists names of available LLMs in the environment.

Returns:: list of string: Name of each available model.

async get_llm_performance_by_llm(interval: str) → List[LLMPerformance]

async get_llm_usage_24h() → float

async get_llm_usage_24h_by_llm() → List[LLMUsage]

async get_llm_usage_24h_with_limits() → LLMUsageLimit

async get_llm_usage_6h() → float

async get_llm_usage_6h_by_llm() → List[LLMUsage]

async get_llm_usage_by_llm(interval: str) → List[LLMUsage]

async get_llm_usage_by_llm_and_user(interval: str) → List[LLMWithUserUsage]

async get_llm_usage_by_user(interval: str) → List[UserWithLLMUsage]

async get_llm_usage_with_limits(interval: str) → LLMUsageLimit

async get_llms() → List[Dict[str, Any]]

Lists metadata information about available LLMs in the environment.

Returns:: list of dict (string, ANY): Name and details about each available model.

async get_meta() → Meta

Returns information about the environment and the user.

Returns:: Meta: Details about the version and license of the environment and the user’s name and email.

async get_prompt_template(id: str | None = None) → PromptTemplate

Get a prompt template

Args:

id:: String id of the prompt template to retrieve or None for default

Returns:

PromptTemplate: prompts

Raises:

KeyError: The prompt template was not found.

async get_reasoning_capable_llm_names() → List[str]

Lists names of available reasoning-capable (that can natively reason) in the environment.

Returns:: list of string: Name of each available model.

async get_scheduler_stats() → SchedulerStats

Count the number of global, pending jobs on the server.

Returns:: SchedulerStats: The queue length for number of jobs.

async get_tag(tag_name: str) → Tag

Returns an existing tag.

Args:

tag_name:: String The name of the tag to retrieve.

Returns:

Tag: The requested tag.

Raises:

KeyError: The tag was not found.

async get_user_all_agent_directories(offset: int, limit: int, filter_text: str | None) → List[dict]

async get_user_configurations() → List[UserConfigItem]

Gets the user configurations for the current user.

Returns:: List[UserConfigItem]: List of user configurations.

async get_user_configurations_for_user(user_id: str) → List[UserConfigItem]

Gets the user configurations for a specific user (to be used by admins only).

Args:

user_id:: The unique identifier of the user.

Returns:

List[UserConfigItem]: List of user configurations.

async get_vision_capable_llm_names() → List[str]

Lists names of available vision-capable multi-modal LLMs (that can natively handle images as input) in the environment.

Returns:: list of string: Name of each available model.

Import all documents from a collection into an existing collection

Args:

collection_id:: Collection ID to add documents to.
src_collection_id:: Collection ID to import documents from.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
copy_document:: Whether to save a new copy of the document
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Import an already stored document to an existing collection

Args:

collection_id:: Collection ID to add documents to.
document_id:: Document ID to add.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
copy_document:: Whether to save a new copy of the document
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

async ingest_agent_only_to_standard(collection_id: str, document_id: str, gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, audio_input_language: str | None = None, ocr_model: str | None = None, restricted: bool = False, permissions: List[SharePermission] | None = None, tesseract_lang: str | None = None, keep_tables_as_one_chunk: bool | None = None, chunk_by_page: bool | None = None, handwriting_check: bool | None = None, timeout: float | None = None)

For files uploaded in “agent_only” ingest mode, convert to PDF and parse

See Also:

upload: Upload the files into the system to then be ingested into a collection. delete_upload: Delete uploaded file

Args:

collection_id:: String id of the collection to add the ingested documents into.
document_id:: ID of document to be parsed.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
restricted:: Whether the document should be restricted only to certain users.
permissions:: List of permissions. Each permission is a SharePermission object.
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.

Add files from the Azure Blob Storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
container:: Name of the Azure Blob Storage container.
path:: Path or list of paths to files or directories within an Azure Blob Storage container. Examples: file1, dir1/file2, dir3/dir4/
account_name:: Name of a storage account
credentials:: The object with Azure credentials. If the object is not provided, only a public container will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Metadata to be associated with the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Add files from the local system into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
root_dir:: String path of where to look for files.
glob:: String of the glob pattern used to match files in the root directory.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Add files from the Google Cloud Storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: The path or list of paths of GCS files or directories. Examples: gs://bucket/file, gs://bucket/../dir/
credentials:: The object holding a path to a JSON key of Google Cloud service account. If the object is not provided, only public buckets will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

async ingest_from_plain_text(collection_id: str, plain_text: str, file_name: str, gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, metadata: Dict[str, Any] | None = None, timeout: float | None = None)

Add plain text to a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
plain_text:: String of the plain text to ingest.
file_name:: String of the file name to use for the document.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds

Add files from the AWS S3 storage into a collection.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: The path or list of paths of S3 files or directories. Examples: s3://bucket/file, s3://bucket/../dir/
region:: The name of the region used for interaction with AWS services.
credentials:: The object with S3 credentials. If the object is not provided, only public buckets will be accessible.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Dictionary of metadata to add to the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

async ingest_uploads(collection_id: str, upload_ids: Iterable[str], gen_doc_summaries: bool | None = None, gen_doc_questions: bool | None = None, audio_input_language: str | None = None, ocr_model: str | None = None, restricted: bool = False, permissions: List[SharePermission] | None = None, tesseract_lang: str | None = None, keep_tables_as_one_chunk: bool | None = None, chunk_by_page: bool | None = None, handwriting_check: bool | None = None, metadata: Dict[str, Any] | None = None, timeout: float | None = None, ingest_mode: str | None = None) → Job

Add uploaded documents into a specific collection.

See Also:

upload: Upload the files into the system to then be ingested into a collection. delete_upload: Delete uploaded file

Args:

collection_id:: String id of the collection to add the ingested documents into.
upload_ids:: List of string ids of each uploaded document to add to the collection.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
restricted:: Whether the document should be restricted only to certain users.
permissions:: List of permissions. Each permission is a SharePermission object.
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
metadata:: Metadata to be associated with the document.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

Crawl and ingest a URL into a collection.

The web page or document linked from this URL will be imported.

Args:

collection_id:: String id of the collection to add the ingested documents into.
url:: String of the url to crawl.
gen_doc_summaries:: Whether to auto-generate document summaries (uses LLM)
gen_doc_questions:: Whether to auto-generate sample questions for each document (uses LLM)
follow_links:: Whether to import all web pages linked from this URL will be imported. External links will be ignored. Links to other pages on the same domain will be followed as long as they are at the same level or below the URL you specify. Each page will be transformed into a PDF document.
max_depth:: Max depth of recursion when following links, only when follow_links is True. Max_depth of 0 means don’t follow any links, max_depth of 1 means follow only top-level links, etc. Use -1 for automatic (system settings).
max_documents:: Max number of documents when following links, only when follow_links is True. Use None for automatic (system defaults). Use -1 for max (system limit).
audio_input_language:: Language of audio files. Defaults to “auto” language detection. Pass empty string to see choices.
ocr_model:: Which method to use to extract text from images using AI-enabled optical character recognition (OCR) models. Pass empty string to see choices. docTR is best for Latin text, PaddleOCR is best for certain non-Latin languages, Tesseract covers a wide range of languages. Mississippi works well on handwriting. “auto” - Automatic will auto-select the best OCR model for every page. “off” - Disable OCR for speed, but all images will then be skipped (also no image captions will be made).
tesseract_lang:: Which language to use when using ocr_model=”tesseract”. Pass empty string to see choices.
keep_tables_as_one_chunk:: When tables are identified by the table parser the table tokens will be kept in a single chunk.
chunk_by_page:: Each page will be a chunk. keep_tables_as_one_chunk will be ignored if this is True.
handwriting_check:: Check pages for handwriting. Will use specialized models if handwriting is found.
timeout:: Timeout in seconds.
ingest_mode:: Ingest mode to use. “standard” - Files will be ingested for use with RAG “lite” - Files will be ingested for use with RAG, but minimal processing will be done, favoring ingest speed over accuracy “agent_only” - Bypasses standard ingestion. Files can only be used with agents.

async is_collection_permission_granted(collection_id: str, permission: str) → bool

async is_permission_granted(permission: str) → bool

async list_all_api_keys(offset: int, limit: int, key_filter: str = '') → List[APIKey]

Allows admins to list all the API keys that exist.

Args:

offset:: Int: How many keys to skip before returning.
limit:: Int: How many keys to return.
key_filter:: String: Only returns keys for usernames matching this filter.

Returns:

List[APIKey]: List of APIKeys with metadata about each key.

async list_all_collections_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[CollectionInfo]

Fetch all users’ collection metadata sorted by last update time.

This is for admin use only and includes private, public, and shared collections in the result.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of CollectionInfo: Metadata about each collection.

async list_all_jobs() → List[UserJobs]

List all jobs (to be used by admins only).

Returns:: list of UserJobs

async list_all_tags() → List[Tag]

Lists all existing tags.

Returns:: List of Tags: List of existing tags.

async list_chat_message_meta_part(message_id: str, info_type: str) → ChatMessageMeta

Fetch one chat message meta information.

Args:

message_id:: Message id to which the metadata should be pulled.
info_type:: Metadata type to fetch. Valid choices are: “self_reflection”, “usage_stats”, “prompt_raw”, “llm_only”, “hyde1”, “py_client_code”

Returns:

ChatMessageMeta: Metadata information about the chat message.

async list_chat_message_references(message_id: str, limit: int | None = None) → List[ChatMessageReference]

Fetch metadata for references of a chat message.

References are only available for messages sent from an LLM, an empty list will be returned for messages sent by the user.

Args:

message_id:: String id of the message to get references for.
limit:: The number of references to consider based on the highest confidence scores.

Returns:

list of ChatMessageReference: Metadata including the document name, polygon information, and score.

async list_chat_messages(chat_session_id: str, offset: int, limit: int) → List[ChatMessage]

Fetch chat message and metadata for messages in a chat session.

Messages without a reply_to are from the end user, messages with a reply_to are from an LLM and a response to a specific user message.

Args:

chat_session_id:: String id of the chat session to filter by.
offset:: How many chat messages to skip before returning.
limit:: How many chat messages to return.

Returns:

list of ChatMessage: Text and metadata for chat messages.

async list_chat_messages_full(chat_session_id: str, offset: int, limit: int) → List[ChatMessageFull]

Fetch chat message and metadata for messages in a chat session.

Messages without a reply_to are from the end user, messages with a reply_to are from an LLM and a response to a specific user message.

Args:

chat_session_id:: String id of the chat session to filter by.
offset:: How many chat messages to skip before returning.
limit:: How many chat messages to return.

Returns:

list of ChatMessageFull: Text and metadata for chat messages.

async list_chat_sessions_for_collection(collection_id: str, offset: int, limit: int) → List[ChatSessionForCollection]

Fetch chat session metadata for chat sessions in a collection.

Args:

collection_id:: String id of the collection to filter by.
offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionForCollection: Metadata about each chat session including the latest message.

async list_chat_sessions_for_document(document_id: str, offset: int, limit: int) → List[ChatSessionForDocument]

Fetch chat session metadata for chat session that produced a specific document (typically through agents).

Args:

document_id:: String id of the document to filter by.
offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionForDocument: Metadata about each chat session including the latest message.

async list_collection_group_permissions(collection_id: str) → List[GroupSharePermission]

Returns a list of group access permissions for a given collection.

The returned list of group permissions denoting which groups have access to the collection and their access level.

Args:

collection_id:: ID of the collection to inspect.

Returns:

list of GroupSharePermission: Group sharing permissions list for the given collection.

async list_collection_permissions(collection_id: str) → List[SharePermission]

Returns a list of access permissions for a given collection.

The returned list of permissions denotes who has access to the collection and their access level.

Args:

collection_id:: ID of the collection to inspect.

Returns:

list of SharePermission: Sharing permissions list for the given collection.

async list_collections_for_document(document_id: str, offset: int, limit: int) → List[CollectionInfo]

Fetch metadata about each collection the document is a part of.

At this time, each document will only be available in a single collection.

Args:

document_id:: String id of the document to search for.
offset:: How many collections to skip before returning.
limit:: How many collections to return.

Returns:

list of CollectionInfo: Metadata about each collection.

async list_document_chunks(document_id: str, collection_id: str | None = None) → List[SearchResult]

Returns all chunks for a specific document.

Args:

document_id:: ID of the document.
collection_id:: ID of the collection the document belongs to. If not specified, an arbitrary collections containing the document is chosen.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

async list_documents_from_tags(collection_id: str, tags: List[str]) → List[Document]

Lists documents that have the specified set of tags within a collection. Args:

collection_id:
String The id of the collection to find documents in.

tags:
List of Strings representing the tags to retrieve documents for.

Returns:: List of Documents: All the documents with the specified tags.

async list_documents_in_collection(collection_id: str, offset: int, limit: int, metadata_filter: dict = {}) → List[DocumentInfo]

Fetch document metadata for documents in a collection.

Args:

collection_id:: String id of the collection to filter by.
offset:: How many documents to skip before returning.
limit:: How many documents to return.
metadata_filter:: Metadata filter to apply to the documents.

Returns:

list of DocumentInfo: Metadata about each document.

async list_embedding_models() → List[str]

async list_group_permissions(group_id: str) → List[UserPermission]

async list_group_permissions_by_name(group_names: List[str]) → List[UserPermission]

async list_group_roles(group_id: str) → List[UserRole]

async list_jobs() → List[Job]

List the user’s jobs.

Returns:: list of Job:

async list_list_chat_message_meta(message_id: str) → List[ChatMessageMeta]

Fetch chat message meta information.

Args:

message_id:: Message id to which the metadata should be pulled.

Returns:

list of ChatMessageMeta: Metadata about the chat message.

async list_prompt_group_permissions(prompt_id: str) → List[GroupSharePermission]

Returns a list of group access permissions for a given prompt template.

The returned list of group permissions denoting which groups have access to the prompt template.

Args:

prompt_id:: ID of the prompt template to inspect.

Returns:

list of GroupSharePermission: Group sharing permissions list for the given prompt template.

async list_prompt_permissions(prompt_id: str) → List[SharePermission]

Returns a list of access permissions for a given prompt template.

The returned list of permissions denotes who has access to the prompt template and their access level.

Args:

prompt_id:: ID of the prompt template to inspect.

Returns:

list of SharePermission: Sharing permissions list for the given prompt template.

async list_question_reply_feedback_data(offset: int, limit: int) → List[QuestionReplyData]

Fetch user’s questions and answers that have a feedback.

Questions and answers with metadata and feedback information.

Args:

offset:: How many conversations to skip before returning.
limit:: How many conversations to return.

Returns:

list of QuestionReplyData: Metadata about questions and answers.

async list_recent_chat_sessions(offset: int, limit: int) → List[ChatSessionInfo]

Fetch user’s chat session metadata sorted by last update time.

Chats across all collections will be accessed.

Args:

offset:: How many chat sessions to skip before returning.
limit:: How many chat sessions to return.

Returns:

list of ChatSessionInfo: Metadata about each chat session including the latest message.

async list_recent_collections(offset: int, limit: int) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.

Returns:

list of CollectionInfo: Metadata about each collection.

async list_recent_collections_filter(offset: int, limit: int, current_user_only: bool = False, name_filter: str = '') → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time with filter options.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
current_user_only:: When true, will only return the user owned collections.
name_filter:: Only returns collections with names matching this filter.

Returns:

list of CollectionInfo: Metadata about each collection.

async list_recent_collections_metadata_filter(offset: int, limit: int, current_user_only: bool, metadata_filter: dict) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time with a filter on metadata.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
current_user_only:: When true, will only return the user owned collections.
metadata_filter:: Only returns collections with metadata matching this filter.

Returns:

list of CollectionInfo: Metadata about each collection.

async list_recent_collections_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[CollectionInfo]

Fetch user’s collection metadata sorted by last update time.

Args:

offset:: How many collections to skip before returning.
limit:: How many collections to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of CollectionInfo: Metadata about each collection.

async list_recent_document_summaries(document_id: str, offset: int, limit: int) → List[ProcessedDocument]

Fetches recent document summaries/extractions/transformations

Args:

document_id:: document ID for which to return summaries
offset:: How many summaries to skip before returning summaries.
limit:: How many summaries to return.

async list_recent_documents(offset: int, limit: int, metadata_filter: dict = {}) → List[DocumentInfo]

Fetch user’s document metadata sorted by last update time.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.
metadata_filter:: Metadata filter to apply to the documents.

Returns:

list of DocumentInfo: Metadata about each document.

async list_recent_documents_with_summaries(offset: int, limit: int) → List[DocumentInfoSummary]

Fetch user’s document metadata sorted by last update time, including the latest document summary.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.

Returns:

list of DocumentInfoSummary: Metadata about each document.

async list_recent_documents_with_summaries_sort(offset: int, limit: int, sort_column: str, ascending: bool) → List[DocumentInfoSummary]

Fetch user’s document metadata sorted by last update time, including the latest document summary.

All documents owned by the user, regardless of collection, are accessed.

Args:

offset:: How many documents to skip before returning.
limit:: How many documents to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.

Returns:

list of DocumentInfoSummary: Metadata about each document.

async list_recent_prompt_templates(offset: int, limit: int) → List[PromptTemplate]

Fetch user’s prompt templates sorted by last update time.

Args:

offset:: How many prompt templates to skip before returning.
limit:: How many prompt templates to return.

Returns:

list of PromptTemplate: set of prompts

async list_recent_prompt_templates_sort(offset: int, limit: int, sort_column: str, ascending: bool, template_type: str = 'all', filter: str = '') → List[PromptTemplate]

Fetch user’s prompt templates sorted by last update time.

Args:

offset:: How many prompt templates to skip before returning.
limit:: How many prompt templates to return.
sort_column:: Sort column.
ascending:: When True, return sorted by sort_column in ascending order.
template_type:: When set, will be used as a type filter, possible values are: all, user, system.
filter:: When set, will be used as a filter on some prompt template columns.

Returns:

list of PromptTemplate: set of prompts

async list_system_groups() → List[UserGroup]

async list_system_permissions() → List[UserPermission]

async list_system_roles() → List[UserRole]

async list_upload() → List[str]

List pending file uploads to the H2OGPTE backend.

Uploaded files are not yet accessible and need to be ingested into a collection.

See Also:: upload: Upload the files into the system to then be ingested into a collection. ingest_uploads: Add the uploaded files to a collection. delete_upload: Delete uploaded file
Returns:: List[str]: The pending upload ids to be used in ingest jobs.
Raises:: Exception: The upload list request was unsuccessful.

async list_user_permissions(user_id: str | None = None) → List[UserPermission]

async list_user_role_permissions(roles: List[str]) → List[UserPermission]

async list_user_roles(user_id: str | None = None) → List[UserRole]

async list_users(offset: int, limit: int) → List[User]

List system users.

Returns a list of all registered users fo the system, a registered user, is a users that has logged in at least once.

Args:

offset:: How many users to skip before returning.
limit:: How many users to return.

Returns:

list of User: Metadata about each user.

async make_collection_private(collection_id: str)

Make a collection private

Once a collection is private, other users will no longer be able to access chat history or documents related to the collection.

Args:

collection_id:: ID of the collection to make private.

async make_collection_public(collection_id: str)

Make a collection public

Once a collection is public, it will be accessible to all authenticated users of the system.

Args:

collection_id:: ID of the collection to make public.

async match_chunks(collection_id: str, vectors: List[List[float]], topics: List[str], offset: int, limit: int, cut_off: float = 0, width: int = 0) → List[SearchResult]

Find chunks related to a message using semantic search.

Chunks are sorted by relevance and similarity score to the message.

See Also: H2OGPTE.encode_for_retrieval to create vectors from messages.

Args:

collection_id:: ID of the collection to search within.
vectors:: A list of vectorized message for running semantic search.
topics:: A list of document_ids used to filter which documents in the collection to search.
offset:: How many chunks to skip before returning chunks.
limit:: How many chunks to return.
cut_off:: Exclude matches with distances higher than this cut off.
width:: How many chunks before and after a match to return - not implemented.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

Processes a document to either create a global or piecewise summary/extraction/transformation of a document.

Effective prompt created (excluding the system prompt):

"{pre_prompt_summary}
"""
{text from document}
"""
{prompt_summary}"

Args:

document_id:

String id of the document to create a summary from.

system_prompt:

System Prompt

pre_prompt_summary:

Prompt that goes before each large piece of text to summarize

prompt_summary:

Prompt that goes after each large piece of text to summarize

image_batch_final_prompt:

Prompt for each image batch for vision models

image_batch_image_prompt:

Prompt to reduce all answers each image batch for vision models

llm:

LLM to use

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:: temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. seed (int, default: 0) — The seed for the random number generator when sampling during generation (if temp>0 or top_k>1 or top_p<1), seed=0 picks a random seed. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. enable_vision (str, default: “auto”) - Controls vision mode, send images to the LLM in addition to text chunks. Only if have models that support vision, use get_vision_capable_llm_names() to see list. One of [“on”, “off”, “auto”]. visible_vision_models (List[str], default: [“auto”]) - Controls which vision model to use when processing images. Use get_vision_capable_llm_names() to see list. Must provide exactly one model. [“auto”] for automatic.

max_num_chunks:

Max limit of chunks to send to the summarizer

sampling_strategy:

How to sample if the document has more chunks than max_num_chunks. Options are “auto”, “uniform”, “first”, “first+last”, default is “auto” (a hybrid of them all).

pages:

List of specific pages (of the ingested document in PDF form) to use from the document. 1-based indexing.

schema:

Optional JSON schema to use for guided json generation.

keep_intermediate_results:

Whether to keep intermediate results. Default: disabled. If disabled, further LLM calls are applied to the intermediate results until one global summary is obtained: map+reduce (i.e., summary). If enabled, the results’ content will be a list of strings (the results of applying the LLM to different pieces of document context): map (i.e., extract).

guardrails_settings:

Guardrails Settings.

meta_data_to_include:

A dictionary containing flags that indicate whether each piece of document metadata is to be included as part of the context given to the LLM. Only used if enable_vision is disabled. Default is {

“name”: True, “text”: True, “page”: True, “captions”: True, “uri”: False, “connector”: False, “original_mtime”: False, “age”: False, “score”: False,

}

timeout:

Amount of time in seconds to allow the request to run. The default is 86400 seconds.

Returns:

ProcessedDocument: Processed document. The content is either a string (keep_intermediate_results=False) or a list of strings (keep_intermediate_results=True).

Raises:

TimeoutError: The request did not complete in time. SessionError: No summary or extraction created. Document wasn’t part of a collection, or LLM timed out, etc.

async remove_collection_expiry_date(collection_id: str) → str

Remove an expiry date from a collection.

Args:

collection_id:: ID of the collection to update.

async remove_collection_inactivity_interval(collection_id: str) → str

Remove an inactivity interval for a collection.

Args:

collection_id:: ID of the collection to update.

async remove_collection_size_limit(collection_id: str) → str

Remove a size limit for a collection.

Args:

collection_id:: ID of the collection to update.

async remove_collection_thumbnail(collection_id: str, timeout: float | None = None)

Remove a thumbnail from a collection.

Args:

collection_id:: Collection you want to remove the thumbnail from.
timeout:: Amount of time in seconds to allow the request to run. The default is 86400 seconds.

async remove_role_from_group(group_id: str, roles: List[str]) → List[UserRole]

async remove_role_from_user(user_id: str, roles: List[str]) → List[UserRole]

async rename_chat_session(chat_session_id: str, name: str)

Update a chat session name

Args:

chat_session_id:: String id of the document to search for.
name:: The new chat session name.

async reset_and_share_prompt_template(prompt_id: str, new_usernames: List[str]) → ShareResponseStatus

Remove all users who have access to a prompt template (except for the owner) and share it with the provided list of new users.

Args:

prompt_id:: ID of the prompt template to un-share.
new_usernames:: The list of usernames belonging to the users this prompt template will be shared with.

ShareResponseStatus: Status of share request.

async reset_and_share_prompt_template_with_groups(prompt_id: str, new_groups: List[str]) → ShareResponseStatus

Remove all groups who have access to a prompt template and share it with the provided list of new group ids.

Args:

prompt_id:: ID of the prompt template to un-share.
new_groups:: The list of group ids this prompt template will be shared with.

ShareResponseStatus: Status of share request.

async reset_collection_prompt_settings(collection_id: str) → str

Reset the prompt settings for a given collection.

Args:

collection_id:: ID of the collection to update.

Returns:

str: ID of the updated collection.

async reset_roles_for_group(group_id: str, roles: List[str]) → List[UserRole]

async reset_roles_for_user(user_id: str, roles: List[str]) → List[UserRole]

async reset_user_configurations_for_user(key_name: str, user_id: str) → List[UserConfigItem]

Reset a user configuration for a specific user (to be used by admins only).

Returns:: List[UserConfigItem]: List of user configurations.

async run_selftest(llm: str, mode: str) → dict

Run a self-test for a given LLM Args:

llm:
Name of LLM

mode:
one of [“quick”, “rag”, “full”, “agent”]

Returns:: Dictionary with performance stats. If “error” is filled, the test failed.

async search_chunks(collection_id: str, query: str, topics: List[str], offset: int, limit: int) → List[SearchResult]

Find chunks related to a message using lexical search.

Chunks are sorted by relevance and similarity score to the message.

Args:

collection_id:: ID of the collection to search within.
query:: Question or imperative from the end user to search a collection for.
topics:: A list of document_ids used to filter which documents in the collection to search.
offset:: How many chunks to skip before returning chunks.
limit:: How many chunks to return.

Returns:

list of SearchResult: The document, text, score and related information of the chunk.

async set_api_key_expiration(api_key_id: str, expires_in: str | None = None) → Result

Allows admins to set an expiration on an API key.

Args:

api_key_id:: String: The id of the API key.
expires_in:: (Optional) String: The expiration for the API key as an interval or None (to remove an expiration that was previously set). Ex. “30 days” or “30 minutes”

Returns:

Result: Status of the expiration request.

async set_chat_message_votes(chat_message_id: str, votes: int) → Result

Change the vote value of a chat message.

Set the exact value of a vote for a chat message. Any message type can be updated, but only LLM response votes will be visible in the UI. The expectation is 0: unvoted, -1: dislike, 1 like. Values outside of this will not be viewable in the UI.

Args:

chat_message_id:: ID of a chat message, any message can be used but only LLM responses will be visible in the UI.
votes:: Integer value for the message. Only -1 and 1 will be visible in the UI as dislike and like respectively.

Returns:

Result: The status of the update.

Raises:

Exception: The upload request was unsuccessful.

async set_chat_session_collection(chat_session_id: str, collection_id: str | None) → str

Set the collection for a chat_session

Args:

chat_session_id:: ID of the chat session
collection_id:: ID of the collection, or None to chat with the LLM only.

Returns:

str: ID of the updated chat session

async set_chat_session_prompt_template(chat_session_id: str, prompt_template_id: str | None) → str

Set the prompt template for a chat_session

Args:

chat_session_id:: ID of the chat session
prompt_template_id:: ID of the prompt template to get the prompts from. None to delete and fall back to system defaults.

Returns:

str: ID of the updated chat session

async set_collection_expiry_date(collection_id: str, expiry_date: str, timezone: str | None = None) → str

Set an expiry date for a collection.

Args:

collection_id:: ID of the collection to update.
expiry_date:: The expiry date as a string in ‘YYYY-MM-DD’ format.
timezone:: Optional timezone to associate with expiry date (with IANA timezone support).

async set_collection_inactivity_interval(collection_id: str, inactivity_interval: int) → str

Set an inactivity interval for a collection.

Args:

collection_id:: ID of the collection to update.
inactivity_interval:: The inactivity interval as an integer number of days.

async set_collection_prompt_template(collection_id: str, prompt_template_id: str | None, strict_check: bool = False) → str

Set the prompt template for a collection

Args:

collection_id:: ID of the collection to update.
prompt_template_id:: ID of the prompt template to get the prompts from. None to delete and fall back to system defaults.
strict_check:: whether to check that the collection’s embedding model and the prompt template are optimally compatible

Returns:

str: ID of the updated collection.

async set_collection_size_limit(collection_id: str, limit: int | str) → str

Set a maximum limit on the total size of documents (sum) added to a collection. The limit is measured in bytes.

Args:

collection_id:: ID of the collection to update.
limit:: The bytes limit, possible values follow the format: 12345, “1GB”, or “1GiB”.

async set_collection_thumbnail(collection_id: str, file_path: Path, timeout: float | None = None)

Upload an image file to be set as a collection’s thumbnail.

The image file will not be considered as a collection document. Acceptable image file types include: .png, .jpg, .jpeg, .svg

Args:

collection_id:: Collection you want to add the thumbnail to.
file_path:: Path to the image file. Must include appropriate file extension.
timeout:: Amount of time in seconds to allow the request to run.

Raises:

ValueError: The file is invalid. Exception: The upload request was unsuccessful.

async set_global_configuration(key_name: str, string_value: str, can_overwrite: bool, is_public: bool, value_type: str | None = None) → List[ConfigItem]

Set a global configuration.

Note: Both default collection size limit and inactivity interval can be disabled. To do so, pass ‘-1’ as the string_value.

Args:

key_name:: The name of the global config key.
string_value:: The value to be set for the global config.
can_overwrite:: Whether user settings can override this global setting.
is_public:: Whether users can see the value for this global setting.
value_type:: The type of the value to be set for the global config.

Returns:

List[ConfigItem]: List of global configurations.

async set_user_configuration_for_user(key_name: str, string_value: str, user_id: str, value_type: str | None = None) → List[UserConfigItem]

Set a user configuration for a specific user (overrides the global configuration and to be used by admins only).

Note: Both default collection size limit and inactivity interval can be disabled. To do so, pass ‘-1’ as the string_value.

Args:

key_name:: The name of the global config key.
string_value:: The value to be set for the config.
user_id:: The user id you want to apply the config for.
value_type:: The type of the value to be set for the config.

Returns:

List[UserConfigItem]: List of user configurations.

async share_collection(collection_id: str, permission: SharePermission) → ShareResponseStatus

Share a collection to a user.

The permission attribute defined the level of access, and who can access the collection, the collection_id attribute denotes the collection to be shared.

Args:

collection_id:: ID of the collection to share.
permission:: Defines the rule for sharing, i.e. permission level.

Returns:

ShareResponseStatus: Status of share request.

async share_collection_with_group(collection_id: str, permission: GroupSharePermission) → ShareResponseStatus

Share a collection to a group.

The permission attribute defines the level of access, and which group can access the collection, the collection_id attribute denotes the collection to be shared.

Args:

collection_id:: ID of the collection to share.
permission:: Defines the rule for sharing, i.e. permission level and group.

Returns:

ShareResponseStatus: Status of share request.

async share_prompt(prompt_id: str, permission: SharePermission) → ShareResponseStatus

Share a prompt template to a user.

Args:

prompt_id:: ID of the prompt template to share.
permission:: Defines the rule for sharing, i.e. permission level.

Returns:

ShareResponseStatus: Status of share request.

async share_prompt_with_group(prompt_id: str, permission: GroupSharePermission) → ShareResponseStatus

Share a prompt to a group.

Args:

prompt_id:: ID of the prompt to share.
permission:: Specific permissions for a group.

Returns:

ShareResponseStatus: Status of share request.

Summarize one or more contexts using an LLM.

Effective prompt created (excluding the system prompt):
"{pre_prompt_summary}
"""
{text_context_list}
"""
{prompt_summary}"
Args:

text_context_list:
List of raw text strings to be summarized.

system_prompt:
Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto for the model default or None for h2oGPTe defaults. Defaults to ‘’ for no system prompt.

pre_prompt_summary:
Text that is prepended before the list of texts. The default can be customized per environment, but the standard default is "In order to write a concise single-paragraph or bulleted list summary, pay attention to the following text:\n"

prompt_summary:
Text that is appended after the list of texts. The default can be customized per environment, but the standard default is "Using only the text above, write a condensed and concise summary of key results (preferably as bullet points):\n"

llm:
Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Default value is to use the first model (0th index).

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:
temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 seed (int, default: 0) — The seed for the random number generator, only used if temperature > 0, seed=0 will pick a random number for each call, seed > 0 will be fixed. top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag.

guardrails_settings:
Guardrails Settings.

timeout:
Timeout in seconds.

kwargs:

Dictionary of kwargs to pass to h2oGPT. Not recommended, see https://github.com/h2oai/h2ogpt for source code. Valid keys:
h2ogpt_key: str = “” chat_conversation: list[tuple[str, str]] | None = None docs_ordering_type: str | None = “best_near_prompt” max_input_tokens: int = -1 docs_token_handling: str = “split_or_merge” docs_joiner: str = “

“

image_file: Union[str, list] = None

Returns:: Answer: The response text and any errors.
Raises:: TimeoutError: If response isn’t completed in timeout seconds.

async summarize_document(*args, **kwargs) → DocumentSummary

async tag_document(document_id: str, tag_name: str) → str

Adds a tag to a document.

Args:

document_id:: String id of the document to attach the tag to.
tag_name:: String representing the tag to attach.

Returns:

String: The id of the newly created tag.

async unarchive_collection(collection_id: str) → str

Restore an archived collection to an active status.

Args:

collection_id:: ID of the collection to restore.

async unshare_collection(collection_id: str, permission: SharePermission) → ShareResponseStatus

Remove sharing of a collection to a user.

The permission attribute defined the level of access, and who can access the collection, the collection_id attribute denotes the collection to be shared.

In case of un-sharing, the SharePermission’s user is sufficient.

Args:

collection_id:: ID of the collection to un-share.
permission:: Defines the user for which collection access is revoked.

ShareResponseStatus: Status of share request.

async unshare_collection_for_all(collection_id: str) → ShareResponseStatus

Remove sharing of a collection to all other users but the original owner.

Args:

collection_id:: ID of the collection to un-share.

ShareResponseStatus: Status of share request.

async unshare_collection_from_group(collection_id: str, permission: GroupSharePermission) → ShareResponseStatus

Remove sharing of a collection from a group.

The permission attribute defines which group to remove access from, the collection_id attribute denotes the collection to be unshared. In case of un-sharing, the GroupSharePermission’s group_id is sufficient.

Args:

collection_id:: ID of the collection to un-share.
permission:: Defines the group for which collection access is revoked.

Returns:

ShareResponseStatus: Status of share request.

async unshare_prompt(prompt_id: str, permission: SharePermission) → ShareResponseStatus

Remove sharing of a prompt template to a user.

Args:

prompt_id:: ID of the prompt template to un-share.
permission:: Defines the user for which collection access is revoked.

ShareResponseStatus: Status of share request.

async unshare_prompt_for_all(prompt_id: str) → ShareResponseStatus

Remove sharing of a prompt template to all other users but the original owner (owner action only).

Args:

prompt_id:: ID of the prompt template to un-share.

ShareResponseStatus: Status of share request.

async unshare_prompt_from_group(prompt_id: str, permission: GroupSharePermission) → ShareResponseStatus

Unshare a prompt from a group.

Args:

prompt_id:: ID of the prompt to un-share.
permission:: Specific permissions for a group.

Returns:

ShareResponseStatus: Status of share request.

async untag_document(document_id: str, tag_name: str) → str

Removes an existing tag from a document.

Args:

document_id:: String id of the document to remove the tag from.
tag_name:: String representing the tag to remove.

Returns:

String: The id of the removed tag.

async update_agent_key(key_id: str, name: str | None = None, value: str | None = None, key_type: str | None = None, description: str | None = None) → dict | None

async update_agent_tool_preference(reference_value: List[str]) → None

async update_collection(collection_id: str, name: str, description: str) → str

Update the metadata for a given collection.

All variables are required. You can use h2ogpte.get_collection(<id>).name or description to get the existing values if you only want to change one or the other.

Args:

collection_id:: ID of the collection to update.
name:: New name of the collection, this is required.
description:: New description of the collection, this is required.

Returns:

str: ID of the updated collection.

async update_collection_metadata(collection_id: str, collection_metadata: dict) → str

Set the new collection metadata overwriting the existing metadata. Be careful not to delete any settings you want to keep.

Args:

collection_id:: ID of the collection to update.
collection_metadata:: Dictionary containing the new collection metadata.

Returns:

str: ID of the updated collection.

async update_collection_rag_type(collection_id: str, name: str, description: str, rag_type) → str

Update the metadata for a given collection.

All variables are required. You can use h2ogpte.get_collection(<id>).name or description to get the existing values if you only want to change one or the other.

Args:

collection_id:

ID of the collection to update.

name:

New name of the collection, this is required.

description:

New description of the collection, this is required.

rag_type: str one of

"auto" Automatically select the best rag_type. "llm_only" LLM Only - Answer the query without any supporting document contexts.

Requires 1 LLM or Agent call.

"agent_only" Agent Only - Answer the query with only original files passed to agent.: Requires 1 Agent call.
"rag" RAG (Retrieval Augmented Generation) - Use supporting document contexts: to answer the query. Requires 1 LLM or Agent call.
"hyde1" LLM Only + RAG composite - HyDE RAG (Hypothetical Document Embedding).: Use ‘LLM Only’ response to find relevant contexts from a collection for generating a response. Requires 2 LLM calls.
"hyde2" HyDE + RAG composite - Use the ‘HyDE RAG’ response to find relevant: contexts from a collection for generating a response. Requires 3 LLM calls.
"rag+" Summary RAG - Like RAG, but uses more context and recursive: summarization to overcome LLM context limits. Keeps all retrieved chunks, puts them in order, adds neighboring chunks, then uses the summary API to get the answer. Can require several LLM calls.
"all_data" All Data RAG - Like Summary RAG, but includes all document: chunks. Uses recursive summarization to overcome LLM context limits. Can require several LLM calls.

Returns:

str: ID of the updated collection.

async update_collection_settings(collection_id: str, collection_settings: dict) → str

Set the new collection settings, must be complete. Be careful not to delete any settings you want to keep.

Args:

collection_id:: ID of the collection to update.
collection_settings:: Dictionary containing the new collection settings.

Returns:

str: ID of the updated collection.

async update_document_metadata(document_id: str, document_metadata: dict) → str

Set the new document metadata overwriting the existing metadata. Be careful not to delete any settings you want to keep.

Args:

document_id:: ID of the document to update.
document_metadata:: Dictionary containing the new document metadata.

Returns:

str: ID of the updated document.

async update_document_name(document_id: str, name: str) → str

Update the name metadata for a given document.

Args:

document_id:: ID of the document to update.
name:: New name of the document, must include file extension.

Returns:

str: ID of the updated document.

async update_document_uri(document_id: str, uri: str) → str

Update the URI metadata for a given document.

Args:

document_id:: ID of the document to update.
uri:: New URI of the document, this is required.

Returns:

str: ID of the updated document.

async update_prompt_template(id: str, name: str, description: str | None = None, lang: str | None = None, system_prompt: str | None = None, pre_prompt_query: str | None = None, prompt_query: str | None = None, hyde_no_rag_llm_prompt_extension: str | None = None, pre_prompt_summary: str | None = None, prompt_summary: str | None = None, system_prompt_reflection: str | None = None, pre_prompt_reflection: str | None = None, prompt_reflection: str | None = None, auto_gen_description_prompt: str | None = None, auto_gen_document_summary_pre_prompt_summary: str | None = None, auto_gen_document_summary_prompt_summary: str | None = None, auto_gen_document_sample_questions_prompt: str | None = None, default_sample_questions: List[str] | None = None, image_batch_image_prompt: str | None = None, image_batch_final_prompt: str | None = None) → str

Update a prompt template

Args:

id:: String ID of the prompt template to update
name:: Name of the prompt template
description:: Description of the prompt template
lang:: Language code
system_prompt:: System Prompt
pre_prompt_query:: Text that is prepended before the contextual document chunks.
prompt_query:: Text that is appended to the beginning of the user’s message.
hyde_no_rag_llm_prompt_extension:: LLM prompt extension.
pre_prompt_summary:: Prompt that goes before each large piece of text to summarize
prompt_summary:: Prompt that goes after each large piece of text to summarize
system_prompt_reflection:: System Prompt for self-reflection
pre_prompt_reflection:: Deprecated - ignored
prompt_reflection:: Template for self-reflection, must contain two occurrences of %s for full previous prompt (including system prompt, document related context and prompts if applicable, and user prompts) and answer
auto_gen_description_prompt:: prompt to create a description of the collection.
auto_gen_document_summary_pre_prompt_summary:: pre_prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_summary_prompt_summary:: prompt_summary for summary of a freshly imported document (if enabled).
auto_gen_document_sample_questions_prompt:: prompt to create sample questions for a freshly imported document (if enabled).
default_sample_questions:: default sample questions in case there are no auto-generated sample questions.
image_batch_final_prompt:: Prompt for each image batch for vision models
image_batch_image_prompt:: Prompt to reduce all answers each image batch for vision models

Returns:

str: The ID of the updated prompt template.

async update_question_reply_feedback(reply_id: str, expected_answer: str, user_comment: str)

Update feedback for a specific answer to a question.

Args:

reply_id:: UUID of the reply.
expected_answer:: Expected answer.
user_comment:: User comment.

Returns:

None

async update_tag(tag_name: str, description: str, format: str) → str

Updates a tag.

Args:

tag_name:: String representing the tag to update.
description:: String describing the tag.
format:: String representing the format of the tag.

Returns:

String: The id of the updated tag.

async upload(file_name: str, file: Any, uri: str | None = None) → str

Upload a file to the H2OGPTE backend.

Uploaded files are not yet accessible and need to be ingested into a collection.

See Also:

ingest_uploads: Add the uploaded files to a collection. delete_upload: Delete uploaded file

Args:

file_name:: What to name the file on the server, must include file extension.
file:: File object to upload, often an opened file from with open(…) as f.
uri:: Optional - URI you would like to associate with the file.

Returns:

str: The upload id to be used in ingest jobs.

Raises:

Exception: The upload request was unsuccessful.

class h2ogpte.Session(address: str, chat_session_id: str, client: H2OGPTE = None, prompt_template_id: str | None = None, open_timeout: int = 10, close_timeout: int = 10, max_connect_retries: int = 10, connect_retry_delay: int = 0.5, connect_retry_max_delay: int = 60)

Bases: object

Create and participate in a chat session.

This is a live connection to the h2oGPTe server contained to a specific chat session on top of a single collection of documents. Users will find all questions and responses in this session in a single chat history in the UI.

See Also:

H2OGPTE.connect: To initialize a session on an existing connection.

Args:

address:: Full URL of the h2oGPTe server to connect to.
chat_session_id:: The ID of the chat session the queries should be sent to.
client:: Set to the value of H2OGPTE client object used to perform other calls to the system.

Examples:

# Example 1: Best practice, create a session using the H2OGPTE module
with h2ogpte.connect(chat_session_id) as session:
    answer1 = session.query('How many paper clips were shipped to Scranton?', timeout=10)
    answer2 = session.query('Did David Brent co-sign the contract with Initech?', timeout=10)

# Example 2: Connect and disconnect manually
session = Session(
    address=address,
    client=client,
    chat_session_id=chat_session_id
)
session.connect()
answer = session.query("Are there any dogs in the documents?")
session.disconnect()

connect()

Connect to an h2oGPTe server.

This is primarily an internal function used when users create a session using with from the H2OGPTE.connection() function.

property connection: ClientConnection

disconnect()

Disconnect from an h2oGPTe server.

This is primarily an internal function used when users create a session using with from the H2OGPTE.connection() function.

Retrieval-augmented generation for a query on a collection.

Finds a collection of chunks relevant to the query using similarity scores. Sends these and any additional instructions to an LLM.

Format of questions or imperatives:

"{pre_prompt_query}
"""
{similar_context_chunks}
"""                {prompt_query}{message}"

Args:

message:

Query or instruction from the end user to the LLM.

system_prompt:

Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto or None for the model default. Defaults to ‘’ for no system prompt.

pre_prompt_query:

Text that is prepended before the contextual document chunks. The default can be customized per environment, but the standard default is "Pay attention and remember the information below, which will help to answer the question or imperative after the context ends.\n"

prompt_query:

Text that is appended to the beginning of the user’s message. The default can be customized per environment, but the standard default is “According to only the information in the document sources provided within the context above, “

image_batch_final_prompt:

Prompt for each image batch for vision models

image_batch_image_prompt:

Prompt to reduce all answers each image batch for vision models

pre_prompt_summary:

Not used, use H2OGPTE.process_document to summarize.

prompt_summary:

Not used, use H2OGPTE.process_document to summarize.

llm:

Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Use None or “auto” for automatic model routing, set cost_controls for detailed control over automatic routing.

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:

temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. seed (int, default: 0) — The seed for the random number generator when sampling during generation (if temp>0 or top_k>1 or top_p<1), seed=0 picks a random seed. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. enable_vision (str, default: “auto”) - Controls vision mode, send images to the LLM in addition to text chunks. Only if have models that support vision, use get_vision_capable_llm_names() to see list. One of [“on”, “off”, “auto”]. visible_vision_models (List[str], default: [“auto”]) - Controls which vision model to use when processing images. Use get_vision_capable_llm_names() to see list. Must provide exactly one model. [“auto”] for automatic. images_num_max (int, default: None) — Maximum number of images to process. json_preserve_system_prompt (bool, default: None) — Whether to preserve system prompt in JSON response. client_metadata (str, default: None) — Additional metadata to send with the request. min_chars_per_yield (int, default: 1) — Minimum characters to yield in streaming response. cost_controls: Optional dictionary

max_cost (float) - Sets the maximum allowed cost in USD per LLM call when doing Automatic model routing. If the estimated cost based on input and output token counts is higher than this limit, the request will fail as early as possible. max_cost_per_million_tokens (float) - Only consider models that cost less than this value in USD per million tokens when doing automatic routing. Using the max of input and output cost. model (List[str] or None) - Optional subset of models to consider when doing automatic routing. None means consider all models. willingness_to_pay (float) - Controls the willingness to pay extra for a more accurate model for every LLM call when doing automatic routing, in units of USD per +10% increase in accuracy. We start with the least accurate model. For each more accurate model, we accept it if the increase in estimated cost divided by the increase in estimated accuracy is no more than this value divided by 10%, up to the upper limit specified above. Lower values will try to keep the cost as low as possible, higher values will approach the cost limit to increase accuracy. 0 means unlimited. willingness_to_wait (float) - Controls the willingness to wait longer for a more accurate model for every LLM call when doing automatic routing, in units of seconds per +10% increase in accuracy. We start with the least accurate model. For each more accurate model, we accept it if the increase in estimated time divided by the increase in estimated accuracy is no more than this value divided by 10%. Lower values will try to keep the time as low as possible, higher values will take longer to increase accuracy. 0 means unlimited.

use_agent (bool, default: False) - If True, use the AI agent (with access to tools) to generate the response. agent_accuracy (str, default: “standard”) - Effort level by the agent. Only if use_agent=True. One of [“quick”, “basic”, “standard”, “maximum”]. agent_max_turns (Optional[Union[str, int]], default: “auto”) - Optional max. number of back-and-forth turns with the agent. Only if use_agent=True. Either “auto” or an integer. agent_tools (Optional[Union[str, List[str]]], default: “auto”) - Either “auto”, “all”, “any” to enable all available tools, or a specific list of tools to use. Only if use_agent=True. Get full list by calling await self._lang(“get_agent_tools_options”). agent_type (str, default: “auto”) — Type of agent to use for task processing. agent_original_files (List[str], default: None) — List of file paths for agent to process. agent_timeout (int, default: None) — Timeout in seconds for each agent turn. agent_total_timeout (int, default: 3600) — Total timeout in seconds for all agent processing. agent_code_writer_system_message (str, default: None) — System message for agent code writer. agent_num_executable_code_blocks_limit (int, default: 1) — Maximum number of executable code blocks. agent_system_site_packages (bool, default: True) — Whether agent has access to system site packages. agent_main_model (Optional[str], default: None) — Main model to use for agent. agent_max_stream_length (Optional[int], default: None) — Maximum stream length for agent response. agent_max_memory_usage (int, default: 16*1024**3) — Maximum memory usage for agent in bytes (16GB default). agent_main_reasoning_effort (Optional[int], default: None) — Effort level for main reasoning. agent_advanced_reasoning_effort (Optional[int], default: None) — Effort level for advanced reasoning. agent_max_confidence_level (Optional[int], default: None) — Maximum confidence level for agent responses. agent_planning_forced_mode (Optional[bool], default: None) — Whether to force planning mode for agent. agent_too_soon_forced_mode (Optional[bool], default: None) — Whether to force “too soon” mode for agent. agent_critique_forced_mode (Optional[int], default: None) — Whether to force critique mode for agent. agent_stream_files (bool, default: True) — Whether to stream files from agent.

self_reflection_config:

Dictionary of arguments for self-reflection, can contain the following string:string mappings:

llm_reflection: str
"gpt-4-0613" or "" to disable reflection

prompt_reflection: str
‘Here’s the prompt and the response: """Prompt:\n%s\n"""\n\n""" Response:\n%s\n"""\n\nWhat is the quality of the response for the given prompt? Respond with a score ranging from Score: 0/10 (worst) to Score: 10/10 (best), and give a brief explanation why.'

system_prompt_reflection: str
""

llm_args_reflection: str
"{}"

rag_config:

Dictionary of arguments to control RAG (retrieval-augmented-generation) types. Can contain the following key/value pairs: rag_type: str one of

"auto" Automatically select the best rag_type. "llm_only" LLM Only - Answer the query without any supporting document contexts.

Requires 1 LLM or Agent call.

"agent_only" Agent Only - Answer the query with only original files passed to agent.
Requires 1 Agent call.

"rag" RAG (Retrieval Augmented Generation) - Use supporting document contexts
to answer the query. Requires 1 LLM or Agent call.

"hyde1" LLM Only + RAG composite - HyDE RAG (Hypothetical Document Embedding).
Use ‘LLM Only’ response to find relevant contexts from a collection for generating a response. Requires 2 LLM calls.

"hyde2" HyDE + RAG composite - Use the ‘HyDE RAG’ response to find relevant
contexts from a collection for generating a response. Requires 3 LLM calls.

"rag+" Summary RAG - Like RAG, but uses more context and recursive
summarization to overcome LLM context limits. Keeps all retrieved chunks, puts them in order, adds neighboring chunks, then uses the summary API to get the answer. Can require several LLM calls.

"all_data" All Data RAG - Like Summary RAG, but includes all document
chunks. Uses recursive summarization to overcome LLM context limits. Can require several LLM calls.

"all_data" All Data RAG - Like Summary RAG, but includes all document
chunks. Uses recursive summarization to overcome LLM context limits. Can require several LLM calls.

hyde_no_rag_llm_prompt_extension: str

Add this prompt to every user’s prompt, when generating answers to be used for subsequent retrieval during HyDE. Only used when rag_type is “hyde1” or “hyde2”. example: '\nKeep the answer brief, and list the 5 most relevant key words at the end.'

num_neighbor_chunks_to_include: int

Number of neighboring chunks to include for every retrieved relevant chunk. Helps to keep surrounding context together. Only enabled for rag_type “rag+”. Defaults to 1.

meta_data_to_include:

A dictionary containing flags that indicate whether each piece of document metadata is to be included as part of the context for a chat with a collection. Default is {

“name”: True, “text”: True, “page”: True, “captions”: True, “uri”: False, “connector”: False, “original_mtime”: False, “age”: False, “score”: False,

}

rag_max_chunks:

Maximum number of document chunks to retrieve for RAG. If not specified (default: -1), actual number depends on rag_type and admin configuration. Set to >0 values to enable. Can be combined with rag_min_chunk_score.

rag_min_chunk_score:

Minimum score of document chunks to retrieve for RAG. If not specified (default: 0.0), will not filter chunks by score. Set to >0 values to enable. Can be combined with rag_max_chunks.

include_chat_history:

Whether to include chat history. Includes previous questions and answers for the current chat session for each new chat request. Disable if require deterministic answers for a given question. Choices are: [“on”,”off”,”auto”,True,False]

tags:

A list of tags from which to pull the context for RAG.

metadata_filter:

A dictionary to filter documents by metadata, from which to pull the context for RAG.

timeout:

Amount of time in seconds to allow the request to run. The default is 1000 seconds.

retries:

Amount of retries to allow the request to run when hits a network issue. The default is 3.

callback:

Function for processing partial messages, used for streaming responses to an end user.

Returns:

ChatMessage: The response text and details about the response from the LLM. For example:

ChatMessage(
    id='XXX',
    content='The information provided in the context...',
    reply_to='YYY',
    votes=0,
    created_at=datetime.datetime(2023, 10, 24, 20, 12, 34, 875026)
    type_list=[],
)

Raises:

TimeoutError: The request did not complete in time.

class h2ogpte.SessionAsync(chat_session_id: str, client: H2OGPTEAsync, prompt_template_id: str | None = None, open_timeout: int = 10, close_timeout: int = 10, max_connect_retries: int = 10, connect_retry_delay: int = 0.5, connect_retry_max_delay: int = 60)

Bases: object

Create and participate in a chat session. This is a live connection to the h2oGPTe server contained to a specific chat session on top of a single collection of documents. Users will find all questions and responses in this session in a single chat history in the UI. See Also:

H2OGPTE.connect: To initialize a session on an existing connection.

Args:

address:: Full URL of the h2oGPTe server to connect to.
api_key:: API key for authentication to the h2oGPTe server. Users can generate a key by accessing the UI and navigating to the Settings.
chat_session_id:: The ID of the chat session the queries should be sent to.
verify:: Whether to verify the server’s TLS/SSL certificate. Can be a boolean or a path to a CA bundle. Defaults to True.

Examples::

async with h2ogpte.connect(_chat_session_id) as session:

answer1 = await session.query(: ‘How many paper clips were shipped to Scranton?’

) answer2 = await session.query(

‘Did David Brent co-sign the contract with Initech?’

)

async connect()

Retrieval-augmented generation for a query on a collection. Finds a collection of chunks relevant to the query using similarity scores. Sends these and any additional instructions to an LLM. Format of questions or imperatives:

"{pre_prompt_query}
"""
{similar_context_chunks}
"""            {prompt_query}{message}"

Args:

message:

Query or instruction from the end user to the LLM.

system_prompt:

Text sent to models which support system prompts. Gives the model overall context in how to respond. Use auto or None for the model default. Defaults to ‘’ for no system prompt.

pre_prompt_query:

Text that is prepended before the contextual document chunks. The default can be customized per environment, but the standard default is "Pay attention and remember the information below, which will help to answer the question or imperative after the context ends.\n"

prompt_query:

Text that is appended to the beginning of the user’s message. The default can be customized per environment, but the standard default is “According to only the information in the document sources provided within the context above, “

image_batch_final_prompt:

Prompt for each image batch for vision models

image_batch_image_prompt:

Prompt to reduce all answers each image batch for vision models

pre_prompt_summary:

Not used, use H2OGPTE.process_document to summarize.

prompt_summary:

Not used, use H2OGPTE.process_document to summarize.

llm:

Name or index of LLM to send the query. Use H2OGPTE.get_llms() to see all available options. Use None or “auto” for automatic model routing, set cost_controls for detailed control over automatic routing.

llm_args:

Dictionary of kwargs to pass to the llm. Valid keys:

temperature (float, default: 0) — The value used to modulate the next token probabilities. Most deterministic: 0, Most creative: 1 top_k (int, default: 1) — The number of highest probability vocabulary tokens to keep for top-k-filtering. top_p (float, default: 1.0) — If set to float < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. seed (int, default: 0) — The seed for the random number generator when sampling during generation (if temp>0 or top_k>1 or top_p<1), seed=0 picks a random seed. repetition_penalty (float, default: 1.07) — The parameter for repetition penalty. 1.0 means no penalty. max_new_tokens (int, default: 1024) — Maximum number of new tokens to generate. This limit applies to each (map+reduce) step during summarization and each (map) step during extraction. min_max_new_tokens (int, default: 512) — minimum value for max_new_tokens when auto-adjusting for content of prompt, docs, etc. response_format (str, default: “text”) — Output type, one of [“text”, “json_object”, “json_code”]. guided_json (dict, default: None) — If specified, the output will follow the JSON schema. guided_regex (str, default: “”) — If specified, the output will follow the regex pattern. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_choice (Optional[List[str]], default: None — If specified, the output will be exactly one of the choices. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_grammar (str, default: “”) — If specified, the output will follow the context free grammar. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. guided_whitespace_pattern (str, default: “”) — If specified, will override the default whitespace pattern for guided json decoding. Only for models that support guided generation: check output of get_llms() for guided_vllm flag. enable_vision (str, default: “auto”) - Controls vision mode, send images to the LLM in addition to text chunks. Only if have models that support vision, use get_vision_capable_llm_names() to see list. One of [“on”, “off”, “auto”]. visible_vision_models (List[str], default: [“auto”]) - Controls which vision model to use when processing images. Use get_vision_capable_llm_names() to see list. Must provide exactly one model. [“auto”] for automatic. images_num_max (int, default: None) — Maximum number of images to process. json_preserve_system_prompt (bool, default: None) — Whether to preserve system prompt in JSON response. client_metadata (str, default: None) — Additional metadata to send with the request. min_chars_per_yield (int, default: 1) — Minimum characters to yield in streaming response. cost_controls: Optional dictionary

max_cost (float) - Sets the maximum allowed cost in USD per LLM call when doing Automatic model routing. If the estimated cost based on input and output token counts is higher than this limit, the request will fail as early as possible. max_cost_per_million_tokens (float) - Only consider models that cost less than this value in USD per million tokens when doing automatic routing. Using the max of input and output cost. model (List[str] or None) - Optional subset of models to consider when doing automatic routing. None means consider all models. willingness_to_pay (float) - Controls the willingness to pay extra for a more accurate model for every LLM call when doing automatic routing, in units of USD per +10% increase in accuracy. We start with the least accurate model. For each more accurate model, we accept it if the increase in estimated cost divided by the increase in estimated accuracy is no more than this value divided by 10%, up to the upper limit specified above. Lower values will try to keep the cost as low as possible, higher values will approach the cost limit to increase accuracy. 0 means unlimited. willingness_to_wait (float) - Controls the willingness to wait longer for a more accurate model for every LLM call when doing automatic routing, in units of seconds per +10% increase in accuracy. We start with the least accurate model. For each more accurate model, we accept it if the increase in estimated time divided by the increase in estimated accuracy is no more than this value divided by 10%. Lower values will try to keep the time as low as possible, higher values will take longer to increase accuracy. 0 means unlimited.

use_agent (bool, default: False) - If True, use the AI agent (with access to tools) to generate the response. agent_accuracy (str, default: “standard”) - Effort level by the agent. Only if use_agent=True. One of [“quick”, “basic”, “standard”, “maximum”]. agent_max_turns (Optional[Union[str, int]], default: “auto”) - Optional max. number of back-and-forth turns with the agent. Only if use_agent=True. Either “auto” or an integer. agent_tools (Optional[Union[str, List[str]]], default: “auto”) - Either “auto”, “all”, “any” to enable all available tools, or a specific list of tools to use. Only if use_agent=True. Get full list by calling await self._lang(“get_agent_tools_options”). agent_type (str, default: “auto”) — Type of agent to use for task processing. agent_original_files (List[str], default: None) — List of file paths for agent to process. agent_timeout (int, default: None) — Timeout in seconds for each agent turn. agent_total_timeout (int, default: 3600) — Total timeout in seconds for all agent processing. agent_code_writer_system_message (str, default: None) — System message for agent code writer. agent_num_executable_code_blocks_limit (int, default: 1) — Maximum number of executable code blocks. agent_system_site_packages (bool, default: True) — Whether agent has access to system site packages. agent_main_model (Optional[str], default: None) — Main model to use for agent. agent_max_stream_length (Optional[int], default: None) — Maximum stream length for agent response. agent_max_memory_usage (int, default: 16*1024**3) — Maximum memory usage for agent in bytes (16GB default). agent_main_reasoning_effort (Optional[int], default: None) — Effort level for main reasoning. agent_advanced_reasoning_effort (Optional[int], default: None) — Effort level for advanced reasoning. agent_max_confidence_level (Optional[int], default: None) — Maximum confidence level for agent responses. agent_planning_forced_mode (Optional[bool], default: None) — Whether to force planning mode for agent. agent_too_soon_forced_mode (Optional[bool], default: None) — Whether to force “too soon” mode for agent. agent_critique_forced_mode (Optional[int], default: None) — Whether to force critique mode for agent. agent_stream_files (bool, default: True) — Whether to stream files from agent.

self_reflection_config:

Dictionary of arguments for self-reflection, can contain the following string:string mappings:

llm_reflection: str
"gpt-4-0613" or "" to disable reflection

prompt_reflection: str
‘Here’s the prompt and the response: """Prompt:\n%s\n"""\n\n""" Response:\n%s\n"""\n\nWhat is the quality of the response for the given prompt? Respond with a score ranging from Score: 0/10 (worst) to Score: 10/10 (best), and give a brief explanation why.'

system_prompt_reflection: str
""

llm_args_reflection: str
"{}"

rag_config:

Dictionary of arguments to control RAG (retrieval-augmented-generation) types. Can contain the following key/value pairs: rag_type: str one of

"auto" Automatically select the best rag_type. "llm_only" LLM Only - Answer the query without any supporting document contexts.

Requires 1 LLM or Agent call.

"agent_only" Agent Only - Answer the query with only original files passed to agent.
Requires 1 Agent call.

"rag" RAG (Retrieval Augmented Generation) - Use supporting document contexts
to answer the query. Requires 1 LLM or Agent call.

"hyde1" LLM Only + RAG composite - HyDE RAG (Hypothetical Document Embedding).
Use ‘LLM Only’ response to find relevant contexts from a collection for generating a response. Requires 2 LLM calls.

"hyde2" HyDE + RAG composite - Use the ‘HyDE RAG’ response to find relevant
contexts from a collection for generating a response. Requires 3 LLM calls.

"rag+" Summary RAG - Like RAG, but uses more context and recursive
summarization to overcome LLM context limits. Keeps all retrieved chunks, puts them in order, adds neighboring chunks, then uses the summary API to get the answer. Can require several LLM calls.

"all_data" All Data RAG - Like Summary RAG, but includes all document
chunks. Uses recursive summarization to overcome LLM context limits. Can require several LLM calls.

hyde_no_rag_llm_prompt_extension: str

Add this prompt to every user’s prompt, when generating answers to be used for subsequent retrieval during HyDE. Only used when rag_type is “hyde1” or “hyde2”. example: '\nKeep the answer brief, and list the 5 most relevant key words at the end.'

num_neighbor_chunks_to_include: int

Number of neighboring chunks to include for every retrieved relevant chunk. Helps to keep surrounding context together. Only enabled for rag_type “rag+”. Defaults to 1.

meta_data_to_include:

A dictionary containing flags that indicate whether each piece of document metadata is to be included as part of the context for a chat with a collection. Default is {

“name”: True, “text”: True, “page”: True, “captions”: True, “uri”: False, “connector”: False, “original_mtime”: False, “age”: False, “score”: False,

}

rag_max_chunks:

Maximum number of document chunks to retrieve for RAG. If not specified (default: -1), actual number depends on rag_type and admin configuration. Set to >0 values to enable. Can be combined with rag_min_chunk_score.

rag_min_chunk_score:

Minimum score of document chunks to retrieve for RAG. If not specified (default: 0.0), will not filter chunks by score. Set to >0 values to enable. Can be combined with rag_max_chunks.

include_chat_history:

Whether to include chat history. Includes previous questions and answers for the current chat session for each new chat request. Disable if require deterministic answers for a given question. Choices are: [“on”,”off”,”auto”,True,False]

tags:

A list of tags from which to pull the context for RAG.

metadata_filter:

A dictionary to filter documents by metadata, from which to pull the context for RAG.

timeout:

Amount of time in seconds to allow the request to run. The default is 1000 seconds.

retries:

Amount of retries to allow the request to run when hits a network issue. The default is 3.

callback:

Function for processing partial messages, used for streaming responses to an end user.

Returns:

ChatMessage: The response text and details about the response from the LLM. For example:

ChatMessage(
    id='XXX',
    content='The information provided in the context...',
    reply_to='YYY',
    votes=0,
    created_at=datetime.datetime(2023, 10, 24, 20, 12, 34, 875026)
    type_list=[],
)

Raises:

TimeoutError: The request did not complete in time.

property websocket: WebSocketClientProtocol