Skip to main content

H2O Engine Examples

This guide will walk you through the H2O Engine client usage in a few examples.

Advanced users and app developers

See the module documentation for full client reference.

Listing available H2O versions

Before you start an engine, you should decide its version. Following example will print a list of all available versions.

import h2o_engine_manager

h2o_engine_client = h2o_engine_manager.login().h2o_engine_client

print(h2o_engine_client.list_all_versions())
# [{'aliases': [], 'annotations': {}, 'version': '3.36.1.3'}, {'aliases': ['latest'], 'annotations': {}, 'version': '3.36.0.4'}]

The version value will be used in the engine creation step. You can also use any of the aliases to reference given version.

caution

You must install a matching h2o-3 Python client version to connect to the engine.

Starting an H2O engine

import h2o_engine_manager
import h2o

# Initialize the client.
h2o_engine_client = h2o_engine_manager.login().h2o_engine_client

# Set the name of the workspace that will contain this engine.
workspace_id = "my-first-workspace"
# Set the name of the engine.
engine_id = "my-engine"

# Create the H2O engine.
# This call returns immediately, and does not wait for the engine to be ready.
engine = h2o_engine_client.create_engine(
workspace_id=workspace_id,
engine_id=engine_id,
version=h2o.__version__,
node_count=1,
cpu=8,
gpu=0,
memory_bytes="64Gi",
max_idle_duration="1h"
)

# This call blocks until the engine is ready.
# It may take a few minutes.
engine.wait()

# Now DAI is ready to be connected to.
h2o.connect(config=engine.get_connection_config())

# Import a dataset
df = h2o.import_file(path="s3://h2o-public-test-data/smalldata/iris/iris.csv")
df.show()

Fetching an H2O engine

This fetches an existing engine from the previous example.

import h2o_engine_manager

h2o_engine_client = h2o_engine_manager.login().h2o_engine_client

engine = h2o_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

print(engine)

Listing H2O engines

The list_engine method supports pagination, filtering and sorting. Following example shows basic listing usages.

import h2o_engine_manager

h2o_engine_client = h2o_engine_manager.login().h2o_engine_client

# Specify a workspace in which you want to list engines.
workspace_id = "default"

# By default, engines are ordered by their time of creation in descending order
# and limited to 50 items per page.
list1 = h2o_engine_client.list_engines(workspace_id=workspace_id)

# To list a second page, pass the next_page_token from the previous page.
# This example increases the page size to 100.
list2 = h2o_engine_client.list_engines(
workspace_id=workspace_id,
page_size=100,
page_token=list1.next_page_token
)

# Engine ordering works on any documented field. This example orders engines
# by the most recently started first and the amount of CPU units second.
list3 = h2o_engine_client.list_engines(
workspace_id=workspace_id,
order_by="resume_time desc, cpu desc"
)

# Engine filtering works on any documented field. This example filters engines
# only running engines with 5 CPU units or more.
list4 = h2o_engine_client.list_engines(
workspace_id=workspace_id,
filter="state = STATE_RUNNING AND cpu <= 5"
)
info

A convenience function list_all_engines is provided that pages through all engines and returns a complete list. Although it is strongly recommended to use pagination, this function is useful for quick prototyping.

Deleting an H2O engine

Engine deletion is a destructive operation that immediately terminates the engine and deletes all data contained within. It cannot be undone.

import h2o_engine_manager

h2o_engine_client = h2o_engine_manager.login().h2o_engine_client

engine = h2o_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

engine.delete()

# You may wait for the engine to finish deleting.
engine.wait()

Asynchronous waiting

The client provides a way to concurrently wait for an engine using the async/await syntax. While within an async function context, you can take advantage of asyncio wait:

engine = h2o_engine_client.create_engine(...)

# This method utilizes asyncio.sleep() to asynchronously poll engine status.
await engine.wait_async()

Eventual consistency

The API is eventually consistent, meaning a that when reading immediately after a write operations may not reflect the written changes immediately. This client is designed to eliminate the hurdles of eventual consistency if used correctly. Always use the object returned by write operations.

Incorrect - Error not found
h2o_engine_client.create_engine(...)
engine = h2o_engine_client.get_engine(...)
Correct
engine = h2o_engine_client.create_engine(...)

Feedback