Skip to main content

DAI Engine Examples

This guide demonstrates how to use the Driverless AI Engine client with examples covering common operations.

Advanced users and app developers

See the module documentation for full client reference.

Starting a Driverless AI engine

The simplest way to create a Driverless AI engine is using default configuration:

import h2o_engine_manager

# Initialize AIEM client.
dai_engine_client = h2o_engine_manager.login().dai_engine_client

# Create the DAIEngine using the default configuration.
# This call returns immediately, and does not wait for the engine to be ready.
engine = dai_engine_client.create_engine(
display_name="My Basic DAI Engine"
)

# This call blocks until the engine is ready.
# It may take a few minutes.
engine.wait()

# Now DAI is ready to be connected to.
# This returns an initialized driverlessai.Client
# Note: If you encounter version compatibility issues, use backend_version_override="latest"
dai = engine.connect()

# Import a dataset
ds = dai.datasets.create(
data='s3://h2o-public-test-data/smalldata/iris/iris.csv',
data_source='s3'
)

# Print column summary for the first three columns
print(ds.column_summaries()[0:3])

Advanced Driverless AI engine

For more control, you can specify custom configuration parameters including CPU, GPU, memory, storage, timeouts, profiles, and versions:

import h2o_engine_manager

# Initialize AIEM clients.
aiem = h2o_engine_manager.login()
dai_engine_client = aiem.dai_engine_client
dai_engine_profile_client = aiem.dai_engine_profile_client
dai_engine_version_client = aiem.dai_engine_version_client

# Set ID of the workspace that will contain this engine.
# You can use alias 'default' for your personal workspace.
workspace_id = "default"
# Set ID of the engine.
engine_id = "my-engine"

# You can list available profiles.
profiles = dai_engine_profile_client.list_all_assigned_dai_engine_profiles(parent="workspaces/global")
print("Profiles:")
for profile in profiles:
print("\t", profile.name)

# You can list available versions.
versions = dai_engine_version_client.list_all_dai_engine_versions(parent="workspaces/global")
print("Versions:")
for version in versions:
print("\t", version.name)

# Create the DAIEngine with custom parameters.
# This call returns immediately, and does not wait for the engine to be ready.
engine = dai_engine_client.create_engine(
workspace_id=workspace_id,
engine_id=engine_id,
display_name="My Advanced DAI Engine",
cpu=4,
gpu=1,
memory_bytes="12Gi",
storage_bytes="64Gi",
max_idle_duration="3000s",
max_running_duration="5000s",
config={"max_runtime_minutes": "120", "feature_engineering_effort": "5"},
profile="workspaces/global/daiEngineProfiles/default", # One of the listed profiles.
dai_engine_version="workspaces/global/daiEngineVersions/2.0.0", # One of the listed versions.
)

# This call blocks until the engine is ready.
# It may take a few minutes.
engine.wait()

# Now DAI is ready to be connected to.
# This returns an initialized driverlessai.Client
# Note: If you encounter version compatibility issues, use backend_version_override="latest"
dai = engine.connect()

Fetching a Driverless AI engine

Retrieve an existing engine by its workspace ID and engine ID:

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

print(engine)

Listing Driverless AI engines

The list_engines method supports pagination, filtering, and sorting. The following examples demonstrate basic usage:

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

# Specify a workspace in which you want to list engines.
workspace_id = "default"

# By default, engines are ordered by their time of creation in descending order
# and limited to 50 items per page.
list1 = dai_engine_client.list_engines(workspace_id=workspace_id)

# To list a second page, pass the next_page_token from the previous page.
# This example increases the page size to 100.
list2 = dai_engine_client.list_engines(
workspace_id=workspace_id,
page_size=100,
page_token=list1.next_page_token
)

# Engine ordering works on any documented field. This example orders engines
# by the most recently started first and the amount of CPU units second.
list3 = dai_engine_client.list_engines(
workspace_id=workspace_id,
order_by="resume_time desc, cpu desc"
)

# Engine filtering works on any documented field. This example filters engines
# to only running engines with 5 CPU units or more.
list4 = dai_engine_client.list_engines(
workspace_id=workspace_id,
filter="state = STATE_RUNNING AND cpu >= 5"
)
info

A convenience function list_all_engines is provided that pages through all engines and returns a complete list. Although it is strongly recommended to use pagination, this function is useful for quick prototyping.

Pausing a Driverless AI engine

You should pause your engines when you are not using them to save costs. Pausing terminates all running jobs but all other state is persisted.

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

engine.pause()

# You may wait for the engine to finish pausing.
engine.wait()
info

An engine can be automatically paused after exceeding the idle or uptime limit.

Resuming a Driverless AI engine

A paused engine can be resumed to continue using it:

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

# This returns immediately and does not wait for the engine to be ready.
engine.resume()

# This call blocks until the engine is ready.
# It may take a few minutes.
engine.wait()

# Note: If you encounter version compatibility issues, use backend_version_override="latest"
dai = engine.connect()
# ...

Updating a Driverless AI engine

You can modify an existing engine with the update method. The engine must be paused before updating.

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

# Fetch the engine first.
engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

# Update the engine fields directly.
engine.cpu = 2
engine.memory_bytes = "16Gi"
engine.max_idle_duration = "3600s"

# Persist changes to the engine.
engine.update()

Upgrading a Driverless AI engine version

You can upgrade an engine to a newer Driverless AI version. The engine must be paused before upgrading.

import h2o_engine_manager

# Initialize AIEM clients.
aiem = h2o_engine_manager.login()
dai_engine_client = aiem.dai_engine_client
dai_engine_version_client = aiem.dai_engine_version_client

# List available versions to find the target version.
versions = dai_engine_version_client.list_all_dai_engine_versions(parent="workspaces/global")
print("Available versions:")
for v in versions:
print(f" {v.name}")

# Fetch the engine.
engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

# Upgrade to a specific version.
target_version = "workspaces/global/daiEngineVersions/2.1.0"
engine.upgrade_dai_engine_version(new_dai_engine_version=target_version)

You can also upgrade to the latest version using the latest alias:

# Upgrade to the latest version.
engine.upgrade_dai_engine_version(new_dai_engine_version="workspaces/global/daiEngineVersions/latest")

Resizing engine storage

You can increase the storage size of an engine. The engine must be paused before resizing storage.

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

# Fetch the engine.
engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

# Resize storage to 100Gi.
engine.resize_storage(new_storage="100Gi")

Downloading engine logs

You can download logs from a running engine for troubleshooting:

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

# Fetch the engine.
engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

# Download and print logs.
logs = engine.download_logs()
print(logs)

Migrating engine creator (Admin only)

Administrators can migrate the creator of an engine to a different user. This is useful for transferring ownership.

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

# Fetch the engine.
engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

# Migrate creator to a new user.
engine.migrate_creator(new_creator="users/397b8c16-f4cb-41dd-a5e9-5e838edb81ab")

Deleting a Driverless AI engine

Engine deletion is a destructive operation that immediately terminates the engine and deletes all data contained within. It cannot be undone.

import h2o_engine_manager

dai_engine_client = h2o_engine_manager.login().dai_engine_client

engine = dai_engine_client.get_engine(
workspace_id="default",
engine_id="my-engine"
)

engine.delete()

# You may wait for the engine to finish deleting.
engine.wait()

Asynchronous waiting

The client provides a way to concurrently wait for an engine using the async/await syntax. While within an async function context, you can take advantage of asyncio wait:

engine = dai_engine_client.create_engine(...)

# This method utilizes asyncio.sleep() to asynchronously poll engine status.
await engine.wait_async()

Eventual consistency

The API is eventually consistent, meaning that when reading immediately after a write operation, the read may not reflect the written changes immediately. This client is designed to eliminate the hurdles of eventual consistency if used correctly. Always use the object returned by write operations.

Incorrect - Error not found
dai_engine_client.create_engine(...)
engine = dai_engine_client.get_engine(...)
Correct
engine = dai_engine_client.create_engine(...)

Example notebook

You can find a Jupyter notebook with examples here.


Feedback