Interpreting Datasets
Datasets are used either to interpret models (for example by the Decision Tree explainer) or to be interpreted themselves (for example the Drift Detection Explainer). Datasets can be provided as:
path to the dataset stored as CSV or
.jay
filedatatable.Frame
instancepandas.DataFrame
instanceh2o.H2OFrame
instanceh2o_sonar.lib.api.datasets.ExplainableDatasetHandle
instanceh2o_sonar.lib.api.datasets.ExplainableDataset
instance
See also:
Explainable Dataset
h2o_sonar.lib.api.datasets.ExplainableDataset
is typically used when there is a need
to specify dataset metadata or when h2o_sonar.lib.api.datasets.DatasetApi::create_dataset()
method
is used to create ExplainableDataset
. In the latter case, metadata - like shape, columns and unique
column values frequencies - are constructed automatically.
dataset: datasets.ExplainableDataset = (
self.container.dataset_api.create_dataset(
dataset_src= ... path to dataset or frame instance
)
)
Explainable Dataset Handle
h2o_sonar.lib.api.datasets.ExplainableDatasetHandle
represents a remote dataset
hosted e.g. by a Driverless AI server. For instance it is
used by H2O Model Validation based
explainers which use Driverless AI servers as workers to explain the models. Explainable dataset handle
string serialization (used for instance on the command line) has the following format:
resource:connection:<connection ID>:key:<dataset ID>
where:
connection ID
… is a unique identifier of the Driverless AI connection specified in the H2O Eval Studio configuration.
dataset ID
… is a unique identifier of the dataset hosted by the Driverless AI server (typically UUID).
Example:
resource:connection:local-driverless-ai-server:key:7965e2ea-f898-11ed-b979-106530ed5ceb