Settings: Adversarial Similarity¶

H2O Model Validation offers an array of settings for an Adversarial Similarity test. Below, each setting is described in turn.

Training dataset¶

Defines the training dataset, one of the two datasets H2O Model Validation uses during the validation test to observe similar or dissimilar rows between the training dataset and reference dataset (test dataset). The defined training dataset dictates the required structure of the reference dataset (i.e., similar columns). H2O Model Validation requires you to define this setting before it can initiate an Adversarial Similarly validation test.

Reference dataset¶

Defines the reference dataset (test dataset), one of the two datasets H2O Model Validation uses during the validation test to observe similar or dissimilar rows between the training dataset and reference dataset (test dataset). The defined training dataset dictates the required structure of the reference dataset (i.e., similar columns). H2O Model Validation requires you to define this setting before it can initiate an Adversarial Similarly validation test.

Note

H2O Model Validation drops a particular column in the reference dataset if that column is not present in the defined train dataset.

ID column¶

Defines the ID column of the train and reference dataset, which H2O Model Validation does not use during training.

Note

An identity (ID) column is a column in a dataset that uniquely identifies the rows.

Columns to drop¶

Defines the columns H2O Model Validation drops during model training.

Info

This setting is proper when you want to drop columns that cause high dissimilarity (e.g., a time column).

Compute Shapley values¶

Determines if H2O Model Validation computes Shapley values for the model used to analyze the similarity between the train and reference dataset. H2O Model Validation uses the generated Shapley values to create an array of visual metrics that provide valuable insights into the contribution of individual features to the overall model performance.

Note

Generating Shapley values for the model can lead to a significant impact on the runtime.
Generated visual metrics can help understand what might cause a higher degree of dissimilarity between the train and reference dataset. To learn more about generated visual metrics, see Metrics.

Remove validation experiments from DAI after finish¶

Determines if H2O Model Validation should delete the Driverless AI (DAI) experiments generated during the Adversarial Similarity test. By default, H2O Model Validation checks this setting (enables it), and accordingly, H2O Model Validation deletes all DAI experiments because they are no longer needed after the validation test is complete.

Feedback

Submit and view feedback for this page
Send feedback about H2O Model Validation to cloud-feedback@h2o.ai