Skip to content

Drift Detection

Drift Detection refers to a validation test that enables you to identify changes in the distribution of variables in your model's input data, preventing model performance degradation.

H2O Model Validation performs drift detection using the train and a reference dataset captured at different times to assess how data has changed over time. The Population Stability Index (PSI) formula is applied to each variable to measure how much the variable has shifted in distribution over time. PSI is applied to numerical and categorical columns and not date columns. The PSI formula is as follows:

PSI formula

Variables with a higher PSI indicate a higher drift. Important variables in a model with a high PSI increase the likelihood of performance deterioration while requiring model retraining.

Note

  • See Settings to learn about all the settings for a Drift Detection validation test.

  • See Metrics to learn about all the metrics for a Drift Detection validation test.