Supported derived transformation
Transformation changes the raw data and makes it usable by a model.
Spark pipeline
Creating a feature set via Spark pipeline. The Spark pipeline generates the data from an existing feature set that you pass in as an input to the pipeline. Feature Store then uploads the Spark pipeline to the Feature Store artifacts cache and stores only the location of the pipeline in the database.
User API:
- Python
- Scala
Parameters:
pipeline_local_location: String or Pipeline Object
- you pass the local path to the pipeline or the pipeline object itself. Once the feature set is registered, this parameter contains the path to the uploaded Spark pipeline in the Feature Store artifacts storage.
import featurestore.core.transformations as t
spark_pipeline_transformation = t.SparkPipeline("...")
Parameters:
pipelineLocalLocation: String or Pipeline Object
- you pass the local path to the pipeline or the pipeline object itself. Once the feature set is registered, this parameter contains the path to the uploaded Spark pipeline in the Feature Store artifacts storage.
import ai.h2o.featurestore.core.transformations.SparkPipeline
val sparkPipelineTransformation = t.SparkPipeline("...")
Driverless AI MOJO
Creating a feature set via Driverless AI MOJO. The MOJO pipeline generates the data from an existing feature set that you pass in as an input to the pipeline. Feature Store then uploads the MOJO pipeline to the Feature Store artifacts cache and stores only the location of the pipeline in the database.
Only features created from Driverless AI with the
make_mojo_scoring_pipeline_for_features_only
setting
are supported in Feature Store.
User API:
- Python
- Scala
Parameters:
mojo_local_location: String
- you pass the local path to the pipeline. Once the feature set is registered, this parameter contains the path to the uploaded MOJO pipeline in the Feature Store artifacts cache
import featurestore.core.transformations as t
transformation = t.DriverlessAIMOJO(...)
Parameters:
mojoLocalLocation: String
- you pass the local path to the pipeline. Once the feature set is registered, this parameter contains the path to the uploaded MOJO pipeline in the Feature Store artifacts cache
import ai.h2o.featurestore.core.transformations.DriverlessAIMOJO
val transformation = DriverlessAIMOJO(...)
JoinFeatureSets
Creating a new feature set by joining together two different feature sets.
User API:
- Python
- Scala
Parameters:
left_key: String
- joining key which must be present in left feature setright_key: String
- joining key which must be present in right feature setjoin_type: JoinFeatureSetsType
- join type (default: JoinFeatureSetsType.INNER)
JoinFeatureSetsType
- JoinFeatureSetsType.INNER - The inner join is the default join in Spark SQL. It selects rows that have matching values in both relations.
- JoinFeatureSetsType.LEFT - A left join returns all values from the left relation and the matched values from the right relation, or appends NULL if there is no match.
- JoinFeatureSetsType.RIGHT - A right join returns all values from the right relation and the matched values from the left relation, or appends NULL if there is no match.
- JoinFeatureSetsType.FULL - A full join returns all values from both relations, appending NULL values on the side that does not have a match.
- JoinFeatureSetsType.CROSS - A cross join returns the Cartesian product of two relations.
import featurestore.core.transformations as t
transformation = t.JoinFeatureSets(left_key=..., right_key=..., join_type=...)
Parameters:
leftKey: String
- joining key which must be present in left feature setrightKey: String
- joining key which must be present in right feature setjoinType: JoinFeatureSetsType
- join type (default: JoinFeatureSetsType.INNER)
JoinFeatureSetsType
- JoinFeatureSetsType.INNER - The inner join is the default join in Spark SQL. It selects rows that have matching values in both relations.
- JoinFeatureSetsType.LEFT - A left join returns all values from the left relation and the matched values from the right relation, or appends NULL if there is no match.
- JoinFeatureSetsType.RIGHT - A right join returns all values from the right relation and the matched values from the left relation, or appends NULL if there is no match.
- JoinFeatureSetsType.FULL - A full join returns all values from both relations, appending NULL values on the side that does not have a match.
- JoinFeatureSetsType.CROSS - A cross join returns the Cartesian product of two relations.
import ai.h2o.featurestore.core.transformations.JoinFeatureSets
val transformation = JoinFeatureSets(leftKey=..., rightKey=...,joinType=...)
During join transformations, Feature Store perform inner joins
- Submit and view feedback for this page
- Send feedback about H2O Feature Store to cloud-feedback@h2o.ai