Retrieve API
To retrieve the data, first run:
- Python
- Scala
ref = fs.retrieve(start_date_time=None, end_date_time=None)
val ref = fs.retrieve(startDateTime="", endDateTime="")
Parameters explanation:
- Python
- Scala
If start_date_time
and end_date_time
are empty, all ingested data
are fetched. Otherwise, these parameters are used to retrieve only a
specific range of ingested data. For example, when ingested data are in
a time range between T1 <= T2
, start_date_time
can have any value
T3
and end_date_time
can have any value T4
, where
T1 <= T3 <= T4 <= T2
.
If startDateTime
and endDateTime
are empty, all ingested data are
fetched. Otherwise, these parameters are used to retrieve only a
specific range of ingested data. For example, when ingested data are in
a time range between T1 <= T2
, startDateTime
can have any value T3
and endDateTime
can have any value T4
, where T1 <= T3 <= T4 <= T2
.
This call returns immediately with a retrieve holder allowing you to use multiple approaches on how to retrieve the data. Based on the input parameters, the specific call for data retrieval searches the cache and tries to find the ingested data.
Downloading the files from Feature Store
You can download the data to your local machine by:
Blocking approach:
- Python
- Scala
dir = ref.download()
val dir = ref.download()
Non-Blocking approach:
- Python
- Scala
future = ref.download_async()
val future = ref.downloadAsync()
More information about asynchronous methods is available at Asynchronous methods.
This will download all produced data files (parquet) into a newly created directory.
Obtaining data as a Spark Frame
You can also read the data from the retrieve call directly as a Spark frame:
- Python
- Scala
ref = my_feature_set.retrieve()
data_frame = ref.as_spark_frame(spark_session)
val ref = myFeatureSet.retrieve()
val dataFrame = ref.asSparkFrame(sparkSession)
Read more about Spark Dependencies in the Spark dependencies section.
Retrieving from online
To retrieve data from the online Feature Store, run:
- Python
- Scala
json = feature_set.retrieve_online(key)
json = featureSet.retrieveOnline(key)
The key
represents a specific primary key value for which the entry is
obtained.
- Submit and view feedback for this page
- Send feedback about H2O Feature Store to cloud-feedback@h2o.ai