Create an experiment for causal classification modeling
Overview
This tutorial will guide you through the process of setting up and conducting an experiment for causal classification modeling problem type using H2O LLM Studio. It covers how to import datasets, configure key experiment settings, and create a new experiment. By following these steps, you will learn how to design experiments that can identify causal relationships in classification tasks.
Objectives
- Learn how to import datasets from Hugging Face into H2O LLM Studio.
- Set up an experiment for causal classification modeling with appropriate parameters.
Prerequisites
- Access to the latest version of H2O LLM Studio.
- Basic understanding of classification and causal models.
Step 1: Import dataset
For this tutorial, we'll use the IMDb movie review dataset on Hugging Face. The dataset contains 25,000 movie reviews for training, each labeled as either positive or negative. Let's import the dataset.
- Click on Import dataset.
- Select Hugging Face as the data source from the Source dropdown.
- In the Hugging Face dataset field, enter
stanfordnlp/imdb
. - In the Split field, enter
train
. - Click Continue.
Step 2: Configure dataset
In this step, we'll review and adjust the dataset settings for our experiment.
- In the Dataset name field, enter
classification
. - In the Problem type dropdown, select Causal classification modeling.note
If the dataset is configured correctly, the Causal classification modeling problem type will be pre-selected automatically.
- In the Train dataframe dropdown, leave the default train dataframe as
imdb_train.pq
. - In the Validation dataframe dropdown, leave the default value as
None
. - In the Prompt column dropdown, select Text.
- In the Answer column dropdown, select Label.
- Click Continue.
- In the Sample data visualization page, click Continue if the input data and labels appear correctly.
Step 3: Create a new experiment
Now that the dataset is imported, it's time to start a new experiment for causal classification modeling.
- From the View datasets page, click on the
imdb_train
dataset, then select New experiment. Kebab menu next to the - In General settings, enter
tutorial-1a
in the Experiment name text box. - In Dataset settings, set the Data sample to 0.1.
- In Dataset settings, set the Num classes to 1.
- In Training settings, select the BinaryCrossEntrophyLoss from the Loss function dropdown.
- In Prediction settings, select LogLoss from the Metric dropdown.
- Leave the other configurations at their default values.
- Click Run experiment.
Step 4: Evaluate experiment
After successfully creating the new experiment, click on the experiment name to access the experiment tabs. These tabs provide detailed information and insights into various aspects of your experiment. For more information about the experiment tabs, see Experiment tabs.
Summary
In this tutorial, we walked through the process of setting up a causal classification experiment using H2O LLM Studio. You learned how to import the IMDb dataset from Hugging Face, configure the dataset and experiment settings, and create a new experiment. With these steps, you're now ready to explore different datasets and experiment with various configurations for causal classification problem type in H2O LLM Studio.
- Submit and view feedback for this page
- Send feedback about H2O LLM Studio | Docs to cloud-feedback@h2o.ai