Training the decision tree model

This section explains how to train your decision tree model.

Procedure

Add a tDecisionTreeModel component to the palette.
Connect tModelEncoder to tDecisionTreeModel with a Main.
Double-click tDecisionTreeModel and choose the Component view.
Select the check box below Storage to choose HDFS storage.
Choose the schema you created earlier.
In Features Column, choose MyFeatures.
In Label Column, choose MyLabels.
Select the check box below Model location and save the HDFS file system at /user/puccini/machinelearning/decisiontrees/marketing/decisiontree.model.
Leave the default value for the rest of the settings.

Your final Job should look as follows.
Click the Run tab and go to Spark Configuration.
Select the Use local mode check box.

You can also run this Job directly on the Hadoop cluster, which is the most likely scenario in a production setting. For that, you need to make a few small adjustments to how the Job runs, including clearing the Use local mode check box.

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!