This section explains how to train your decision tree model.
- Add a tDecisionTreeModel component to the palette.
- Connect tModelEncouder to tDecisionTreeModel with a Main.
- Double-click tDecisionTreeModel and choose the Component view.
- Select the check box below Storage to choose HDFS storage.
- Choose the schema you created earlier.
- In Features Column, choose MyFeatures.
- In Label Column, choose MyLabels.
- Select the check box below Model location and save the HDFS file system at /user/puccini/machinelearning/decisiontrees/marketing/decisiontree.model.
Leave the default value for the rest of the settings.
Your final job should look as follows.
- Click the Run tab and go to Spark Configuration.
Select the Use local mode check box.
You can also run this job directly on the Hadoop cluster, which is the most likely scenario in a production setting. For that, you need to make a few small adjustments to how the job runs, including clearing the Use local mode check box.