Skip to main content Skip to complementary content

Creating a training data schema reference

This sections explains how to create a training data schema reference to develop a machine learning routine.

Procedure

  1. Right-click the HDFS connection you previously created and choose Retrieve Schema.
  2. Navigate to the pre-loaded training data file located at /user/puccini/machinelearning/decisiontrees/marketing/marketing_campaign_train.csv.
  3. Click Next, name the schema and adjust the data types as needed.
    In this case, the defaults are accurate.
  4. Click Finish.
  5. Add a tHDFSConfiguration component to the palette.
  6. Set Property Type to Repository.
  7. Select the HDFS connection you created, MarketingCampaignData.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!