Formatting names with Magic Fill - Cloud

Talend Cloud Data Preparation User Guide

author
Talend Documentation Team
EnrichVersion
Cloud
EnrichProdName
Talend Cloud
task
Data Quality and Preparation > Cleansing data
EnrichPlatform
Talend Data Preparation

You can use the magic fill function to automatically format names, based on a pattern defined by examples.

Let's take the example of a dataset with a column containing the full names or your customers.

You would like to format those names, and only keep the first letter of the first name, followed by a dot, and then the last name. For example, George Abitbol would become G. Abitbol. The easiest way for you to accomplish that would be to use the Magic Fill function to set some examples of how you would like the transformation to work, and apply it to the rest of the column.

Procedure

  1. Click the header of the fullname column in order to select its content.
  2. In the functions panel, type Magic fill and click the result to display the options of the associated function.
  3. Clear the Create new column check box.
    This way, the values will be fixed directly in the existing column.
  4. In the Input 1 field, enter one of the value from the fullname column that you would like to transform, Dimitri Tudor for example.
  5. In the Output 1 field, enter the same value, but with the correct format this time: D. Tudor.
    For the function to work, you need to enter at least two complete examples of the transformation you want to apply. You can then add up to three other examples. Examples can either be taken from your dataset, or made up. The more examples you input, the more accurately the pattern will be identified by the function.
  6. Enter more before and after examples, in the remaining fields.
    • Mina Luze as Input 2 and M. Luze as Output 2
    • Henry Bank as Input 3 and H. Bank as Output 3
    • Ben Schneider as Input 4 and B. Schneider as Output 4
    • Jonathan Oliver as Input 5 and J. Oliver as Output 5
  7. Click Submit.

Results

From the few examples set at the beginning, the function has been able to understand the pattern, and automatically create the corresponding transformation. The names in your dataset have now been replaced with their equivalent in the expected format.