Converting organization names to their abbreviated forms using Magic Fill - Cloud

Talend Cloud Data Preparation User Guide

Version
Cloud
Language
English (United States)
Product
Talend Cloud
Module
Talend Data Preparation
Content
Administration and Monitoring > Managing connections
Data Quality and Preparation > Cleansing data
Data Quality and Preparation > Managing datasets

The Magic Fill function can be used to transform names, units or expressions into their abbreviated forms.

In this example, the dataset to improve contains data on people working for well-known national or international organizations. However, these organizations full names are often long and less well-known as the corresponding acronym. So in order to make the dataset easier to read, you will use the Magic Fill function to convert the full names into their acronyms.

Procedure

  1. Click the header of the organization column in order to select it.
  2. In the functions panel, type Magic fill and click the result to display the options of the associated function.
  3. In the Input 1 field, enter one of the values from the organization column that you would like to transform, World Wildlife Fund for example.
  4. In the Output 1 field, enter the corresponding acronym: WWF.
    For the function to work, you need to enter at least two complete examples of the transformation you want to apply. You can then add up to three other examples. Examples can either be taken from your dataset, or made up. The more examples you input, the more accurately the pattern will be identified by the function.
  5. Enter more before and after examples in the remaining fields:
    • Federal Bureau of Investigation as Input 2 and FBI as Output 2
    • International Court of Justice as Input 3 and ICJ as Output 3
    • World Trade Organization as Input 4 and WTO as Output 4
    • European Union as Input 5 and EU as Output 5

    Based on these examples, the function will understand that it only has to keep the first upper case letter of each word to transform full names into the corresponding acronym.

  6. Click Submit.

Results

A new column is created, where the transformation defined by your examples has been applied to the rest of the organizations names. You can know recognize more easily which organizations appear in the dataset.