Teradata error: "Invalid Input: only Latin letters allowed"
From the Profiling perspective of Talend Studio, try to profile a column in Teradata, first_name for example, using the Soundex Frequency Table indicator. Run the column analysis with the SQL engine. The analysis runs successfully.
Try to drill down data on the result page: in the Frequency Statistics table in the Analysis Results view, right-click a row and select View Rows. You will get an error in the SQL Editor about the generated SQL query.
This limitation is due to Teradata soundex implementation. The Teradata database requires that a character string or expression that contains a surname is evaluated in simple Latin characters.
A simple Latin character is one that does not have diacritical marks such as tilde (~) or
acute accent (´). There are 26 uppercase simple Latin characters and 26 lowercase simple
Latin characters. Even a simple call to
SOUNDEX ('Sébastien') cannot be
executed on Teradata. Therefore, it is not possible to drill down into all rows that
sounds like 'Sébastien'.