For more technologies supported by Talend, see Talend components.
Using the tJapaneseTokenize component, you can split Japanese text into tokens.
To replicate the example described below, retrieve the tJapaneseTokenize_standard_scenario.zip file from the Downloads tab from the left panel of this help page.
- the plain text file inputJapaneseText.txt containing Japanese text, the transcription and the English translation; and
- the tJapaneseTokenizeJob.zip file containing the Job.