tAlfrescoOutput Properties - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Component family

Business

 

Function

Creates dematerialized documents in an Alfresco server where they are indexed under meaningful models.

Purpose

Allows to create and manage documents in an Alfresco server.

Basic settings

URL

Type in the URL to connect to the Alfresco Web application.

 

Login and Password

Type in the user authentication data to the Alfresco server.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

Target Location

Base

Type in the base path where to put the document, or

Select the Map... check box and then in the Column list, select the target location column.

Note: When you type in the base name, make sure to use the double backslash (\\) escape character.

Create Or Update Mode

Document Mode

Select in the list the mode you want to use for the created document.

Create only: creates a document if it does not exist.

Note that an error message will display if you try to create a document that already exists

Create or update: creates a document if it does not exist or updates the document if it exists.

 

Container Mode

Select in the list the mode you want to use for the destination folder in Alfresco.

Update only: updates a destination folder if the folder exists.

Note that an error message will display if you try to update a document that does not exist

Create or update: creates a destination folder if it does not exist or updates the destination folder if it exists.

 

Define Document Type

Click the three-dot button to display the tAlfrescoOutput editor. This editor enables you to:

- select the file where you defined the metadata according to which you want to save the document in Alfresco

-define the type f the document

-select any of the aspects in the available aspects list of the model file and click the plus button to add it in the list to the left.

 

Property Mapping

Displays the parameters you set in the tAlfrescoOutput editor and according to which the document will be created in the Alfresco server.

Note that in the Property Mapping area, you can modify any of the input schemas.

 

Schema and Edit schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

Result Log File Name

Browse to the file where you want to save any logs related to the Job execution.

 

Die on error

This check box is cleared by default, meaning to skip the row on error and to complete the process for error-free rows. If needed, you can retrieve the rows on error via a Row > Rejects link.

Advanced settings

Configure Target Location Container

Allows to configure the (by default) type of containers (folders)

Select this check box to display new fields where you can modify the container type to use your own created types based on the father/child model.

Permissions

Configure Permissions

When selected, allows to manually configure access rights to containers and documents.

Select the Inherit Permissions check box to synchronize access rights between containers and documents.

Click the Plus button to add new lines to the Permissions list, then you can assign roles to user or group columns.

 

Encoding

Select the encoding type from the list or select Custom and define it manually. This field is compulsory.

 

Association Target Mapping

Allows to create new documents in Alfresco with associated links towards other documents already existing in Alfresco, to facilitate the navigation process for example.

To create associations:

  1. Open the tAlfresco editor.

  2. Click the Add button and select a model where you have already defined aspects that contain associations.

  3. Click the drop-down arrow at the top of the editor and select the corresponding document type.

  4. Click OK to close the editor and display the created association in the Association Target Mapping list.

 

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at each component level.

Global Variables

NB_LINE: the number of rows read by an input component or transferred to an output component. This is an After variable and it returns an integer.

NB_LINE_REJECTED: the number of rows rejected. This is an After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Usually used as an output component. An input component is required.

Limitation/Prerequisites

To be able to use the tAlfrescoOutput component, few relevant resources need to be installed: check the Installation Procedure sub section below for more information.

Due to license incompatibility, one or more JARs required to use this component are not provided. You can install the missing JARs for this particular component by clicking the Install button on the Component tab view. You can also find out and add all missing JARs easily on the Modules tab in the Integration perspective of your studio. For details, see https://help.talend.com/display/KB/How+to+install+external+modules+in+the+Talend+products or the section describing how to configure the Studio in the Talend Installation Guide.

Installation procedure

To be able to use tAlfrescoOutput in the Integration perspective of Talend Studio, you need first to install the Alfresco server with few relevant resources.

The below sub sections detail the prerequisite and the installation procedure.

Prerequisites

Start with the following operations:

  1. Download the file alfresco-community-tomcat-2.1.0.zip

  2. Unzip the file in an installation folder, for example:

    C:\Program Files\Java\jdk1.6.0_27
  3. Install JDK 1.6.0+

  4. Update the environment variable

    JAVA_HOME (JAVA_HOME= C:\alfresco)
  5. From the installation folder (C:\alfresco), launch the alfresco server using the script alf_start.bat

Warning

Make sure that the Alfresco server is launched correctly before start using the tAlfrescoOutput component.

Installing the Talend Alfresco module

Note that the talendalfresco_20081014.zip is provided with the tAlfrescoOutput component in the Integration perspective of Talend Studio.

To install the talendalfresco module:

  1. From talendalfresco_20081014.zip and in the talendalfresco_20081014\alfresco folder, look for the following jars: stax-api-1.0.1.jar, wstx-lgpl-3.2.7.jar, talendalfresco-client_1.0.jar, and talendalfresco-alfresco_1.0.jar and move them to C:\alfresco\tomcat\webapps\alfresco\WEB-INF\lib

  2. Add the authentification filter of the commands to the web.xml file located in the path

    C:\alfresco\tomcat\webapps\alfresco\WEB-INF
    son WEB-INF/

    following the model of the example provided in talendalfresco_20081014/alfresco folder of the zipped file talendalfresco_20081014.zip

    The following figures show the portion of lines (in blue) to add in the file web.xml alfresco.

Useful information for advanced use

Installing new types for Alfresco:

From the package_jeu_test.zip and in the package_jeu_test/fichiers_conf_alfresco2.1 folder, look for the following files: xml H76ModelCustom.xml (description of the model), web-client-config-custom.xml (web interface of the model), and custom-model-context.xml (registration of the new model) and paste them in the following folder: C:/alfresco/tomcat/shared/classes/alfresco/extension

Dates:

  • The dates must be of the Talend date type java.util.Date.

  • Columns without either mapping or default values, for example of the type Date, are written as empty strings.

  • Solution: delete all columns without mapping or default values. Note that any modification of the type Alfresco will put them back.

Content:

  • Do not mix up between the file path which content you want to create in Alfresco and its target location in Alfresco.

  • Provide a URL! It can target various protocols, among which are file, HTTP and so on.

  • For URLs referring to files on the file system, precede them by "file:" for Windows used locally, and by "file://" for Windows on a network (which accepts as well "file: \ \") or for Linux.

  • Do not double the backslash in the target base path (automatic escape), unless you type in the path in the basic settings of the tAlfrescoOutput component, or doing concatenation in the tMap editor for example.

Multiple properties or associations:

  • It is possible to create only one association by document if it is mapped to a string value, or one or more associations by document if it is mapped to a list value (object).

  • You can empty an association by mapping it to an empty list, which you can create, for example, by using new java.util.ArrayList()in the tMap component.

However, it is impossible to delete an association.

Building List(object)with tAggregate:

  • define the table of the relation n-n in a file, containing a name line for example (included in the input rows), and a category line (that can be defined with its mapping in a third file).

  • group by: input name, output name.

  • operation: output categoryList, function list(object), input category. ATTENTION list (object) and non simple list.

- References (documents and folders):

  • References are created by mapping one or more existing reference nodes (xpath or namepath) using String type or List(object).

  • An error in the association or the property of the reference type does not prevent the creation of the node that holds the reference.

  • Properties of the reference type are created in the Basic Settings view.

  • Associations are created in the Advanced Settings view.

Dematerialization, tAlfrescoOutput, and Enterprise Content Management

Dematerialization is the process that convert documents held in physical form into electronic form, and thus helps to move away from the use of physical documentation to the use of electronic Enterprise Content Management (ECM) systems. The range of documents that can be managed with an Enterprise Content Management system include just about everything from basic documents to stock certificates, for example.

Enterprises dematerialize their content via a manual document handling, done by man, or an automatic document handling, machine-based.

Considering the varied nature of the content to be dematerialized, enterprises have to use varied technologies to do it. Scanning paper documents, creating interfaces to capture electronic documents from other applications, converting document images into machine-readable/editable text documents, and so on are examples of the technologies available.

Furthermore, scanned documents and digital faxes are not readable texts. To convert them into machine-readable characters, different character recognition technologies are used. Handwritten Character Recognition (HCR) and Optical Mark Recognition (OMR) are two examples of such technologies.

Equally important as the content that is captured in various formats from numerous sources in the dematerialization process is the supporting metadata that allows efficient identification of the content via specific queries.

Now how can this document content along with the related metadata be aggregated and indexed in an Enterprise Content Management system so that it can be retrieved and managed in meaningful ways? Talend provides the answer through the tAlfrescoOutput component.

The tAlfrescoOutput component allows you to stock and manage your electronic documents and the related metadata on the Alfresco server, the leading open source enterprise content management system.

The following figure illustrates Talend's role between the dematerialization process and the Enterprise Content Management system (Alfresco).