Skip to main content Skip to complementary content
Close announcements banner

tAzureSynapseBulkExec Standard properties

These properties are used to configure tAzureSynapseBulkExec running in the Standard Job framework.

The Standard tAzureSynapseBulkExec component belongs to two families: Cloud and Databases.

The component in this framework is available in all Talend products.

Basic settings

Property Type

Select the way the connection details will be set.

  • Built-In: The connection details will be set locally for this component. You need to specify the values for all related connection properties manually.

  • Repository: The connection details stored centrally in Repository > Metadata will be reused by this component.

    You need to click the [...] button next to it and in the pop-up Repository Content dialog box, select the connection details to be reused, and all related connection properties will be automatically filled in.

Use an existing connection

Select this check box and in the Component List drop-down list, select the desired connection component to reuse the connection details you already defined.

When a Job contains the parent Job and the child Job, if you need to share an existing connection between the two levels, for example, to share the connection created by the parent Job with the child Job, you have to:

  1. In the parent level, register the database connection to be shared in the Basic settings view of the connection component which creates that very database connection.

  2. In the child level, use a dedicated connection component to read that registered database connection.

For an example about how to share a database connection across Job levels, see Sharing a database connection.

JDBC Provider

Select the provider of the JDBC driver to be used.

Host

Enter the IP address or the hostname of the database server or the Azure Synapse Analytics to be used.

If the SQL Server Browser service is running on the machine where the server resides, you can connect to a named instance through a TCP dynamic port by providing the host name and the instance name in this field in the format of {host_name}\{instance_name}. In this case, you can leave the Port field empty. See SQL Server Browser service for related information.

Port

Enter the listening port number of the database server or the Azure Synapse Analytics to be used.

If the SQL Server Browser service is running on the machine where the server resides, you can connect to a named instance through a TCP dynamic port by providing the host name and the instance name in the Host field and leave this field empty. See SQL Server Browser service for related information.

Schema Enter the name of the Azure Synapse Analytics schema.
Database Specify the name of the Azure Synapse Analytics to be used.
Username and Password Enter the authentication data.

To enter the password, click the [...] button next to the Password field, enter the password in double quotes in the pop-up dialog box, and then click OK.

You can use Type 2 integrated authentication on Windows by adding integratedSecurity=true in the Additional JDBC Parameters field and leave these two fields empty. See section Connecting with integrated authentication On Windows at Building the connection URL for related information.

Additional JDBC Parameters

Specify additional connection properties for the database connection you are creating. The properties are separated by semicolon and each property is a key-value pair. For example, encrypt=true;trustServerCertificate=false; hostNameInCertificate=*.database.windows.net;loginTimeout=30; for Azure SQL database connection.

Table Specify the name of the SQL Data Warehouse table into which data will be loaded.
Action on table

Select an operation to be performed on the table defined.

  • None: No operation is carried out.

  • Drop and create table: The table is removed and created again.

  • Create table: The table does not exist and gets created.

  • Create table if not exists: The table is created if it does not exist.

  • Drop table if exists and create: The table is removed if it already exists and created again.

  • Clear table: The table content is deleted. You have the possibility to rollback the operation.

  • Truncate table: The table content is deleted. You do not have the possibility to rollback the operation.

Schema and Edit schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. When you create a Spark Job, avoid the reserved word line when naming the fields.

  • Built-In: You create and store the schema locally for this component only.

  • Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs.

Click Edit schema to make changes to the schema. If you make changes, the schema automatically becomes built-in.

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion.

    If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the Repository Content window.

Load method

Set the way for loading data. Two options are provided: Copy and PloyBase. See Data loading strategies for Synapse SQL pool for related information.

Azure Storage Select the type of Azure Storage from which data will be loaded.

The Data Lake Storage Gen1 option is not available if Copy is selected from the Load method drop-down list.

If you select both PloyBase as Load method and Data Lake Storage Gen2 as Azure Storage in the Basic settings view, and Secure transfer required in the Advanced settings view, the option Authentication with managed identity will be displayed in the Basic settings view. See Managed identities for Azure resources for related information.

Authentication method
Select one of the following authentication method from the drop-down list.
  • Shared Access Signatures, which requires a shared access signature. See Constructing the Account SAS URI for related information.
  • Storage account key, which requires an account access key. See Manage a storage account for related information.
  • Azure Active Directory, available when Authenticate using Azure Active Directory is selected in the Advanced settings view.
  • Managed identity, provided you have previously created a managed identity (MSI) in your Azure Storage account. See Managed identities for Azure resources for related information.

This option is only available when Copy is selected from the Load method drop-down list.

Account Name Enter the account name for your Azure Blob Storage or Azure Data Lake Storage to be accessed.
Access key

Enter the key associated with the storage account you need to access. Two keys are available for each account and by default, either of them can be used for this access. Select the component whose connection details will be used to set up the connection to Azure storage from the drop-down list.

This option is only available when Blob Storage is selected from the Azure Storage drop-down list.

SAS token

Enter the SAS token value. For related information, see Constructing the Account SAS URI. This option is only available when Shared access signatures is selected from the Authentication method drop-down list.

Endpoint suffix

Enter the Azure Storage service endpoint.

The combination of the account name and the Azure Storage service endpoint forms the endpoint of the storage account.

The default value varies with the Azure Storage setting, that is, "dfs.core.windows.net" for Data Lake Storage Gen2 and "blob.core.windows.net" for Blob Storage.

This option is only available when Copy is selected from the Load method list.

Container Enter the name of the blob container.

This option is only available when Blob Storage is selected from the Azure Storage drop-down list.

External paths

Enter external paths to load data from. You can add multiple external paths by clicking the plus button on the bottom. Note that the external paths are relative paths with the container as the root.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view.

Client Id Enter your application ID (also called client ID).

This option is only available when Data Lake Storage Gen1 is selected from the Azure Storage drop-down list.

OAuth 2.0 token endpoint

In the Token endpoint field, copy-paste the OAuth 2.0 token endpoint that you can obtain from the Endpoints list accessible on the App registrations page on your Azure portal.

This option is only available when Data Lake Storage Gen1 is selected from the Azure Storage drop-down list.

Azure Storage Location Specify the location where your Azure Blob Storage or Azure Data Lake Storage account is created.

Advanced settings

Secure transfer required Select this option to use secure transfer for Azure storage connection.

This option is only available when Data Lake Storage Gen2 is selected from the Azure Storage drop-down list and PloyBase is selected from the Load method drop-down list in the Basic settings view.

Authenticate using Azure Active Directory

Select this option to use Azure Active Directory authentication when establishing the connection. See Azure AD Authentication for related information.

File format Select the file format that defines external data stored in your Azure Blob Storage or Azure Data Lake Storage.

This option is only available when PloyBase is selected from the Load method drop-down list in the Basic settings view.

For more information about the file format, see CREATE EXTERNAL FILE FORMAT.

File type Select the file type that defines external data stored in your Azure Blob Storage or Azure Data Lake Storage. If you select Blob Storage as the Azure Storage type in the Basic settings view, only the CSV file type is supported.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view.

For more information about the file format, see CREATE EXTERNAL FILE FORMAT.

Specify map to source table fields Select this option to map the fields of the source file to specific schema columns by entering schema columns and field indexes in the table that appears. Note that the field indexes of the source file need to be given in an ascendant order in the table.

If this option is not selected, the fields of the source file will be mapped to the schema columns in the default order.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view.

First row

Set the number of rows in the source file to be treated as the header. The default is 1.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV is selected from the File type drop-down list.

Field quote Specify the character that will be used as the quote character (string delimiter) in the source file.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV is selected from the File type drop-down list.

Field terminator Specify the field terminator, which indicates the end of each field in the source file.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV is selected from the File type drop-down list.

Row terminator Specify the row terminator, which indicates the end of each row in the source file.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV is selected from the File type drop-down list.

Date format Specify the date format. The default is Session date format.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV is selected from the File type drop-down list.

Encoding Select the encoding from the list, either UTF8 or UTF16.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view and CSV is selected from the File type drop-down list.

Identity insert

Select this option to insert the identity values in the source file to the identity column. For related information, see COPY (Transact-SQL).

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view.

Max errors Enter the maximum number of errors allowed that will not stop the process.

This option is only available when Copy is selected from the Load method drop-down list in the Basic settings view.

Field separator Specify the character or characters that indicate the end of each field in the delimited text file.

This option is only available when PloyBase is selected from the Load method drop-down list in the Basic settings view and Delimited Text is selected from the File format drop-down list.

Enclosed by Select this check box and in the field next to it, specify the character that encloses the string in the delimited file.

This option is only available when PloyBase is selected from the Load method drop-down list in the Basic settings view and Delimited Text is selected from the File format drop-down list.

Date format Select this check box and in the field next to it, specify the custom format for all date and time data in the delimited file. For more information about the date format, see CREATE EXTERNAL FILE FORMAT.

This option is only available when PloyBase is selected from the Load method drop-down list in the Basic settings view and Delimited Text is selected from the File format drop-down list.

Use type default Select this check box to store each missing value using the default value of the data type of the corresponding column.

Clear this check box to store each missing value in the delimited file as NULL.

This option is only available when PloyBase is selected from the Load method drop-down list in the Basic settings view and Delimited Text is selected from the File format drop-down list.

Serde Method Select a Hive serializer and deserializer method.

This option is only available when Hive RCFile is selected from the File format drop-down list.

Compressed by Select this check box if external data is compressed, and from the drop-down list displayed next to it, select the compression method.
Data import reject options Select this check box to specify the following reject options.
  • Reject type: Specify how you want to deal with reject rows.

    • Value: If the number of rejected rows exceeds the value specified in the Reject value field, the load fails.
    • Percentage: If the percentage of rejected rows exceeds the value specified in the Reject value field, the load fails.
  • Reject value: The reject value according to the reject type. For percentage, it is the percent value without the symbol %.

  • Reject sample value: The reject percentage sample value.

This option is only available to the PolyBase option of the Load method drop-down list.

For more information about the reject options, see CREATE EXTERNAL TABLE.

Distribution Option Select the sharding pattern used to distribute data in the table. For more information about the sharding pattern supported by Azure Synapse Analytics, see Azure Synapse Analytics - Massively parallel processing (MPP) architecture.

This option is only available when any option related to table creation is selected from the Action on table drop-down list.

Distribution Column Name The name of the distribution column for a hash-distribution table.

This option is only available when Hash is selected from the Distribution Option drop-down list.

Table Option Select the index type of the table. For more information, see Indexing tables in SQL Data Warehouse.

This option is only available when any option related to table creation is selected from the Action on table drop-down list.

Index column(s) Specify the name of one or more key columns in the index. If multiple columns are specified, separate them with comma.

This option is only available when Clustered Index is selected from the Table Option drop-down list.

Partition Select this check box to specify the following partition options:
  • Partition column name: Specify the name of the column used to partition the table.

  • Range: Specify how the limit value is included in the range of the limit.

    • Left: The limit value is included in the left range of the limit.

    • Right: The limit value is included in the right range of the limit.

  • Partition For Values: Specify the values (separated by comma) used for partition.

For more information about the table partition, see Partitioning tables in SQL Data Warehouse.

This option is only available when any option related to table creation is selected from the Action on table drop-down list.

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job level as well as at each component level.

Global Variables

ERROR_MESSAGE

The error message generated by the component when an error occurs. This is an After variable and it returns a string.

NB_LINE_INSERTED

The number of rows inserted. This is an After variable and it returns an integer.

Usage

Usage rule

This component can be used as a standalone component of a Job or subJob.

Limitation

Note that some features that are supported by other databases are not supported by Azure Synapse Analytics. For more information, see Unsupported table features.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!