Working with databases

Etlworks Integrator works equally well with all relational and NoSQL databases, as long as there is a JDBC driver for the database, and the database can be accessed from the server that Integrator is running on.

Connecting to the database

The following connectors are available out-of-the-box:

Create a database connection to connect to the relational database.

Connecting to Oracle

To create a connection to an Oracle database, open the Connections window, click the + button, and type in oracle. Choose a connection type from the list of available Oracle connections.

Enter the connection parameters.

The following link details the properties that are available for an Oracle connection's URL.

Connecting to MS SQL Server

To create a connection to a Microsoft SQL Server database, open the Connections window, click the + button, and type in sql server. Select Microsoft SQL Server.

Enter the connection parameters.

The following link details the properties that are available for a Microsoft Sql Server connection's URL.

Connecting to IBM DB2

To create a connection to an IBM DB2 database, open the Connections window, click the + button, and type in db2. Select IBM DB2.

Enter the connection parameters.

The following link details the properties that are available for an IBM DB2 connection's URL.

Connecting to PostgreSQL

To create a connection to a PostgreSQL database, open the Connections window, click the + button, and type in postres. Select PostgreSQL.

Enter the connection parameters.

The following link details the properties that are available for a PostgreSQL connection's URL.

Connecting to MySQL

To create a connection to an MySQL database, open the Connections window, click the + button, and type in mysql. Select MySQL.

Enter the connection parameters.

The following link details the properties that are available for a MySQL connection's URL.

Connecting to Informix

To create a connection to an Informix database, open the Connections window, click the + button, and type in informix. Select one of the available Informix connection types.

Enter the connection parameters.

If Informix is using the DB2 driver, the connection type is selected and the following properties are available for an Informix connection's URL.

Otherwise, the following link details the URL properties that are available.

Connecting to Sybase ASE

To create a connection to a Sybase ASE database, open the Connections window, click the + button, and type in sybase. Select Sybase ASE.

Enter the connection parameters.

The following link details the properties that are available for a Sybase ASE connection's URL.

Connecting to Snowflake

To create a connection to a Snowflake database, open the Connections window, click the + button, and type in snowflake. Select Snowflake.

Enter the connection parameters.

The following link details the properties that are available for a Snowflake connection's URL.

Read how to work with Snowflake in Etlworks Integrator.

Connecting to Amazon RDS

Amazon RDS is a fully-managed database service. The following RDS connectors are available in Etlworks Integrator:

Connecting to Amazon Aurora PostgreSQL

Amazon Aurora PostgreSQL is essentially a fully-managed PostgreSQL cluster that is available on AWS. Use the following guidelines to configure an Aurora PostgreSQL cluster.

To create a connection to an Amazon Aurora PostgreSQL database, open the Connections window, click the + button, and type in aurora. Select Amazon Aurora PostgreSQL.

Enter the connection parameters.

The following link details the properties that are available for an Amazon Aurora PostgreSQL connection's URL.

Connecting to Amazon Aurora MySQL

Amazon Aurora MySQL is essentially a fully-managed MySQL cluster that is available on AWS. Use the following guidelines to configure an Aurora MySQL cluster.

To create a connection to an Amazon Aurora MySQL database, open the Connections window, click the + button, and type in aurora. Select Amazon Aurora MySQL.

Enter the connection parameters.

The following link details the properties that are available for an Amazon Aurora MySQL connection's URL.

Connecting to Amazon Redshift

Amazon Redshift is a fully-managed data warehouse that is available on AWS.

To create a connection to an Amazon Redshift database, open the Connections window, click the + button, and type in redhsift. Select Amazon Redshift.

Enter the connection parameters.

The following link details the properties that are available for an Amazon Redshift connection's URL.

Read how to work with Amazon Redshift in Etlworks Integrator.

Connecting to Google BigQuery

Google BigQuery is not a typical relational database (it is actually a service), but Google has partnered with third-party companies to provide a JDBC driver that leverages the power of BigQuery's Standard SQL.

To create a connection to Google BigQuery, open the Connections window, click the + button, and type in bigquery. Select Google BigQuery.

Enter the connection parameters.

The following link details the properties that are available for a Google BigQuery connection's URL.

At this time a Google service account is required to connect to BigQuery.

This guide explains how to create a new Google service account.

When creating a new service account, two parameters are important:

Use the following URL to connect to BigQuery using a Google service account.

jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<Project>;OAuthType=0;
OAuthServiceAcctEmail={user};OAuthPvtKeyPath={password}

Enter a Google service account email into the User field.

Upload the key, which is in the JSON or P12 format, into server storage.

Enter the path to the key file into the Password field. Use the token {app.data} to define the path to the key. For example: {app.data}/key.json.

Connecting to Derby

Derby (Java DB) is a compact and embeddable database implemented entirely in Java and available under the Apache License, Version 2.0.

To create a connection to a Derby database, open the Connections window, click the + button, and type in derby. Select Client Java DB (Derby).

Enter the connection parameters.

The following link details the properties that are available for a Derby connection's URL.

Connecting to SQLite

SQLite is available in Etlworks Integrator as a temporary staging database.

In a dedicated instance, it is also possible to create a connection to the actual, not temporary (in memory), SQLite database.

Step 1. Open the Connections window, click the + button, and type in jdbc. Select Generic JDBC.

Step 2. Enter org.sqlite.JDBC into the Driver field.

Step 3. Enter the connection URL as described here.

Step 4. If needed, enter a user name and password.

Connecting to a database that is not on the list

If a database is not on the list of supported databases, you can still connect to it using this guide.

Testing a database connection

Click the Test Connection button on the connection screen, to test the database connection.

Test connection

Browsing database objects and executing SQL

Use Integrator's add-on, Explorer, to browse database objects, such as tables, views, columns, etc.; and execute SQL queries.

Browsing database objects

To browse database objects, select and expand the database connection on the left navigational bar.

Writing and executing SQL queries

To write and execute SQL queries use the built-in SQL editor.

Database as a source

Integrator can extract data from database objects, such as tables, views, synonyms, user-defined SQL, functions, etc., transform it and load it into any destination, for example, another database, files, API endpoints, etc.

Step 1 If the database is a source in your transformation, start creating the data integration flow by opening the Flow Builder page, clicking the + button, selecting Databases in the left navigational bar and selecting a flow type where the source is a database:

database as a source

Step 2 Continue by creating one or more transformations where the source is a database

db is the source of a transformation

Step 3 Modify each transformation by selecting the database object to read data from (table, view, etc.) or defining a source query.

Step 4 Continue by defining the mapping and parameters.

Database as a destination

Integrator can extract data from files, API endpoints, social web sites, databases, etc., transform it and load it into any database. When loading data into a database, Integrator either automatically generates and executes an SQL DML statement (INSERT, UPDATE, DELETE or MERGE) or executes user-created SQL.

Step 1 If a database is the destination in a transformation, start creating the data integration flow by opening the Flow Builder page, clicking the + button, selecting Databases in the left navigational bar and selecting a flow type where the destination is a database:

database as a destination

Step 2 Continue by creating one or more transformations where the destination is a database.

db is the destination of a transformation

Step 3 Modify each transformation by selecting a database object to load data into (table, view, etc.) or by defining a destination query.

Step 4 Continue by defining the mapping and parameters.

Automatically create a destination table

Integrator can automatically create a destination table if it does not already exist. It uses fields in the source to create and execute create table SQL statements.

This behavior is enabled by default.

create table

Conditionally update database table using MERGE

By default, when the destination is a database, Integrator generates and executes an INSERT SQL statement. In many cases, it makes more sense to check whether the record already exists and UPDATE it instead. A relational database management system uses SQL MERGE (also called UPSERT) statements to INSERT new records or UPDATE existing records depending on which condition it matches. Integrator supports MERGE (UPSERT) for the following databases:

Step 1 Create a flow where the destination is a database.

Step 2 Click the MAPPING button between FROM and TO.

Step 3 Go to the Parameters tab.

Step 4 Select MERGE as the Action.

Step 5 Enter a comma-separated list of fields into the Lookup Fields text box. These fields will be used to match records in the destination table.

merge action

Important: PostgreSQL and MySQL require a unique index, which includes all the fields defined in the Lookup Fields text box.

Conditionally update database table without MERGE

The alternative to MERGE is the IfExist action. Instead of relying on the native MERGE, included in the target database engine, Integrator will execute an SQL statement to check if the record already exists, and if it does, will then execute UPDATE. If the record does not exist, Integrator will execute INSERT.

Use IfExist if:

Step 1. Create a flow with at least one transformation where a database is the destination.

Step 2. When configuring each transformation, click MAPPING and then select the Parameters tab.

Step 3. Select IfExist for the action and enter a comma-separated list of field names, which will be used to identify unique records.

IfExist action

Conditionally update a database table (JavaScript)

You can use JavaScript to define what kind of DML statement (INSERT, UPDATE, DELETE or MERGE) will be executed when updating the target table.

Other common use cases

Change replication

In Integrator it is possible to create flows, which track changes in a database and load them into another database or API. This is called change replication.

Examples:

Using a temporary database

In Integrator it is possible to create a connection to a temporary, in-memory, database and use it as a staging area for the data.

Work with a temporary database as you would work with any other relational database, just keep in mind that it will be automatically deleted once the connection that created it closes.

Usage examples:

Automatically create staging tables

Integrator can automatically create (and drop) staging tables for you, from practically any data source: files, API endpoints, etc.

Step 1 Begin to create a data integration flow by opening the Flow Builder page. Click the + button and type create staging tables in the "Select Flow Type" box.

create staging tables

Step 2 Continue by defining the FROM and TO. Select a database connection for the staging table(s) in the TO connection field. Note: The TO field will be used as the name for the staging table.

create staging tables flow

Step 3. Configure the flow to drop an existing staging table if it already exists.

drop staging tables

Execute any SQL

In Integrator, it is possible to execute any SQL statement (including DDL) or a list of the ; separated SQL statements, outside of a regular source-to-destination transformation.

Read more about SQL flows.

Read records from a database and execute the flow for each record

In Integrator, it is possible to execute the same flow in a loop for each record returned, by using an SQL statement.

Usage Examples:

Expose a database object as an API endpoint

In Integrator, it is possible to build a REST API that can be used to access some controlled portion of the data from the other side of the firewall. For example, you can create a flow which executes an SQL query that reads data from your corporate database and exposes the result set as an Integrator API endpoint.

Read more about building a REST API for accessing database objects.

Inserting images from files into a database table

Read how to insert image files into a database table.

Tips and tricks

Read about tips and tricks to be used when working with databases.

Performance tips

Read about performance tips when working with databases.

Also read about common performance tips.