Scripting Overview

In Integrator, you can create very complex data integration flows without ever writing a single line of code. However, there are certain tasks which are better handled by using the built-in scripting language. For example: looping through inbound files, calculating a field's value on the fly, validating data, etc. Integrator uses JavaScript as the primary scripting language.

Tutorials

Integrator includes a JavaScript engine called Nashorn. It is an embedded scripting engine inside Java applications There are plenty of tutorials for Nashorn, such as:

Scripting basics

It is important to understand that the scripting engine is integrated directly into the ETL engine, so that it has access to all the ETL classes within the Toolsverse API as well as core Java packages and classes, including collection-based APIs, utilities, etc.

Combining JavaScript and Java

The scripts must be written using the JavaScript language. However, you can also use Java classes and methods. It almost feels like you are using Java, except for Java-specific language constructs. The best part is that you can combine and chain JavaScript and Java functions in one line of code:

var val = value.toLowerCase().trim().replaceAll('|', '');

Packages and Classes

To access the classes provided by the Toolsverse API you need to import a particular package. It is possible to import multiple packages into the same JavaScript program.

For example:

importPackage(com.toolsverse.etl.common);
importPackage(com.toolsverse.util); 

// the rest of the code

Some of the most commonly used packages

Used for Import JavaDoc
Utility classes importPackage(com.toolsverse.util); util
Logging importPackage(com.toolsverse.util.log); util.log
System Configuration importPackage(com.toolsverse.config); config
Common ETL engine classes, such as DataSet, FieldDef, etc. importPackage(com.toolsverse.etl.common); etl.common
ETL engine configuration importPackage(com.toolsverse.etl.core.config); etl.core.config
ETL engine core importPackage(com.toolsverse.etl.core.engine); etl.core.engine
Common ETL engine tasks importPackage(com.toolsverse.etl.core.task.common); etl.core.task.common
Common Java data structures and collection classes importPackage(java.util); java.util

Some of the most commonly use classes

Used for Class name / JavaDoc Usage Example Import
Utility methods com.toolsverse.util.Utils if (Utils.isNothing(value)) {do something} importPackage(com.toolsverse.util);
Working with file names com.toolsverse.util.FilenameUtils var name = FilenameUtils.getName(value); importPackage(com.toolsverse.util);
Logging com.toolsverse.util.log.Logger Logger.log(Logger.SEVERE, null, "Error doing something"); importPackage(com.toolsverse.util.log);
System Configuration com.toolsverse.config.SystemConfig var props = SystemConfig.instance().getProperties(); importPackage(com.toolsverse.config);
Accessing data com.toolsverse.etl.common.DataSet var fldValue = dataSet.getFieldValue(currentRow, "InvoiceNo"); importPackage(com.toolsverse.etl.common);
Accessing dataset field com.toolsverse.etl.common.FieldDef var fldName = dataSet.getFieldDef("InvoiceNo").getNameToUse(); importPackage(com.toolsverse.etl.common);
Accessing dataset record com.toolsverse.etl.common.DataSetRecord var record = dataSet.getRecord(0); importPackage(com.toolsverse.etl.common);
High level transformations com.toolsverse.etl.common.CommonEtlUtils var newDs = CommonEtlUtils.intersect(dataSet, with, "id"); importPackage(com.toolsverse.etl.common);
Connection information com.toolsverse.etl.common.Alias var alias = etlConfig.getAliasesMap().get('Connection name'); importPackage(com.toolsverse.etl.common);
ETL Engine configuration com.toolsverse.etl.core.config.EtlConfig var alias = etlConfig.getAliasesMap().get('Connection name'); importPackage(com.toolsverse.etl.core.config);
ETL scenario com.toolsverse.etl.core.engine.Scenario var source = scenario.getSources().get('Source name'); importPackage(com.toolsverse.etl.core.engine);
Validation com.toolsverse.etl.core.engine.TaskResult if (dataSet.getFieldValue(currentRow, 'SUCCESS').toString() == 'error') {value = TaskResult.HALT;} importPackage(com.toolsverse.etl.core.engine);
Checking if files exist com.toolsverse.etl.core.task.common.FileManagerTask var ret = !FileManagerTask.filesExist(etlConfig, 'name of the connection', '*.json') ? null : "true"; importPackage(com.toolsverse.etl.core.task.common);
Writing files or making HTTP calls com.toolsverse.etl.core.task.common.FileManagerTask FileManagerTask.write(etlConfig, "name of the connection", filename, payload); importPackage(com.toolsverse.etl.core.task.common);
Reading files com.toolsverse.etl.core.task.common.FileManagerTask var data = FileManagerTask.read(etlConfig, "name of the connection", filename); importPackage(com.toolsverse.etl.core.task.common);

Returning values

If the JavaScript code needs to return a value, for example when calculating a field's value on the fly, simply assign something to the value variable. Note: you don't need to declare a value variable.

For example:

// a bunch of code

value = 1 + 2 + 3;

Alternatively, if there is a single line of code (for example, an inline function) - the result of the evaluation will be returned as a value.

For example:

{firstName} + ' ' + {lastName};

Debugging JavaScript

At this time, three techniques are available when debugging JavaScript code: logging, generating debug information and throwing exceptions.

Logging

When debugging JavaScript code, it is possible to add information to the system log.

To log something (placed anywhere in the JavaScript code):

importPackage(com.toolsverse.util.log);

// bunch of code

Logger.log(Logger.INFO, null, "Something happened");

To see the log in real time (as a tail log) or after the flow has been executed, go to the flow dashboard.

Generating debug information

When debugging JavaScript code, it is possible to generate a debugging informational log which can be viewed from the flow dashboard or from the audit trail grid.

To generate debug information (placed anywhere in the JavaScript code):

// bunch of code

etlConfig.log('start');
.....
etlConfig.log('something happened here');

For example:

debug info

Note: debug information also appears in a flow log so it can be used to monitor flow in real time.

To view debug information from the Flow Builder

Step 1: Click on the Success or Error link in the Flow Builder

debug info link

Step 2: In the flow statistics dashboard click on theinfo link

info link

Step 3: Debugging information will be displayed in a pop-up window

info pop-up

To view debug information from the audit-trail

Important: only admin users have access to the audit-trail.

Step 1: Go to Statistics->Audit-trail window and click the Success or Error link.

audit-trail window

Step 2: Debugging information will be displayed in a pop-up window

debug info pop-up

Throwing an exception

When an exception is thrown, execution of the flow will stop and the entire exception stack trace will be displayed.

Exception

To throw an exception (placed anywhere in the JavaScript code):

// bunch of code

if (Utils.isNothing(flowId)) {
   throw 'Flow ID not set';
}

Loading External Libraries

It is possible to extend the functionality of the embedded JavaScript by loading external JavaScript and Java libraries.

Loading JavaScript Library

To load any external JavaScript library simply put load(url); anywhere in the JavaScript.

// loading from Internet
load("https://example.com/js/test.js");

// loading from file
load("file://path/js/test.js");

Loading Java Library

Note: if you are using hosted Etlworks Integrator in a cloud please contact your system administrator or Etlworks support at support@etlworks.com.

Step 1. Copy Java library (a jar file) into the TOMCAT_HOME/lib folder.

Step 2. Restart the Tomcat.

Step 3. Import package(s) in the JavaScript code.

importPackage(java.package.name);

Use Cases

Setting a field's value

Setting a field's value is available under the field-to-field mapping.

To set a field's value use the techniques described here.

The following objects can be referenced by name from JavaScript code:

Object name Class name / JavaDoc Import
dataSet com.toolsverse.etl.common.DataSet importPackage(com.toolsverse.etl.common);
currentRow com.toolsverse.etl.common.DataSetRecord importPackage(com.toolsverse.etl.common);
etlConfig com.toolsverse.etl.core.config.EtlConfig importPackage(com.toolsverse.etl.core.config);
scenario com.toolsverse.etl.core.engine.Scenario importPackage(com.toolsverse.etl.core.engine);
row current 0-based row number

Referencing existing fields

Existing fields in the currentRow can be referenced as a {field name}. For example:

{firstName} + ' ' + {lastName};

It is also possible to use any of the available dataSet.getFieldValue(...) methods.

Global Variables

Whenever you need to "remember" a string, in order to reference it later by calling it with JavaScript code or as part of a URL, file name, folder name, etc. - use the following technique.

To store string:

importPackage(com.toolsverse.config);

// props is a java.util.HashMap<String, String>
var props = SystemConfig.instance().getProperties();

// you can store strings in props
props.put("unique key", someObject);

Accessing a previously stored string:

importPackage(com.toolsverse.config);

// props is a java.util.HashMap<String, String>
var props = SystemConfig.instance().getProperties();

// use the same key you stored in props.put(...)
var obj = props.get("unique key");

Important: string values stored in SystemConfig.instance().getProperties() can be referenced as a {unique key} in the connection's URL, directory, file name, or in the FROM and TO fields for the transformation.

Flow Variables

Flow variables are key-value pairs that are passed as URL parameters in the user-created API endpoints or added by the user as parameters in the nested flow.

Accessing a value stored in variable:

var value = scenario.getVariable("unique key").getValue();

Adding new variable:

importPackage(com.toolsverse.etl.common);

var variable = new Variable();
variable.setName('name');
variable.setValue('value');

scenario.addVariable(variable);

Key/value storage

Key/value storage is a common facility for objects, programmatically stored as key/value pairs and accessible from anywhere within the JavaScript. Unlike global variables and flow variables you can add any object to the key/value storage.

To store object:

// you can store anything in etlConfig
etlConfig.setValue("unique key", someObject);

Accessing a previously stored object:

// use the same key that was user in etlConfig.setValue(...)
var obj = etlConfig.getValue("unique key");

The source file name is a variable

A fairly common usage is when the flow needs to access the source file name, for example, in order to store it in the database.

The simplest way is to use dataSet.getFileNameToRead(), if thedataSet object is available to that particular JavaScript code. For example:

importPackage(com.toolsverse.config);

SystemConfig.instance().getProperties().put('my unique key for the source file name', 
   FilenameUtils.getName(dataSet.getFileNameToRead()));

and then to access it somewhere else:

importPackage(com.toolsverse.config);

var sourceFileName = SystemConfig.instance().getProperties().get('my unique key for the source file name');

Executing any JavaScript

There is a special type of flow in Integrator, which allows the user to execute any JavaScript. It can be combined together with any other flow by using a nested flow.

The following objects can be referenced by name from JavaScript code:

Object name Class name / JavaDoc Import
etlConfig com.toolsverse.etl.core.config.EtlConfig importPackage(com.toolsverse.etl.core.config);
scenario com.toolsverse.etl.core.engine.Scenario importPackage(com.toolsverse.etl.core.engine);

Executing a flow in a loop

Typical examples of flows that are executed in a loop:

All these cases have one thing in common - the flow must be executed while a certain condition is true, such as, there are more files, there are more records, etc.

To create a flow which will be executed in a loop:

Step 1. Create the flow which you would like to execute in a loop.

Step 2. Create a nested flow, which includes the flow created in step 1.

Step 3. Specify the loop conditions.

Loop

Important: the flow continues to run in a loop while the JavaScript loop returns a not null value.

The following objects can be referenced by name from JavaScript code:

Object name Class name / JavaDoc Import
etlConfig com.toolsverse.etl.core.config.EtlConfig importPackage(com.toolsverse.etl.core.config);
scenario com.toolsverse.etl.core.engine.Scenario importPackage(com.toolsverse.etl.core.engine);
evals 1-based number of loop iterations

Execute the flow a fixed number of times

// this flow will be executed 40 times
value = evals <= 40 ? evals : null;

Executing a flow conditionally

Similar to how flows can be run in a loop, they also can be executed conditionally. When creating conditionally-executed flows, follow the same steps as when creating flows executed in a loop, but instead of Loop specify a Condition.

Important: a Condition in JavaScript must set the value variable to true or false:

value = boolean condition; // true or false

The following objects can be referenced by name from the JavaScript code:

Object name Class name / JavaDoc Import
etlConfig com.toolsverse.etl.core.config.EtlConfig importPackage(com.toolsverse.etl.core.config);
scenario com.toolsverse.etl.core.engine.Scenario importPackage(com.toolsverse.etl.core.engine);

Inserting images from files into database table

Read how to insert image files into a database table.

Lookup

One of the common ETL operations, typically performed to calculate a field's value, is a lookup. Given the list of input parameters, the system should be able to return a field's value or the entire dataset, by querying a database or other data source. For example, if the input is a user_id, the system should be able to return all the phones which belong to a user with the given id.

Important: lookup can be expensive if executed for each row of the dataset, especially when the data source is not a database, such as data in a file or on an API endpoint. In such a case, if possible, first dump the data into a temporary staging database prior to using lookup.

An example of populating all images from two linked database tables for a row with the given id:

importPackage(com.toolsverse.etl.core.engine);
importPackage(com.toolsverse.util);

var images = Extractor.lookup(etlConfig, scenario, "source_connection", "image", 
"select filename as url,
row_number() OVER () as id 
from property_pictures 
inner join property_picture on 
(property_pictures.nestedrowid =property_picture.nestedparentrowid) 
where property_pictures.nestedparentrowid={ROWID}", 
new TypedKeyValue("ROWID", dataSet.getFieldValue(currentRow, 'nestedrowid')));

The signature of the Extractor.lookup:

Extractor.lookup(etlConfig, scenario, source-connection-name, dataset-name, sql, key-value-parameters)

Looking at the example above:

Required imports:

importPackage(com.toolsverse.etl.core.engine);
importPackage(com.toolsverse.util);

Executing SQL from JavaScript

The typical use-case for executing SQL from the JavaScript is to update the execution status at run-time in the complex nested flow. The example below demonstrates how to execute SQL from the JavaScript. The prerequisites for this example are: the JavaScript flow is a part of the nested flow and the database connection is either used in one of the inner flows or is added as a named connection to the nested flow.

importPackage(java.util);
importPackage(com.toolsverse.config);
importPackage(com.toolsverse.util);
importPackage(com.toolsverse.etl.sql.util);

var ex = etlConfig.getLastException();

etlConfig.setLastException(null);

var props = SystemConfig.instance().getContextProperties();

var stagingConnection = etlConfig.getConnectionFactory().getConnection("Staging Postgres");

var statusSql = "insert into feeds_status (filename, url, error, finished) values (?,?,?,?)";

var message = null;

if (ex != null) {
  message = ex.getMessage();

  if (Utils.isNothing(message)) {
     message = Utils.getStackTraceAsString(ex);
  }
}  

var when = new java.util.Date();


try {
   SqlUtils.executeSql(stagingConnection, statusSql, props.get('filename'), props.get('url'), message, when);
} catch (ex) {
   etlConfig.log(ex.toString());
}  

In the example above:

Step 1. Get the reference to the database connection:

var stagingConnection = etlConfig.getConnectionFactory().getConnection("Staging Postgres");

Step 2. If needed, set the bind variables and executed the SQL:

SqlUtils.executeSql(stagingConnection, statusSql, props.get('filename'), props.get('url'), message, when);

Required imports:

importPackage(com.toolsverse.config);
importPackage(com.toolsverse.etl.sql.util);

Validating data

Validation, using JavaScript, is available under the Mapping/Additional Transformations/Validate.

Validate

You can conditionally fail an entire flow (the flow execution will stop and an exception will be generated)

if (dataSet.getFieldValue(currentRow, 'SUCCESS').toString() == 'error') {
   value = TaskResult.HALT;
} else {
   value = TaskResult.CONTINUE; 
}

or reject the whole dataset (the source-to-destination transformation will be skipped)

if (dataSet.getFieldValue(currentRow, 'SUCCESS').toString() == 'error') {
   value = TaskResult.STOP;
} else {
   value = TaskResult.CONTINUE; 
}

or just reject the current row (the current row will not be pushed to the destination)

if (dataSet.getFieldValue(currentRow, 'SUCCESS').toString() == 'error') {
   value = TaskResult.REJECT;
} else {
   value = TaskResult.CONTINUE; 
}

The following objects can be referenced by name from the JavaScript code:

Object name Class name / JavaDoc Import
dataSet com.toolsverse.etl.common.DataSet importPackage(com.toolsverse.etl.common);
currentRow com.toolsverse.etl.common.DataSetRecord importPackage(com.toolsverse.etl.common);
etlConfig com.toolsverse.etl.core.config.EtlConfig importPackage(com.toolsverse.etl.core.config);
scenario com.toolsverse.etl.core.engine.Scenario importPackage(com.toolsverse.etl.core.engine);
row current 0-based row number

Transforming data

This is the most powerful type of transformation. It is used to change a dataset on the fly, including fields, values and structure. Most likely, you will be using it for creating complex nested datasets.

Transformation using JavaScript are available under the Mapping/Additional Transformations/JavaScript transformation.

Transform

There are actually two transformations:

Important: you can use both transformations together.

Note: when Execute Transformation Rules after Extract is enabled, the Transformation rules transformation is executed after the extract is finished. We have preserved it for backward compatibility with flows that were created prior to the introduction of the After extract transformation. Moving forward, please use the After extract transformation instead.

Objects

The following objects can be referenced by name from JavaScript code:

Object name Class name / JavaDoc Import
dataSet com.toolsverse.etl.common.DataSet importPackage(com.toolsverse.etl.common);
currentRow com.toolsverse.etl.common.DataSetRecord importPackage(com.toolsverse.etl.common);
etlConfig com.toolsverse.etl.core.config.EtlConfig importPackage(com.toolsverse.etl.core.config);
scenario com.toolsverse.etl.core.engine.Scenario importPackage(com.toolsverse.etl.core.engine);
row current 0-based row number

Methods

The following methods of the DataSet object are typically used when modifying a dataset.

Method Purpose
getRecordCount() returns the number of records in the dataset
getFieldCount() returns the number of fields in the dataset
DataSetRecord getRecord(index) returns a record by index
DataSetData getData() returns an array of records
DataSetFields getFields() returns an array of fields (columns)
getFieldDef(name) returns the field's definition by name
getFieldDef(index) returns the field's definition by index
TypedKeyValue findField(name) finds a field and, optionally, the nested dataset it belongs to
setData(DataSetData) sets records
setFields(DataSetFields) sets fields (columns)
getFieldValue(row, column) returns the value of a field for the given row and column
getFieldValue(DataSetRecord, column) returns the value of a field for the given record and column
getFieldValue(DataSetRecord, name) returns the value of a field for the given record and field name
addField(FieldDef) adds a field (column)
addField(name, value) adds a field (column) and value
setFieldValue(DataSetRecord, column, value) sets the field value for the given record and given column number
setFieldValue(row, column, value) sets the field value for the given row and given column number
setValue(DataSetRecord, name, value) sets the field value for the given record, adds a new field if needed
getActualData() gets the data in its internal format. Currently used for HL7 messages only
setActualData(Object) sets the data in its internal format. Currently used for HL7 messages only

The example below demonstrates how to traverse the original dataset and create a brand new one, which will then be used to create a destination file. The After extract transformation is used for this.

importPackage(com.toolsverse.etl.common);

var stageDataSet = new DataSet();

var rows = dataSet.getRecordCount();

for (row = 0; row < rows; row++) {
    var record = dataSet.getRecord(row);

    stageDataSet.addField("Name.FirstName", dataSet.getFieldValue(record, "firstName"));
    stageDataSet.addField("Name.LastName", dataSet.getFieldValue(record, "lastName"));

    stageDataSet.addField("Age", dataSet.getFieldValue(record, "age"));
    stageDataSet.addField("City", dataSet.getFieldValue(record, "city"));
    stageDataSet.addField("State", dataSet.getFieldValue(record, "state"));
    stageDataSet.addField("ZipCode", dataSet.getFieldValue(record, "postalCode"));
    stageDataSet.addField("Addr1", dataSet.getFieldValue(record, "streetAddress"));

    var phones = dataSet.getFieldValue(record, "phoneNumbers");

    if (phones != null && phones.getRecordCount() > 0) {
        var phonesCount = phones.getRecordCount();

        for (phoneRow = 0; phoneRow < phonesCount; phoneRow++) {
           stageDataSet.addField("Phone.Type", 
              dataSet.getFieldValue(record, "phoneNumbers.phoneNumbers_type", '.', phoneRow));
           stageDataSet.addField("Phone.Num", 
              dataSet.getFieldValue(record, "phoneNumbers.phoneNumbers", '.', phoneRow));             
        }  
    }  
}  

dataSet.setFields(stageDataSet.getFields());

dataSet.setData(stageDataSet.getData());

Creating complex XML and JSON documents

Note: this technique can be used to create documents of any supported type. XML and JSON are the most commonly used data exchange formats.

Mapping fields in flat files (like CSV) or relational databases is easy, and can be accomplished without writing a single line of code, by using a Mapping Editor. When the destination is a complex document (especially XML), with multiple layers of nestedness and complicated formatting (for example a mix of attributes and tags), it is better to use a JavaScript transformation.

Note: the Mapping Editor can be used to create complex XML and JSON documents with a well-defined structure. Below, we will look at some extreme cases where the destination document is so complicated that using the drag-and-drop editor is not an option.

When Etlworks Integrator parses the source file, it creates an internal data structure, called a DataSet. Its methods can be used to create a new dataset, which then will be used by the system to automatically create a destination file.

Step 1. Create a flow, where the destination is a file (JSON or XML).

Step 2. Select the Mapping->Parameters and disable Streaming.

Step 3. Enter the transformational JavaScript code into the After Extract field.

Overview

// create a staging dataset
var root = new DataSet();

// create fields and data in the staging dataset
...

// update the dataset, which will be used to create a final documemnt, by copying data and fields from the staging dataset
dataSet.setFields(root.getFields());
dataSet.setData(root.getData());

Techniques

Understanding the DataSet

A DataSet is just like a database table: it has fields and rows, combined with DataSetData. Each value in the row can be either a generic value (integer, string, date, etc.) or a DataSet by itself. This allows datasets to have an infinitely complex structure, supporting all types of real-life use cases.

Fields and Data

When creating a staging dataset, you will be traversing the original dataset, which was populated during the parsing, then updating fields and data in the staging dataset.

Read about the DataSet methods, available when traversing the original dataset and updating the staging dataset.

An example of traversing the original, and updating the staging dataset:

for (row = 0; row < allGenres.getRecordCount(); row++) {
    genre.addField("genre", allGenres.getFieldValue(row, 2));
}  

Renaming Field

It is possible to programmatically rename the field in the document.

Original XML

   <SourceID>123456</SourceID>

New XML

   <ID>123456</ID>

To rename the field:

// get the reference to the field. Use actual data set
var fld = dataSet.getFieldDef("SourceID");

// set the name of the field and the real name of the field as it should appear in the created document. 
fld.setName("ID");
fld.setRealName("ID");

Hiding Field

It is possible to programmatically hide the field in the document.

Original XML

   <ID name="abc" value="xyz">123456</ID>

New XML

   <ID value="xyz">123456</ID>

To hide the field in the document:

// get the reference to the field. Use actual data set and field name
var fld = dataSet.getFieldDef("abc");

// hide the field
fld.setVisible(false);

Using SQL

SQL Lookup is a powerful technique, which can be used to extract data from the original dataset, even multiple levels deep inside a nested structure.

Read how to write SQL queries, which can be executed on any dataset.

Use CommonEtlUtils.executeSql(dataSet, sql) to execute SQL on any dataset and return another dataset.

The example below demonstrates how to use SQL lookup to extract deeply-nested data from the original dataset.

var allGenres = CommonEtlUtils.executeSql(dataSet, "select localeCode, * from versions.metaData.genre.values.SMAT.value where localeCode = 'US'");

Notice in the following code: from versions.metaData.genre.values.SMAT.value that each group separated by a . represents another dimension in the nested dataset.

Using Lookup

When the data needed to update a staging dataset is located somewhere else, not in the original dataset, you can use lookup. In addition to SQL, lookup requires a connection name, and a data object name (for example, a database table name, or a file name).

Using UNION

UNION is available when using SQL, with one restriction: while keyword UNION is supported, Integrator in fact always executes UNION ALL.

To perform UNION from the JavaScript:

var people = CommonEtlUtils.executeSql(dataSet, 'select languageCode,displayName as "Name", "actor" as "Job" from versions.metaData.castAndCrew.actors.value where languageCode = "ENG" union select languageCode,displayName as "Name", "producer" as "Job" from versions.metaData.castAndCrew.producers.value where languageCode = "ENG"');

var crew = null;

// merging crew with actors and producers, the crew can be empty.
try {
  // get crew
  var crew = CommonEtlUtils.executeSql(dataSet, 'select languageCode,displayName as "Name", role as "Job" from versions.metaData.castAndCrew.crew.value where languageCode = "ENG"');

  if (crew != null && crew.getRecordCount() > 0) {
       people = CommonEtlUtils.union(people, crew, null, true, null, null);
  }  
}
catch (e) {
  etlConfig.log(e);
}  

Using INTERSECT and MINUS

Similarly to UNION, the INTERSECT and MINUS set operations are available when using SQL. The same restriction is also true: while keywords INTERSECT and MINUS are available Integrators always executes INTERSECT ALL and MINUS ALL.

To perform INTERSECT from the JavaScript:

var crew = CommonEtlUtils.executeSql(dataSet, 'select languageCode,displayName as "Name", role as "Job" from versions.metaData.castAndCrew.crew.value where languageCode = "ENG"');

people = CommonEtlUtils.union(people, crew, null);

To perform MINUS from the JavaScript:

var crew = CommonEtlUtils.executeSql(dataSet, 'select languageCode,displayName as "Name", role as "Job" from versions.metaData.castAndCrew.crew.value where languageCode = "ENG"');

people = CommonEtlUtils.minus(people, crew, null);

Using JOIN and other transformations

Other high level transformations, such as JOIN, can be executed by calling static methods of the CommonEtlUtils.

Creating complex XML documents

In addition to the rules and techniques used when creating complex XML and JSON documents, the following XML-specific techniques are available.

Root XML Tag

You can define a Root XML tag when configuring a destination for the transformation in the XML format.

XML Namespaces

You can define XML namespaces when configuring a destination for the transformation in the XML format.

Field is an XML attribute

The example below demonstrates how to add a field, and define it as an XML attribute:

staging.addField("ContentID", "11-22-33-44);
staging.getFieldDef("ContentID").setAttribute(true);

Which will produce the following XML:

<tag ContentID="11-22-33-44"/>
Creating fields with both attributes AND a value

If you need to create an XML tag which has both: attribute(s) and a value, as shown below:

<tag ContentID="11-22-33-44" anotherattr="attr value">some value</tag>

use the following technique, as in the example below:

var tag = new DataSet();
tag.setName("tag");

tag.addField("ContentID", "11-22-33-44");
tag.getFieldDef("ContentID").setAttribute(true);

tag.addField("anotherattr", "attr value");
tag.getFieldDef("anotherattr").setAttribute(true);

tag.addField("value", "some value");
tag.getFieldDef("value").setTagValueField(true);

staging.addField("tag", tag);
XML tag with CDATA section

If you need to create an XML tag with a CDATA section, as shown below:

<tag><![CDATA[a lot of text]]></tag>

use the following JavaScript:

owner.getFieldDef("field name").setCdata(true);
Rules for repeating elements

By default, any dataset will be serialized to XML as a root element, followed by the row tags.

<owner>
   <row/>
   ....
   <row/>
</owner>   

If you want to hide the owner tag in XML, use the following technique, as shown in the example below:

owner.getFieldDef("row field name").setHiddenInNested(true);

Given the example above, the resulting XML is shown below. Notice that there are no longer any <owner> tags:

   <row/>
   ....
   <row/>

Creating complex JSON documents

Once again, in addition to the rules and techniques used when creating complex XML and JSON documents, the following JSON-specific techniques are available.

Defining a field as a JSON array

When creating JSON documents, the system makes a decision about whether to create a JSON arrays based on the number of elements in the nested dataset: arrays are used if the number of rows is more than 1.

You can manually enforce or disable the creation of arrays by setting the property FieldDef#setJsonArray(bool).

staging.getFieldDef("phones").setJsonArray(true); 

Filtering data

It is possible to filter out certain rows of the source dataset, using JavaScript.

Filter is available under Mapping/Additional Transformations/Filter

Filter data

Important: a JavaScript Filter must return true (include record) or false (exclude record).

The field values can be referenced from JavaScript code, using the "field name".

Note: if Use SQL-like syntax for filter is selected - you can use operands available in SQL, for example <>, and, or, etc. Otherwise, only JavaScript operands, for example !=, &&, || are allowed.

An example of an SQL-like filter expression:

"Importer Name" <> null and "Importer Name".contains("test");

An example of a JavaScript-like filter expression:

"Importer Name" != null && "Importer Name".contains("test");

Executing SQL INSERT, UPDATE, DELETE or MERGE conditionally

When working with databases, where the database is a destination, it is possible to execute automatically-prepared SQL statements conditionally. For example: UPDATE records if a certain flag is set, otherwise - MERGE.

For databases, this feature is available under Mapping/Parameters/Action Conditions.

Conditional SQL

Field values can be referenced from JavaScript code by using the {field name}.

Possible actions are: none, insert, update, delete, merge.

The following objects can be referenced by name from the JavaScript code:

Object name Class name / JavaDoc Import
dataSet com.toolsverse.etl.common.DataSet importPackage(com.toolsverse.etl.common);
currentRow com.toolsverse.etl.common.DataSetRecord importPackage(com.toolsverse.etl.common);
etlConfig com.toolsverse.etl.core.config.EtlConfig importPackage(com.toolsverse.etl.core.config);
scenario com.toolsverse.etl.core.engine.Scenario importPackage(com.toolsverse.etl.core.engine);
destination com.toolsverse.etl.core.engine.Destination importPackage(com.toolsverse.etl.core.engine);

Example of JavaScript which sets the action, based on conditions:

{Types}.equals('U') ? 'update' : 'insert';

Preprocessing source messages

When Integrator reads the source data, for example from a file or web service, it can access the original source message (as long as it is in text format) and modify it at will.

The technique is explained here.

The source message can be referenced as a message variable.

You can assign the new, modified message to the value variable:

value = message.substring(12);

Or simply use the last line in the JavaScript code as a function which modifies the message:

message.substring(12);

Filtering and modifying rows in a source CSV file

It is possible to filter out or modify some of the rows in the CSV file before passing the data to the transformation.

The technique is explained here.

The actual filtering is performed by an object, which can be referenced as a filter.

The following getters and setters are available for the filter object:

// Sets the code which controls the parsing: 
// Possible code values: 
// TextFilter.TextFilterCode.STOP - stop the parsing
// TextFilter.TextFilterCode.REJECT - skip the line
// TextFilter.TextFilterCode.CONTINUE - continue parsing, default value
filter.setCode(TextFilter.TextFilterCode code); 

// Returns current line number.  
filter.getIndex();

// Sets the current line number.  
filter.setIndex(index);

// Returns current actual line number. It is different from line number if Use first Row for Data is enabled  
filter.getActualIndex();


// Returns current line.  
filter.getLine();

// Sets the current line.  
filter.setLine(line);

In the example below we use filter to 1) stop parsing a file and 2) skip lines in a file.

if (filter.getActualIndex() == 1) {
   filter.setCode(TextFilter.TextFilterCode.REJECT);
   filter.setIndex(1);
}

Storing and accessing an original raw message in its native format

When Integrator reads source data, for example from a file or web service, it is possible for it to store the original message and then use it anywhere in the JavaScript code.

To store the original raw message, enter the following code into the Preprocessor field on the Format page. Use and remember the unique key.

importPackage(com.toolsverse.config);

SystemConfig.instance().getProperties().put("unique key", message);

value = message;

To access the original raw message from anywhere in the JavaScript code:

importPackage(com.toolsverse.config);

var message = SystemConfig.instance().getProperties().get("unique key");

Working with data in special formats, such as HL7

When messages in some of the more complicated formats, such as HL7, are parsed or created, they are stored as "objects", using one of the available object models.

There are getter and setter methods in the com.toolsverse.etl.common.DataSet class, which allow the developer to access these special types of objects.

Method Purpose
getActualData() gets the data in an internal format. Currently used for HL7 messages only
setActualData(...) sets the data in an internal format. Currently used for HL7 messages only

This example demonstrates how to use actualData when processing HL7 files.