Scripting

Recipes

25+ patterns lifted from production flows. Examples in JavaScript except where Python adds something specific. Every recipe runs in the inline editor — no external setup.


Mapping fields

Get and set a field's value in Mapping

Setting a field's value is available under field-to-field Mapping. Three ways to read; two ways to write.

Read

JavaScript
// Most explicit — works in any field function
value = dataSet.getFieldValue(currentRow, "field name");

// Shorthand for the current field in context (only inside this field's function)
value = fieldValue;

// Token form — slightly slower (preprocessed)
value = {field name};

Write

JavaScript
dataSet.setFieldValue(currentRow, "field name", fieldValue);

Set a field's value using native SQL (inline SQL)

If the destination is a relational database, you can compute a column value with the database's own SQL dialect — pushed down by the engine. Faster than JS/Python because there's no context switch.

Setup

  1. Make sure Use Bind Variables is enabled (default).
  2. In Mapping, click the pen button to open the field function editor.
  3. Change the language of the field function editor to SQL.
  4. Reference any column as {column_name}; flow / global vars as {VARNAME} (uppercase).

Example

SQL (field function)
LOWER({first_name}) || '-' || LOWER({last_name})

The engine generates the destination INSERT/UPDATE/MERGE with the SQL inlined; bind variables substitute the actual column values at execution time.

Anonymize / randomize a column

etlConfig.randomizer(locale) exposes a built-in anonymizer with locale-aware data providers (names, emails, addresses, …).

Pure random

JavaScript (Mapping → calculated field)
value = etlConfig.randomizer("en").internet().emailAddress();

Stable (same source value → same anonymous value within the flow)

JavaScript
var randomizer = etlConfig.randomizer("en");
value = randomizer.reuse(fieldValue, randomizer.name().firstName());

Locale examples

JavaScript
etlConfig.randomizer("de").name().firstName();
etlConfig.randomizer("en").name().lastName();
etlConfig.randomizer("en").internet().emailAddress();

Flow control

Run JavaScript or Python as a flow

The dedicated Execute JavaScript or Python flow type runs only your code — no source, no destination. Compose it inside a nested flow alongside other flow types.

Inside the script, etlConfig and scenario are available by name. Useful for orchestration, custom HTTP calls, sending notifications, computing complex inputs for downstream flows.

Conditional step in a workflow

Any workflow step can run conditionally based on a script that returns true (run) or false (skip). Useful when the next step depends on:

Configured at the workflow step level — see the platform docs on conditional execution.

Loop a flow

A workflow step can loop — running the same flow repeatedly with different parameters or input data. Common cases: paginated APIs, files matching a wildcard, records returned by a query, fixed N iterations. Configured at the workflow step level.


Dataset transforms

Five transformation phases

Scripting transformations run at specific points in the source-to-destination pipeline. Pick the one that matches when you need to act:

PhaseWhen it runsTypical use
Before Prepare Source QueryBefore the source query is merged with flow variablesModify the source SQL programmatically; set flow variable values
Before ExtractBefore extraction startsSet globals / flow vars, log, prep state
For Each RowFor each extracted rowModify field values on the fly (alternative to per-field Mapping)
After ExtractAfter extraction is complete (entire dataset in memory)Build a brand-new dataset from the extracted one. Requires streaming disabled.
After LoadLast step, after data has loaded into the destinationCleanup, set vars, log, send notifications

All five can be used together in the same transformation. Configure under Transformation → MAPPING → Additional Transformations.

Modify the source query before it runs

JavaScript (Before Prepare Source Query)
source.setSql('select * from table where id > {VAR_NAME}');
scenario.getVariable('VAR_NAME').setValue(value);

DataSet API — methods you'll actually use

MethodPurpose
getRecordCount()Number of records in the dataset
getFieldCount()Number of fields (columns)
getRecord(index)Returns a DataSetRecord by index
getData()Array of records
getFields()Array of fields (column metadata)
getFieldDef(name) · getFieldDef(index)Field definition by name or position
findField(name)TypedKeyValue<FieldDef, DataSet> — finds a field and the nested dataset it belongs to (if any)
setData(DataSetData)Replace records
setFields(DataSetFields)Replace fields (columns)
getFieldValue(row, column)Value at row + column position
getFieldValue(record, column)Value from a specific record + column
getFieldValue(record, name)Value from a specific record + field name
addField(FieldDef) · addField(name, value)Add a column
setFieldValue(record, column, value) · setFieldValue(row, column, value)Set a value
setValue(record, name, value)Set by field name; adds the field if it doesn't exist
getActualData() · setActualData(Object)Internal-format access (used for HL7)

Build a brand-new dataset from the source

Common pattern: traverse the extracted dataset, build a stage dataset with restructured fields (including flattening nested arrays), then replace the live one. Run as After Extract with streaming disabled.

JavaScript (After Extract)
var stageDataSet = new com.toolsverse.etl.common.DataSet();
var rows = dataSet.getRecordCount();

for (var row = 0; row < rows; row++) {
  var record = dataSet.getRecord(row);
  stageDataSet.addField("Name.FirstName", dataSet.getFieldValue(record, "firstName"));
  stageDataSet.addField("Name.LastName",  dataSet.getFieldValue(record, "lastName"));
  stageDataSet.addField("Age",            dataSet.getFieldValue(record, "age"));
  stageDataSet.addField("City",           dataSet.getFieldValue(record, "city"));
  stageDataSet.addField("State",          dataSet.getFieldValue(record, "state"));
  stageDataSet.addField("ZipCode",        dataSet.getFieldValue(record, "postalCode"));
  stageDataSet.addField("Addr1",          dataSet.getFieldValue(record, "streetAddress"));

  // Flatten a nested array
  var phones = dataSet.getFieldValue(record, "phoneNumbers");
  if (phones != null && phones.getRecordCount() > 0) {
    var n = phones.getRecordCount();
    for (var phoneRow = 0; phoneRow < n; phoneRow++) {
      stageDataSet.addField("Phone.Type",
        dataSet.getFieldValue(record, "phoneNumbers.phoneNumbers_type", '.', phoneRow));
      stageDataSet.addField("Phone.Num",
        dataSet.getFieldValue(record, "phoneNumbers.phoneNumbers", '.', phoneRow));
    }
  }
}

dataSet.setFields(stageDataSet.getFields());
dataSet.setData(stageDataSet.getData());

Lookup

Extract & lookup — query an external source per row

Pull a single value or an entire dataset from any connection while processing source records. Use the source connection (SELECT …), the destination connection (destination.SELECT …), or a third connection by name (connection_name.SELECT …).

Single field value — Extractor.lookup(...)

JavaScript
value = com.toolsverse.etl.core.engine.Extractor.lookup(
  etlConfig, scenario,
  "connection_for_directors_db",  // connection name
  "directors",                    // dataset name (any string)
  "first_name",                   // column to return
  "SELECT first_name FROM directors WHERE id = {ID}",  // sql, can be null
  new com.toolsverse.util.TypedKeyValue("ID",
    dataSet.getFieldValue(currentRow, 'director_id'))
);

Entire dataset — same call, different signature

JavaScript
var images = com.toolsverse.etl.core.engine.Extractor.lookup(
  etlConfig, scenario,
  "connection for lookup",
  "image",
  "SELECT filename AS url FROM property_pictures WHERE id = {ID}",
  new com.toolsverse.util.TypedKeyValue("ID",
    dataSet.getFieldValue(currentRow, 'picture_id'))
);

Cached vs non-cached

UseWhen
Extractor.lookup(...)Performance — cache results across rows
Extractor.lookupNoCache(...)Data changes between calls

Callable SQL lookup

For lookups that need stored procedures or PL/SQL-style blocks, use Extractor.lookupWithCallable(...):

JavaScript
var sql = 'begin\n' +
          '  select country from country where country_id = {country_id}\n' +
          'end';

value = com.toolsverse.etl.core.engine.Extractor.lookupWithCallable(
  etlConfig, scenario,
  'SQL Server',
  'country',
  'country',
  sql,
  new com.toolsverse.util.TypedKeyValue('country_id',
    dataSet.getFieldValue(currentRow, 'country_id'))
);

In-memory lookups (already-loaded datasets)

If the lookup data is already loaded in memory (a JSON / XML / CSV source, an API response), there's no need to round-trip through a database.

SQL on an in-memory dataset

JavaScript
// Single field
value = com.toolsverse.etl.common.CommonEtlUtils.getFieldValue(
  scenario.getSources().get("DIRECTORS").getDataSet(),
  "select last_name where id = {ID}",
  "last_name",
  new com.toolsverse.util.TypedKeyValue("ID",
    dataSet.getFieldValue(currentRow, 'director_id'))
);

// Entire dataset
var phones = com.toolsverse.etl.common.CommonEtlUtils.executeSql(
  scenario.getSources().get("PHONES").getDataSet(),
  "select * where area_code=412"
);

Filter expression on an in-memory dataset

Faster than SQL for flat datasets:

JavaScript
// Single field
value = com.toolsverse.etl.common.CommonEtlUtils.lookup(
  scenario.getSources().get("DIRECTORS").getDataSet(),
  "id=" + dataSet.getFieldValue(currentRow, 'director_id'),
  "last_name"
);

// Entire dataset
var phones = com.toolsverse.etl.common.CommonEtlUtils.filter(
  scenario.getSources().get("PHONES").getDataSet(),
  "area_code=412"
);

Filter & validate

Simple filter expression

Available under MAPPING → Additional Transformations → Filter. Reject rows by expression. Field values are accessible by name ("field name").

ModeOperators
JavaScript syntax!=, &&, ||, ===
Use SQL like syntax enabled<>, and, or, like
JavaScript Filter
"Importer Name" != null && "Importer Name".contains("test");
SQL-like Filter
"Importer Name" <> null and "Importer Name".contains("test")

Returns true → keep the row, false → drop it.

Filter / modify rows in a source CSV file

Configurable in CSV Format. The runtime injects a filter object you control via getters/setters:

JavaScript
var javaImports = new JavaImporter(com.toolsverse.etl.connector.text);

with (javaImports) {
  // Skip the second line (1-based actual index)
  if (filter.getActualIndex() == 1) {
    filter.setCode(TextFilter.TextFilterCode.REJECT);
    filter.setIndex(1);
  }
}

Codes: STOP (halt parsing), REJECT (skip line), CONTINUE (default).

Validate — TaskResult codes

Available under MAPPING → Additional Transformations → Validate. Different from Filter: you can halt the entire flow, reject the entire dataset, or just drop a single row.

CodeEffect
TaskResult.HALTStop the flow with an exception
TaskResult.STOPSkip the source-to-destination transformation entirely (no rows loaded)
TaskResult.REJECTSkip the current row only
TaskResult.CONTINUEAccept the row (default)

Reject a row when required fields are missing or malformed

JavaScript (Validate)
var postalCode = dataSet.getFieldValue(currentRow, 'postal_code');
var phone      = dataSet.getFieldValue(currentRow, 'phone');

if (Utils.isNothing(postalCode) || Utils.isNothing(phone) || !Utils.isNumber(postalCode)) {
  value = TaskResult.REJECT;
} else {
  value = TaskResult.CONTINUE;
}

Swap REJECT for HALT to stop the entire flow, or STOP to skip the whole transformation.


Source messages

Preprocess the source message

Modify the raw source document before Etlworks parses it — useful for fixing encoding, supplying default content for empty payloads, or stringifying nested nodes.

Setup

  1. Create an XML, JSON, or CSV Format.
  2. Set Transformation type to preprocessor.
  3. Write JavaScript that modifies the message variable.
JavaScript (Format → Preprocessor)
// Strip the first 12 chars (e.g. a header byte-order mark or framing)
value = message.substring(12);

// Or rely on last-expression
message.substring(12);

Store the original source message

Sometimes you want both the parsed dataset and the raw message. Stash the raw message in a global variable from the preprocessor, then read it anywhere downstream:

JavaScript (preprocessor)
com.toolsverse.config.SystemConfig.instance().getProperties()
  .put("raw_source", message);
value = message;
JavaScript (anywhere downstream)
var raw = com.toolsverse.config.SystemConfig.instance().getProperties()
  .get("raw_source");

Serialize

DataSet → JSON / XML / CSV string

One call, any of the three formats. Useful when you want to keep a nested node as a single string field, or generate a payload to send via HTTP.

JavaScript
var str = com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(
  dataSet,
  'fully.qualified.connector.class.name',
  parameters  // string "k=v;k=v" or null for defaults
);

Connectors

FormatClass
JSONcom.toolsverse.etl.connector.json.JsonConnector
XMLcom.toolsverse.etl.connector.xml.XmlObjectConnector
CSVcom.toolsverse.etl.connector.text.TextConnector

JSON parameters

ParameterValuesDefault
startarray · object · namedobject · variablearray
rootnameany stringnull
nestedarray · object · variablearray
forsqltrue / falsefalse

XML parameters

ParameterValuesDefault
rootnameany stringautomatic
namespacevalid namespacenull
xmlversion1.0 / 1.11.0
rowtagvalid XML tag nameautomatic
parseattrs · parserootattrs · parsecdata · forsqltrue / falsesee article

CSV parameters

ParameterValuesDefault
delimiterany character,
lineseparatoru Unix · w Windows · s automatics
charseparatorany character"
alwaysquote · escapequote · forsqltrue / falsefalse

Examples

JavaScript
var jsonStr = com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(dataSet,
  'com.toolsverse.etl.connector.json.JsonConnector',
  'start=variable;nested=variable');

var xmlStr = com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(dataSet,
  'com.toolsverse.etl.connector.xml.XmlObjectConnector',
  'rootname=products;rowtag=product');

var csvStr = com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(dataSet,
  'com.toolsverse.etl.connector.text.TextConnector',
  'alwaysquote=true;escapequote=true');

Inside a SELECT

Etlworks can run SQL against any data object. You can serialize a nested dataset directly inside a SELECT:

SQL
SELECT com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(
         dataSet, 'fully.qualified.connector.class.name', parameters
       ) AS field_name,
       other_fields
FROM …

In SQL context, replace = with = and ; with ; inside the parameters string.

JSON / XML / CSV string → DataSet

The inverse of above — ingest a string in any of the three formats and get a DataSet:

JavaScript
var dataSet = com.toolsverse.etl.connector.ConnectorUtils.str2DataSet(
  null,                                      // pass null for the EtlConfig param
  'fully.qualified.connector.class.name',
  parameters,                                // same as for dataSet2Str
  encodedStr                                 // the input string
);

Connectors and parameters are identical to dataSet2Str.


SQL

Conditional SQL action per row

For database destinations, decide INSERT / UPDATE / DELETE / MERGE / nothing per row. Configure under MAPPING → Parameters → Action Conditions.

JavaScript (Action Conditions)
// If column "Types" equals 'U', update; otherwise insert
{Types}.equals('U') ? 'update' : 'insert';

Set the action to 'nothing' to skip the record entirely. Available actions: 'insert', 'update', 'delete', 'merge', 'nothing'.

Execute SQL from JavaScript or Python

Run arbitrary SQL against any connection wired into the flow — useful for status updates inside complex nested flows, or for writing rejected records to an errors table.

JavaScript
var ex = etlConfig.getLastException();
etlConfig.setLastException(null);

var props = com.toolsverse.config.SystemConfig.instance().getContextProperties();
var stagingConnection = etlConfig.getConnection("Staging Postgres");
var statusSql = "insert into feeds_status (filename, url, error, finished) " +
                "values (?,?,?,?)";

var message = null;
if (ex != null) {
  message = ex.getMessage();
  if (Utils.isNothing(message)) {
    message = com.toolsverse.util.Utils.getStackTraceAsString(ex);
  }
}

var when = new java.util.Date();
try {
  com.toolsverse.etl.sql.util.SqlUtils.executeSql(
    stagingConnection, statusSql,
    props.get('filename'),
    props.get('url'),
    message,
    when
  );
} catch (ex) {
  etlConfig.log(ex.toString());
}

Required imports: com.toolsverse.config, com.toolsverse.etl.sql.util. The connection must be part of the flow (either used in a transformation, or added as a named connection on the nested flow).


Connections in code

Load text data from any connection

Named connection

JavaScript
var images = com.toolsverse.etl.core.task.common.FileManagerTask.read(
  etlConfig, "connection_name", null
);

On-the-fly connection

JavaScript
var alias = new com.toolsverse.etl.common.Alias();
alias.setUrl("https://example.com/api/images");
alias.setTransport('http');

var images = com.toolsverse.etl.core.task.common.FileManagerTask.read(alias, null);

The last parameter is a filename — pass null for HTTP, use the actual filename for file connections.

Load binary data — readAsStream

Returns a java.io.InputStream. Common pattern: load an image from one connection, transform it, save to another.

JavaScript
var imageStream = null;
var imageStreamToSave = null;
var os = null;

try {
  imageStream = com.toolsverse.etl.core.task.common
    .FileManagerTask.readAsStream(etlConfig, 'image', null);

  var image    = com.toolsverse.imaging.ImageProcessor.instance().loadImage(imageStream);
  var newImage = com.toolsverse.imaging.ImageProcessor.instance()
                   .scale(image, Java.to([0.5, 0.5], "java.lang.Float[]"));

  os = new java.io.ByteArrayOutputStream();
  javax.imageio.ImageIO.write(newImage, "png", os);
  imageStreamToSave = new java.io.ByteArrayInputStream(os.toByteArray());

  var alias = new com.toolsverse.etl.common.Alias();
  alias.setUrl("logo.png");
  alias.setTransport('file');

  com.toolsverse.etl.core.task.common
    .FileManagerTask.write(alias, null, imageStreamToSave);

} finally {
  if (imageStream       != null) imageStream.close();
  if (os                != null) os.close();
  if (imageStreamToSave != null) imageStreamToSave.close();
}

Use a named connection in JavaScript

For a connection to be referenceable from JavaScript, it must be part of the flow:

HTML scraping with jsoup

Etlworks bundles jsoup — a full HTML5 parser. Load a page, parse, traverse the DOM, extract.

JavaScript

JavaScript
var html = com.toolsverse.etl.core.task.common.FileManagerTask.read(
  etlConfig, 'html', null
);
var doc   = org.jsoup.Jsoup.parse(html);
var files = new java.util.ArrayList();
etlConfig.setValue('files', files);

var root  = doc.select("h3 > a[name=annualproj]").get(0).parent().nextElementSibling();
var spans = root.select("ul.numbered li > span.formats");

for each (var span in spans) {
  var link  = span.nextElementSibling();
  var href  = link.attr("href").toString();
  var title = link.attr("title").toString();
  title = title.substring(title.indexOf('.') + 1).trim();
  files.add(new com.toolsverse.util.TypedKeyValue(href, title));
  etlConfig.log("href: " + href + ", title: " + title);
}

Python (same flow)

Python
from org.jsoup import Jsoup
from com.toolsverse.etl.core.task.common import FileManagerTask
from com.toolsverse.util import TypedKeyValue, Utils
from java.util import ArrayList

html = FileManagerTask.read(etlConfig, 'html', None)
doc  = Jsoup.parse(html)
files = ArrayList()
etlConfig.setValue('files', files)

root  = doc.select("h3 > a[name=annualproj]").get(0).parent().nextElementSibling()
spans = root.select("ul.numbered li > span.formats")

for span in spans:
    link  = span.nextElementSibling()
    href  = link.attr("href")
    title = link.attr("title")
    title = title[title.index(".") + 1:].strip()
    files.add(TypedKeyValue(href, title))
    etlConfig.log("href: " + href + ", title: " + title)

Utilities

Sequence generators

UniqueNumber — in-process, timestamp-based

Cache the same instance in etlConfig's key/value storage so the sequence stays consistent across invocations within a flow.

JavaScript
var uniqueNumber = etlConfig.getValue("uniqueNumber");
if (uniqueNumber == null) {
  uniqueNumber = new com.toolsverse.util.UniqueNumber();
  etlConfig.setValue("uniqueNumber", uniqueNumber);
}
value = uniqueNumber.nextNumber();

Algorithm: (System.currentTimeMillis() << 20) + n, where n is incremented each call.

RedisSequenceGenerator — cluster-wide

Backed by Redis — safe across parallel flows. Up to 10K numbers/sec.

JavaScript
// key = anything (e.g. table name), startValue = starting sequence number
var seq = com.toolsverse.io.RedisSequenceGenerator.instance()
            .nextNumber('my_table_name', 1);

Share parameters across flows

Variables on a flow are local to it (and its nested children). To share state across independent flows — e.g. parallel runners, sequential dependent flows, real-time stats collected via a webhook + listener — use the Redis-backed parameter store.

Set / get many at once

JavaScript
var params = new java.util.HashMap();
params.put("key1", "value1");
params.put("key2", "value2");

// anyid: any string (auditId / flowId / both); expireInSec: 1..86400 (24h max)
com.toolsverse.io.RedisSequenceGenerator.instance()
  .setParameters("anyid", params, expireInSec);

// key can be null — uses wildcard *
var got = com.toolsverse.io.RedisSequenceGenerator.instance()
  .getParameters("anyid", null);
var v1 = got.get("key1");

Set / get a single value

JavaScript
com.toolsverse.io.RedisSequenceGenerator.instance()
  .setParameter("anyid", "key1", "value1", expireInSec);

var v = com.toolsverse.io.RedisSequenceGenerator.instance()
  .getParameter("anyid", "key1");

Recommended keys

JavaScript
// Audit ID for the currently running flow
var auditId = com.toolsverse.config.SystemConfig.instance()
  .getEtlThreadContext().getRequestId();

// Flow ID
var flowId  = com.toolsverse.config.SystemConfig.instance()
  .getEtlThreadContext().getFlowId();

When collecting parameters via a webhook payload, the auditId is in JSON.parse(message).entity.auditId.

From an Integration Agent

An on-prem agent runs on a different network and can't reach Redis directly. Workaround: have a flow on the agent POST to /etl/rest/v1/parameters on the host instance via an Etlworks API connection. The platform writes the parameters to Redis on the agent's behalf.

Working with dates

Dates are java.util.Date under the hood. The most useful helpers:

NeedCode
Current timestamp as Datenew java.util.Date()
Current timestamp in millisSystem.currentTimeMillis()
Format a Date as a stringcom.toolsverse.util.Utils.date2Str(date, 'yyyy-MM-dd HH:mm:ss')
Parse a string into a Datecom.toolsverse.util.Utils.str2Date(string, defaultDate, format)
Reformat between two string formatsUtils.date2Str(Utils.str2Date(s, null, srcFmt), dstFmt)

Add / subtract

All on com.toolsverse.util.DateUtil. Negative second arg subtracts.

JavaScript
var d1 = com.toolsverse.util.DateUtil.addDays(new java.util.Date(), 1);
var d2 = com.toolsverse.util.DateUtil.addHours(new java.util.Date(), -3);
var d3 = com.toolsverse.util.DateUtil.addMinutes(new java.util.Date(), 30);
var d4 = com.toolsverse.util.DateUtil.addSeconds(new java.util.Date(), 100);
var d5 = com.toolsverse.util.DateUtil.addMillis(new java.util.Date(), 10000);
var d6 = com.toolsverse.util.DateUtil.addMonths(new java.util.Date(), 1);
var d7 = com.toolsverse.util.DateUtil.addYears(new java.util.Date(), 1);

Date in a specific timezone

JavaScript
var calendar = java.util.Calendar.getInstance(
  java.util.TimeZone.getTimeZone("EST")
);
var formatted = calendar.get(java.util.Calendar.YEAR) + '/' +
                calendar.get(java.util.Calendar.MONTH) + '/' +
                calendar.get(java.util.Calendar.DATE);
Full format-letter and timezone-ID tables

Date format letters (y, M, d, H, m, s, S, z, Z, X, …) follow the standard Java SimpleDateFormat conventions. Timezone IDs are the tz database values. Both full tables are mirrored at Work with dates — ~150 KB of reference content we don't duplicate here.

Home URL of the app

Configured under Settings → General → Home URL. Useful in emails, environment-detection logic, building absolute links.

JavaScript
var homeUrl = etlConfig.getHomeUrl();