Recipes

25+ patterns lifted from production flows. Examples in JavaScript except where Python adds something specific. Every recipe runs in the inline editor — no external setup.

Mapping fields

Get and set a field's value in Mapping

Setting a field's value is available under field-to-field Mapping. Three ways to read; two ways to write.

Read

JavaScript

// Most explicit — works in any field function
value = dataSet.getFieldValue(currentRow, "field name");

// Shorthand for the current field in context (only inside this field's function)
value = fieldValue;

// Token form — slightly slower (preprocessed)
value = {field name};

Write

JavaScript

dataSet.setFieldValue(currentRow, "field name", fieldValue);

Set a field's value using native SQL (inline SQL)

If the destination is a relational database, you can compute a column value with the database's own SQL dialect — pushed down by the engine. Faster than JS/Python because there's no context switch.

Setup

Make sure Use Bind Variables is enabled (default).
In Mapping, click the pen button to open the field function editor.
Change the language of the field function editor to SQL.
Reference any column as {column_name}; flow / global vars as {VARNAME} (uppercase).

Example

SQL (field function)

LOWER({first_name}) || '-' || LOWER({last_name})

The engine generates the destination INSERT/UPDATE/MERGE with the SQL inlined; bind variables substitute the actual column values at execution time.

Anonymize / randomize a column

etlConfig.randomizer(locale) exposes a built-in anonymizer with locale-aware data providers (names, emails, addresses, …).

Pure random

JavaScript (Mapping → calculated field)

value = etlConfig.randomizer("en").internet().emailAddress();

Stable (same source value → same anonymous value within the flow)

JavaScript

var randomizer = etlConfig.randomizer("en");
value = randomizer.reuse(fieldValue, randomizer.name().firstName());

Locale examples

JavaScript

etlConfig.randomizer("de").name().firstName();
etlConfig.randomizer("en").name().lastName();
etlConfig.randomizer("en").internet().emailAddress();

Flow control

Run JavaScript or Python as a flow

The dedicated Execute JavaScript or Python flow type runs only your code — no source, no destination. Compose it inside a nested flow alongside other flow types.

Inside the script, etlConfig and scenario are available by name. Useful for orchestration, custom HTTP calls, sending notifications, computing complex inputs for downstream flows.

Conditional step in a workflow

Any workflow step can run conditionally based on a script that returns true (run) or false (skip). Useful when the next step depends on:

The result of a previous step
The presence of specific data
Workflow parameters or variables
External system state

Configured at the workflow step level — see the platform docs on conditional execution.

Loop a flow

A workflow step can loop — running the same flow repeatedly with different parameters or input data. Common cases: paginated APIs, files matching a wildcard, records returned by a query, fixed N iterations. Configured at the workflow step level.

Dataset transforms

Five transformation phases

Scripting transformations run at specific points in the source-to-destination pipeline. Pick the one that matches when you need to act:

Phase	When it runs	Typical use
Before Prepare Source Query	Before the source query is merged with flow variables	Modify the source SQL programmatically; set flow variable values
Before Extract	Before extraction starts	Set globals / flow vars, log, prep state
For Each Row	For each extracted row	Modify field values on the fly (alternative to per-field Mapping)
After Extract	After extraction is complete (entire dataset in memory)	Build a brand-new dataset from the extracted one. Requires streaming disabled.
After Load	Last step, after data has loaded into the destination	Cleanup, set vars, log, send notifications

All five can be used together in the same transformation. Configure under Transformation → MAPPING → Additional Transformations.

Modify the source query before it runs

JavaScript (Before Prepare Source Query)

source.setSql('select * from table where id > {VAR_NAME}');
scenario.getVariable('VAR_NAME').setValue(value);

DataSet API — methods you'll actually use

Method	Purpose
`getRecordCount()`	Number of records in the dataset
`getFieldCount()`	Number of fields (columns)
`getRecord(index)`	Returns a `DataSetRecord` by index
`getData()`	Array of records
`getFields()`	Array of fields (column metadata)
`getFieldDef(name)` · `getFieldDef(index)`	Field definition by name or position
`findField(name)`	`TypedKeyValue<FieldDef, DataSet>` — finds a field and the nested dataset it belongs to (if any)
`setData(DataSetData)`	Replace records
`setFields(DataSetFields)`	Replace fields (columns)
`getFieldValue(row, column)`	Value at row + column position
`getFieldValue(record, column)`	Value from a specific record + column
`getFieldValue(record, name)`	Value from a specific record + field name
`addField(FieldDef)` · `addField(name, value)`	Add a column
`setFieldValue(record, column, value)` · `setFieldValue(row, column, value)`	Set a value
`setValue(record, name, value)`	Set by field name; adds the field if it doesn't exist
`getActualData()` · `setActualData(Object)`	Internal-format access (used for HL7)

Build a brand-new dataset from the source

Common pattern: traverse the extracted dataset, build a stage dataset with restructured fields (including flattening nested arrays), then replace the live one. Run as After Extract with streaming disabled.

JavaScript (After Extract)

var stageDataSet = new com.toolsverse.etl.common.DataSet();
var rows = dataSet.getRecordCount();

for (var row = 0; row < rows; row++) {
  var record = dataSet.getRecord(row);
  stageDataSet.addField("Name.FirstName", dataSet.getFieldValue(record, "firstName"));
  stageDataSet.addField("Name.LastName",  dataSet.getFieldValue(record, "lastName"));
  stageDataSet.addField("Age",            dataSet.getFieldValue(record, "age"));
  stageDataSet.addField("City",           dataSet.getFieldValue(record, "city"));
  stageDataSet.addField("State",          dataSet.getFieldValue(record, "state"));
  stageDataSet.addField("ZipCode",        dataSet.getFieldValue(record, "postalCode"));
  stageDataSet.addField("Addr1",          dataSet.getFieldValue(record, "streetAddress"));

  // Flatten a nested array
  var phones = dataSet.getFieldValue(record, "phoneNumbers");
  if (phones != null && phones.getRecordCount() > 0) {
    var n = phones.getRecordCount();
    for (var phoneRow = 0; phoneRow < n; phoneRow++) {
      stageDataSet.addField("Phone.Type",
        dataSet.getFieldValue(record, "phoneNumbers.phoneNumbers_type", '.', phoneRow));
      stageDataSet.addField("Phone.Num",
        dataSet.getFieldValue(record, "phoneNumbers.phoneNumbers", '.', phoneRow));
    }
  }
}

dataSet.setFields(stageDataSet.getFields());
dataSet.setData(stageDataSet.getData());

Lookup

Extract & lookup — query an external source per row

Pull a single value or an entire dataset from any connection while processing source records. Use the source connection (SELECT …), the destination connection (destination.SELECT …), or a third connection by name (connection_name.SELECT …).

Single field value — `Extractor.lookup(...)`

JavaScript

value = com.toolsverse.etl.core.engine.Extractor.lookup(
  etlConfig, scenario,
  "connection_for_directors_db",  // connection name
  "directors",                    // dataset name (any string)
  "first_name",                   // column to return
  "SELECT first_name FROM directors WHERE id = {ID}",  // sql, can be null
  new com.toolsverse.util.TypedKeyValue("ID",
    dataSet.getFieldValue(currentRow, 'director_id'))
);

Entire dataset — same call, different signature

JavaScript

var images = com.toolsverse.etl.core.engine.Extractor.lookup(
  etlConfig, scenario,
  "connection for lookup",
  "image",
  "SELECT filename AS url FROM property_pictures WHERE id = {ID}",
  new com.toolsverse.util.TypedKeyValue("ID",
    dataSet.getFieldValue(currentRow, 'picture_id'))
);

Cached vs non-cached

Use	When
`Extractor.lookup(...)`	Performance — cache results across rows
`Extractor.lookupNoCache(...)`	Data changes between calls

Callable SQL lookup

For lookups that need stored procedures or PL/SQL-style blocks, use Extractor.lookupWithCallable(...):

JavaScript

var sql = 'begin\n' +
          '  select country from country where country_id = {country_id}\n' +
          'end';

value = com.toolsverse.etl.core.engine.Extractor.lookupWithCallable(
  etlConfig, scenario,
  'SQL Server',
  'country',
  'country',
  sql,
  new com.toolsverse.util.TypedKeyValue('country_id',
    dataSet.getFieldValue(currentRow, 'country_id'))
);

In-memory lookups (already-loaded datasets)

If the lookup data is already loaded in memory (a JSON / XML / CSV source, an API response), there's no need to round-trip through a database.

SQL on an in-memory dataset

JavaScript

// Single field
value = com.toolsverse.etl.common.CommonEtlUtils.getFieldValue(
  scenario.getSources().get("DIRECTORS").getDataSet(),
  "select last_name where id = {ID}",
  "last_name",
  new com.toolsverse.util.TypedKeyValue("ID",
    dataSet.getFieldValue(currentRow, 'director_id'))
);

// Entire dataset
var phones = com.toolsverse.etl.common.CommonEtlUtils.executeSql(
  scenario.getSources().get("PHONES").getDataSet(),
  "select * where area_code=412"
);

Filter expression on an in-memory dataset

Faster than SQL for flat datasets:

JavaScript

// Single field
value = com.toolsverse.etl.common.CommonEtlUtils.lookup(
  scenario.getSources().get("DIRECTORS").getDataSet(),
  "id=" + dataSet.getFieldValue(currentRow, 'director_id'),
  "last_name"
);

// Entire dataset
var phones = com.toolsverse.etl.common.CommonEtlUtils.filter(
  scenario.getSources().get("PHONES").getDataSet(),
  "area_code=412"
);

Filter & validate

Simple filter expression

Available under MAPPING → Additional Transformations → Filter. Reject rows by expression. Field values are accessible by name ("field name").

Mode	Operators
JavaScript syntax	`!=`, `&&`, `\|\|`, `===`
Use SQL like syntax enabled	`<>`, `and`, `or`, `like`

JavaScript Filter

"Importer Name" != null && "Importer Name".contains("test");

SQL-like Filter

"Importer Name" <> null and "Importer Name".contains("test")

Returns true → keep the row, false → drop it.

Filter / modify rows in a source CSV file

Configurable in CSV Format. The runtime injects a filter object you control via getters/setters:

JavaScript

var javaImports = new JavaImporter(com.toolsverse.etl.connector.text);

with (javaImports) {
  // Skip the second line (1-based actual index)
  if (filter.getActualIndex() == 1) {
    filter.setCode(TextFilter.TextFilterCode.REJECT);
    filter.setIndex(1);
  }
}

Codes: STOP (halt parsing), REJECT (skip line), CONTINUE (default).

Validate — `TaskResult` codes

Available under MAPPING → Additional Transformations → Validate. Different from Filter: you can halt the entire flow, reject the entire dataset, or just drop a single row.

Code	Effect
`TaskResult.HALT`	Stop the flow with an exception
`TaskResult.STOP`	Skip the source-to-destination transformation entirely (no rows loaded)
`TaskResult.REJECT`	Skip the current row only
`TaskResult.CONTINUE`	Accept the row (default)

Reject a row when required fields are missing or malformed

JavaScript (Validate)

var postalCode = dataSet.getFieldValue(currentRow, 'postal_code');
var phone      = dataSet.getFieldValue(currentRow, 'phone');

if (Utils.isNothing(postalCode) || Utils.isNothing(phone) || !Utils.isNumber(postalCode)) {
  value = TaskResult.REJECT;
} else {
  value = TaskResult.CONTINUE;
}

Swap REJECT for HALT to stop the entire flow, or STOP to skip the whole transformation.

Source messages

Preprocess the source message

Modify the raw source document before Etlworks parses it — useful for fixing encoding, supplying default content for empty payloads, or stringifying nested nodes.

Setup

Create an XML, JSON, or CSV Format.
Set Transformation type to preprocessor.
Write JavaScript that modifies the message variable.

JavaScript (Format → Preprocessor)

// Strip the first 12 chars (e.g. a header byte-order mark or framing)
value = message.substring(12);

// Or rely on last-expression
message.substring(12);

Store the original source message

Sometimes you want both the parsed dataset and the raw message. Stash the raw message in a global variable from the preprocessor, then read it anywhere downstream:

JavaScript (preprocessor)

com.toolsverse.config.SystemConfig.instance().getProperties()
  .put("raw_source", message);
value = message;

JavaScript (anywhere downstream)

var raw = com.toolsverse.config.SystemConfig.instance().getProperties()
  .get("raw_source");

Serialize

DataSet → JSON / XML / CSV string

One call, any of the three formats. Useful when you want to keep a nested node as a single string field, or generate a payload to send via HTTP.

JavaScript

var str = com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(
  dataSet,
  'fully.qualified.connector.class.name',
  parameters  // string "k=v;k=v" or null for defaults
);

Connectors

Format	Class
JSON	`com.toolsverse.etl.connector.json.JsonConnector`
XML	`com.toolsverse.etl.connector.xml.XmlObjectConnector`
CSV	`com.toolsverse.etl.connector.text.TextConnector`

JSON parameters

Parameter	Values	Default
`start`	`array` · `object` · `namedobject` · `variable`	`array`
`rootname`	any string	`null`
`nested`	`array` · `object` · `variable`	`array`
`forsql`	`true` / `false`	`false`

XML parameters

Parameter	Values	Default
`rootname`	any string	automatic
`namespace`	valid namespace	`null`
`xmlversion`	`1.0` / `1.1`	`1.0`
`rowtag`	valid XML tag name	automatic
`parseattrs` · `parserootattrs` · `parsecdata` · `forsql`	`true` / `false`	see article

CSV parameters

Parameter	Values	Default
`delimiter`	any character	`,`
`lineseparator`	`u` Unix · `w` Windows · `s` automatic	`s`
`charseparator`	any character	`"`
`alwaysquote` · `escapequote` · `forsql`	`true` / `false`	`false`

Examples

JavaScript

var jsonStr = com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(dataSet,
  'com.toolsverse.etl.connector.json.JsonConnector',
  'start=variable;nested=variable');

var xmlStr = com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(dataSet,
  'com.toolsverse.etl.connector.xml.XmlObjectConnector',
  'rootname=products;rowtag=product');

var csvStr = com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(dataSet,
  'com.toolsverse.etl.connector.text.TextConnector',
  'alwaysquote=true;escapequote=true');

Inside a SELECT

Etlworks can run SQL against any data object. You can serialize a nested dataset directly inside a SELECT:

SQL

SELECT com.toolsverse.etl.connector.ConnectorUtils.dataSet2Str(
         dataSet, 'fully.qualified.connector.class.name', parameters
       ) AS field_name,
       other_fields
FROM …

In SQL context, replace = with = and ; with ; inside the parameters string.

JSON / XML / CSV string → DataSet

The inverse of above — ingest a string in any of the three formats and get a DataSet:

JavaScript

var dataSet = com.toolsverse.etl.connector.ConnectorUtils.str2DataSet(
  null,                                      // pass null for the EtlConfig param
  'fully.qualified.connector.class.name',
  parameters,                                // same as for dataSet2Str
  encodedStr                                 // the input string
);

Connectors and parameters are identical to dataSet2Str.

SQL

Conditional SQL action per row

For database destinations, decide INSERT / UPDATE / DELETE / MERGE / nothing per row. Configure under MAPPING → Parameters → Action Conditions.

JavaScript (Action Conditions)

// If column "Types" equals 'U', update; otherwise insert
{Types}.equals('U') ? 'update' : 'insert';

Set the action to 'nothing' to skip the record entirely. Available actions: 'insert', 'update', 'delete', 'merge', 'nothing'.

Execute SQL from JavaScript or Python

Run arbitrary SQL against any connection wired into the flow — useful for status updates inside complex nested flows, or for writing rejected records to an errors table.

JavaScript

var ex = etlConfig.getLastException();
etlConfig.setLastException(null);

var props = com.toolsverse.config.SystemConfig.instance().getContextProperties();
var stagingConnection = etlConfig.getConnection("Staging Postgres");
var statusSql = "insert into feeds_status (filename, url, error, finished) " +
                "values (?,?,?,?)";

var message = null;
if (ex != null) {
  message = ex.getMessage();
  if (Utils.isNothing(message)) {
    message = com.toolsverse.util.Utils.getStackTraceAsString(ex);
  }
}

var when = new java.util.Date();
try {
  com.toolsverse.etl.sql.util.SqlUtils.executeSql(
    stagingConnection, statusSql,
    props.get('filename'),
    props.get('url'),
    message,
    when
  );
} catch (ex) {
  etlConfig.log(ex.toString());
}

Required imports: com.toolsverse.config, com.toolsverse.etl.sql.util. The connection must be part of the flow (either used in a transformation, or added as a named connection on the nested flow).

Connections in code

Load text data from any connection

Named connection

JavaScript

var images = com.toolsverse.etl.core.task.common.FileManagerTask.read(
  etlConfig, "connection_name", null
);

On-the-fly connection

JavaScript

var alias = new com.toolsverse.etl.common.Alias();
alias.setUrl("https://example.com/api/images");
alias.setTransport('http');

var images = com.toolsverse.etl.core.task.common.FileManagerTask.read(alias, null);

The last parameter is a filename — pass null for HTTP, use the actual filename for file connections.

Load binary data — `readAsStream`

Returns a java.io.InputStream. Common pattern: load an image from one connection, transform it, save to another.

JavaScript

var imageStream = null;
var imageStreamToSave = null;
var os = null;

try {
  imageStream = com.toolsverse.etl.core.task.common
    .FileManagerTask.readAsStream(etlConfig, 'image', null);

  var image    = com.toolsverse.imaging.ImageProcessor.instance().loadImage(imageStream);
  var newImage = com.toolsverse.imaging.ImageProcessor.instance()
                   .scale(image, Java.to([0.5, 0.5], "java.lang.Float[]"));

  os = new java.io.ByteArrayOutputStream();
  javax.imageio.ImageIO.write(newImage, "png", os);
  imageStreamToSave = new java.io.ByteArrayInputStream(os.toByteArray());

  var alias = new com.toolsverse.etl.common.Alias();
  alias.setUrl("logo.png");
  alias.setTransport('file');

  com.toolsverse.etl.core.task.common
    .FileManagerTask.write(alias, null, imageStreamToSave);

} finally {
  if (imageStream       != null) imageStream.close();
  if (os                != null) os.close();
  if (imageStreamToSave != null) imageStreamToSave.close();
}

Use a named connection in JavaScript

For a connection to be referenceable from JavaScript, it must be part of the flow:

If the connection is part of a source-to-destination transformation, it's automatically available in the flow itself, in any nested flow that includes it, and in all inner flows.
Otherwise, add it manually as a named connection on the nested flow.

HTML scraping with `jsoup`

Etlworks bundles jsoup — a full HTML5 parser. Load a page, parse, traverse the DOM, extract.

JavaScript

var html = com.toolsverse.etl.core.task.common.FileManagerTask.read(
  etlConfig, 'html', null
);
var doc   = org.jsoup.Jsoup.parse(html);
var files = new java.util.ArrayList();
etlConfig.setValue('files', files);

var root  = doc.select("h3 > a[name=annualproj]").get(0).parent().nextElementSibling();
var spans = root.select("ul.numbered li > span.formats");

for each (var span in spans) {
  var link  = span.nextElementSibling();
  var href  = link.attr("href").toString();
  var title = link.attr("title").toString();
  title = title.substring(title.indexOf('.') + 1).trim();
  files.add(new com.toolsverse.util.TypedKeyValue(href, title));
  etlConfig.log("href: " + href + ", title: " + title);
}

Python (same flow)

Python

from org.jsoup import Jsoup
from com.toolsverse.etl.core.task.common import FileManagerTask
from com.toolsverse.util import TypedKeyValue, Utils
from java.util import ArrayList

html = FileManagerTask.read(etlConfig, 'html', None)
doc  = Jsoup.parse(html)
files = ArrayList()
etlConfig.setValue('files', files)

root  = doc.select("h3 > a[name=annualproj]").get(0).parent().nextElementSibling()
spans = root.select("ul.numbered li > span.formats")

for span in spans:
    link  = span.nextElementSibling()
    href  = link.attr("href")
    title = link.attr("title")
    title = title[title.index(".") + 1:].strip()
    files.add(TypedKeyValue(href, title))
    etlConfig.log("href: " + href + ", title: " + title)

Utilities

Sequence generators

UniqueNumber — in-process, timestamp-based

Cache the same instance in etlConfig's key/value storage so the sequence stays consistent across invocations within a flow.

JavaScript

var uniqueNumber = etlConfig.getValue("uniqueNumber");
if (uniqueNumber == null) {
  uniqueNumber = new com.toolsverse.util.UniqueNumber();
  etlConfig.setValue("uniqueNumber", uniqueNumber);
}
value = uniqueNumber.nextNumber();

Algorithm: (System.currentTimeMillis() << 20) + n, where n is incremented each call.

RedisSequenceGenerator — cluster-wide

Backed by Redis — safe across parallel flows. Up to 10K numbers/sec.

JavaScript

// key = anything (e.g. table name), startValue = starting sequence number
var seq = com.toolsverse.io.RedisSequenceGenerator.instance()
            .nextNumber('my_table_name', 1);

Variables on a flow are local to it (and its nested children). To share state across independent flows — e.g. parallel runners, sequential dependent flows, real-time stats collected via a webhook + listener — use the Redis-backed parameter store.

Set / get many at once

JavaScript

var params = new java.util.HashMap();
params.put("key1", "value1");
params.put("key2", "value2");

// anyid: any string (auditId / flowId / both); expireInSec: 1..86400 (24h max)
com.toolsverse.io.RedisSequenceGenerator.instance()
  .setParameters("anyid", params, expireInSec);

// key can be null — uses wildcard *
var got = com.toolsverse.io.RedisSequenceGenerator.instance()
  .getParameters("anyid", null);
var v1 = got.get("key1");

Set / get a single value

JavaScript

com.toolsverse.io.RedisSequenceGenerator.instance()
  .setParameter("anyid", "key1", "value1", expireInSec);

var v = com.toolsverse.io.RedisSequenceGenerator.instance()
  .getParameter("anyid", "key1");

Recommended keys

JavaScript

// Audit ID for the currently running flow
var auditId = com.toolsverse.config.SystemConfig.instance()
  .getEtlThreadContext().getRequestId();

// Flow ID
var flowId  = com.toolsverse.config.SystemConfig.instance()
  .getEtlThreadContext().getFlowId();

When collecting parameters via a webhook payload, the auditId is in JSON.parse(message).entity.auditId.

From an Integration Agent

An on-prem agent runs on a different network and can't reach Redis directly. Workaround: have a flow on the agent POST to /etl/rest/v1/parameters on the host instance via an Etlworks API connection. The platform writes the parameters to Redis on the agent's behalf.

Working with dates

Dates are java.util.Date under the hood. The most useful helpers:

Need	Code
Current timestamp as `Date`	`new java.util.Date()`
Current timestamp in millis	`System.currentTimeMillis()`
Format a Date as a string	`com.toolsverse.util.Utils.date2Str(date, 'yyyy-MM-dd HH:mm:ss')`
Parse a string into a Date	`com.toolsverse.util.Utils.str2Date(string, defaultDate, format)`
Reformat between two string formats	`Utils.date2Str(Utils.str2Date(s, null, srcFmt), dstFmt)`

Add / subtract

All on com.toolsverse.util.DateUtil. Negative second arg subtracts.

JavaScript

var d1 = com.toolsverse.util.DateUtil.addDays(new java.util.Date(), 1);
var d2 = com.toolsverse.util.DateUtil.addHours(new java.util.Date(), -3);
var d3 = com.toolsverse.util.DateUtil.addMinutes(new java.util.Date(), 30);
var d4 = com.toolsverse.util.DateUtil.addSeconds(new java.util.Date(), 100);
var d5 = com.toolsverse.util.DateUtil.addMillis(new java.util.Date(), 10000);
var d6 = com.toolsverse.util.DateUtil.addMonths(new java.util.Date(), 1);
var d7 = com.toolsverse.util.DateUtil.addYears(new java.util.Date(), 1);

Date in a specific timezone

JavaScript

var calendar = java.util.Calendar.getInstance(
  java.util.TimeZone.getTimeZone("EST")
);
var formatted = calendar.get(java.util.Calendar.YEAR) + '/' +
                calendar.get(java.util.Calendar.MONTH) + '/' +
                calendar.get(java.util.Calendar.DATE);

Full format-letter and timezone-ID tables

Date format letters (y, M, d, H, m, s, S, z, Z, X, …) follow the standard Java SimpleDateFormat conventions. Timezone IDs are the tz database values. Both full tables are mirrored at Work with dates — ~150 KB of reference content we don't duplicate here.

Home URL of the app

Configured under Settings → General → Home URL. Useful in emails, environment-detection logic, building absolute links.

JavaScript

var homeUrl = etlConfig.getHomeUrl();

Recipes

Mapping fields

Get and set a field's value in Mapping

Read

Write

Set a field's value using native SQL (inline SQL)

Setup

Example

Anonymize / randomize a column

Pure random

Stable (same source value → same anonymous value within the flow)

Locale examples

Flow control

Run JavaScript or Python as a flow

Conditional step in a workflow

Loop a flow

Dataset transforms

Five transformation phases

Modify the source query before it runs

DataSet API — methods you'll actually use

Build a brand-new dataset from the source

Lookup

Extract & lookup — query an external source per row

Single field value — Extractor.lookup(...)

Entire dataset — same call, different signature

Cached vs non-cached

Callable SQL lookup

In-memory lookups (already-loaded datasets)

SQL on an in-memory dataset

Filter expression on an in-memory dataset

Filter & validate

Simple filter expression

Filter / modify rows in a source CSV file

Validate — TaskResult codes

Reject a row when required fields are missing or malformed

Source messages

Preprocess the source message

Setup

Store the original source message

Serialize

DataSet → JSON / XML / CSV string

Connectors

JSON parameters

XML parameters

CSV parameters

Examples

Inside a SELECT

JSON / XML / CSV string → DataSet

SQL

Conditional SQL action per row

Execute SQL from JavaScript or Python

Connections in code

Load text data from any connection

Named connection

On-the-fly connection

Load binary data — readAsStream

Use a named connection in JavaScript

HTML scraping with jsoup

JavaScript

Python (same flow)

Utilities

Sequence generators

UniqueNumber — in-process, timestamp-based

RedisSequenceGenerator — cluster-wide

Share parameters across flows

Set / get many at once

Set / get a single value

Recommended keys

Working with dates

Add / subtract

Date in a specific timezone

Home URL of the app

Single field value — `Extractor.lookup(...)`

Validate — `TaskResult` codes

Load binary data — `readAsStream`

HTML scraping with `jsoup`