Scripting basics
The languages, the Java bridge, the variable systems, and where each fits.
JavaScript (Nashorn)
Etlworks's default and recommended scripting language. The engine is Nashorn — the JavaScript implementation embedded in the JVM. Nashorn supports:
- Standard JavaScript syntax
- Direct access to every Java class on the classpath
- Mixing JavaScript and Java in the same script
- Direct interaction with Etlworks internals
Returning a value
Two equivalent forms — pick whichever reads better:
// Assign to the special `value` variable
value = firstName + " " + lastName;
// Or rely on the last evaluated expression
firstName + " " + lastName;
Loading external JavaScript libraries
// From a URL
load("https://example.com/scripts/utility.js");
// From the local file system
load("file:///opt/scripts/functions.js");
For deep dives on the engine itself: Nashorn tutorial · Oracle Nashorn overview.
Python (Jython)
Built-in Python is powered by Jython, which runs inside the Etlworks runtime and provides Python 2.7 compatibility. It's available wherever JavaScript is available — same scripting fields, same runtime objects.
| Built-in Python (Jython) | Real Python (Execute Script flow) |
|---|---|
| Embedded in the Etlworks JVM | External Python 3 interpreter |
| Python 2.7 only | Python 3.x |
No pip, no pandas / numpy / boto3 / etc. | Full pip, virtual environments, every modern package |
| Can import any Java class on the classpath | Can't access JVM (it's a separate process) |
| For lightweight inline expressions | For full programs and orchestration |
Returning a value
Python scripts always return via scriptResult.setResult(...) — not the last-expression rule that JS uses:
value = row.get("amount") * 1.2
scriptResult.setResult(value)
Importing Java classes from Python
from org.jsoup import Jsoup
from com.toolsverse.etl.core.task.common import FileManagerTask
from com.toolsverse.util import TypedKeyValue, Utils
from java.util import ArrayList
Use the Execute Script Local or Remote via SSH flow instead of an inline scripting field. That flow runs a real Python 3 interpreter outside the JVM and supports pip, virtual environments, and any package.
Importing Java classes
Four ways to reference a Java class from JavaScript. They all produce the same result — pick by readability.
1. Fully qualified name (recommended for one-off use)
var list = new java.util.ArrayList();
2. importPackage
importPackage(java.util);
var list = new ArrayList();
3. Java.type("…") (recommended when you'll use the class repeatedly)
var ArrayList = Java.type("java.util.ArrayList");
var list = new ArrayList();
4. JavaImporter (flexible, slower — avoid in hot loops)
var javaImports = new JavaImporter(
java.util,
com.toolsverse.util
);
with (javaImports) {
var list = new ArrayList();
var ds = new DataSet();
}
Etlworks API reference
The most useful packages and classes you'll reach for from a script. Full Javadoc is published with the platform; this is a curated index.
Common packages
| Purpose | Package |
|---|---|
| Utility classes | com.toolsverse.util |
| Logging | com.toolsverse.util.log |
| System configuration | com.toolsverse.config |
| Common ETL engine classes (DataSet, FieldDef, …) | com.toolsverse.etl.common |
| ETL engine configuration | com.toolsverse.etl.core.config |
| ETL engine core | com.toolsverse.etl.core.engine |
| Common ETL tasks | com.toolsverse.etl.core.task.common |
| Java collections and utilities | java.util |
Utility methods — com.toolsverse.util.Utils
if (com.toolsverse.util.Utils.isNothing(value)) {
// value is null, undefined, empty string, empty collection, etc.
}
var fileName = com.toolsverse.util.FilenameUtils.getName(path);
File / connection IO — com.toolsverse.etl.core.task.common.FileManagerTask
// Check for files matching a pattern on a named connection
var found = com.toolsverse.etl.core.task.common.FileManagerTask.filesExist(
etlConfig, "connectionName", "*.json"
);
// List files
var list = com.toolsverse.etl.core.task.common.FileManagerTask.list(
etlConfig, "connectionName", "*.json"
);
if (list != null) {
for each (var file in list) {
etlConfig.log("Name:" + file.getName() +
" Size:" + file.getSize() +
" Path:" + file.getPath() +
" Last modified:" + file.getLastModified());
}
}
// Write
com.toolsverse.etl.core.task.common.FileManagerTask.write(
etlConfig, "connectionName", filename, payload
);
// Read
var data = com.toolsverse.etl.core.task.common.FileManagerTask.read(
etlConfig, "connectionName", filename
);
// Execute an HTTP call against a manually-built Alias
var alias = new com.toolsverse.etl.common.Alias();
alias.setUrl("http://localhost:8080/health");
alias.setTransport("com.toolsverse.io.HttpProcessor");
alias.setParams("method=GET");
var response = com.toolsverse.etl.core.task.common.FileManagerTask.execute(
alias, null, true
);
// Or via a named connection
var response = com.toolsverse.etl.core.task.common.FileManagerTask.execute(
etlConfig, "connection_name", null, true
);
Data — DataSet, FieldDef, DataSetRecord
// Read a value
var fldValue = dataSet.getFieldValue(currentRow, "InvoiceNo");
// Get the field definition by name
var fldName = dataSet.getFieldDef("InvoiceNo").getNameToUse();
// Get a record by index
var record = dataSet.getRecord(0);
// High-level transformations across two datasets
var newDs = com.toolsverse.etl.common.CommonEtlUtils.intersect(dataSet, with, "id");
Connection & engine config — Alias, EtlConfig
// Connection (Alias) by name from the current flow
var alias = etlConfig.getAliasesMap().get("Connection name");
// Direct connection access
var connection = etlConfig.getConnection("Postgres");
Scenario — com.toolsverse.etl.core.engine.Scenario
var name = scenario.getName();
var v = scenario.getVariable("MY_VAR");
v.setValue("new value");
For the full reference (including SqlUtils, Extractor, ConnectorUtils, anonymizer providers, and many more), see the Recipes and Reference pages — they cover the methods you'll actually use, with examples.
External libraries
Custom Java libraries
Drop your own JAR onto the classpath and call it directly from JavaScript.
| Deployment | How to install |
|---|---|
| Self-hosted Etlworks | Copy the JAR into TOMCAT_HOME/lib, restart Tomcat. Then var X = Java.type("com.mycompany.X"); |
| Etlworks Cloud | Contact support@etlworks.com to upload custom libraries. |
External JS libraries
Use Nashorn's built-in load() — loads from URL or file system. See the JavaScript section above.
Global variables
Tenant-scoped key/value pairs accessible across flows. Strings only.
Set
// props is a java.util.HashMap<String, String>
var props = com.toolsverse.config.SystemConfig.instance().getProperties();
props.put("unique key", someValue);
Get
var props = com.toolsverse.config.SystemConfig.instance().getProperties();
var someValue = props.get("unique key");
Inside parallel loops or transformations
Parallel execution gets its own thread-safe context. Use getContextProperties() instead of getProperties() — values are visible only to the current ETL thread, with falls-back to the main thread on read:
var props = com.toolsverse.config.SystemConfig.instance().getContextProperties();
props.put("unique key", value);
// ...
var value = props.get("unique key");
// If not found in current ETL thread, returns whatever the main thread set.
Reference from SQL or connection params
Inside a Source / Destination query, or inside connection parameters: {global variable name}. Works for both main-thread and parallel-context values.
Flow variables
Per-flow key/value pairs. Set as URL parameters in user-defined APIs (see User APIs → Path parameters) or added by the user as parameters in a nested flow. Strings only.
Read
var value = scenario.getVariable("MY_VAR").getValue();
Write to an existing variable
scenario.getVariable("MY_VAR").setValue("new value");
Add a new variable at runtime
var v = new com.toolsverse.etl.common.Variable();
v.setName("MY_VAR");
v.setValue("initial value");
scenario.addVariable(v);
Reference from SQL or connection params
Inside a Source / Destination query, or inside connection parameters: {flow variable name}.
Flow key/value storage
Unlike global and flow variables (string-only), etlConfig's key/value storage holds any object — useful for caching UniqueNumber generators, parsed documents, lookup datasets, anything serializable. Scope is the current flow execution.
// Store
etlConfig.setValue("unique key", someObject);
// Retrieve later in the same flow
var obj = etlConfig.getValue("unique key");
Logging
Two ways. etlConfig.log(...) is the everyday choice — entries land in the flow's run log:
etlConfig.log("processed " + count + " rows");
For severity-tagged logging, use the JVM logger directly:
com.toolsverse.util.log.Logger.log(
com.toolsverse.util.log.Logger.SEVERE,
null,
"Error doing something"
);
Best practices
- Prefer JavaScript over Python for inline logic.
- Import Java classes only when you actually need them — not preemptively.
- Keep functions modular and reusable; if a script grows past ~50 lines, move it to a dedicated Execute Script flow.
- Always initialize variables explicitly (
var x = …). - Use flow variables to avoid hard-coding values that should differ between environments.
- In Mapping field functions, keep expressions small and readable. For multi-step logic, reach for a scripting transformation instead.
When NOT to use scripting
Inline JavaScript / Python isn't the right tool for:
- Full ETL pipelines — build a flow with source & destination connections instead.
- Heavy data processing (more than a few thousand rows in a tight loop) — push down to SQL or use a streaming transformation.
- Anything requiring Python packages like
pandas,numpy, or modern Python 3 syntax — use the Execute Script Local or Remote via SSH flow. - Long-running jobs, daemons, or anything that needs to outlive a single flow execution.
For Python-driven orchestration (replacing Airflow's PythonOperator / BashOperator), the Execute Script flow type runs Python 3, Bash, PowerShell, Node, Docker, cloud CLIs, SQL CLIs — locally or via SSH.