Functions

The following table lists all available functions. For details about each function, see the related function type:

Function Type Functions
Record functions
  • record:attribute(<attribute name>)
  • record:attributeOrDefault(<attribute name>, <default value>)
  • record:creator()
  • record:eventCreation()
  • record:eventType()
  • record:eventVersion()
  • record:exists(<field path>)
  • record:fieldAttribute(<field path>, <field attribute name>)
  • record:fieldAttributeOrDefault(<field path>, <field attribute name>, <default value>)
  • record:id()
  • record:path()
  • record:type(<field path>)
  • record:value(<field path>)
  • record:valueOrDefault(<field path>, <default value>)
Delimited data record functions
  • record:dExists(<header>)
  • record:dHasDupHeader()
  • record:dIndex(<header>)
  • record:dIsDupHeader(<header>)
  • record:dToMap()
  • record:dValue(<header>)
  • record:dValueAt(<header index>)
Error record functions
  • record:errorCode()
  • record:errorCollectorId()
  • record:errorMessage()
  • record:errorPipeline()
  • record:errorStackTrace()
  • record:errorStage()
  • record:errorStageLabel()
  • record:errorTime()
Base64 functions
  • base64:decodeBytes(<string>)
  • base64:decodeString(<string>, <charset>)
  • base64:encodeBytes(<byte array>, <urlSafe: true | false>)
  • base64:encodeString(<string>, <urlSafe: true | false>, <charset>)
Credential functions
  • credential:get(<storeID>, <group>, <credential name>)
  • credential:getWithOptions(<storeID>, <group>, <credential name>, <options>)
Data drift functions
  • drift:names(<field path>, <ignoreWhenMissing: true | false>)
  • drift:order(<field path>, <ignoreWhenMissing: true | false>)
  • drift:size(<field path>, <ignoreWhenMissing: true | false>)
  • drift:type(<field path>, <ignoreWhenMissing: true | false>)
Field functions
  • f:attribute(<attribute name>)
  • f:path()
  • f:type()
  • f:value()
File functions
  • file:fileExtension(<filepath>)
  • file:fileName(<filepath>)
  • file:parentPath(<filepath>)
  • file:pathElement(<filepath>, <integer>)

  • file:removeExtension(<filepath>)
Math functions
  • math:abs(<number>)
  • math:ceil(<number>)
  • math:floor (<number>)
  • math:max(<number1>, <number2>)
  • math:min(<number1>, <number2>)
  • math:round(<number>)
Pipeline functions
  • pipeline:id()
  • pipeline:name()
  • pipeline:title()
  • pipeline:user()
  • pipeline:version()
String functions
  • str:concat(<string1>, <string2>)
  • str:contains(<string>, <subset>)
  • str:endsWith(<string>, <subset>)
  • str:escapeXML10(<string>)
  • str:escapeXML11(<string>)
  • str:indexOf(<string>, <subset>)
  • str:isNullOrEmpty(<string>)
  • str:length(<string>)
  • str:matches(<string>, <regEx>)
  • str:regexCapture(<string>, <regEx>, <group>)
  • str:replace(<string>, <oldChar>, <newChar>)
  • str:replaceAll(<string>, <regEx>, <newString>)
  • str:splitKV(<string>, <pairSeparator>, <keyValueSeparator>)
  • str:startsWith(<string>, <subset>)
  • str:substring(<string>, <beginIndex>, <endIndex>)
  • str:toLower(<string>)
  • str:toUpper(<string>)
  • str:trim(<string>)
  • str:truncate(<string>, <length>)
  • str:unescapeJava(<string>)
  • str:unescapeXML(<string>)
  • str:urlDecode(<URL>, <charset>)
  • str:urlEncode(<infoforURL>, <charset>)
Time functions
  • time:createDateFromStringTZ(<string>, <time zone>, <date format>)
  • time:dateTimeZoneOffset(<Date object>, <time zone>)
  • time:extractDateFromString(<string>, <format string>)
  • time:extractLongFromDate(<Date object>, <format string>)
  • time:extractStringFromDate(<Date object>, <format string>)
  • time:extractStringFromDateTZ(<Date object>, <time zone>, <format string>)
  • time:millisecondsToDateTime(<long>)
  • time:now()
  • time:timeZoneOffset(<time zone>)
  • time:trimDate(<datetime>)
  • time:trimTime(<datetime>)
Miscellaneous functions
  • alert:info()
  • emptyList()
  • emptyMap()
  • every(<interval>, <mm() | ss()>)
  • field:field()
  • jvm:maxMemoryMB()
  • list:join(<list field>, <separator>)
  • list:joinSkipNulls(<list field>, <separator>)
  • offset:column(<position>)
  • runtime:availableProcessors()
  • runtime:conf(<runtime property>)
  • runtime:loadResource(<file name>, <restricted: true | false>)
  • runtime:loadResourceRaw(<file name>, <restricted: true | false>)
  • sdc:hostname()
  • sdc:id()
  • uuid:uuid()
  • vault:read(<path>, <key>) - Deprecated
  • vault:readWithDelay(<path>, <key>, <delay>) - Deprecated

Record Functions

Use record functions to determine information about a record, such as the stage that created it or whether a field exists in the record.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The expression language provides the following general record functions:

record:attribute(<attribute name>)
Returns the value of the specified record attribute. Use to return data, for example, for the "tag" attribute that you set using the File Tail origin.
Uses the following argument:
  • attribute name - Name of the record header attribute.
Return type: String.
For example, the following expression returns the value of a tag attribute added by a File Tail origin:
${record:attribute('tag')}
record:attributeOrDefault(<attribute name> , <default value>)
Returns the value of the specified record header attribute. When the attribute does not exist or has no value, returns the specified default value.
Uses the following arguments:
  • attribute name - Name of the record header attribute.
  • default value - Value to use when the record header attribute does not exist or has no value.
Return type: Returns a string value when returning the field attribute value. Otherwise, returns the type of the specified default value.
For example, the following expression returns the value of the tag attribute, and returns NA when no value exists:
${record:attributeOrDefault('tag', 'NA')}
record:creator()
Returns the instance name of the stage that created the record, typically an origin stage. Data is stored in the stageCreator field of a record.
Return type: String.
For example, the following expression returns the instance name of the stage that creates each record:
${record:creator()}
record:eventCreation()
Returns the epoch timestamp when an event occurred. Use to return the creation timestamp from an event record. Returns null values when applied to other records.
Return type: String.
For example, the following expression returns the time that the event-generating stage generated an event:
${record:eventCreation()}
record:eventType()
Returns the event type from an event record. Returns null values when applied to other records.
Event generating stages create different event types. For information about the possible event types, see "Event Records" in the stage documentation.
Return type: String.
For example, the following expression returns the event type from an event record:
${record:eventType()}
record:eventVersion()
Returns the event version from an event record. Returns null values when applied to other records.
Return type: String.
For example, the following expression returns the event version from an event record:
${record:eventVersion()}
record:exists(<field path>)
Determines if the specified field is in a record. When the field is not in the record, returns 0. When the field is in the record, returns 1.
Return type: Boolean.
For example, the following expression returns 1 when the Name field exists in a record:
${record:exists('/Name')}
It returns 0 when the Name field is not part of the record.
record:fieldAttribute(<field path>, <field attribute name>)
Returns the value for the specified field attribute.
Uses the following arguments:
  • field path - Path to the field with the attribute.
  • field attribute name - Name of the field attribute.
Return type: String
For example, the following expression returns the value for the pk attribute on the ID field:
record:fieldAttribute('/ID', 'pk')
record:fieldAttributeOrDefault(<field path>, <field attribute name>, <default value>)
Returns the value for the field attribute when it exists. If the attribute does not exist, returns the specified default value.
Uses the following arguments:
  • field path - Path to the field with the attribute.
  • field attribute name - Name of the field attribute.
  • default value - Value to use when the field attribute does not exist or has no value.
Return type: Returns a string value when returning the field attribute value. Otherwise, returns the type of the specified default value.
For example, the following expression returns the value for the pk attribute on the ID field when available, otherwise, it returns "n".
record:fieldAttributeOrDefault('/ID', 'pk', 'n')
record:id()
Returns the record ID. The record ID is also stored in the sourceId record header attribute.
Return type: String.
record:path()
Returns the stages in the pipeline that processed a record. Data is stored in the stagesPath field of a record. The field path uses the following format:
<origin instance name>:<next stage instance name>:<next stage instance name>:... 
Return type: String.
record:type(<field path>)
Checks the data type of a field. Possible return values are:
  • Boolean
  • Char
  • Byte
  • Short
  • Integer
  • Long
  • Float
  • Double
  • Date
  • Datetime
  • Decimal
  • String
  • Time
  • Byte-Array
  • Map
  • List
Return type: String.
record:value(<field path>)
Returns the value of the field in the appropriate data type.
Return type: When the field includes a value, returns the type based on the field value. When the field is null, the function returns no value.

Nulls are coerced based on the expected return type of the function. For example, if an EL is expected to return a string, null is coerced to an empty string. However, a null integer is coerced to 0.

record:valueOrDefault(<field path> , <default value>)
Returns the value of the field in the appropriate data type. If the field does not exist or if the field is null, returns the default value.
Use to provide a default value when a field does not exist or when a field is null.
Return type: Variable, based on whether the function returns the field value or default value.

Delimited Data Record Functions

Use delimited data record functions to process delimited data with the list root field type. If you configured an origin to process the delimited data with the list-map root field type, you can use standard record functions.

For more information about the delimited data root field types, see Delimited Data Root Field Type.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The expression language provides the following delimited data record functions:
record:dExists(<header name>)
Determines if the specified header name exists in the record.
Return type: Boolean.
record:dHasDupHeader()
Determines if the record has duplicate header names.
Return type: Boolean.
record:dIndex(<header name>)
Returns the index associated with the specified header name. Use to find the position of a header in the record.
Return type: Integer, starting with 0 for the first position.
record:dIsDupHeader(<header name>)
Determines if the specified header name is used more than once in a record.
Return type: Boolean.
record:dToMap()
Converts a record with a List root field to a List-Map root field. Use to convert delimited List records to List-Map to allow easier use with standard record functions. For more information about root field types, see Delimited Data Root Field Type.
Return type: Records with a List-Map root field.
To use record:dToMap(), use a slash as the root output field as follows:
record:dValue(<header name>)
Returns the value associated with the specified header name.
Return type: String.
record:dValueAt(<header index>)
Returns the value at the specified header index position.
Return type: String.

Error Record Functions

Error record functions provide information about error records. Use error functions to process error records.

For example, you might use error functions in a Stream Selector to pass data to different error handling pipelines based on the type of error that occurred.

The expression language provides the following error functions:

record:errorCode()
Returns the error code for the error record.
Return type: String.
record:errorCollectorId()
Returns the ID of the Data Collector that sent the record to error.
By default, the Data Collector ID uses the format <hostname>:<port>. You can verify the Data Collector ID in the Data Collector Configuration page.
Return type: String.
record:errorMessage()
Returns the error message for the error record.
Return type: String.
record:errorPipeline()
The name of pipeline that sent the record to error.
Return type: String.
record:errorStackTrace()
Returns the error stack trace for the error record.
Return type: String.
record:errorStage()
Returns the instance name of the stage that sent the record to error.
You can verify the instance name on the General tab of the stage.
Return type: String.
record:errorStageLabel()
Returns the user-defined label for the stage that sent the record to error.
Return type: String.
record:errorTime()
The time the record was sent to error.
Return type: Long.

Base64 Functions

Use Base64 functions to encode or decode data using Base64.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The expression language provides the following Base64 functions:

base64:decodeBytes(<string>)
Returns a decoded byte array from a Base64 encoded string.
Return type: Byte Array.
Uses the following argument:
  • string - The Base64 encoded string to decode.
For example, ${base64:decodeBytes(record:value("/ID"))} decodes the Base64 encoded string in the ID field as a byte array value.
base64:decodeString(<string>, <charset>)
Returns a decoded string from a Base64 encoded string using the specified character set.
Return type: String.
Uses the following arguments:
  • string - The Base64 encoded string to decode.
  • charset - The character set to use to decode the data.
For example, ${base64:decodeString(record:value("/ID"), "UTF-8")} decodes the Base64 encoded string in the ID field using the UTF-8 character set as a string value.
base64:encodeBytes(<byte array>, <urlSafe: true | false>)
Returns a Base64 encoded string value of the specified byte array.
Return type: String.
Uses the following arguments:
  • byte array - The byte array to encode using Base64.
  • urlSafe - When set to true, encodes the data so that it can be safely sent in a URL.
For example, ${base64:encodeBytes(record:value("/ID"), true)} uses Base64 to encode the byte array in the ID field such that the encoded data is URL safe.
base64:encodeString(<string>, <urlSafe: true | false>, <charset>)

Returns a Base64 encoded string value of the specified string.

Return type: String.
Uses the following arguments:
  • string - The string to encode using Base64.
  • urlSafe - When set to true, encodes the data so that it can be safely sent in a URL.
  • charset - The character set to use to encode the data.
For example, ${base64:encodeString(record:value("/ID"), false, "UTF-8")} uses Base64 to encode the string in the ID field using the UTF-8 character set such that the encoded data is not URL safe.

Credential Functions

Use a credential function to retrieve a credential value from the following credential store systems:
  • CyberArk
  • Java keystore
  • Vault

Before you use a credential function, you must configure the CyberArk, Java keystore, or Vault credential stores.

String literals must be enclosed in single or double quotation marks.

Important: When you use a credential function in a stage property, the function must be the only value defined in the property. For example, you cannot include another function or a literal value along with the credential function.

The expression language provides the following credential functions:

credential:get(<storeID>, <group>, <credential name>)
Returns the credential value from the credential store. Uses the following arguments:
  • storeId - Unique ID of the credential store to use. Enter "cyberark" to access the CyberArk credential store, "jks" to access the Java keystore credential store, or "vault" to access the Vault credential store.
  • group - Group to which a user must belong before that user can access the credential. Only users that have execute permission on the pipeline and that belong to this group can validate, preview, or run the pipeline that retrieves the credential values.

    If working with Control Hub, specify the group using the required naming convention: <group ID>@<organization ID>. To grant access to all users, specify the default "all" group when working only with Data Collector or the default "all@<organization ID>" group when working with Control Hub.

  • credential name - Name of the credential value to retrieve from the credential store. Use the required format for the credential store:
    • CyberArk - Enter the credential name using the following format: "<safe><separator><folder><separator><object name><separator><element name>".

      Where <safe> is the CyberArk safe to read, <separator> is the separator defined in the $SDC_CONF/credential-stores.properties file, <folder> is the CyberArk folder to read, <object name> is the CyberArk object to read, and <element name> is the name for the value that you want returned.

    • Java keystore - Enter the credential name added to the Java keystore file using the jks-cs add command.
    • Vault - Enter the credential name using the following format: "<path><separator><key>".

      Where <path> is the path in Vault to read, <separator> is the separator defined in the $SDC_CONF/credential-stores.properties file, and <key> is the key for the value that you want returned.

Return type: String.
For example, the following expression returns the value of the OracleDBPassword credential defined in the Java keystore credential store and allows any user belonging to the devops group access to the credential when validating, previewing, or running the pipeline:
${credential:get("jks", "devops", "OracleDBPassword")}
credential:getWithOptions(<storeID>, <group>, <credential name>, <options>)
Returns the credential value from the credential store using additional options to communicate with the credential store. Not applicable for the Java keystore credential store.
For example, use this function with the CyberArk credential store to define the separator for the credential name within the function. Or use this function with the Vault credential store when you might need a delay of 10 seconds or more before the credentials can be used successfully.
Uses the following arguments:
  • storeId - Unique ID of the credential store to use. Enter "cyberark" to access the CyberArk credential store or "vault" to access the Vault credential store.
  • group - Group to which a user must belong before that user can access the credential. Only users that have execute permission on the pipeline and that belong to this group can validate, preview, or run the pipeline that retrieves the credential values.

    If working with Control Hub, specify the group using the required naming convention: <group ID>@<organization ID>. To grant access to all users, specify the default "all" group when working only with Data Collector or the default "all@<organization ID>" group when working with Control Hub.

  • credential name - Name of the credential value to retrieve from the credential store. Use the required format for the credential store:
    • CyberArk - Enter the credential name using the following format: "<safe><separator><folder><separator><object name><separator><element name>".

      Where <safe> is the CyberArk safe to read, <separator> is the separator defined in the $SDC_CONF/credential-stores.properties file, <folder> is the CyberArk folder to read, <object name> is the CyberArk object to read, and <element name> is the name for the value that you want returned.

    • Vault - Enter the credential name using the following format: "<path><separator><key>".

      Where <path> is the path in Vault to read, <separator> is the separator defined in the $SDC_CONF/credential-stores.properties file, and <key> is the key for the value that you want returned.

  • options - Additional options to communicate with the credential store. For Vault, you can use the delay option to enter a delay in milliseconds to allow time for external processing. Use the delay option when using the Vault AWS secret backend to generate AWS access credentials based on IAM policies. According to Vault documentation, you might need a delay of 10 seconds or more before the credentials can be used successfully.

    For CyberArk, you can use the following options:

    • separator - Separator to use for the credential name.
    • ConnectionTimeout - Connection timeout value.
    • FailRequestOnPasswordChange - Whether to fail the request on a password change, set to true or false.
    Use the following format to specify options:
    "<option1>=<value>,<option2>=<value>"
    For example, to use the pipe symbol (|) as the CyberArk separator, enter the following for the options argument:
    "separator=|"
    Or to set the Vault delay to 1,000 milliseconds, enter the following for the options argument:
    "delay=1000"
Return type: String.
For example, the following expression returns the value of the key password stored in the Vault path /secret/databases/oracle after waiting for a delay of 1,000 milliseconds. The credential name argument uses the default ampersand (&) as the separator. The expression allows any user belonging to the devops group access to the credential when validating, previewing, or running the pipeline:
${credential:getwithOptions("vault", "devops", "/secret/databases/oracle&password", "delay=1000")}

Data Drift Functions

Use data drift functions to create alerts when data drift occurs. You can use these functions in data drift rules.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The expression language provides the following data drift functions:
drift:names(<field path>, <ignoreWhenMissing: true | false>)
Triggers an alert when key values change in a map or list-map field. When key values are field names, triggers an alert when field names change. Use with list-map or map fields.

Use the ignoreWhenMissing flag to determine the behavior when the field is missing. When set to "true", a missing field causes no errors. When set to "false", a missing field generates an alert for the record missing the field, and for the next record that includes the field.

For example, to trigger an alert when any field name in the record changes and ignore empty records, you can use the root field in the expression as follows:
${drift:names('/', true)}
drift:order(<field path>, <ignoreWhenMissing: true | false>)
Triggers an alert when the field order changes for fields within a list-map field. Use only with list-map fields.
Note that this function triggers alerts when field names change as well.

Use the ignoreWhenMissing flag to determine the behavior when the field is missing. When set to "true", a missing field causes no errors. When set to "false", a missing field generates an alert for the record missing the field, and for the next record that includes the field.

To alert on field order or name changes within a delimited record and ignore empty records, use the root field as follows:
${drift:order('/', true)}
drift:size(<field path>, <ignoreWhenMissing: true | false>)
Triggers an alert when the number of fields changes within a list, list-map, or map field. Use with list, list-map, or map fields.

Use the ignoreWhenMissing flag to determine the behavior when the field is missing. When set to "true", a missing field causes no errors. When set to "false", a missing field generates an alert for the record missing the field, and for the next record that includes the field.

To alert when the number of columns change within the Depts map field and when a record is missing the Depts field, use the following expression:
${drift:size('/Depts', false)}
drift:type(<field path>, <ignoreWhenMissing: true | false>)
Triggers an alert when the data type of the specified field changes. Evaluates only the specified field.

Use the ignoreWhenMissing flag to determine the behavior when the field is missing. When set to "true", a missing field causes no errors. When set to "false", a missing field generates an alert for the record missing the field, and for the next record that includes the field.

For example, to receive alerts when the data type of the ID field changes and when the ID field does not exist in the record, use the following expression:
${drift:type('/ID', false)}

Field Functions

You can use field functions in field path expressions that determine the set of fields that a processor uses. Each function is evaluated against a set of matching fields individually.

For more information about field path expressions, see Field Path Expressions. For a list of stages where you can use field functions, see Supported Stages.

f:attribute(<attribute name>)
Returns the value of the specified field attribute.
Uses the following argument:
  • attribute name - A string value that specifies a field attribute name.
Return type: String.
The following field path expression returns only the fields where the region attribute is set to "west":
/*[${f:attribute('region') == 'west'}]
f:path()
Returns the path of a field. Use to return the fields with a specified path or where the path is as defined in the expression.
Return type: String.
For example, the following field path expression returns all datetime fields except for the audit_date field:
/*[${f:type() == 'DATETIME' && f:path() != '/audit_date'}]
f:type()
Returns the data type of a field. Use to return the fields with a specified field type or where the field type is as defined in the expression.
Important: Use all caps when specifying a data type in an expression and enclose the string in quotation marks. For a list of valid datatypes, see Data Collector Data Types.
Return type: String.
For example, the following field path expression returns all boolean fields:
/*[${f:type() == 'BOOLEAN'}]
f:value()
Returns the value of the field. Use to return the fields with a specified value or where the value is as defined in the expression.
Return type: Dependent on the field type.
For example, the following field path expression returns only the fields where the value is greater than 0.
/*[${f:value() > 0}]

File Functions

Use file functions to return information about a filename or path. For example, you might use a file function to remove a file extension from a filepath or to return part of the path.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The expression language provides the following file functions:
file:fileExtension(<filepath>)
Returns the file extension from a file path. Uses the following argument:
  • filepath - An absolute path to a file.
Return type: String.
For example, the following expression returns .txt:
${file:fileExtension('/logs/weblog.txt'}
file:fileName(<filepath>)
Returns the file name from a file path. Uses the following argument:
  • filepath - An absolute path to a file.
Return type: String.
For example, the following expression returns the filename, weblog.txt:
${file:fileName('/logs/weblog.txt')}
file:parentPath(<filepath>)
When used with a path to a file, returns the path to the file without the final separator, such as /files for /files/file.log.
When used with a path to a directory, returns the path to the directory without the final separator, such as /serverA/logs for /serverA/logs/2016.
Uses the following argument:
  • filepath - An absolute path to a file.
Return type: String.
For example, the following expression that includes a path to a file returns /serverB/logs:
${file:parentPath('/serverB/logs/weblog.txt')}
Similarly, the following expression that includes a path to a directory returns the parent directory, /serverB/logs:
${file:parentPath('/serverB/logs/weblogs')}
file:pathElement(<filepath>, <integer>)
Returns the part of a path based on the specified integer. Uses the following arguments:
  • filepath - An absolute path to a file.
  • integer - The section of a path to return. Can return parts starting from the left or right side of the path:
    • To return a section of a path, counting from the left side of the path, use 0 and positive integers and start with 0.
    • To return a section of a path, counting from the right side of the path, use negative integers and start with -1.
Return type: String.
For example, to return the logs portion of the path, you can use either of the following expressions:
${file:pathElement('/logs/weblog.txt',0}
${file:pathElement('/logs/weblog.txt',-2}
file:removeExtension(<filepath>)
Returns the file path without the file extension. Uses the following argument:
  • filepath - An absolute path to a file.
Return type: String.
For example, the following expression returns /logs/weblog:
${file:removeExtension('/logs/weblog.txt'}

Math Functions

Use math functions to perform math on numeric values.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

You can use the following data types with math functions:
  • Double
  • Float
  • Integer
  • Long
  • String
    Note: Data Collector parses a string as a Double value if the numeric value in a String field contains a dot ( . ). Otherwise, Data Collector parses the string as a Long value.
The expression language provides the following math functions:
math:abs(<number>)
Returns the absolute value, or positive version, of the argument. If the argument is already positive, returns the original number.
Return type: Double, Float, Int, or Long, based on the data type of the argument.
math:ceil(<number>)
Returns the smallest integer greater than or equal to the argument.
For example, ${math:ceil(8.0+3.6)} returns 12.0.
Return type: Double.
math:floor (<number>)
Returns the largest integer greater than or equal to the argument.
Return type: Double.
math:max(<number1>, <number2>)
Returns the greater of two arguments.
For example, if the value of the tip_amount field is 2.5 and the value of the fare_amount field is 12.5, then ${math:max(record:value("/tip_amount"), record:value("/fare_amount"))} returns 12.5.
Return type: Double, Float, Int, or Long, based on the data type of the argument.
math:min(<number1>, <number2>)
Returns the lesser of two arguments.
Return type: Double, Float, Int, or Long, based on the data type of the argument.
math:round(<number>)
Returns the closest number to the argument, rounding up for ties.
Return type: Double or Long.

Pipeline Functions

Use pipeline functions to determine information about a pipeline, such as the pipeline title or ID. The expression language provides the following pipeline functions:

pipeline:id()
Returns the ID of the pipeline. The ID is a UUID automatically generated when the pipeline is created and is used by Data Collector to identify the pipeline. The pipeline ID cannot be changed.
Return type: String.
For example, you can include the function in the directory template for the Hadoop FS destination to include the pipeline ID in the output directory name. The following expression creates output directories based on the pipeline ID and timestamp of a record:
/outputfiles/${pipeline:id()}/${YYYY()}-${MM()}-${DD()}-${hh()}
pipeline:name()
Like pipeline:id, this function returns the ID of the pipeline. The ID is a UUID automatically generated when the pipeline is created and is used by Data Collector to identify the pipeline. The pipeline ID cannot be changed.
In Data Collector versions earlier than 2.3.0.0, this function returned the pipeline name. This function is retained for backward compatibility. Use pipeline:id for new development.
Return type: String.
For example, you can include the function in the directory template for the Hadoop FS destination to include the pipeline ID in the output directory name. The following expression creates output directories based on the pipeline ID and timestamp of a record:
/outputfiles/${pipeline:name()}/${YYYY()}-${MM()}-${DD()}-${hh()}
pipeline:title()
Returns the title or name of the pipeline.
Return type: String.
pipeline:user()
Returns the user who started the pipeline.
Return type: String.
pipeline:version()
Returns the pipeline version when the pipeline has been published to StreamSets Control Hub. Returns "UNDEFINED" if the pipeline has not been published to Control Hub. Use this function when you have registered Data Collector to work with Control Hub.
Return type: String.

String Functions

Use string functions to transform string data.

You can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The expression language provides the following string functions:

str:concat(<string1>, <string2>)
Concatenates two strings together.
Uses the following arguments:
  • string1 - The first string to concatenate.
  • string2 - The second string to concatenate.
Use the str:concat function to concatenate strings within an expression. For example, to concatenate the FirstName field with the LastName field and then check if that result matches a pattern, use the following expression:
${str:matches(str:concat(record:value("/FirstName") , record:value("/LastName")) , "[A-Z][a-z]+[A-Z][a-z]+")}
str:contains(<string>, <subset>)
Returns true or false based on whether the string contains the configured subset of characters.
Uses the following arguments:
  • string - The string to evaluate.
  • subset - The subset of characters to look for.
For example, ${str:contains("Jane", "boo")} returns: false.
str:endsWith(<string>, <subset>)
Returns true or false based on whether the string ends with the configured subset of characters.
Uses the following arguments:
  • string - The string to evaluate.
  • subset - The subset of characters to look for.
For example, ${str:endsWith("32403-1001", "1001")} returns: true.
str:escapeXML10(<string>)
Returns a string that you can embed in an XML 1.0 or 1.1 document.
Uses the following argument:
  • string - The string to escape.
Return type: String.
For example, you can use the following expression to generate XML-safe string data based on the XML in the XMLdata field: ${str:escapeXML10(record:value('/XMLdata'))}
str:escapeXML11(<string>)
Returns a string that you can embed in an XML 1.1 document.
Uses the following argument:
  • string - The string to escape.
Return type: String.
For example, you can use the following expression to generate XML-safe string data based on the XML in the XMLdata field: ${str:escapeXML11(record:value('/XMLdata'))}
str:indexOf(<string>, <subset>)
Returns the index within a string of the first occurrence of the specified subset of characters.
Return type: Integer.
Uses the following arguments:
  • string - The string to return the index of.
  • subset - The subset of characters to look for.
For example, ${str:indexOf("tomorrow", "row")} returns: 5.
str:isNullOrEmpty(<string>)
Returns true or false based on whether a string is null or is the empty string.
Uses the following argument:
  • string - The string to evaluate.
For example, ${str:isNullOrEmpty(record:value("/work_phone"))} returns true if the work_phone field is the empty string.
str:length(<string>)
Returns the length of a string.
Return type: Integer.
Uses the following argument:
  • string - The string to return the length for.
For example, ${str:length("tomorrow")} returns: 8.
str:matches(<string>, <regEx>)
Returns true or false based on whether a string matches a Java regex pattern.
Uses the following arguments:
  • string - The string to evaluate.
  • regEx - Regular expression that describes the pattern of the string.
For example, ${str:matches(record:value("/HostName"), ".*\.com$")} checks if the string in the HostName field ends with the characters ".com".
str:regExCapture(<string>, <regEx>, <group>)
Parses a complex string into groups based on a Java regex pattern and returns the specified group.
Uses the following arguments:
  • string - The string that contains a pattern of characters.
  • regEx - Regular expression that describes the pattern of the string, separating it into groups. Use the backslash as an escape character for special characters in the expression. For example, to represent a digit in the expression with the characters \d, use \\d.
  • group - The number of the group to return, where 1 represents the first group, 2 represents the second group, etc. 0 returns the entire string.

For example, to extract the month and day from the Date field that uses the format 'mm-dd-yyyy', use the following function: ${str:regExCapture(record:value('/Date'), '(\\w{2}\\-\\w{2})-\\w*', 1)}

(\\w{2}\\-\\w{2})-\\w*', 1)

str:replace(<string>, <oldChar>, <newChar>)
Replaces all instances of a specified character in a string with a new character.
Uses the following arguments:
  • string - The string for character replacement.
  • oldChar - Character to replace.
  • newChar - Character to use for replacement.
For example, ${str:replace("lecucereche", "e", "a")} returns: lacucaracha.
str:replaceAll(<string>, <regEx>, <newString>)
Replaces a set of characters in a string with a new set of characters.
Uses the following arguments:
  • string - The string for the replacement of a group of characters.
  • regEx - A regular expression that describes the string to replace.
  • newString - The set of characters to use for replacement.

For example, ${str:replaceAll("shoes(*)","footwear")} replaces all strings that start with "shoes" with "footwear".

str:splitKV(<string>, <pairSeparator>, <keyValueSeparator>)
Splits key-value pairs in a string into a map of string values.
Uses the following arguments:
  • string - The string containing the key-value pairs.
  • pairSeparator - The set of characters that separate the key-value pairs.
  • keyValueSeparator - The set of characters that separate each key and value.
For example, let's say that a record contains a "dimensions" string field that records the height, width, and length as follows: "height=32 width=24 length=36". The function ${str:splitKV(record:value("/dimensions"), " ", "=")} returns the following map of string values:
"dimensions": {
      "height": "32",
      "width": "24",
      "length": "36"
    }
str:startsWith(<string>, <subset>)
Returns true or false based on whether the string starts with the configured subset of characters.
Uses the following arguments:
  • string - The string to evaluate.
  • subset - The subset of characters to look for.
For example, ${str:startsWith("Data Collector", "Data")} returns: true.
str:substring(<string>, <beginIndex>, <endIndex>)
Returns a subset of the string value that starts with the beginIndex character and ends one character before the endIndex.
Uses the following arguments:
  • string - The string that contains the return substring that you want.
  • beginIndex - An integer that represents the beginning position of the returned substring. Start the count from the left with 0.
  • endIndex - An integer that represents one position past the end of the substring.
For example, ${str:substring("Chewing Gum", 0, 5)} returns: Chew.
str:toLower(<string>)
Converts string data to all lowercase letters.
Uses the following argument:
  • string - The string to lower case.
For example, ${str:toLower("FALSE")} returns: false.
str:toUpper(<string>)
Converts string data to all capital letters.
Uses the following argument:
  • string - The string to capitalize.
For example, ${str:toUpper("true")} returns: TRUE.
str:trim(<string>)
Trims leading and trailing white space characters from a string, including spaces and return characters.
Uses the following argument:
  • string - The string to return without additional white space characters.
For example, ${str:trim(record:value("/Address"))} trims leading and trailing white space characters from the Address field.
str:truncate(<string>, <length>)
Returns a string truncated to the specified length. Use an integer to specify the length.
Uses the following arguments:
  • string - The string to truncate.
  • length - An integer that represents the number of characters to keep.
For example, when a phone number field includes string numbers such as 415-555-5555, ${str:truncate(record:value('/phone'),3)} returns the area code of the phone number.
str:unescapeJava(<string>)
Returns an unescaped string from a string with special Java characters. Use to include binary or non-printable characters in any location where you can enter an expression.
Uses the following argument:
  • string - The string to process.
str:unescapeXML(<string>)
Returns an unescaped string from a string that had XML data escaped.
Uses the following argument:
  • string - The string that includes escaped XML data.
For example, say you have escaped XML data embedded within an XML document in an additionalXML field. The following expression unescapes the XML: ${str:unescapeXML(record:value('/additionalXML'))}
str:urlDecode(<URL>, <charset>)
Converts characters from a URL to the specified character set, such as UTF-8.
Uses the following arguments:
Return value: String.
For example, to convert a URL to UTF-8, use the following expression:
${str:urlDecode(record:value('/URL'), 'UTF-8')}
str:urlEncode(<infoforURL>, <charset>)
Converts invalid characters to help create a valid URL based on the specified character set, such as UTF-8. You might use this function when using record data to add additional information, like a fragment, to a URL.
Uses the following arguments:
Return value: String.
For example, to encode a search result for use in a URL, converting spaces to valid characters, you might use the following expression:
${str:urlEncode(record:value('/result'), 'UTF-8')}

Time Functions

Use time functions to return the current time or to transform datetime data.

You can replace any datetime argument with an expression that evaluates to a datetime value. You cannot replace a datetime argument with a datetime literal.

You can replace any long or string argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The expression language provides the following time functions:

time:createDateFromStringTZ(<string>, <time zone>, <date format>)
Creates a Date object based on a datetime in a String field and using the specified time zone. The datetime string should not include the time zone.
Uses the following arguments:
  • string - String with datetime values, not including the time zone.
  • time zone - Time zone associated with the datetime values. The time zone is used when creating the Date object.
    You can use the following time zone formats:
    • <area>/<location> - For a list of valid time zones in this format, see https://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/timezone.html.
    • Numeric time zones with the GMT prefix, such as GMT-0500 or GMT-8:00. Note that numeric-only time zones such as -500 are not supported.
    • Short time zone IDs such as EST and CST - These time zones should generally be avoided because they can stand for multiple time zones, e.g. CST stands for both Central Standard Time and China Standard Time.
  • date format - The date format used by the string data. For information about date formats, see https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
For example, the following expression converts string values in the Datetime field to a Date object using the time zone specified in the TZ field. It assumes the date is in the format, "03/30/2016 12:15:33 AM":
${time:createDateFromStringTZ(record:value('/Datetime'), 'record:value('/TZ')', 'MM-dd-yyyy hh:mm:ss a'}
time:dateTimeZoneOffset(<Date object>, <time zone>)

Returns the time zone offset in milliseconds for the specified date and time zone. The time zone offset is the difference in hours and minutes from Coordinated Universal Time (UTC).

Uses the following arguments:
  • Date object - Date object to use.
  • time zone - Time zone associated with the Date object.
    You can use the following time zone formats:
    • <area>/<location> - For a list of valid time zones in this format, see https://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/timezone.html.
    • Numeric time zones with the GMT prefix, such as GMT-0500 or GMT-8:00. Note that numeric-only time zones such as -500 are not supported.
    • Short time zone IDs such as EST and CST - These time zones should generally be avoided because they can stand for multiple time zones, e.g. CST stands for both Central Standard Time and China Standard Time.
For example, the following expression returns the time zone offset of the Date object stored in the Date field using the time zone specified in the TZ field:
${time:dateTimeZoneOffset(record:value('/Date'), 'record:value('/TZ')'}

Return type: Long.

time:extractDateFromString(<string>, <format string>)

Extracts a Date object from a String, based on the specified date format.

Uses the following arguments:
For example, the following expression converts the string '2017-05-01 20:15:30.915' to a Date object:
${time:extractDateFromString('2017-05-01 20:15:30.915','yyyy-MM-dd HH:mm:ss.SSS')}
Return type: Date object.
time:extractLongFromDate(<Date object>, <format string>)
Extracts a long value from a Date object, based on the specified date format.
Uses the following arguments:
For example, the following expression returns the month as a long value from the Date object stored in the Timestamp field:
${time:extractLongFromDate(record:value('/Timestamp')), "MM")}
Return type: Long.
Note: Because the function returns a long value, you cannot specify non-numeric data in the data format string. For example, the date format "MMM" returns the three character abbreviation for the month (such as "Jul"), which causes the function to return incorrect results.
time:extractStringFromDate(<Date object>, <format string>)
Extracts a string value from a Date object based on the specified date format.
Uses the following arguments:
For example, the following expression returns the month as a string value from the Date object stored in the Timestamp field:
${time:extractStringFromDate(record:value('/Timestamp'), "MMM")}
Return type: String.
time:extractStringFromDateTZ(<Date object>, <time zone>, <format string>)
Extracts a string value from a Date object, converting the GMT time in the Date object to the specified date format and time zone. The function adjusts for daylight savings when given the time zone in the appropriate format.
Uses the following arguments:
  • Date object - Date object to use.
  • time zone - Time zone to use in the conversion.

    To convert the time zone and adjust for daylight savings, use the <Area>/<Location> format, such as America/Mexico_City. See the following link for a list of valid time zones in this format: https://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/timezone.html.

    Short time zone IDs such as EST and CST return data, but note that these time zones do not adjust for daylight savings, and should be avoided because they can stand for multiple time zones, e.g. CST stands for both Central Standard Time and China Standard Time.

    You can use numeric time zones with the GMT prefix, such as GMT-0500 or GMT-8:00, but these time zones also do not account for daylight savings. Note that numeric-only time zones such as -500 are not supported.

  • format string - String that specifies the format to use to express the date. For information about creating a date format, see https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDateFormat.html.
Return type: String.
Returns an empty string when the time zone is not specified or invalid.
For example, the following expression converts a Date object to a string date while adjusting to the America/New_York time zone and daylight savings. It expresses the date in the format, "03/30/2016 12:15:33 AM":
${time:extractStringFromDateTZ(record:value('/datetime'), 'America/New_York', 'MM-dd-yyyy hh:mm:ss a')}
time:millisecondsToDateTime(<long>)
Converts an epoch or UNIX time in milliseconds to a Date object.
If the epoch or UNIX time is in seconds, multiply the value by 1000 to produce a value in the milliseconds range. For example, the following expression multiplies the epoch time in seconds stored in the epochTime field and then converts the value to a Date object:
${millisecondsToDateTime(record:value("/epochTime")* 1000)}
Return type: Date object.
time:now()
Returns the current time of the Data Collector machine as a java.util.Date object.
For example, you might use the function in the Expression Evaluator to create a field with the time of processing.
Return type: Datetime.
time:timeZoneOffset(<time zone>)

Returns the time zone offset in milliseconds for the specified time zone. The time zone offset is the difference in hours and minutes from Coordinated Universal Time (UTC).

Uses the following argument:
  • time zone - Time zone to use.
    You can use the following time zone formats:
    • <area>/<location> - For a list of valid time zones in this format, see https://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/timezone.html.
    • Numeric time zones with the GMT prefix, such as GMT-0500 or GMT-8:00. Note that numeric-only time zones such as -500 are not supported.
    • Short time zone IDs such as EST and CST - These time zones should generally be avoided because they can stand for multiple time zones, e.g. CST stands for both Central Standard Time and China Standard Time.
For example, the following expression returns the time zone offset using the time zone specified in the TZ field:
${time:timeZoneOffset('record:value('/TZ')'}

Return type: Long.

time:trimDate(<datetime>)
Trims the date portion of a datetime value by setting the date portion to January 1, 1970.
For example, if the current time of the Data Collector machine is Jul 25, 2016 5:18:05 PM, then ${time.trimDate(time:now())} returns: Jan 1, 1970 5:18:05 PM.
Return type: Datetime.
time:trimTime(<datetime>)
Trims the time portion of a datetime value by setting the time portion to 00:00:00.
For example, if the value of the pickup_datetime field is Jul 22, 2016 11:16:10 AM, then ${time:trimTime(record:value("/pickup_datetime"))} returns: Jul 22, 2016 12:00:00 AM.
Return type: Datetime.

Miscellaneous Functions

In miscellaneous functions, you can replace any argument with a literal or an expression that evaluates to the argument. String literals must be enclosed in single or double quotation marks.

The expression language provides the following miscellaneous functions:

alert:info()
Returns information about the trigger for a data drift alert. Use only in alert text for data drift alerts.
emptyList()
Creates an empty list.
emptyMap()
Creates an empty map.
every(<interval>, < hh() | mm() | ss() >)
Represents the interval of hours, minutes, or seconds for generating output directories for the Hadoop FS, Local FS, or MapR FS destination.
When used, a destination generates output directories for the specified interval beginning on the hour. For example, when generating directories every 30 minutes, it generates a directory on the hour and on the half-hour.
You can use the function once in the Directory Template property to replace the hour, minute, or second datetime variables.
Use the function to replace the smallest time interval in the directory template.
Note: Destinations generate a directory for the smallest unit of measure by default, so do not use the every function to generate a directory every hour, minute, or second. For more information, see Directory Templates.
Uses the following arguments:
  • interval - An integer factor or submultiple of 60 that represents the interval of minutes or seconds to wait between directory generation. Use one of the following values: 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, or 30.
  • < hh() | mm() | ss() > - Use hh() for hours, mm() for minutes, and ss() for seconds.
For example, the following Directory Template generates a new directory every twelve hours, beginning on the hour:
/outputfiles/${YY()}-${MM()}-${DD()}-${every(12,hh())}
The following Directory Template generates a new directory every fifteen minutes, beginning on the hour:
/outputfiles/${YY()}-${MM()}-${DD()}-${hh()}-${every(15,mm())}
The following Directory Template generates a new directory every 30 seconds, beginning on the hour:
/outputfiles/${YY()}-${MM()}-${DD()}-${hh()}-${mm()}-${every(30,ss())}
field:field()
Returns the field name. Available only in the Decimal field expression properties of the Hive Metadata processor.
Return type: String.
jvm:maxMemoryMB()
Returns the Java heap size allocated to the Data Collector in MB. You can use this function in an expression to specify the maximum amount of memory a pipeline can use.
For example, since 65% of the Java heap size is the recommended maximum, the following expression is the default memory limit for a pipeline:
${jvm:maxMemoryMB() * 0.65}
list:join(<list field>, <separator>)
Merges elements in a List field into a String field, using the specified separator between elements.
Uses the following arguments:
  • list field - The List field that you want to merge.
  • separator - The string to use to separate the elements in the merged field.
For example, to merge the list in a colors field using a semicolon as a separator character, you can use the following expression:
${list:join(record:value('/colors'), ";")}
And if the list field includes "red", "blue", and "yellow" as elements, the expression produces the following string data:
red;blue;yellow
Return type: String.
list:joinSkipNulls(<list field>, <separator>)
Merges elements in a List field into a String field, using the specified separator between elements and skipping null values.
Uses the following arguments:
  • list field - The List field that you want to merge.
  • separator - The string to use to separate the elements in the merged field.
For example, say you use the following expression to merge the list in the colors field:
${list:join(record:value('/colors'), ";")}
And if the list field includes "red", "blue, null, "yellow", the expression ignores the null value and produces the following string data:
red;blue;yellow
Return type: String.
offset:column(<position>)
Returns the value of the positioned offset column for the current table. Available only in the additional offset column conditions of the JDBC Multitable Consumer origin.
Uses the following argument:
  • position - Position of the offset column. For example, enter 0 for the first offset column defined in the table configuration. Enter 1 for the second defined offset column.
Return type: String.
runtime:availableProcessors()

Returns the number of processors available to the Java virtual machine. You can use this function when you want to configure multithreaded processing based on the number of processors available to Data Collector.

Return type: Integer.

runtime:conf(<runtime property>)
Returns the value for the specified runtime configuration. Use to call a runtime property.
Uses the following argument:
  • runtime configuration name - Name of the configuration property to use. The property must defined in the sdc.properties file or in a runtime configuration file specified in the sdc.properties file.
For more information, see Using Runtime Properties.
runtime:loadResource(<file name>, <restricted: true | false>)
Returns the value in the specified file, trimming any leading or trailing whitespace characters from the file. Use to call a runtime resource.
Uses the following arguments:
  • file name - Name of the file that contains the information to be loaded. The file must reside in the $SDC_RESOURCES directory:
  • restricted - Whether the file has restricted permissions. If set to true, the file must be owned by the system user who runs the Data Collector and read and writable only by the owner.
For example, the following expression returns the contents of the restricted JDBCpassword.txt file, trimming any leading or trailing whitespace characters:
${runtime:loadResource("JDBCpassword.txt", true)}
For more information about runtime resources, see Using Runtime Resources.
runtime:loadResourceRaw(<file name>, <restricted: true | false>)
Returns the entire contents in the specified file, including any leading or trailing whitespace characters in the file. Use to call a runtime resource.
Uses the following arguments:
  • file name - Name of the file that contains the information to be loaded. The file must reside in the $SDC_RESOURCES directory.
  • restricted - Whether the file has restricted permissions. If set to true, the file must be owned by the system user who runs the Data Collector and read and writable only by the owner.
For example, the following expression returns the entire contents of the restricted JDBCpassword.txt file, including any leading or trailing whitespace characters:
${runtime:loadResourceRaw("JDBCpassword.txt", true)}
For more information about runtime resources, see Using Runtime Resources.
sdc:hostname()
Returns the host name of the Data Collector machine.
For example, you might use the function in the directory template for the Hadoop FS destination to write to a directory that includes the Data Collector host name.
Return type: String.
sdc:id()
Returns the Data Collector ID.

For a pipeline that runs in standalone execution mode, the ID is a unique identifier associated with the Data Collector, such as 58efbb7c-faf4-4d8e-a056-f38667e325d0. The ID is stored in the following file: $SDC_DATA/sdc.id.

For a pipeline that runs in cluster mode, the ID is the Data Collector worker partition ID generated by a cluster application, such as Spark or MapReduce.

uuid:uuid()
Returns a randomly generated UUID. For example, you might use the function in an Expression Evaluator processor to generate a UUID for an ID field added to each record.
The uuid:uuid() function uses a lot of entropy on Linux systems and can cause your entropy pools to run dry. When this happens, your pipelines slow to a halt but continue to run. Throughput effectively goes to zero while the system waits for entropy to again become available. As a best practice, we recommend running the haveged daemon on any Data Collector node where the uuid:uuid() function is used. The haveged daemon regenerates your entropy pools.
Return type: String.
vault:read(<path>, <key>)
Returns the value for the key on the specified path. You can use the function in username, password, and similar properties such as AWS access key IDs and secret access keys. You can also use the function in HTTP headers and bodies when using HTTPS.
Important: This function is now deprecated and will be removed in a future release. We recommend using the credential functions available with the Vault credential store integration in pipelines that include JDBC stages.
Return type: String.
Uses the following arguments:
  • path - The path in Vault to read.
  • key - The key for the value that you want returned.
vault:readWithDelay(<path>, <key>, <delay>)
Returns the value for the key on the specified path after waiting the specified amount of time. Use when you want a delayed response to allow time for external processing.
Important: This function is now deprecated and will be removed in a future release. We recommend using the credential functions available with the Vault credential store integration in pipelines that include JDBC stages.

You can use the function in username, password, and similar properties such as AWS access key IDs and secret access keys. You can also use the function in HTTP headers and bodies when using HTTPS.

Return type: String.
Uses the following arguments:
  • path - The path in Vault to read.
  • key - The key for the value that you want returned.
  • delay - Milliseconds to wait before returning the value.