Customization with Environment Variables

Transformer includes several environment variables that you can modify to customize the following areas:

Modifying Environment Variables

The method that you use to modify environment variables depends on the Transformer installation type:
Tarball installation started manually from the command line
When you start Transformer manually from the command line on any operating system, edit the $TRANSFORMER_DIST/libexec/transformer-env.sh file to modify environment variables.

Use a text editor to edit the transformer-env.sh file. Some of the environment variables in the file are commented out and do not reflect the default values. Be sure to uncomment the line when you change a variable value.

After you edit the file, restart Transformer from the command prompt to enable the changes.

Note: Do not restart Transformer from the user interface after modifying environment variables.

Transformer Directories

Transformer includes environment variables that define the directories used to store configuration, data, log, and resource files.

The TRANSFORMER_DIST environment variable defines the Transformer runtime directory. The runtime directory is the base Transformer directory that stores the executables and related files. This environment variable is set during installation.

When you start Transformer manually, the default values of the remaining directory variables are relative to the $TRANSFORMER_DIST runtime directory. When you start Transformer as a service, the default values of the remaining directory variables are absolute paths that are outside of the $TRANSFORMER_DIST runtime directory.

Modify environment variables using the method required by your installation type.

You can configure the following environment variables that define directories:

Environment Variable Description
TRANSFORMER_CONF

Defines the configuration directory for the Transformer configuration file, transformer.properties, and related realm properties files and keystore files. Also includes the logj4 properties file.

Default values:

  • Manual start: $TRANSFORMER_DIST/etc
  • Service start: /etc/transformer
TRANSFORMER_DATA

Defines the data directory for pipeline configuration and run details.

Default values:

  • Manual start: $TRANSFORMER_DIST/data
  • Service start: /var/lib/transformer
TRANSFORMER_LOG

Defines the log directory.

Default values:

  • Manual start: $TRANSFORMER_DIST/log
  • Service start: /var/log/transformer
TRANSFORMER_RESOURCES Defines the directory for runtime resource files.

Default values:

  • Manual start: $TRANSFORMER_DIST/resources
  • Service start: /var/lib/transformer-resources

User and Group for Service Start

When you run Transformer as a service, Transformer runs as the system user account and group defined in environment variables. The default system user and group are named transformer.

You can modify the values of the environment variables to point to another system user or group. Modify environment variables using the method required by your installation type.

If you change the system user, you must make the new system user the owner of all Transformer directories:
  • $TRANSFORMER_DIST
  • $TRANSFORMER_CONF
  • $TRANSFORMER_DATA
  • $TRANSFORMER_LOG
  • $TRANSFORMER_RESOURCES
For example, if you change the system user and group to myuser, use the following command to change the owner of the configuration directory, $TRANSFORMER_CONF, and all files in the directory to myuser:myuser:
chown -R myuser:myuser /etc/transformer
Note: When you run Transformer manually, Transformer runs as the system user account logged into the command prompt when the launch command is run.

Java Configuration Options

You define the Java configuration options used by Transformer based on your Transformer installation:
Tarball installation

Define Java configuration options in the TRANSFORMER_JAVA_OPTS environment variable.

When defining Java configuration options, avoid defining duplicate options. If you do define duplicates, the last option passed to the JVM usually takes precedence.

Java Heap Size

Increase or decrease the Transformer Java heap size as necessary, based on the resources available on the host machine. By default, the Java heap size is 1024 MB.

Use the following Java options to define the Java heap size:
  • Xmx - Defines the maximum heap size.
  • Xms - Defines the minimum heap size.
Tip: To avoid constant recalculation of the allocated heap size, set both properties to the same value. To define the unit of measure, use m for MB and g for GB.

Define the heap size based on your installation:

Tarball installation

Define the heap size in the TRANSFORMER_JAVA_OPTS environment variable.

For example, to double the heap size, increase the Xmx and Xms settings as follows:

export TRANSFORMER_JAVA_OPTS="${TRANSFORMER_JAVA_OPTS} -Xmx2048m -Xms2048m -server"
Modify environment variables using the method required by your installation type.
With a heap size of 2048 MB, you can configure a pipeline to use up to 65% - that's 1331 MB of memory.
Note: In the pipeline properties, you can use the jvm:maxMemoryMB() function to help define the percentage of the heap size the pipeline uses.

Remote Debugging

You can enable remote debugging to debug a Transformer instance running on a remote machine.

Enable remote debugging based on your installation:
Tarball installation

Define debugging options in the TRANSFORMER_JAVA_OPTS environment variable.

Add the following debugging options to the environment variable, where port_number is an open port number on the remote machine running Transformer:
-Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=<port_number>,suspend=n
For example, to debug Transformer on a remote machine using port number 2005, define TRANSFORMER_JAVA_OPTS as follows:
export TRANSFORMER_JAVA_OPTS="${TRANSFORMER_JAVA_OPTS} -Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=2005,suspend=n"
Modify environment variables using the method required by your installation type.

Garbage Collection

You can define the Java garbage collector that Transformer uses. By default, Transformer uses the Concurrent Mark Sweep (CMS) garbage collector.

For example, if you configure Transformer to use a large heap size, you might want to use the G1 garbage collector. If you define another garbage collector, test and evaluate Transformer performance before making the same change in a production environment. Garbage collector performance depends on each particular use case.

Define the garbage collector based on your installation:

Tarball installation
Define the garbage collector in the TRANSFORMER_JAVA_OPTS environment variable.

For example, the default garbage collector is defined as follows:

export TRANSFORMER_JAVA_OPTS=${TRANSFORMER_JAVA_OPTS:-"-XX:+UseConcMarkSweepGC -XX:+UseParNewGC"}

To use the G1 garbage collector, set the option as follows:

export TRANSFORMER_JAVA_OPTS=${TRANSFORMER_JAVA_OPTS:-"-XX:+UseG1GC"}
Modify environment variables using the method required by your installation type.

Logging

Transformer enables garbage collector logging by default to facilitate troubleshooting. Log files are written to $TRANSFORMER_LOG/gc.log. You can disable logging.

Disable garbage collector logging based on your installation:

Tarball installation
Set the TRANSFORMER_GC_LOGGING environment variable to false. For example:
export TRANSFORMER_GC_LOGGING=false
Modify environment variables using the method required by your installation type.

Heap Dump Creation

By default, when Transformer encounters an out of memory error (OOME), it creates a heap dump.

By default, heap dump files are written to the file defined in the TRANSFORMER_LOG environment variable and use a naming convention that allows generating multiple heap dump files, as follows: $TRANSFORMER_LOG/transformer_heapdump_${timestamp}.hprof.

You can change the name of the heap dump files, but we recommend using the ${timestamp} or similar variable to ensure that the heap dump name is unique.

Note that Java Virtual Machine, and therefore Transformer, does not overwrite existing heap dump files. For example, if you use $TRANSFORMER_LOG/transformer_heapdump.hprof as the file name, after Transformer creates the first heap dump file, it will not create another until you remove the existing file.

Note: Depending on the number and size of the generated heap dump files, you might want to increase the Transformer Java heap size.
You can configure the following heap dump environment variables:
Heap Dump Environment Variable Description
TRANSFORMER_HEAPDUMP_ON_OOM Specifies whether Transformer generates a heap dump upon encountering an out of memory error.

Default is true.

TRANSFORMER_HEAPDUMP_PATH Specifies the file name and location to use for heap dump files.

By default, heap dumps are written to $TRANSFORMER_LOG/transformer_heapdump_${timestamp}.hprof.

To specify a different file name or location, uncomment the property and enter the location and file name to use.

Tip: To write multiple heap dump files to a directory, use a function or variable to ensure that the file name is unique. If a file of the same name exists in the directory, Transformer does not create a new heap dump file.

Modify environment variables using the method required by your installation type.

Security Manager

Transformer includes a Java Security Manager that is enabled by default. For enhanced security, you can enable the Transformer Security Manager which prevents stages from accessing files in protected Transformer directories.

Transformer can use one of the following security managers:
Java Security Manager

By default, Transformer uses the Java Security Manager. The Java Security Manager restricts the runtime permissions of user libraries. This allows administrators to control user libraries actions on production systems. For example, by default, user libraries cannot call out to network resources and potentially cause denial-of-service (DDoS) attacks.

The security policy is defined in the $TRANSFORMER_CONF/transformer-security.policy file. The file syntax is java standard.

Transformer Security Manager
For enhanced security, enable the Transformer Security Manager. The Transformer Security Manager prevents stages from accessing files in protected Transformer directories, regardless of how the transformer-security.policy file is defined.
To enable the Transformer Security Manager, uncomment the security_manager.transformer_manager.enable property in the Transformer configuration file, $TRANSFORMER_CONF/transformer.properties.
Note: If you use an older JVM version, the Transformer Security Manager might encounter some JVM known issues.

If needed, you can configure Transformer to use neither security manager by setting the TRANSFORMER_SECURITY_MANAGER_ENABLED environment variable to false.

Modify environment variables using the method required by your installation type.

Protected Directories

When the Transformer Security Manager is enabled, the following Transformer directories are protected directories:
  • $TRANSFORMER_CONF - Stages cannot access files in the configuration directory.
  • $TRANSFORMER_DATA - Stages cannot access files in the data directory.
  • $TRANSFORMER_RESOURCES - Stages can read files in the resources directory, but cannot write to files in the directory.

If needed, you can allow stages to access specific files in these protected directories by modifying Transformer Security Manager exception properties in the Transformer configuration file, $TRANSFORMER_CONF/transformer.properties. However, use caution when configuring exceptions to these protected directories.

You can configure exceptions to protected directories as follows:
Exceptions for all stage libraries
To allow all stage libraries access to files in protected directories, modify the security_manager.transformer_dirs.exceptions property to define files that can be accessed.
Exceptions for specific stage libraries
To allow a specific stage library access to files in protected directories, add the following property and then define the files that the stage library can access:
security_manager.transformer_dirs.exceptions.<stage_library_name>=<file_path>
For example, the default Transformer configuration file includes an exception for the Java keystore credential store stage library defined as follows:
security_manager.transformer_dirs.exceptions.lib.streamsets-transformer-jks-credentialstore-lib=$TRANSFORMER_CONF/jks-credentialStore.pkcs12

When you configure a Security Manager exception property, use the appropriate directory environment variable in the file path: $TRANSFORMER_CONF, $TRANSFORMER_DATA, or $TRANSFORMER_RESOURCES. You can enter multiple file paths separated by commas.

Root Classloader

You can edit the TRANSFORMER_ROOT_CLASSPATH environment variable to define the path to JAR files to be added to the Transformer root classloader.

Use the variable for components that must be in the root classloader, such as Snappy. Default is $TRANSFORMER_DIST/root-lib/'*'.

Modify environment variables using the method required by your installation type.