• A
    • activation
    • ADLS Gen1 destination
      • configuring[1]
      • data formats[1]
      • overview[1]
      • overwrite partition prerequisite[1]
      • partitions[1][2]
      • prerequisites[1]
      • retrieve authentication information[1]
      • write mode[1]
    • ADLS Gen1 origin
      • configuring[1]
      • data formats[1]
      • overview[1]
      • partitions[1]
      • prerequisites[1]
      • retrieve authentication information[1]
      • schema requirement[1]
    • ADLS Gen2 destination
      • configuring[1]
      • data formats[1]
      • overview[1]
      • overwrite partition prerequisite[1]
      • prerequisites[1]
      • retrieve configuration details[1]
      • write mode[1]
    • ADLS Gen2 origin
      • configuring[1]
      • data formats[1]
      • overview[1]
      • partitions[1]
      • prerequisites[1]
      • retrieve configuration details[1]
      • schema requirement[1]
    • ADLS stages
      • local pipeline prerequisites[1]
    • Aggregate processor
      • aggregate functions[1]
      • configuring[1]
      • default output fields[1]
      • example[1]
      • overview[1]
      • shuffling of data[1]
    • Amazon EMR[1]
    • Amazon Redshift
    • Amazon Redshift destination
      • AWS credentials and write requirements[1]
      • configuring[1]
      • installing the JDBC driver[1]
      • partitions[1]
    • Amazon S3 destination
      • AWS credentials[1]
      • configuring[1]
      • data formats[1]
      • overview[1]
      • overwrite partition prerequisite[1]
      • partitions[1]
      • write mode[1]
    • Amazon S3 origin
      • AWS credentials[1]
      • data formats[1]
      • overview[1]
      • partitions[1]
      • schema requirement[1]
    • Amazon S3 stages
      • local pipeline prerequisites[1]
    • Amazon Web Services
      • StreamSets for Databricks[1]
    • Append Data write mode
      • Delta Lake destination[1]
    • authentication
    • authentication properties
    • authentication tokens
    • AWS Secrets Manager
      • credential store[1]
      • properties file[1]
    • AWS Secrets Manager access
    • Azure
      • StreamSets for Databricks[1]
    • Azure Event Hubs destination
    • Azure Event Hubs origin
      • configuring[1]
      • default and specific offsets[1]
      • overview[1]
      • prerequisites[1]
    • Azure Key Vault
      • credential store[1]
      • credential store, prerequisites[1]
      • properties file[1]
    • Azure Key Vault access
    • Azure SQL destination
      • configuring[1]
      • driver installation[1]
      • partitions[1]
    • Azure SQLL destination
  • B
    • Base64 functions
    • basic syntax
      • for expressions[1]
    • batch pipelines
    • browser
      • requirements[1]
    • bulk edit mode
  • C
    • caching
      • for origins and processors[1]
      • ludicrous mode[1]
    • case study
      • batch pipelines[1]
      • streaming pipelines[1]
    • CDC writes
      • Delta lake destination[1]
    • classloader
    • client deployment mode
      • Hadoop YARN cluster[1]
    • cloud service provider
      • Amazon Web Services[1]
      • Azure[1]
    • cluster
    • cluster configuration
      • Databricks instance pool[1]
      • Databricks pipelines[1]
    • cluster deployment mode
      • Hadoop YARN cluster[1]
    • command line interface
      • jks-credentialstore command[1]
      • stagelib-cli command[1]
    • conditions
      • Delta Lake destination[1]
      • Filter processor[1]
      • Join processor[1]
      • Stream Selector processor[1]
      • Window processor[1]
    • configuring
      • Snowflake origin[1]
    • constants
      • in the StreamSets expression language[1]
    • Control Hub
      • configuration properties for Transformer[1]
      • HTTP or HTTPS proxy[1]
    • credential stores
      • AWS Secrets Manager[1]
      • Azure Key Vault[1]
      • enabling[1]
      • functions to access[1]
      • group access[1]
      • Java keystore[1]
      • overview[1]
    • cross join
      • Join processor[1]
    • custom drivers[1]
    • custom schemas
      • application to JSON and delimited data[1]
      • DDL schema format[1]
      • error handling[1]
      • JSON schema format[1]
      • origins[1]
  • D
    • Databricks
      • cluster[1]
      • provisioned cluster configuration[1]
      • provisioned cluster with instance pool[1]
      • uninstalling old Transformer libraries[1]
    • Databricks pipelines
      • existing cluster[1]
      • provisioned cluster[1][2]
      • staging directory[1]
    • Data Collectors
    • data formats
      • ADLS Gen1 destination[1]
      • ADLS Gen1 origin[1]
      • ADLS Gen2 destination[1]
      • ADLS Gen2 origin[1]
      • Amazon S3 destination[1]
      • Amazon S3 origin[1]
      • Azure Event Hubs destination[1]
      • File destination[1]
      • File origin[1]
      • Kafka destination[1]
      • Kafka origin[1]
      • Whole Directory origin[1]
    • data preview
      • data type display[1]
      • overview[1]
    • Dataproc
      • cluster[1]
      • credentials[1]
      • credentials in a file[1]
      • credentials in a property[1]
      • default credentials[1]
    • Dataproc pipelines
      • existing cluster[1]
    • data types
    • datetime variables
      • in the StreamSets expression language[1]
    • Deduplicate processor
    • default output fields
      • Aggregate processor[1]
    • default stream
      • Stream Selector[1]
    • Delete from Table write mode
      • Delta Lake destination[1]
    • delivery guarantee
    • Delta Lake destination
      • ADLS Gen1 prerequisites[1]
      • ADLS Gen2 prerequisites[1]
      • Amazon S3 credential mode[1]
      • Append Data write mode[1]
      • CDC example[1]
      • configuring[1]
      • creating a managed table[1]
      • creating a table[1]
      • creating a table or managed table[1]
      • Delete from Table write mode[1]
      • overview[1]
      • overwrite condition[1]
      • Overwrite Data write mode[1]
      • partitions[1]
      • retrieve ADLS Gen1 authentication information[1]
      • retrieve ADLS Gen2 authentication information[1]
      • Update Table write mode[1]
      • Upsert Using Merge write mode[1]
      • write mode[1]
      • writing to a local file system[1]
    • Delta Lake Lookup processor
      • ADLS Gen2 prerequisites[1]
      • Amazon S3 credential mode[1]
      • configuring[1]
      • overview[1]
      • retrieve ADLS Gen1 authentication information[1]
      • retrieve ADLS Gen2 authentication information[1]
      • storage systems[1][2]
      • using from a local file system[1]
    • Delta Lake origin
      • ADLS Gen1 prerequisites[1][2]
      • ADLS Gen2 prerequisites[1]
      • Amazon S3 credential mode[1]
      • overview[1][2]
      • reading from a local file system[1]
      • retrieve ADLS Gen1 authentication information[1]
      • retrieve ADLS Gen2 authentication information[1]
      • storage systems[1]
    • deployment mode
      • Hadoop YARN cluster[1]
    • destinations
    • directories
    • directory path
      • File destination[1]
      • File origin[1]
    • disconnected mode
    • Docker
    • dpm.properties
      • modifying for Transformer[1]
    • drivers[1]
      • Azure SQL destination[1]
      • JDBC destination[1]
      • JDBC Lookup processor[1]
      • JDBC origin[1]
      • MySQL JDBC Table origin[1]
      • Oracle JDBC Table origin[1]
  • E
    • EMR
      • base URI and staging directory[1]
      • cluster[1]
      • connection security[1]
      • Kerberos stage limitation[1]
      • provisioned cluster[1]
      • Transformer installation location[1]
    • EMR pipelines
      • existing cluster[1]
    • encryption zones
      • using KMS to access HDFS encryption zones[1]
    • environment variable
      • STREAMSETS_LIBRARIES_EXTRA_DIR[1]
    • environment variables
      • customizing Transformer[1]
      • directories[1]
      • modifying[1]
      • system group[1]
      • system user[1]
    • execution engines
    • execution mode
    • executors
    • expression completion
      • overview and tips[1]
    • expression language
    • expressions
      • in pipeline and stage properties[1]
      • Spark SQL Expression processor[1]
    • external libraries
      • installing for stages[1]
      • manual install[1]
      • manual installation[1]
      • Package Manager installation[1]
      • set up external directory[1]
      • stage properties installation[1]
  • F
    • Field Remover processor
    • fields
    • File destination
      • configuring[1]
      • data formats[1]
      • directory path[1]
      • overview[1]
      • overwrite partition prerequisite[1]
      • partitions[1]
      • write mode[1]
    • file formats
      • Hive destination[1]
    • file functions
    • File origin
      • configuring[1]
      • custom schema[1]
      • data formats[1]
      • directory path[1]
      • overview[1]
      • partitions[1]
      • schema requirement[1]
    • Filter processor
    • full outer join
      • Join processor[1]
    • full read
      • Snowflake origin[1]
    • functions
      • Base64 functions[1]
      • credential[1]
      • file functions[1]
      • in the StreamSets expression language[1]
      • job functions[1]
      • math functions[1]
      • miscellaneous functions[1]
      • pipeline functions[1]
      • string functions[1]
      • time functions[1]
  • G
    • garbage collection
  • H
    • Hadoop impersonation mode
      • configuring KMS for encryption zones[1]
      • lowercasing user names[1]
      • overview[1]
    • Hadoop YARN
      • cluster[1]
      • deployment mode[1]
      • directory requirements[1]
      • impersonation[1]
      • Kerberos authentication[1]
    • heap dump creation
    • heap size
    • history
      • pipeline run[1]
    • Hive destination
      • additional Hive configuration properties[1]
      • configuring[1]
      • file formats[1]
      • overview[1]
      • partitions[1]
    • Hive origin
      • additional Hive configuration properties[1]
      • configuring[1]
      • full mode query guidelines[1]
      • incremental and full query mode[1]
      • incremental mode query guidelines[1]
      • overview[1]
      • partitions[1]
      • reading Delta Lake managed tables[1]
      • SQL query[1]
    • HTTP or HTTPS proxy
      • for Control Hub[1]
    • HTTPS protocol
  • I
    • impersonation mode
    • incremental read
      • Snowflake origin[1]
    • inner join
      • Join processor[1]
    • inputs variable
    • installation
    • install from RPM
    • install from tarball
  • J
    • Java
      • garbage collection[1]
    • Java configuration options
      • heap size[1]
      • Transformer environment configuration file[1]
    • Java keystore
      • credential store[1]
      • properties file[1]
    • Java Security Manager
    • JDBC destination
      • configuring[1]
      • driver installation[1]
      • overview[1]
      • partitions[1]
      • tested versions and drivers[1]
      • write mode[1]
    • JDBC Lookup processor
      • configuring[1]
      • driver installation[1]
      • overview[1]
      • tested versions and drivers[1]
    • JDBC origin
      • configuring[1]
      • driver installation[1]
      • offset column[1]
      • overview[1]
      • partitions[1]
      • supported offset data types[1]
      • tested versions and drivers[1]
    • job functions
    • Join processor
      • condition[1]
      • configuring[1]
      • criteria[1]
      • cross join[1]
      • full outer join[1]
      • inner join[1]
      • join types[1]
      • left anti join[1]
      • left outer join[1]
      • left semi join[1]
      • matching fields[1]
      • overview[1]
      • right anti join[1]
      • right outer join[1]
      • shuffling of data[1]
    • join types
      • Join processor[1]
  • K
    • Kafka destination
      • configuring[1]
      • data formats[1]
      • Kerberos authentication[1]
      • message[1]
      • overview[1]
      • security[1]
      • SSL/TLS encryption[1]
    • Kafka origin
      • configuring[1]
      • custom schemas[1]
      • data formats[1]
      • Kerberos authentication[1]
      • offsets[1]
      • overview[1]
      • partitions[1]
      • security[1]
      • SSL/TLS encryption[1]
    • Kerberos
    • Kerberos authentication
      • Hadoop YARN cluster[1]
      • Kafka destination[1]
      • Kafka origin[1]
    • Kerberos keytab
      • configuring in pipelines[1]
    • Kudu origin
  • L
    • LDAP authentication
    • left anti join
      • Join processor[1]
    • left outer join
      • Join processor[1]
    • left semi join
      • Join processor[1]
    • literals
      • in the StreamSets expression language[1]
    • local pipelines
    • log files
      • viewing and downloading[1]
    • log level
    • logs
      • modifying log level[1]
      • pipelines[1]
      • Spark driver[1]
      • Transformer[1]
    • lookups
      • overview[1]
      • streaming example[1]
    • ludicrous mode
      • caching[1]
      • optimizing pipeline performance[1]
      • pipeline statistics[1]
  • M
    • MapR cluster
      • dynamic allocation requirement[1]
    • MapR clusters
      • Hadoop impersonation prerequisite[1]
      • pipeline start prerequisite[1]
      • prerequisite tasks[1]
    • master instance
      • retrieving details[1]
    • math functions
    • message
      • Kafka destination[1]
    • miscellaneous functions
    • monitoring
    • MySQL JDBC Table origin
      • configuring[1]
      • custom offset queries[1]
      • default offset queries[1]
      • driver installation[1]
      • MySQL data types[1]
      • null offset value handling[1]
      • offset column[1]
      • overview[1]
      • partitions[1]
      • supported offset data types[1]
  • N
    • NullType
      • upgrade task[1]
  • O
    • offset column
      • JDBC[1]
      • MySQL JDBC Table[1]
      • Oracle JDBC Table[1]
      • PostgreSQL JDBC Table[1]
      • SQL Server JDBC Table[1]
    • offsets
      • Kafka origin[1]
      • overview[1]
      • resetting for the pipeline[1]
      • skipping tracking[1]
    • operators
      • in the StreamSets expression language[1]
      • precedence[1]
    • Oracle JDBC Table origin
      • configuring[1]
      • custom offset queries[1]
      • default offset queries[1]
      • driver installation[1]
      • null offset value handling[1]
      • offset column[1]
      • Oracle data types[1]
      • overview[1]
      • partitions[1]
      • supported offset data types[1]
    • origins
    • output order
    • output variable
    • Overwrite Data write mode
      • Delta Lake destination[1]
  • P
    • parameters
      • pipeline[1]
      • starting pipelines with[1]
    • partitioning
    • partitions
      • ADLS Gen1 destination[1][2]
      • ADLS Gen1 origin[1]
      • ADLS Gen2 origin[1]
      • Amazon Redshift destination[1]
      • Amazon S3 destination[1]
      • Amazon S3 origin[1]
      • Azure SQL destination[1]
      • based on origins[1]
      • changing[1]
      • Delta Lake destination[1]
      • File destination[1]
      • File origin[1]
      • Hive destination[1]
      • Hive origin[1]
      • initial[1]
      • initial number[1]
      • JDBC destination[1]
      • JDBC origin[1]
      • Kafka origin[1]
      • MySQL JDBC Table origin[1]
      • Oracle JDBC Table origin[1]
      • PostgreSQL JDBC Table origin[1]
      • Rank processor[1]
      • SQL Server JDBC Table origin[1]
    • passwords
    • performing lookups
    • pipeline functions
    • pipeline offsets[1]
    • pipeline properties
      • runtime parameters[1]
      • using expressions[1]
    • pipeline run
    • pipelines
      • comparison with Data Collector[1]
      • configuring[1]
      • delivery guarantee[1]
      • logs[1]
      • monitoring[1]
      • pause monitoring[1]
      • previewing[1]
      • run history[1]
      • Spark configuration[1]
      • Spark executors[1]
      • stage library match requirement[1]
      • starting with parameters[1]
    • ports
    • PostgreSQL JDBC origin
      • supported data types[1]
    • PostgreSQL JDBC Table origin
      • configuring[1]
      • custom offset queries[1]
      • default offset queries[1]
      • null offset value handling[1]
      • offset column[1]
      • overview[1]
      • partitions[1]
      • PostgreSQL JDBC driver[1]
      • supported offset data types[1]
    • preprocessing script
      • pipeline[1]
      • prerequisites[1]
      • Spark-Scala prerequisites[1]
    • prerequisites
      • ADLS and Amazon S3 stages[1]
      • Azure Event Hubs destination[1]
      • Azure Event Hubs origin[1]
      • for the Scala processor and preprocessing script[1]
      • PySpark processor[1]
      • stage-related[1]
    • preview
      • availability[1]
      • color codes[1]
      • configured cluster[1]
      • editing properties[1]
      • embedded Spark[1]
      • output order[1]
      • overview[1]
      • pipeline[1]
      • writing to destinations[1]
    • processing mode
      • ludicrous mode versus standard[1]
    • processor
      • output order[1]
    • processors
    • Profile processor
    • properties
      • expression completion[1]
    • proxy users
    • PySpark processor[1]
      • configuring[1]
      • custom code[1]
      • Databricks prerequisites[1]
      • EMR prerequisites[1]
      • examples[1]
      • input and output variables[1]
      • other cluster and local pipeline prerequisites[1]
      • overview[1]
      • prerequisites[1][2]
      • referencing fields[1]
  • Q
    • query mode
  • R
    • Rank processor
    • read mode
      • Snowflake origin[1]
    • register
    • registration
    • remote debugging
    • repartitioning
    • Repartition processor
      • coalesce by number repartition method[1]
      • configuring[1]
      • methods[1]
      • overview[1]
      • repartition by field range repartition method[1]
      • repartition by number repartition method[1]
      • shuffling of data[1]
      • use cases[1]
    • reserved words
      • in the StreamSets expression language[1]
    • reverse proxy
      • configuring for Transformer[1]
    • right anti join
      • Join processor[1]
    • right outer join
      • Join processor[1]
    • roles
      • for users with file-based authentication[1]
    • RPM package
      • uninstallation[1]
    • runtime parameters
      • calling from a pipeline[1]
      • calling from checkboxes and drop-down menus[1]
      • calling from scripting processors[1]
      • calling from text boxes[1]
      • defining[1]
      • monitoring[1]
      • overview[1]
      • viewing[1]
    • runtime properties
      • calling from a pipeline[1]
      • defining[1]
      • overview[1]
    • runtime resources
      • calling from a pipeline[1]
      • defining[1]
      • overview[1]
    • runtime values
  • S
    • Scala
    • Scala processor
      • configuring[1]
      • custom code[1]
      • input and output variables[1]
      • inputs variable[1]
      • output variable[1]
      • overview[1]
      • prerequisites[1]
      • Spark-Scala prerequisite[1]
    • scripting processors
      • calling runtime values[1]
    • scripts
      • preprocessing[1]
    • security
      • Kafka destination[1]
      • Kafka origin[1]
    • Security Manager
    • shuffling
    • simple edit mode
    • Slowly Changing Dimension processor
      • change processing[1]
      • configuring[1]
      • configuring a file dimension pipeline[1]
      • configuring a table dimension pipeline[1]
      • dimension types[1]
      • overview[1]
      • partitioned file dimension prerequisite[1]
      • pipeline processing[1]
      • tracking fields[1]
    • Slowly Changing Dimensions processor
    • Snowflake destination
      • configuring[1]
      • overview[1]
      • required privileges[1]
    • Snowflake Lookup processor
      • configuring[1]
      • overview[1]
      • pushdown optimization[1]
      • required privileges[1]
    • Snowflake origin
      • configuring[1]
      • full query guidelines[1]
      • incremental or full read[1]
      • incremental query guidelines[1]
      • overview[1]
      • pushdown optimization[1]
      • read mode[1]
      • required privileges[1]
      • SQL query guidelines[1]
    • sorting
      • multiple fields[1]
    • Sort processor
    • Spark
      • Scala requirement[1]
    • Spark cluster
      • callback URl[1]
      • Transformer URL[1]
    • Spark configuration
    • Spark executors
    • Spark history server
    • Spark processing
    • Spark SQL Expression processor
    • Spark SQL processor
    • Spark SQL query
    • Spark SQL Query processor
    • Spark web UI
    • SQL query
      • guidelines for the Snowflake origin[1]
      • Hive origin[1]
    • SQL Server 2019 BCD
      • Transformer installation location[1]
    • SQL Server 2019 BDC
      • cluster[1]
      • JDBC connection information[1]
      • master instance details for JDBC[1]
      • quick start deployment script[1]
      • retrieving information[1]
    • SQL Server JDBC origin
      • supported data types[1]
    • SQL Server JDBC Table origin
      • configuring[1]
      • custom offset queries[1]
      • default offset queries[1]
      • null offset value handling[1]
      • offset column[1]
      • overview[1]
      • partitions[1]
      • SQL Server JDBC driver[1]
      • supported offset data types[1]
    • SSL/TLS encryption
      • Kafka destination[1]
      • Kafka origin[1]
    • stage library match requirement
      • in a pipeline[1]
    • stage properties
      • using expressions[1]
    • staging directory
      • Databricks pipelines[1]
      • EMR pipelines[1]
    • statistics
    • streaming pipelines
    • Stream Selector processor
    • StreamSets
      • expression language[1]
    • STREAMSETS_LIBRARIES_EXTRA_DIR
      • environment variable[1]
    • StreamSets Control Hub
      • disconnected mode[1]
    • StreamSets for Databricks
      • installation on AWS[1]
      • installation on Azure[1]
    • string functions
  • T
    • tarball
      • uninstallation[1]
    • Technology Preview functionality
    • third party libraries
    • time functions
    • tokens
      • for registering Transformer[1]
    • tracking fields
      • Slowly Changing Dimension processor[1]
    • Transformer
      • activation[1]
      • architecture[1]
      • authentication token[1]
      • customizing with environment variables[1]
      • description[1]
      • directories[1]
      • disconnected mode[1]
      • Docker[1]
      • environment variables[1]
      • execution engine[1][2]
      • for Data Collector users[1]
      • heap dump creation[1]
      • installation[1]
      • Java configuration options[1]
      • Java Security Manager[1]
      • launching[1]
      • proxy users[1]
      • registering[1][2]
      • registration[1]
      • remote debugging[1]
      • restarting[1]
      • Security Manager[1]
      • spark-submit[1]
      • starting[1]
      • starting as service[1]
      • starting manually[1]
      • uninstallation[1]
      • unregistering[1]
      • viewing and downloading log data[1]
      • viewing configuration properties[1]
    • TRANSFORMER_CONF
      • environment variable[1]
    • TRANSFORMER_DATA
      • environment variable[1]
    • TRANSFORMER_DIST
      • environment variable[1]
    • TRANSFORMER_GROUP
      • environment variable[1]
    • TRANSFORMER_JAVA_OPTS
      • Java environment variable[1]
    • TRANSFORMER_LOG
      • environment variable[1]
    • TRANSFORMER_RESOURCES
      • environment variable[1]
    • TRANSFORMER_ROOT_CLASSPATH
      • Java environment variable[1]
    • TRANSFORMER_USER
      • environment variable[1]
    • Transformer configuration files
      • protecting passwords and other sensitive values[1]
    • Transformer libraries
      • removing from Databricks[1]
    • Transformer metrics
    • Type Converter processor
      • configuring[1]
      • field type conversion[1]
      • overview[1]
  • U
    • uninstallation
    • Update Table write mode
      • Delta Lake destination[1]
    • upgrade
      • installation from RPM[1]
      • installation from tarball[1]
      • troubleshooting[1]
    • Upsert Using Merge write mode
      • Delta Lake destination[1]
    • URL
      • cluster callback[1]
    • usage statistics
    • users
      • creating for file-based authentication[1]
      • default for file-based authentication[1]
      • roles for file-based authentication[1]
  • V
    • validation
      • implicit and explicit[1]
  • W
    • what's new
      • version 3.11.x[1]
      • version 3.12.x[1]
      • version 3.13.x[1]
      • version 3.14.x[1]
      • version 3.15.x[1]
    • Whole Directory origin
    • Window processor
    • window types
      • Window processor[1]
    • write mode
      • Azure SQL destination[1]
      • Delta Lake destination[1]
      • JDBC destination[1]
© 2020 StreamSets, Inc.