We are excited to announce the release of the next version of StreamSets Data Collector. With this release we have a number of new features and enhancements and 60+ bug fixes.
New Features :
- SFTP/FTP origin to read files from an SFTP or FTP server.
- MapR DB destination to write data to MapR DB using the HBase API.
- XML Flattener processor to flatten XML data in a string field.
- HBase Lookup processor to perform key-value lookups from HBase.
- Redis Lookup processor to perform key-value lookups from Redis.
- Static Lookup processor to perform key-value lookups from local memory.
- Updated pipeline library view on the application Home page to organize and filter by pipeline labels, and start/stop multiple pipelines at once.
- Support for free form labels to help organize and filter pipelines.
- Hive Streaming support for MapR.
- Support for Elasticsearch 2.0 and HDP 2.4.0
- Support for Rate limiting on a pipeline to control the maximum number of records to read per second.
- Updates to Amazon S3 origin and destinations – S3 origin supports glob patterns in folder paths.
- Directory Origin now supports reading files based on last modified timestamp, in addition to the previous lexicographically ascending filenames.
- Directory and File Tail Origins now add source file metadata to record headers to further refine provenance tracking.
- HBase destination now supports a configurable time basis for the timestamp value added to each column, you can choose from processing time, record time or system time.
- Kudu destination allows writing table names based on expressions.
- Thread Pool size is now configurable through the sdc.properties file, that allows you to run several more pipelines within the same Data Collector.
- A number of new custom metrics have been added to the monitoring console.
And a number of other features and improvements. Please be sure to check out the Release Notes for detailed information about this release.
Download Data Collector now.