skip to Main Content

Announcing StreamSets Data Collector 3.9.0 and StreamSets Data Collector Edge 3.9.0

By Posted in Data Integration June 6, 2019

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.9.0 and StreamSets Data Collector Edge 3.9.0.

StreamSets Data Collector is open source under Apache License 2.0 and a powerful design and execution engine. It enables moving data between any source and destination, performing transformations, and push down analytics along the way. To download, click here.

StreamSets Data Collector Edge is a lightweight execution agent that runs on edge devices with limited memory, CPU, and/or connectivity resources. It enables reading data from an edge device or receiveing data from another dataflow pipeline. It supports messaging protocols including HTTP, MQTT, CoAP, and WebSockets. To download, click here.

Highlights

Let’s review some of the highlights. For a complete list of enhancements, new features, bug fixes, and upgrade instructions, please refer to the Release Notes.

StreamSets Data Collector 3.9.0
Origins
  • JDBC Multitable Consumer now supports multithreaded partition processing when the primary key or user-defined offset column is an Oracle Timestamp with time zone datatype and each row has the same time zone.
  • JMS Consumer can now read messages from durable topic subscriptions, which can have only one active subscriber at a time.
  • SFTP/FTP Client origin now supports FTP over SSL (FTPS).
Processors
  • New Couchbase Lookup processor enables performing key/value or N1QL lookups against a Couchbase bucket.
  • Groovy, JavaScript, and Jython scripting processors now support direct use of SDC records.
  • Hive Metadata processor can now process datetime fields in their native format or can convert the fields to string before processing the data. Previously, the processor always converted datetime fields to string.
  • Log Parser processor now exposes a new set of data format properties to configure the maximum line length, the character set, and the retention of the original line from the log. In addition, the processor now supports entry of multiple grok patterns.
Destinations
  • New SFTP/FTP/FTPS destination that can send data to a URL using SFTP, FTP, or FTPS
  • Both Aerospike and Couchbase destinations now support CRUD operations defined in the sdc.operation.type record header attribute. You can define a default operation for records without the header attribute or value and also configure how to handle records with unsupported operations.
  • In addition, Couchbase destination can now also write to sub-documents as well as in the Avro, Binary, Delimited, JSON, Protobuf, SDC Record, and Text data formats.
Data Collector Configuration

Data Collector now includes a rule-based engine that is capable of suggesting potential fixes and workarounds to common issues. This new property can be found and disabled in sdc.properties file.

For more information about StreamSets Data Collector, please visit our documentation.

StreamSets Data Collector Edge 3.9.0
Origins
  • When enabling SSL/TLS for the HTTP Server origin, it now supports using a keystore file in the PKCS #12 format.
Processors
  • Data Collector Edge pipelines now support the new HTTP Client processor.
Destinations
  • This release includes support for writing data to new Azure Event Hub Producer and the Azure IOT Hub Producer destinations.

For more information about StreamSets Data Collector Edge, please visit our documentation.

Technology Preview Functionality

Data Collector includes these new stages with the Technology Preview designation. Technology Preview functionality is available for use in development and testing, but is not meant for use in production. When Technology Preview functionality becomes approved for use in production, the release notes and documentation will reflect the change.

Origins and Destinations
Executors
  • Change file metadata, create an empty file, or remove a file or directory in Microsoft Azure Data Lake Storage Gen1 and Gen2 upon receipt of an event.
Feedback and Contributions

If you’d like to suggest a feature, enhancement, or if you see something that needs to be fixed or made better, feel free to open a ticket by visiting—https://issues.streamsets.com.

Also note that StreamSets welcomes contributions from the community. For guidelines on contributing code, visit—https://github.com/streamsets/datacollector/blob/master/CONTRIBUTING.md

For any other questions and inquiries, please contact us.

Conduct Data Ingestion and Transformations In One Place

Deploy across hybrid and multi-cloud
Schedule a Demo
Back To Top