November 2017

Announcing StreamSets Data Collector version 3.0

By Kirit Basu, Head of Strategy November 28, 2017

Version 3.0 marks an important new milestone for StreamSets. With close to a million downloads and a strong community and customer base, we are very excited to offer a host of powerful new capabilities within the product. This release has greater connectivity with cloud services, deeper integration with Hadoop distributions, new data aggregations and an exciting new technology for running pipelines on resource constrained devices.

For those keeping count, SDC 3.0 has 27 new features, close to 100 improvements and almost 200 bug fixes.

This release also contains important new functionality where we extend the reach of SDC out to devices out on the edge. SDC Edge is a lightweight agent that can execute pipelines designed in SDC. These agents can run on Windows, Linux, Mac, Android and IoS. To learn more about SDC Edge, follow this link.

Announcing StreamSets Data Collector Edge

By Kirit Basu, Head of Strategy November 28, 2017

Today an increasing amount of data is being generated from outside the data center or cloud – it isn’t always easy to get this data out of source systems or perform analytics right where it’s generated. Furthermore, getting this data into central big data systems managed by the enterprise is an arduous task involving a large number of disjointed, poorly instrumented and often hand coded technologies.

Fun with FileRefs – Manipulating Whole File Data

Data Integration

Data Transformation

By Pat Patterson November 2, 2017

As well as parsing incoming data into records, many StreamSets Data Collector (SDC) origins can be configured to ingest Whole Files. The blog entry Whole File Transfer with StreamSets Data Collector provides a basic introduction to the concept.

Although the initial release of the Whole File feature did not allow file content to be accessed in the pipeline, we soon added the ability for Script Evaluator processors to read the file, a feature exploited in the custom processor tutorial to read metadata from incoming image files. In this blog post, I’ll show you how a custom processor can both create new records with Whole File content, and replace the content in existing records.

StreamSets Data Integration Blog

Announcing StreamSets Data Collector version 3.0

Announcing StreamSets Data Collector Edge

Fun with FileRefs – Manipulating Whole File Data

Stay in Touch

Connect