skip to Main Content

StreamSets Data Integration Blog

Where change is welcome.

Calling External Java Code from Script Evaluators

StreamSets News

When you're building a pipeline with StreamSets Data Collector (SDC), you can often implement the data transformations you require using a combination of 'off-the-shelf' processors. Sometimes, though, you need to write some code. The script evaluators included with SDC allow you to manipulate records in Groovy, JavaScript and Jython (an implementation of Python integrated with the Java platform). You can…

By December 21, 2016

Creating a Custom Origin for StreamSets Data Collector

Engineering, StreamSets News

Since writing tutorials for creating custom destinations and processors for StreamSets Data Collector (SDC), I've been looking for a good use case for a custom origin tutorial. It's been trickier than I expected, partly because the list of out of the box origins is so extensive, and partly because the HTTP Client origin can access most web service APIs, rendering a custom…

By December 12, 2016

Running Apache Spark Code in StreamSets Data Collector

Engineering, StreamSets News

New in StreamSets Data Collector (SDC) 2.2.0.0 is the Spark Evaluator, a processor stage that allows you to run an Apache Spark application, termed a Spark Transformer, as part of an SDC pipeline. With the Spark Evaluator, you can build a pipeline to ingest data from any supported origin, apply transformations, such as filtering and lookups, using existing SDC processor stages, and have the…

By December 8, 2016
Back To Top