skip to Main Content

The DataOps Blog

Where Change Is Welcome

Ingesting JSON Data Into Apache Kudu with StreamSets Data Collector

By Posted in StreamSets News April 15, 2016


At the Hadoop Summit in Dublin this week, Ted Malaska, Principal Solutions Architect at Cloudera, and I presented Ingest and Stream Processing – What Will You Choose?, looking at the big data streaming landscape with a focus on ingest. The session closed with a demo of StreamSets Data Collector, the open source graphical IDE for building ingest pipelines.

In the demo, I built a pipeline to read JSON data from Apache Kafka, augmented the data in JavaScript, and wrote the resulting records to both Apache Kudu (incubating) for analysis and Apache Kafka for visualization.

Here’s a recording of the session:

The Apache Kudu destination is new in StreamSets Data Collector, released this week and available for download.

This post was originally published on the Kudu blog.

Back To Top

We use cookies to improve your experience with our website. Click Allow All to consent and continue to our site. Privacy Policy