skip to Main Content

StreamSets Data Integration Blog

Where change is welcome.

New Tutorial: Creating a Custom StreamSets Destination

By March 23, 2016

One of the first things I hear after I explain the basics of StreamSets Data Collector is, “Cool, so can I ingest data from/send data to X?”, for varying values of X. The short answer is, “Yes, you can!”, while the longer answer involves checking the lists of origins (for ingesting data from X) and destinations (for writing data) included with the product, and writing custom code if X is not on the list.

image_11“My X isn’t on the list! How do I get started writing that custom code?”, I hear you shout; well, I just wrote a detailed tutorial for creating your first custom StreamSets destination that explains all. Fire up your IDE, follow the steps, and you’ll build a sample destination that sends records to RequestBin, but could be adapted to send them pretty much anywhere.

How Trend Micro Uses StreamSets – An Interview with the Threat Research Team

By March 21, 2016
The Forward-Looking Threat Research team at Trend Micro were early adopters of StreamSets Data Collector. They use StreamSets to ingest data from a wide variety of sources to create a Threat Assessment Dashboard in Elasticsearch. In this interview, we talk with members of their team about how they evaluated StreamSets and implemented it in their production environment in a short period of time.

Visualizing Apache Log Data… with Minecraft!

By March 18, 2016

Apache log data in MinecraftA key differentiator of StreamSets Data Collector (SDC) is that it operates in continuous mode – set a pipeline running and it will continue to read files from a directory or take messages from a queue. A Twitter conversation with Richard Tuttle, a solution architect at CRM Science, prompted me to wonder, would it be possible to ingest Apache Web Server log data, lookup the geolocation from the client IP address, and plot the results on a map… in Minecraft?

Getting Started with StreamSets Data Collector

By March 14, 2016

Hi, I’m Pat Patterson, newly minted ‘community champion’ here at StreamSets. As I get up to speed with big data in general and StreamSets Data Collector (SDC) in particular, I’ll write up my exploits here on the StreamSets blog to help other novices as they get started with open source big data ingest.

I’m going to assume you know the basics of what StreamSets Data Collector can do, and you want to get started actually using it. If you do need some background, the product page and FAQs are great places to start.

Now, let’s get hands on!

Binlog Processing Using Maxwell, Kafka & StreamSets

By March 2, 2016

This is a nice example of Kafka enablement using Maxwell (a mysql-to-kafka binlog processor) and StreamSets Data Collector from the folks at B23.   It includes a schema change listener for handling data drift.  Enjoy! Innovate on Your Data - Maxwell…

Back To Top