skip to Main Content

The DataOps Blog

Where Change Is Welcome

Ingest Salesforce Data for Analysis Using StreamSets

By April 29, 2016 origin allows ingest from SalesforceUPDATE – Salesforce origin and destination stages, as well as a destination for Salesforce Wave Analytics, were released in StreamSets Data Collector Use the supported, shipping Salesforce stages rather than the unsupported code mentioned below!

As I’ve mentioned a couple of times, my previous gig was as a developer evangelist at Salesforce, with particular focus on integration. A few weeks ago, I wrote a custom destination allowing StreamSets Data Collector (SDC) to write data to Salesforce Wave Analytics; today, I’ll show you how to ingest data from Salesforce and write it to any destination supported by SDC.

New Tutorial: Creating a Custom StreamSets Destination

By March 23, 2016

One of the first things I hear after I explain the basics of StreamSets Data Collector is, “Cool, so can I ingest data from/send data to X?”, for varying values of X. The short answer is, “Yes, you can!”, while the longer answer involves checking the lists of origins (for ingesting data from X) and destinations (for writing data) included with the product, and writing custom code if X is not on the list.

image_11“My X isn’t on the list! How do I get started writing that custom code?”, I hear you shout; well, I just wrote a detailed tutorial for creating your first custom StreamSets destination that explains all. Fire up your IDE, follow the steps, and you’ll build a sample destination that sends records to RequestBin, but could be adapted to send them pretty much anywhere.

Visualizing Apache Log Data… with Minecraft!

By March 18, 2016

Apache log data in MinecraftA key differentiator of StreamSets Data Collector (SDC) is that it operates in continuous mode – set a pipeline running and it will continue to read files from a directory or take messages from a queue. A Twitter conversation with Richard Tuttle, a solution architect at CRM Science, prompted me to wonder, would it be possible to ingest Apache Web Server log data, lookup the geolocation from the client IP address, and plot the results on a map… in Minecraft?

Getting Started with StreamSets Data Collector

By March 14, 2016

Hi, I’m Pat Patterson, newly minted ‘community champion’ here at StreamSets. As I get up to speed with big data in general and StreamSets Data Collector (SDC) in particular, I’ll write up my exploits here on the StreamSets blog to help other novices as they get started with open source big data ingest.

I’m going to assume you know the basics of what StreamSets Data Collector can do, and you want to get started actually using it. If you do need some background, the product page and FAQs are great places to start.

Now, let’s get hands on!

StreamSets Monitoring with Grafana, InfluxDB, and jmxtrans

By January 14, 2016

The ability to monitor your critical infrastructure is a must, and we designed the StreamSets Data Collector (SDC) with this in mind: metrics are exposed through both the REST API and JMX. While there are many approaches to monitoring these metrics, let’s walk through a specific end-to-end example using jmxtrans to collect metrics, InfluxDB to store them, and Grafana to visualize them.

Back To Top

We use cookies to improve your experience with our website. Click Allow All to consent and continue to our site. Privacy Policy