Skip to content

StreamSets Data Integration Blog

Where change is welcome.

MySQL CDC with MapR Streams, Apache Drill, and StreamSets

By September 26, 2016

mysql-cdc-with-Raphaël-VelfreToday’s post is from Raphaël Velfre, a senior data engineer at MapR. Raphaël has spent some time working with StreamSets Data Collector (SDC) and MapR’s Converged Data Platform. In this blog entry, Raphaël explains how to use SDC for MySQL CDC to extract data from MySQL and write it to MapR Streams, and then move data from MapR Streams to MapR-FS via SDC, where it can be queried with Apache Drill.

Retrieving Metrics via the StreamSets Data Collector REST API

By July 8, 2016

PiTFT Displaying SDC MetricsLast week, I explained how I was able to run StreamSets Data Collector Engine on a Raspberry Pi 3, ingesting sensor data and writing it to Cassandra. With that working, I wanted to show pipeline metrics across data pipelines on Adafruit’s awesome PiTFT Plus 2.8″ screen. In this blog post, I’ll explain how I was able to write a Python app to retrieve pipeline metrics with StreamSets Data Collector REST API, showing them on the PiTFT Plus via pygame to better manage data pipelines.

Ingest Salesforce Data for Analysis Using StreamSets

By April 29, 2016

Force.com origin allows ingest from SalesforceUPDATE – Salesforce origin and destination stages, as well as a destination for Salesforce Wave Analytics, were released in StreamSets Data Collector 2.2.0.0. Use the supported, shipping Salesforce stages rather than the unsupported code mentioned below!

As I’ve mentioned a couple of times, my previous gig was as a developer evangelist at Salesforce, with particular focus on integration. A few weeks ago, I wrote a custom destination allowing StreamSets Data Collector (SDC) to write data to Salesforce Wave Analytics; today, I’ll show you how to ingest data from Salesforce and write it to any destination supported by SDC.

Integrating StreamSets with Salesforce Wave Analytics

By April 4, 2016

Wave AnalyticsUPDATE – Salesforce origin and destination stages, as well as a destination for Salesforce Wave Analytics, were released in StreamSets Data Collector 2.2.0.0. Use the supported, shipping Salesforce stages rather than the unsupported code mentioned below!

In my last blog entry I explained how you can write custom destinations to send data to systems not currently supported by StreamSets Data Collector. As you might know, my last gig was as a developer evangelist at Salesforce, so I put my experience to work writing a destination for Salesforce Wave Analytics. Now you can ingest data from any of a variety of origins, operate on it in a StreamSets pipeline, and write the results into a Wave dataset. Once uploaded, you can combine the dataset with CRM and other data in Salesforce for analysis.

Back To Top