Skip to content

StreamSets Data Integration Blog

Where change is welcome.

Standard Deviations on Cassandra – Rolling Your Own Aggregate Function

By July 28, 2016

Cassandra logoIf you’ve been following the StreamSets blog over the past few weeks, you’ll know that I’ve been building an Internet of Things testbed on the Raspberry Pi. First, I got StreamSets Data Collector (SDC) running on the Pi, ingesting sensor data and sending it to Apache Cassandra, and then I wrote a Python app to display SDC metrics on the PiTFT screen. In this blog entry I’ll take the next step, querying Cassandra for statistics on my sensor data.

Chat with the StreamSets Team via Slack!

By July 15, 2016

SlackSince its inception last October, the sdc-user Google group has been the primary medium for you, our community, to communicate with us, the StreamSets Team. We’ve seen over a thousand messages, and participated in discussions around installation, configuration, bugs, feature requests, and every other aspect of StreamSets Data Collector. While sdc-user works well for many interactions, it does lack the immediacy of chatting in real-time. So, this week, we added a new communication option: the ‘StreamSetters’ Slack team.

Retrieving Metrics via the StreamSets Data Collector REST API

By July 8, 2016

PiTFT Displaying SDC MetricsLast week, I explained how I was able to run StreamSets Data Collector Engine on a Raspberry Pi 3, ingesting sensor data and writing it to Cassandra. With that working, I wanted to show pipeline metrics across data pipelines on Adafruit’s awesome PiTFT Plus 2.8″ screen. In this blog post, I’ll explain how I was able to write a Python app to retrieve pipeline metrics with StreamSets Data Collector REST API, showing them on the PiTFT Plus via pygame to better manage data pipelines.

Back To Top