Retrieving Metrics via the StreamSets Data Collector REST API

Retrieving Metrics via the StreamSets Data Collector REST API

PiTFT Displaying SDC MetricsLast week, I explained how I was able to run StreamSets Data Collector (SDC) on a Raspberry Pi 3, ingesting sensor data and writing it to Cassandra. With that working, I wanted to show pipeline metrics on Adafruit's awesome PiTFT Plus 2.8″ screen. In this blog post, I'll explain how I was able to write a Python app to retrieve pipeline metrics with SDC's REST API, showing them on the PiTFT Plus via pygame.

The StreamSets Data Collector REST API

SDC's REST API gives access to every facet of the application. Using the API, client apps can manipulate pipelines, run previews, capture snapshots; in fact, since the SDC web UI is itself a client of the REST API, client apps can use the API to do anything the web UI can do.

You can explore the SDC REST API in the web UI by clicking the ‘Help' icon (top right) then ‘RESTful API'.

SDC Help Menu

This will show a Swagger-generated interface, allowing you to see the available resources:


Drilling down into manager, you can see how to start and stop pipelines, and also how to get pipeline metrics:

SDC REST API - manager

Let's drill down into the metrics API:

SDC REST API - metrics

We can see the expected response when we GET the metrics, and the required parameters in a metrics request. We can even plug in a pipeline name and try it out:

SDC REST API - metrics response

I wanted to replicate the ‘record count' histogram familiar from the SDC web UI; some inspection of the metrics response shows that the input/output/error record counts for the pipeline are right there:

You can make the same API call from the command line with curl, but you will need to supply the SDC admin username and password, and also set the custom X-Requested-By HTTP header:

Armed with this knowledge, I was able to write a simple Python app to retrieve and display metrics:

Displaying Metrics on the Raspberry Pi

I followed the instructions given by Adafruit on configuring the Raspberry Pi for the PiTFT Plus display, including the steps for setting up pygame, then worked through Jeremy Blythe‘s excellent tutorial on Raspberry Pi pygame UI basics. Pygame is pretty straightforward, so it only took an hour or two to replicate the record count histogram, most of the time spent tweaking the position of the bars and legends. Here's a screenshot of the result:

SDC histogram

Building on Jeremy's GPIO sample and some experimentation with the SDC REST API Swagger UI, I was also able to use the PiTFT's buttons to start and stop the pipeline. You can grab the code from Gist.

Here's a short video of the system in action:


The StreamSets Data Collector REST API allows client apps to control every aspect of SDC, including starting, stopping and retrieving metrics from pipelines. A simple Python app allows use of the Raspberry Pi PiTFT Plus screen to control and monitor a pipeline. Are you using the SDC REST API? Let me know in the comments!


Related Resources

Check out StreamSets white papers, videos, webinars, report and more.

Visit the Resource Library

Related Blog Posts

Receive Updates

Receive Updates

Join our mailing list to receive the latest news from StreamSets.

You have Successfully Subscribed!