Engineering

Create Microservice Pipelines with StreamSets Data Collector (Tutorial)

A microservice is a lightweight component that implements a relatively small component of a larger system – for example, providing access to user data. A microservice architecture comprises a set of independent microservices, often implemented as RESTful web services communicating via JSON over HTTP, that together implement a system’s functionality, rather than a single monolithic […]

Using Docker Wrong: My Journey to a Better Container

Following on from last week’s guest post from MapR’s Ian Downard on integrating StreamSets Data Collector with MapR Persistent Application Client Container (PACC), MapR Distinguished Technologist John Omernik offers a cautionary tale on examining your assumptions before jumping into the world of Docker. We repost John’s original article here with his kind permission. Since starting at MapR […]

Using StreamSets and MapR Together in Docker

Today’s guest blogger is Ian Downard, a Senior Developer Evangelist at MapR Technologies. Ian focuses on machine learning and data engineering, and recently documented how he brought together the MapR Persistent Application Client Container (PACC) with StreamSets Data Collector and Docker to build pipelines for ingesting data into the MapR Converged Data Platform. We’re reposting Ian’s article here, with his […]

Streaming Extreme Data Made Simple with Kinetica and StreamSets

Kinetica, just one of dozens of origins and destinations supported by StreamSets Data Collector, is a distributed, in-memory, GPU database designed for geospatial analysis, machine learning, predictive analytics, and other workloads requiring high performance parallel processing. Mathew Hawkins, a Principal Solutions Architect at Kinetica, recently wrote an excellent tutorial on integrating Data Collector with Kinetics. We repost it here with […]

Extract Data from Google Analytics using StreamSets Data Collector

 Angel Alvarado is a senior software engineer at One Degree, a San Francisco-based non-profit, and also helps run the Molanco data engineering community. Angel previously contributed a Fun Example of Streaming Data into Minecraft; this time he get serious with the Google Analytics API. Many thanks to Angel for his kind permission to adapt this article from his original. Back […]

Change Data Capture from Oracle with StreamSets Data Collector

Today’s guest post is by Franck Pachot, an Oracle Consultant at dbi services in Switzerland. Franck has over 20 years of experience in Oracle, covering every aspect of the database from architecture and data modeling to tuning and operation. Franck recently documented his experiences testing StreamSets Data Collector‘s Oracle CDC origin, and kindly allowed us to repost his blog […]

Receive Updates

Receive Updates

Join our mailing list to receive the latest news from StreamSets.

You have Successfully Subscribed!

Pin It on Pinterest