skip to Main Content

StreamSets Data Integration Blog

Where change is welcome.

Replicating Relational Databases with StreamSets Data Collector

By February 3, 2017

HiveDrift2StreamSets Data Collector Engine has long supported both reading and writing data from and to relational databases via Java Database Connectivity (JDBC). While it was straightforward to configure pipelines to read data from individual tables, ingesting records from an entire database was cumbersome, requiring a pipeline per table. StreamSets Data Collector Engine Now introduces the JDBC Multitable Consumer, a new pipeline origin that can read data from multiple tables through a single database connection. In this blog entry, I’ll explain how the JDBC Multitable Consumer can implement a typical use case – replicating relational databases (an entire one) into Hadoop.

Back To Top