StreamSets Data Integration Blog
Where change is welcome.
AWS Reference Architecture Guide for StreamSets
Using StreamSets DataOps Platform To Integrate Data from PostgreSQL to AWS S3 and Redshift: A Reference Architecture This document describes…
Send Kafka Messages To Amazon S3
In this post, we will take a look at best practices for integrating StreamSets Data Collector Engine (SDC), a fast data ingestion engine, with Kafka. Then, we’ll dive deep into the details of connecting Kafka to Amazon S3. But first, it’ll help to have an overview of Kafka, S3, and StreamSets Data Collector. Kafka, Amazon S3 and SDC Apache Kafka is…
Take Control of the Data “Wild West” — And Empower Your LOB To Boot
Raise your hand if you’ve ever had a line of business user create a dataset without telling you. Don’t be shy… You’re in good company. In a recent survey of over 650 data decision-makers and practitioners, 68% said it had happened to them too. And as frustrating as it may be, it shouldn’t come as a surprise. I don’t know…
Cloud Data Migration – Knowing When, Why, and How To Move Your Data
Cloud data migration is on the rise, with cloud adoption expected to nearly double in the next five years. It's no surprise that cloud data migration is increasing, as many business benefits exist. In this piece, we’ll explore why organizations are moving to the cloud, different migration strategies, the steps involved, how long cloud migration takes, and the everyday challenges…
Reverse ETL to Marketo: A Real-Life Example
Standing for Extract, Transform, and Load, the acronym ETL describes the process of extracting data from a target, transforming it, and sending it on to load into a destination. The barest definition of ETL doesn’t include details about the nature of the source or destination. For this reason, I think there is an excellent case to be made that reverse…
How To Quickly Support Diverse LOBs With Scarce Data Engineering Resources
In a highly competitive environment, making smarter decisions faster dramatically impacts both the top and bottom lines. According to Forrester, advanced insights-driven businesses (IDBs) — firms that use data, analytics, and software in closed, continuously optimized loops to differentiate and compete — are eight times more likely than beginners to say they grew by 20% or more. That kind of…