StreamSets Data Integration Blog
Where change is welcome.
Where change is welcome.
In this post, we will take a look at best practices for integrating StreamSets Data Collector Engine (SDC), a fast data ingestion engine, with Kafka. Then, we’ll dive deep into the details of connecting Kafka to Amazon S3. But first, it’ll…
One crucial part of Big Data is streaming data. As the name suggests, streaming data refers to data that undergoes continuous generation from multiple sources like social media, CRM, and ERM platforms. Handling and analyzing streaming data can be complex,…
This blog post was updated March 14, 2023 Data pipeline architecture refers to the design of systems and schema that help collect, transform, and make data available for business needs. This data pipeline architecture involves tools and technologies for data…
A while ago we talked to a prospect that wanted to stream their IoT data into Delta Lake without a Spark (Databricks) cluster running continuously. While streaming data into Delta Lake is supported by StreamSets, the second part of this…
With their ability to harness and make sense of all types of data from disparate sources, data pipelines are the foundation of modern analytics. A data pipeline refers to a series of steps (typically called jobs) that aggregate data from…
Instant notifications, product recommendations and updates, and fraud detection are practical use-cases of stream processing. With stream processing, data streaming and analytics occur in real-time, which helps drive fast decision-making. However, building an effective streaming architecture to handle data needs…
No one can argue the value streaming data provides for an organization. From analyzing data while in motion, it’s no surprise that data-driven organizations are incorporating it into their overall data processes to help identify business strategies that work and…
Data pipelines are meant to transfer data from a source or legacy system to a target system. Easy right? Well not so much. As a Data Engineer, it’s our job to be responsible for multiple different data pipeline architecture decisions…
Looking to build a Redis pipeline? Before we dig in and build our pipeline, let’s discuss what Redis is, its benefits, and common Redis pipeline use cases. What is Redis? Benefits of Redis Performance Simplicity and Ease-of-Use High Availability and…