skip to Main Content

StreamSets Data Integration Blog

Where change is welcome.

AWS Reference Architecture Guide for StreamSets

Brenna Buuck By July 7, 2022

Using StreamSets Data Integration Platform To Integrate Data from PostgreSQL to AWS S3 and Redshift: A Reference Architecture This document describes the reference architecture for integrating data from a database to Amazon Web Services (AWS) data analytics stack utilizing the…

What You Need to Know About Kafka Stream Processing in Python

Brenna Buuck By June 23, 2022

Instant notifications, product recommendations and updates, and fraud detection are practical use-cases of stream processing. With stream processing, data streaming and analytics occur in real-time, which helps drive fast decision-making. However, building an effective streaming architecture to handle data needs…

How to Think About Data Warehouse Design

Brenna Buuck By June 1, 2022

This blog post was updated March 15, 2023 Data warehouse design is the process of defining how the interdependent systems and processes involved in data warehousing will be implemented to align business needs and functional requirements.  The data warehouse design…

A Birds-Eye View of a Modern Data Stack

Brenna Buuck By May 26, 2022

The modern data stack is less like a stack and more like an ecosystem with many participants. This constellation of technologies coalesces around a few guiding principles. Three Guiding Principles The first principle of the modern data stack is complete…

Data Mart vs. Data Warehouse

Brenna Buuck By April 4, 2022

What is a Data Warehouse? Data warehouses are centralized repositories used to store data for an entire organization. Data warehouses contain data from many disparate data sources and can often be quite large. Data warehouses are different from other data…

PostgreSQL vs MySQL: A Head to Head Comparison

Brenna Buuck By March 29, 2022

What is PostgreSQL? PostgreSQL is a relational database that stores data in tables, rows, and columns with pre-defined relationships. This is as opposed to NoSQL or document storage solutions that lack these features and give up advanced analytical capabilities in…

Back To Top