skip to Main Content

StreamSets Data Integration Blog

Where change is welcome.

The Nuts and Bolts of the Databricks Lakehouse Platform


Exploding data growth has led to a search for a robust, scalable, high performance data solution that can accommodate growing data demand. There are many solutions available, but the data warehouse and data lake are two of the most popular.  While a data warehouse collects and stores processed data for business intelligence and data analytics, the data lake offers a…

Brenna Buuck By December 19, 2022

Four Machine Learning Deployment Methods + How To Choose the Best One


The primary goal of machine learning (ML) is to perform a task more efficiently using models, which only becomes possible if the ML models are available for end users. Most view ML deployment as an art, requiring careful collaboration between the data science, software engineering, and DevOps teams to deploy a model successfully. Also, because teams focus on different aspects…

Brenna Buuck By December 15, 2022

The Building Blocks of AWS Lakehouse Architecture


The data lakehouse is a relatively recent evolution of data lakes and data warehouses. Amazon was one of the first to use a lakehouse as service.  In 2019, they developed Amazon Redshift Spectrum. This service lets users of its Amazon Redshift data warehouse service apply queries to data stored in Amazon S3. In this piece, we’ll dive into all things…

Brenna Buuck By December 9, 2022

Python vs. SQL: A Deep Dive Comparison


Python and SQL are the two most common programming languages crucial in the day-to-day work of data engineers and scientists. So for anyone looking to delve into data, choosing one of these languages to learn and master is typical. Understanding the nature of both languages, what they offer, and their advantages can help budding data professionals decide which language to…

Brenna Buuck By November 29, 2022

7 Examples of Data Pipelines


The best way to understand something is through concrete examples. I’ve put together seven examples of data pipelines that represent very typical patterns that we see our customers engage in. These are also patterns that are frequently encountered by data engineers in the production environments of any tool.  Use these patterns as a starting point for your own data integration…

Brenna Buuck By November 18, 2022
Back To Top