The DataOps Blog
Where Change Is Welcome
AWS Reference Architecture Guide for StreamSets
Using StreamSets DataOps Platform To Integrate Data from PostgreSQL to AWS S3 and Redshift: A Reference Architecture This document describes…
PostgreSQL vs MySQL: A Head to Head Comparison
What is PostgreSQL? PostgreSQL is a relational database that stores data in tables, rows, and columns with pre-defined relationships. This is as opposed to NoSQL or document storage solutions that lack these features and give up advanced analytical capabilities in favor of ease of use. It is also open-source. What does this mean? There is no fee, even for commercial…
Kafka vs. Kinesis: A Deep Dive Comparison
Kafka vs. Kinesis: A Deep Dive Comparison Data comes at businesses today at a relentless pace – and it never stops. It’s a good thing too. The data-driven enterprise is more likely to succeed. According to McKinsey, “companies with the greatest overall growth in revenue and earnings receive a significant proportion of that boost from data and analytics.” But there’s…
Python Pipeline: Here’s How to Build Your Python Package and install it in StreamSets
Coding in Python gives developers ultimate control over every aspect of their design, but with a plethora of choices comes the dangers of becoming distracted. Low code, graphical environments provide for easy operation and reuse of components but with shallow levels of control than hand coding. Custom processors allow data engineers to operationalize their code and provide powerful extensibility for…
Get Access to Transformer for Snowflake Today!
Transformer for Snowflake is the first enterprise data transformation engine built on Snowpark. Want to learn how the engine makes advanced, native data transformations for your Data Cloud possible? Join our technical experts on Office Hours. Today at StreamSets, we're thrilled to announce the launch of our Public Preview of Transformer for Snowflake. By entering Public Preview, all users of…
How Your Data Ingestion Framework Turns Strategy into Action
With Data Infrastructure predicted to grow to over 175 zettabytes (ZB) by 2025, the debate amongst data engineers is no longer how big the data they encounter will be. Instead, they talk about how best to design a data ingestion framework that ensures that the right data is processed and cleansed for applications that need them. Data ingestion is the…