StreamSets Data Integration Blog
Where change is welcome.
AWS Reference Architecture Guide for StreamSets
Using StreamSets DataOps Platform To Integrate Data from PostgreSQL to AWS S3 and Redshift: A Reference Architecture This document describes…
Python Pipeline: Here’s How to Build Your Python Package and install it in StreamSets
Coding in Python gives developers ultimate control over every aspect of their design, but with a plethora of choices comes the dangers of becoming distracted. Low code, graphical environments provide for easy operation and reuse of components but with shallow levels of control than hand coding. Custom processors allow data engineers to operationalize their code and provide powerful extensibility for…
Get Access to Transformer for Snowflake Today!
Transformer for Snowflake is the first enterprise data transformation engine built on Snowpark. Want to learn how the engine makes advanced, native data transformations for your Data Cloud possible? Join our technical experts on Office Hours. Today at StreamSets, we're thrilled to announce the launch of our Public Preview of Transformer for Snowflake. By entering Public Preview, all users of…
How Your Data Ingestion Framework Turns Strategy into Action
With Data Infrastructure predicted to grow to over 175 zettabytes (ZB) by 2025, the debate amongst data engineers is no longer how big the data they encounter will be. Instead, they talk about how best to design a data ingestion framework that ensures that the right data is processed and cleansed for applications that need them. Data ingestion is the…
What is Streaming Data Analytics? Use Cases, Examples, and Architecture
Netflix’s ability to stream data killed Blockbuster video. All of a sudden, customers could access movies—late-fee free—from their couch. And there was no need to drive to the store, rewind the tape, and drive back. Also, for Netflix, a catalog of movie files was far cheaper to maintain and distribute than an inventory of DVDs and VHS tapes. Streaming analytics…
Testing and Automation with the StreamSets DataOps Platform SDK for Python
We are excited to announce the immediate availability of StreamSets DataOps Platform SDK for Python version 4.0.0. It enables users to interact with StreamSets DataOps Platform programmatically using Python 3.4+. Highlights of the StreamSets SDK SDK Activation key is no longer required DataCollector and Transformer classes are no longer public because these are headless engines in StreamSets DataOps Platform Authentication…