StreamSets Data Integration Blog
Where change is welcome.
AWS Reference Architecture Guide for StreamSets
Using StreamSets DataOps Platform To Integrate Data from PostgreSQL to AWS S3 and Redshift: A Reference Architecture This document describes…
Hadoop (MapReduce) vs Apache Spark: A Deep Dive Comparison
There is no denying the impact of distributed data and computing over the last 15 years. These innovations have shattered constructs and limitations in database design which have held largely constant over the decades previous. This enables analytics at a scale and speed never imagined possible. To understand how we got to machine learning, AI, and real-time streaming, we need…
Python vs. Scala: A Deep Dive Comparison
Python vs. Scala: A Deep Dive Comparison Even if you’re relatively new to programming, you’ve most likely come across both the Python and Scala programming languages. Python and Scala are two of the most widely used languages in today’s programming ecosystem. In this piece, we’re not looking to stir the proverbial pot about which programming language is better. There are…
Cloud Data Integration: Benefits, Examples, and Why it Matters
Try and imagine removing the cloud from business. It would bring the world to a screeching halt. Yet only a decade ago, most enterprise leaders were resisting the move. The stats show us clearly who won that debate. There will be over 100 zettabytes of data stored in the cloud around the world by 2025 In 2020, the total worth…
Python vs. Java: A Deep Dive Comparison
Did you know that Python and Java are two of the most commonly used programming languages today? Of the hundreds of languages to choose from, Python and Java continue to gain more and more user adoption, leading to more programs and platforms being written in Python and Java, and so forth creating a snowball-like effect in today’s programming ecosystem. Of…
Why & How to Use Data Enrichment to Activate Your Data Lake for Analytics
We all know the shift to the cloud is massive and has been accelerated by COVID; however, I think many of us (myself included) don’t take enough time to really look at the new cloud-native services for doing data enrichment and analytics. AWS Re:Invent last December was a great opportunity to talk to customers, prospects, and other technology companies. Meeting…