Azure Synapse Analytics, the next evolution of Azure SQL Data Warehouse, combines enterprise data warehousing and big data analytics into a single analytics service. StreamSets Cloud‘s new Azure SQL Data Warehouse destination, released today, loads data into Azure Synapse.
Today we are opening the StreamSets Cloud Beta program, inviting you to experience and give feedback on the latest addition to the StreamSets product family. StreamSets Cloud is a cloud service for designing, deploying and operating smart data pipelines, combining the ease and scalability of the cloud with the flexibility to execute pipelines anywhere – […]
StreamSets Data Collector reads from and writes to a wide variety of data stores and messaging platforms. Any interaction with an external system brings with it the risk of an error, and error messages are often less than helpful at pinpointing the root cause of the problem. Version 3.9.0 of Data Collector, released a few […]
Jeff Schmitz has been working with big data for over a decade: at Shell, Sanchez Energy, MapR and, currently, as a senior solutions architect at MongoDB. Here, in a guest post reposted with permission from the original, Jeff shares his early experience with StreamSets Data Collector. Now that I work for MongoDB I work with StreamSets quite a […]
The Glue Conference (better known as GlueCon) is always a treat for me. I’ve been speaking there since 2012, and this year I presented a session explaining how I use StreamSets Data Collector to ingest content delivery network (CDN) data from compressed CSV files in S3 to MySQL for analysis, using the Kickfire API to […]
Yannick Jaquier is a Database Technical Architect at STMicroelectronics in Geneva, Switzerland. Recently, Yannick started experimenting with StreamSets Data Collector’s Oracle CDC Client origin, building pipelines to replicate data to a second Oracle database, a JSON file, and a MySQL database. Yannick documented his journey very thoroughly, and kindly gave us permission to repost his findings […]
Randy Zwitch is a Senior Director of Developer Advocacy at OmniSci, enabling customers and community users alike to utilize OmniSci to its fullest potential. With broad industry experience in energy, digital analytics, banking, telecommunications and media, Randy brings a wealth of knowledge across verticals as well as an in-depth knowledge of open-source tools for analytics. […]
One of the drivers behind Slack‘s rise as an enterprise collaboration tool is its rich set of integration mechanisms. Bots can monitor channels for messages and reply as if they were users, apps can send and receive messages and more via a wide range of APIs, and slash commands allow users to interact with external […]
Vinu Kumar is Chief Technologist at HorizonX, based in Sydney, Australia. Vinu helps businesses in unifying data, focusing on a centralized data architecture. In this guest post, reposted from the original here, he explains how to automate data quality using open source tools such as StreamSets Data Collector, Apache Griffin and Apache Kafka. “Data is the new oil. […]
Although the recent public preview of Amazon Managed Streaming for Kafka (MSK) certainly made headlines, Kinesis remains Amazon’s supported, production, real-time streaming service. In this blog post, I’ll show you how to get started using StreamSets Data Collector to build dataflow pipelines to send data to and receive data from Amazon Kinesis Data Streams.