skip to Main Content

The DataOps Blog

Where Change Is Welcome

Announcing StreamSets Data Collector 3.11.0 and StreamSets Data Collector Edge 3.11.0

StreamSets News

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.11.0 and StreamSets Data Collector Edge 3.11.0. StreamSets Data Collector is open source under Apache License 2.0 and a powerful design and execution engine. It enables moving data between any source and destination, performing transformations, and push down analytics along the way. To download, click here. StreamSets Data…

By October 8, 2019

StreamSets Cloud
Unlocking Insights: Amazon S3 to Snowflake

Engineering, StreamSets News

StreamSets Cloud is a cloud service for designing, deploying and operating smart data pipelines, combining ease and scalability with the flexibility to execute pipelines anywhere - on-premise, or in a private or public cloud. It provides an integrated user interface to design, deploy, operate and monitor smart data pipelines managed by StreamSets cloud service. In this step-by-step tutorial blog, you’ll…

By September 26, 2019

The StreamSets Cloud Beta is Open for Participation!

Videos, StreamSets News

Today we are opening the StreamSets Cloud Beta program, inviting you to experience and give feedback on the latest addition to the StreamSets product family. StreamSets Cloud is a cloud service for designing, deploying and operating smart data pipelines, combining the ease and scalability of the cloud with the flexibility to execute pipelines anywhere - on-premise, private cloud or public…

By September 26, 2019

StreamSets Transformer
Extensibility — Part 2:
Spark MLeap Bundles to S3

Engineering

In part 1, you learned how to extend StreamSets Transformer in order to train Spark ML RandomForestRegressor model. In this part 2, you will learn how to create Spark MLeap bundle to serialize the trained model and save the bundle to Amazon S3. MLeap is a common serialization format and execution engine for machine learning pipelines. It supports Spark, Scikit-learn…

By September 24, 2019

StreamSets Transformer
Extensibility:
Spark and Machine Learning

Engineering

Apache Spark has been on the rise for the past few years and it continues to dominate the landscape when it comes to in-memory and distributed computing, real-time analysis and machine learning use cases. And with the recent release of StreamSets Transformer, a powerful tool for creating highly instrumented Apache Spark applications for modern ETL, you can quickly start leveraging…

By September 12, 2019
Back To Top