skip to Main Content

The DataOps Blog

Where Change Is Welcome

StreamSets Transformer: Your Questions Answered

Engineering, Videos

StreamSets Transformer, a powerful tool for creating highly instrumented Apache Spark applications for modern ETL, is the newest addition to the StreamSets DataOps Platform. StreamSets enables next-generation ETL through the StreamSets Transformer tool. The product provides enterprises with the flexibility to create ETL pipelines for both batch and streaming data as well as clear visibility into their data processing operation…

By October 22, 2019

Announcing StreamSets Data Collector 3.11.0 and StreamSets Data Collector Edge 3.11.0

StreamSets News

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.11.0 and StreamSets Data Collector Edge 3.11.0. StreamSets Data Collector is open source under Apache License 2.0 and a powerful design and execution engine. It enables moving data between any source and destination, performing transformations, and push down analytics along the way. To download, click here. StreamSets Data…

By October 8, 2019

StreamSets Cloud
Unlocking Insights: Amazon S3 to Snowflake

Engineering, StreamSets News

StreamSets Cloud is a cloud service for designing, deploying and operating smart data pipelines, combining ease and scalability with the flexibility to execute pipelines anywhere - on-premise, or in a private or public cloud. It provides an integrated user interface to design, deploy, operate and monitor smart data pipelines managed by StreamSets cloud service. In this step-by-step tutorial blog, you’ll…

By September 26, 2019

StreamSets Transformer
Extensibility — Part 2:
Spark MLeap Bundles to S3

Engineering

In part 1, you learned how to extend StreamSets Transformer in order to train Spark ML RandomForestRegressor model. In this part 2, you will learn how to create Spark MLeap bundle to serialize the trained model and save the bundle to Amazon S3. MLeap is a common serialization format and execution engine for machine learning pipelines. It supports Spark, Scikit-learn…

By September 24, 2019

StreamSets Transformer
Extensibility:
Spark and Machine Learning

Engineering

Apache Spark has been on the rise for the past few years and it continues to dominate the landscape when it comes to in-memory and distributed computing, real-time analysis and machine learning use cases. And with the recent release of StreamSets Transformer, a powerful tool for creating highly instrumented Apache Spark applications for modern ETL, you can quickly start leveraging…

By September 12, 2019
Back To Top