skip to Main Content

StreamSets Enables Data Lake in the Cloud with Microsoft Azure Integration

StreamSets Data Collector Now Available for Azure Customers, Helping Enterprises Accelerate Execution of Their Cloud Strategies

SAN FRANCISCO — March 22, 2017 — StreamSets Inc., a provider of innovative data-in-motion middleware, today announced integration of its award-winning StreamSets Data Collector™ open source software for building any-to-any dataflows with Microsoft Azure. The integration enables StreamSets customers and open source adopters to use Azure Data Lake Store as a built-in pipeline destination, accelerating time to value by being able to move and transform data without writing custom code. The software is available on the Azure Marketplace.

Cloud data storage and analytics are critical to enabling enterprise digital transformation. To be effective, enterprises must easily move data between on-premises and cloud data stores at will in order to leverage the advantages of each modality, whether the data comes from traditional databases or new big data sources such as from logs, IoT sensors and social media feeds. An integrated solution using StreamSets Data Collector and Microsoft Azure Data Lake Store simplifies development of new cloud-based applications and analytics by continually feeding data stores with timely and trustworthy data.

A major bank and a leading real estate company, both in the Forbes 1000, already use StreamSets to ingest to Azure Data Lake to leverage the exceptional analytic services Microsoft Azure offers. They use StreamSets to stock their data lake in the cloud, at scale and with little to no custom code. They can easily ingest source data from traditional and modern sources for use with HDInsights, uSQL and other analytics tools.

“It is no longer a question of whether enterprises will embrace the cloud, but how,” said Jobi George, vice president of business development, StreamSets. “As a result, vendors must answer customers’ demand for flexible, portable, hybrid solutions. By integrating with Microsoft Azure Data Lake, StreamSets provides drag-and-drop access to a top-tier platform for enabling the multi-cloud reality that is shaping enterprises’ digital transformation efforts.”

Joanne Marone, director of marketing for Cloud App Dev & Data at Microsoft Corp., added, “Microsoft is dedicated to making big data and advanced analytics accessible to more people to transform their organizations. The integration with StreamSets Data Collector allows organizations to quickly ingest data into Azure Data Lake from data stores on-premises, thus providing them a hybrid data solution. With the ability to do continuous ingestion, enterprises using StreamSets can quickly unlock insights from their data using the big data and analytic solutions in Azure whether their data lives on-premises or in the cloud.”

You can download StreamSets Data Collector from the Azure Marketplace. For more information about the joint solution, watch this recent webinar featuring product experts from Microsoft and StreamSets.

About StreamSets
StreamSets provides innovative data-in-motion middleware that reinvents how enterprises deliver timely and trustworthy data to their critical applications. StreamSets Data Collector™ is award-winning, open source software for the development of any-to-any dataflows. StreamSets Dataflow Performance Manager (DPM™) provides a comprehensive control panel for managing the day-to-day operation of complex dataflow topologies. Founded by Girish Pancha, former chief product officer of Informatica, and Arvind Prabhakar, a former engineering leader at Cloudera, StreamSets is backed by top-tier Silicon Valley venture capital firms, including Accel Partners, Battery Ventures and New Enterprise Associates (NEA). For more information, visit

Media Contact:
Brittney Timmins
BOCA Communications


Back To Top