Bringing DevOps Agility
to Data Integration

Starting with a Purpose

In 2014, Arvind Prabhakar, an ex-Cloudera engineering leader, and Girish Pancha, an ex-Informatica product leader, clicked on a difficult problem. They had both witnessed how big and fast data was breaking the traditional data integration paradigm. Designed for structured data, static environments and stable data movement patterns, traditional approaches lacked the resilience and agility for the new paradigm of continuous data and frequent change. With help from a small team of gifted engineers, they architected a new way to collect data and handle data drift. And StreamSets was born.

It started with the StreamSets Data Collector, an open source data movement engine with millions of downloads, designed to handle batch and new, streaming data sources. Working with customers, it became clear that a whole new practice was needed to modernize data integration– the practice of DataOps. Over time, the StreamSets DataOps Platform was built out, enabling the agile design and operation of continuous data movement and integration architectures that connect traditional and new data sources to analytics platforms and applications.

StreamSets is all about modernizing data integration so our customers can make the most out of today’s high-volume, high-velocity, always-changing data.

Fortune 100 Downloads
Customer Growth 2017
ARR Growth 2017


The Exploding Data Supply Chain

To win with data, companies must enable modern analytics, empowering every corner of their business to analyze and act on all available data, on-demand. But the “data supply chain” has exploded, with a growing and shifting variety of data sources and storage/compute platforms being put to use for general-purpose data science as well as specific initiatives such as cybersecurity, IoT and customer 360.

This dynamic sprawl of technology makes it resource-intensive to design reliable data integration architectures. Moreover, the exploding supply chain creates data drift, unexpected flux in data formats and semantics that break pipelines and pollute data integrity.


Embracing DataOps is Key

To master the modern data supply chain, enterprises need to adopt DataOps, an agile and iterative process for managing data integration and management. In the same way that DevOps enables continuous delivery of applications, DataOps allows enterprises to operate data integration architectures that frequently iterate in response to changing infrastructure, data sources and analytics requirements.

A DataOps Platform Enables Pervasive Intelligence

To make it easier to be agile, StreamSets built the industry’s first DataOps platform. With it, enterprises can develop and operate data movement architectures that span edge devices, data center platforms and multiple clouds to bring about the vision of pervasive intelligence.

The platform enables companies to:

  • Build dataflows 10 times faster with 1/10 the resource
  • Move data continuously with end-to-end visibility and control
  • Enforce dataflow performance through Data SLAs
  • Iterate pipelines and topologies through a central hub

Intelligent Pipelines

The platform is architected with Intelligent Pipelines that inspect and learn from the data as it passes. This provides numerous benefits, such as the automatic handling of data drift, and the detection and securing of sensitive data in-stream for improved regulatory compliance.

Learn More

Watch the short video to see the StreamSets DataOps approach (DevOps for data management and integration).


Commercial customers include dozens of Fortune 500 companies across banking, energy, financial services, healthcare, insurance, manufacturing, media, retail, technology and telecommunications.
StreamSets had 3-fold customer growth and 4-fold ARR growth in 2017.



Fortune Great Places to Work 2017 Small and Medium Companies
CRN Big Data 100 Logo
2017 Best Places to Work - Bay Area
TIE50 logo


Data Collective
Accel Partners
Battery Ventures
New Enterprise Association
Receive Updates

Receive Updates

Join our mailing list to receive the latest news from StreamSets.

You have Successfully Subscribed!