StreamSets Data Integration Blog
Where change is welcome.
Where change is welcome.
Arvind Prabhakar and I co-founded StreamSets in 2014 with an audacious vision: data should be the lifeblood of the enterprise. Not just gathered in warehouses and lakes, but to drive the next advances in digital transformation with operationalized data analytics.…
As I was preparing my session for the recent DataOps Summit, I realized once again that data engineering is the future of data. More than that, those data engineers who rely on DataOps will lead the way. In this blog,…
“These are the times that try men’s souls.” --Thomas Paine, The American Crisis Many of you, reading this blog, weathered the systemic disruptions of the dot-com bubble burst and the great financial crisis. The current environment is fundamentally different. While…
Today, we announced that StreamSets raised $35 million in a Series C funding round, led by new investor Harmony Partners. I met Mark Lotke, Harmony’s Managing Partner, over 18 months ago, and we immediately hit it off because it was clear that he really got both Data and Operations, exemplified by his investments in AppDynamics, Alation and InfluxDB. Our other new investor is Paul Drews of Tenaya Capital. Paul and StreamSets go back a long way; he was a Board Observer in his past life at Battery Ventures and must have liked what he saw. I’m also delighted that our existing investors, Dharmesh Thakker from Battery Ventures and Pete Sonsini at NEA, also participated to their fullest, validating our “say what we’ll do, do what we said” doctrine.
Today we hear a lot about streaming data, fast data, and data in motion. But the truth is that we have always needed ways to move our data. Historically, the industry has been pretty inventive about getting this done. From the early days of data warehousing and extract, transform, and load (ETL) to now, we have continued to adapt and create new data movement methods, even as the characteristics of the data and data processing architectures have dramatically changed.
Exerting firm control over data in motion is a critical competency which has become core to modern data integration and operations. Based on more than 20 years in enterprise data, here is my take on the past, present and future of data in motion.
Friends of StreamSets,
Today I am delighted to announce our new product, StreamSets Dataflow Performance Manager, or DPM, the industry’s first solution for managing operations of a company’s end-to-end dataflows within a single pane of glass. The result of a year’s worth of innovative engineering and collaboration with key customers, DPM will be generally available on or before September 27, in time for Strata. We invite you to come by our booth (#451) for a live demonstration.
DPM is a natural follow-on to our first product, StreamSets Data Collector, which is open source software for building and deploying any-to-any dataflow pipelines. That product has enjoyed a great deal of success in its first year in market, with an accelerating number of weekly downloads, which now total in the tens of thousands across hundreds of enterprises, and numerous production use cases in Fortune 500 companies across a variety of industries.
Forward-looking, data-driven enterprises increasingly leverage Big Data platforms, such as Hadoop, Elasticsearch and Amazon Web Services, to derive insights from non-transactional, machine-generated data. Many tools have emerged to power next generation data pipelines and provide specialized analytic capabilities. To get value…
Today, after a year of working in stealth mode with a number of enterprise charter customers, we are excited to launch StreamSets. Arvind and I started StreamSets in June 2014 because, as they say in French, “plus ça change, plus…