StreamSets Data Collector Technology

Open Source

The StreamSets team includes top-level open source contributors. The software is completely open source, based on the Apache 2.0 License. We invite developers like you to participate in an open development process and shape the conversation around the future of data ingest.


With unprecedented transparency into data flows and strong data governance capabilities, StreamSets allows organizations looking for a high degree of maturity in data ingest to simplify operations and greatly accelerate time-to-analysis.


Design self-service data pipelines with StreamSets' highly intuitive user interface. Work with structured or unstructured data. Explore, trace and debug data flows in real time. Alter or refine data with in-stream Javascript or Python scripting support. Or use an SDK or Command Line Interface to interact with data pipelines.

Enterprise Grade

Designed from the ground up for the enterprise: Deploys in standalone or cluster mode. Reads from and writes to a large number of endpoints. Includes a rich complement of components that sanitize and secure incoming data even before it gets to sensitive internal databases. Guarantees data delivery and security.