skip to Main Content

Download StreamSets Data Collector

Data ingestion tool to easily move data between any source and destination

StreamSets Data Collector

Apache License 2.0

Current Release : 3.18.1 | Release Date : August 26, 2020 | Release Notes

Design and run data pipelines in minutes with StreamSets Data Collector, an easy-to-use modern execution engine for fast data ingestion and resilient pipelines. With StreamSets Data Collector you can:

  • Design and run continuous data pipelines for structured or unstructured datasets
  • Monitor data flow metrics with built-in data flow sensors and observers
  • Automate data drift handling for CDC, ETL, real-time ML, and streaming ingestion into data lakes

Select a Download Format

To use with Windows, select Docker Image

Quick Start Guide

  1. Download and install Java 8 JDK or OpenJDK 8. (You must have Java 8 JDK, not Java 8 JRE.)
  2. Open the terminal window and set your file descriptors limit to at least 32768
  3. Extract the tarball by entering this command in the terminal window: tar xvzf streamsets-datacollector-all-<VERSION>.tgz
  4. After the tarball is extracted, change the folder to the root of the installation. For example, cd streamsets-datacollector-<VERSION>
  5. Run StreamSets Data Collector by running this command in the terminal window: bin/streamsets dc
  6. In your browser, enter the URL shown in the terminal window. For example,
  7. To start using StreamSets Data Collector, enter your username and password (default is admin/admin).

Note: Replace <VERSION> with the current version number and remove brackets.

Deploy Data Collector Now

Back To Top