skip to Main Content

Download StreamSets Data Collector

Data ingestion tool to easily move data between any source and destination

StreamSets Data Collector

Apache License 2.0

Current Release : 3.18.1 | Release Date : August 26, 2020 | Release Notes

Design and run data pipelines in minutes with StreamSets Data Collector, an easy-to-use modern execution engine for fast data ingestion and resilient pipelines. With StreamSets Data Collector you can:

  • Design and run continuous data pipelines for structured or unstructured datasets
  • Monitor data flow metrics with built-in data flow sensors and observers
  • Automate data drift handling for CDC, ETL, real-time ML, and streaming ingestion into data lakes

Select a Download Format

To use with Windows, select Docker Image

Quick Start Guide

  1. Download and install Java 8 JDK or OpenJDK 8. (You must have Java 8 JDK, not Java 8 JRE.)
  2. Open the terminal window and set your file descriptors limit to at least 32768
  3. Extract the tarball by entering this command in the terminal window: tar xvzf streamsets-datacollector-all-<VERSION>.tgz
  4. After the tarball is extracted, change the folder to the root of the installation. For example, cd streamsets-datacollector-<VERSION>
  5. Run StreamSets Data Collector by running this command in the terminal window: bin/streamsets dc
  6. In your browser, enter the URL shown in the terminal window. For example, http://10.0.0.100:18360
  7. To start using StreamSets Data Collector, enter your username and password (default is admin/admin).

Note: Replace <VERSION> with the current version number and remove brackets.

Deploy Data Collector Now

Back To Top