StreamSets Data Collector
Apache License 2.0
Current Release : 3.18.1 | Release Date : August 26, 2020 | Release Notes
Design and run data pipelines in minutes with StreamSets Data Collector, an easy-to-use modern execution engine for fast data ingestion and resilient pipelines. With StreamSets Data Collector you can:
- Design and run continuous data pipelines for structured or unstructured datasets
- Monitor data flow metrics with built-in data flow sensors and observers
- Automate data drift handling for CDC, ETL, real-time ML, and streaming ingestion into data lakes
Select a Download Format
To use with Windows, select Docker Image
Quick Start Guide
- Download and install Java 8 JDK or OpenJDK 8. (You must have Java 8 JDK, not Java 8 JRE.)
- Open the terminal window and set your file descriptors limit to at least 32768.
- Extract the tarball by entering this command in the terminal window: tar xvzf streamsets-datacollector-all-<VERSION>.tgz.
- After the tarball is extracted, change the folder to the root of the installation. For example, cd streamsets-datacollector-<VERSION>.
- Run StreamSets Data Collector by running this command in the terminal window: bin/streamsets dc
- In your browser, enter the URL shown in the terminal window. For example, http://10.0.0.100:18360
- To start using StreamSets Data Collector, enter your username and password (default is admin/admin).
Note: Replace <VERSION> with the current version number and remove brackets.