Before You Begin

Before you start this tutorial, you'll need to do a few things:

  1. Download sample data.

    You can download sample data from the following location:

    https://www.streamsets.com/documentation/datacollector/sample_data/tutorial/nyc_taxi_data.csv

  2. Create directories local to the Data Collector.
    Let's use the same root directory for the origin, destination, and error files as follows:
    /<base directory>/tutorial/origin
    /<base directory>/tutorial/destination
    /<base directory>/tutorial/error
    Make sure the user who started the Data Collector has read and write permission for the directories.
  3. Save the sample data in the origin directory.
  4. Make sure the Data Collector is installed and running.
  5. To create and run the pipeline, you should have a Data Collector login with the admin role or both the creator and manager roles.

    If you haven't set up custom user accounts, you can use the admin account shipped with the Data Collector. The default login is: admin / admin.