Upgrade an Installation from the RPM Package

When you upgrade an installation from the RPM package, the new version uses the default configuration, data, log, and resource directories. If the previous version used the default directories, the new version has access to the files created in the previous version.

If the previous version used customized values for the directory environment variables, you must make the same customizations in the new version so that the new version can access the same files.

Note: If you installed external libraries or developed custom stages, verify that those libraries are stored in a local directory external to the Data Collector installation directory before you upgrade. That way, Data Collector can still use the libraries after the upgrade.

To upgrade an installation from the RPM package, perform the following steps:

Step 1. Shut Down the Previous Version

Step 2. Back Up the Previous Version

Step 3. Install the New Version

Step 4. Update the Environment Configuration File

Step 5. Update the Configuration Files

Step 6. Install Additional Libraries for the Core Installation

Step 7. Uninstall Previous Libraries

Step 8. Start the New Version of Data Collector

Step 1. Shut Down the Previous Version

Stop all pipelines and then shut down the previous version of Data Collector.

  1. From the Home page, select all running pipelines in the list and then click the Stop icon.
    When the confirmation dialog appears, click Yes.
  2. Use one of the following methods to shut down Data Collector:
    • To use the command line for shutdown, use the following command:
      service sdc stop
    • To use the Data Collector console for shutdown, click Administration > Shut Down. When the confirmation dialog box appears, click Yes.

Step 2. Back Up the Previous Version

Before you install the new version, create a backup of the files in the previous version by copying and renaming the data, log, and resource directories. You’ll also need to create a backup of the environment configuration file, $SDC_DIST/libexec/sdcd-env.sh, so that the file is not overwritten when you install the new version. That way, you can continue to run the previous version if needed.

Copy and rename the following directories and files:
  • Data directory defined in the SDC_DATA environment variable. Default is /var/lib/sdc.
  • Log directory defined in the SDC_LOG environment variable. Default is /var/log/sdc.
  • Resource directory defined in the SDC_RESOURCES environment variable. Default is /var/lib/sdc-resources.
  • Environment configuration file, $SDC_DIST/libexec/sdcd-env.sh.

For example, if you are upgrading version 2.6.0.0, copy the Data Collector data directory and rename it as follows: /var/lib/sdc2600. Create a backup of the environment configuration file by renaming the file as follows: sdcd-env-2600.sh.

Step 3. Install the New Version

Install the new version of the RPM package. Installing the full Data Collector as a service requires sudo privileges on the root directory.

  1. Use the following URL to download the Data Collector RPM package from the StreamSets website: https://streamsets.com/opensource.
  2. Use the following command to extract the file to a different directory than the previous version:
    tar -xzf streamsets-datacollector-<version>-all-rpms.tgz
    For example, to extract version 2.6.0.0, use the following command:
    tar -xzf streamsets-datacollector-2.6.0.0-all-rpms.tgz
  3. To install the full RPM package and all available stage libraries, use the following command:
    yum localinstall streamsets*
  4. Or, to install the core RPM package and then install individual stage libraries as needed, use the following command:
    yum localinstall streamsets-datacollector-<version>-1.noarch.rpm
    For example, to install version 2.6.0.0, use the following command:
    yum localinstall streamsets-datacollector-2.6.0.0-1.noarch.rpm

Step 4. Update the Environment Configuration File

Each RPM installation uses the same default values as the previous version for all of the directory environment variables. If the previous version used the default values, the new version is configured to use the same working directories.

If the previous version used customized values for the directory environment variables, you must make the same customizations in the new version. The new version must use the same data, log, and resource directories as the previous version.

  1. Open the environment configuration file that you backed up in the previous version.
    For example, open the $SDC_DIST/libexec/sdcd-env-2600.sh file.
  2. In the new version of Data Collector, open the $SDC_DIST/libexec/sdcd-env.sh file.
  3. Compare the previous and new versions of the environment configuration file, and update the new file as needed with the same customized property values.

Step 5. Update the Configuration Files

A new Data Collector version can include new properties and configuration files required for Data Collector to start or function properly.

When you install the new RPM package, the configuration files are written to the same default directory as the previous version, /etc/sdc. The new versions of the configuration files are renamed with the following extension: .rpmnew. For example, the new version of the Data Collector configuration file is renamed to sdc.properties.rpmnew.

To update the configuration files, you must rename the previous and new versions of the files and then update the new files with any customized property values defined in the previous version.

Note: If the previous version used a customized value for $SDC_CONF, the new configuration files are written to a different directory than the previous version, and so do not require the .rpmnew file extension. In this case, you do not rename the configuration files, but must update the new files with any customized values defined in the previous version.
  1. In the working $SDC_CONF directory, /etc/sdc by default, rename all previous configuration files except for the application-token.txt file with the following extension: .old.
    The previous version of the application-token.txt file includes the authentication token that this Data Collector instance requires to issue authenticated requests to DPM. As a result, we'll need Data Collector to use the previous version of the file.
  2. Remove the following extension from all new configuration files except for the application-token.txt file: .rpmnew.
  3. Compare the previous and new versions of the sdc.properties file, and update the new file as needed with the same customized property values.
  4. Compare the previous and new versions of the remaining files, and update the new files as needed with the same customized property values:
    • The appropriate realm.properties file, based on the authentication type that you use.
    • email-password.txt
    • keystore files
    • LDAP files
    • log4j properties file
    • security policy file
    • Vault properties file

Step 6. Install Additional Libraries for the Core Installation

If you installed the core RPM package, install the individual stage libraries that the upgraded pipelines require.

For instructions on installing additional stage libraries, see Installing for RPM.

Step 7. Uninstall Previous Libraries

Uninstall all stage libraries used by the previous Data Collector version.

  1. Run the following command to list all stage libraries used by the previous Data Collector version:
    yum list installed | grep "datacollector" | grep "<version>"
    For example, to list all stage libraries used by Data Collector version 2.6.0.0, run the following command:
    yum list installed | grep "datacollector" | grep "2.6.0.0"
  2. Run the following command to uninstall all stage libraries used by the previous version:
    yum remove <library package name>,<library package name>,...

    Where library package name is the full name of the libraries that you want to uninstall. Separate each name with commas. Do not include spaces in the command.

Step 8. Start the New Version of Data Collector

Use the following command to start the new version of Data Collector:
service sdc start