Using Labels to Support Pipeline Failover

When deciding how to group execution Data Collectors, consider whether you want to include one or more backup Data Collectors in the group to support pipeline failover for jobs.

You can enable pipeline failover for a job to minimize downtime due to unexpected pipeline failures and to help you achieve high availability. When you enable failover for a job, you must set the number of pipeline instances to a value less than the number of Data Collectors assigned all of the same labels as the job. This reserves an available Data Collector for pipeline failover.

For example, you use Test and Production labels to designate the Data Collectors that run in those environments. Each production job must run two remote pipeline instances. You want to ensure that there is minimal downtime in the production environment. You assign the Production label to four Data Collectors and configure each production job to run two pipeline instances. When you start the job, Control Hub identifies two available Data Collectors and starts pipeline instances on both. The third and fourth Data Collectors serve as backups and are available to continue processing pipelines if another Data Collector shuts down or a pipeline encounters an error.

Note: In most cases, you should disable failover for a job that contains an edge pipeline. Origins in edge pipelines are tied to an SDC Edge running on a particular edge device. In this situation, a backup SDC Edge cannot continue processing from the last-saved offset recorded by the previous SDC Edge.