Creating a Job for a Pipeline

A pipeline is the design of the dataflow. A job is the execution of the dataflow.

A job defines the pipeline to run and the Data Collectors or Edge Data Collectors (SDC Edge) that run the pipeline. When you create a job, you specify the pipeline version to run and you select labels for the job. Labels indicate which group of Data Collectors or Edge Data Collectors should run the pipeline.

When you start a job that contains a standalone or cluster pipeline, Control Hub runs a remote pipeline instance on Data Collectors with matching labels. When you start a job that contains an edge pipeline, Control Hub runs a remote pipeline instance on Edge Data Collectors with matching labels.

When you create a job that includes a pipeline with runtime parameters, you can designate the job as a job template. A job template lets you run multiple job instances with different runtime parameter values from a single job definition.

For more information about jobs, see Jobs Overview.

In the Pipelines view, you can create a job for a single pipeline or for multiple pipelines at the same time.

  1. In the Navigation panel, click Pipeline Repository > Pipelines.
  2. To create a job for a single pipeline, hover over the pipeline that you want to create a job for, and then click the Create Job icon () next to the pipeline.
    Or to create jobs for multiple pipelines, select multiple pipelines in the list and then click the Create Job icon () at the top of the pipeline list.
  3. On the Add Job window, configure the following properties:
    Job Property Description
    Job Name Job name.
    Description Optional description of the job.
    Pipeline Pipeline that you want to run.
    Pipeline Commit / Tag Pipeline commit or pipeline tag assigned to the pipeline version that you want to run. You can create a job for any pipeline version.

    By default, Control Hub displays the latest pipeline version.

    Data Collector or Data Collector Edge Labels Label or labels that determine the group of Data Collectors or Edge Data Collectors that run the pipeline. When you start the job, Control Hub can send an instance of the pipeline to each Data Collector or SDC Edge with all of the specified labels.
    Enable Job Template Enables the job to work as a job template. A job template lets you run multiple job instances with different runtime parameter values from a single job definition.

    Enable only for jobs that include pipelines that use runtime parameters.

    Note: You can enable a job to work as a job template only during job creation. You cannot enable an existing job to work as a job template.
    Statistics Refresh Interval (ms) Milliseconds to wait before automatically refreshing statistics when you monitor the job.

    The minimum and default value is 60,000 milliseconds.

    Enable Time Series Analysis Enables Control Hub to store time series data which you can analyze when you monitor the job.

    When time series analysis is disabled, you can still view the total record count and throughput for a job, but you cannot view the data over a period of time. For example, you can’t view the record count for the last five minutes or for the last hour.

    Number of Instances Number of pipeline instances to run for the job. Increase the value only when the pipeline is designed for scaling out.

    Default is 1 which runs one pipeline instance on an available Data Collector or SDC Edge running the fewest number of pipelines. An available Data Collector or SDC Edge includes any component assigned all labels specified for the job.

    Enable Failover Enables Control Hub to restart a failed pipeline on another available Data Collector when the original Data Collector running the pipeline shuts down or when the pipeline encounters a Run_Error or Start_Error state.

    Default is disabled.

    Failover Retries per Data Collector Maximum number of pipeline failover retries to attempt on each available Data Collector.

    Control Hub increments the failover retry count and applies the retry limit only when the pipeline transitions to an error state. If the Data Collector running the pipeline shuts down, failover always occurs and Control Hub does not increment the failover retry count.

    Use -1 to retry indefinitely. Use 0 to attempt zero retries.

    Pipeline Force Stop Timeout Number of milliseconds to wait before forcing remote pipeline instances to stop.

    In some situations when you stop a job, a remote pipeline instance can remain in a Stopping state. For example, if a scripting processor in the pipeline includes code with a timed wait or an infinite loop, the pipeline remains in a Stopping state until it is force stopped.

    Default is 120,000 milliseconds, or 2 minutes.

    Read Policy The read protection policy to use for the job. Select an appropriate read policy for the job.

    Available with Data Protector only.

    Write Policy The write protection policy to use for the job. Select an appropriate write policy for the job.

    Available with Data Protector only.

    Runtime Parameters Runtime parameter values to start the pipeline instances with. Overrides the default parameter values defined for the pipeline.

    Click Get Default Parameters to display the parameters and default values as defined in the pipeline, and then override the default values.

    You can configure parameter values using simple or bulk edit mode. In bulk edit mode, configure parameter values in JSON format.

    Add to Topology Topology to add the job to. Select one of the following options:
    • None - Do not add to a topology. You can add the job to a topology after job creation.
    • Default Topology - Add to a new topology named Default Topology. Control Hub creates the default topology and adds the job to it. You can rename the topology if needed.
    • An existing topology. Add to an existing topology.
  4. If creating a job for a single pipeline, click Add Another to add another job for the same pipeline. Configure the properties for the additional job.
    click Save when you have finished configuring all jobs.
  5. If creating jobs for multiple pipelines, click Next to configure properties for the next job. When you finish configuring a job for each selected pipeline, click Create.

    Control Hub displays the job or job template in the Jobs view.