Monitoring Jobs

After you start a job, you can view real-time statistics, error information, logs, and alerts about all remote pipeline instances run from the job.

Important: To monitor job statistics and metrics, the pipeline must be configured to write statistics to Control Hub or to another system. Configure the pipeline to write statistics when you design the pipeline. For more information, see Pipeline Statistics.

To monitor a job, simply click the name of an active job in the Jobs view. Control Hub displays the pipeline in the canvas and displays statistics for the job in the Monitor panel. Click the canvas to view statistics for the entire job. Select a stage in the canvas to view statistics for the stage.

Tip: You can also monitor job statistics, error information, and alerts from a topology.

The following image shows the job monitoring view:

The Summary tab in the Monitoring panel displays the real-time statistics for the job. The Summary tab also displays any metric, data, and data drift alerts configured on the pipeline and triggered during the pipeline run.

Tip: You can also view metrics through data delivery reports.
You'll likely spend most of your time in the Summary tab while you monitor a job. However, the Monitoring panel also includes the following tabs that provide additional information:
  • Job Status - Status of the job and of the system job. The system job includes the system pipeline which is used to aggregate statistics for the job. For more information about system pipelines, see Pipeline Status.
  • Data Collectors or Edge Data Collectors - List of Data Collectors running each remote pipeline instance and the system pipeline. Or, list of Edge Data Collectors running each remote pipeline instance. Click View Logs to view the logs for the remote pipeline instance.
  • Configuration - Configuration details for the pipeline or selected stage.
  • Rules - Metric alert rules, data rules, and email IDs for alerts.
  • Info - General information about the pipeline or selected stage or link.
  • History - Job history, including all user actions completed on the job and the progress of all Data Collectors or Edge Data Collectors running a remote pipeline instance for the job.

Note the following icons that display in the top and bottom toolbars for the Monitoring panel when you monitor a job. You'll use these icons frequently as you analyze the real-time statistics for a job:

Icon Name Description
View Logs View logs for the remote pipeline instance running on the selected Data Collector.
Auto Arrange Arrange the stages in the pipeline.
Share Share the job with other users and groups, as described in Permissions.
Synchronize Job Synchronize an active job after you have updated the labels assigned to Data Collectors or Edge Data Collectors.
Stop Job Stop the job.
Refresh Refresh the statistics.
Select Data Collector Select Aggregated to view aggregated statistics for all remote pipeline instances run for the job. Or select a specific Data Collector or SDC Edge to view statistics for the remote pipeline instance running on that Data Collector or SDC Edge.
Note: At this time, edge pipelines cannot be configured to aggregate statistics.
Select Time Range Select the time range for the statistics. For example, you can view statistics from the last 5 minutes or from the last 12 hours. Available when time series analysis is enabled for the job.

Time Series Analysis

When time series analysis is enabled for a job, you can view historical time series data when you monitor the job.

By default, all new jobs have time series analysis disabled. You might want to enable time series analysis for jobs for debugging purposes or to analyze dataflow performance. You can enable time series analysis for an inactive job when you edit the job.

When time series analysis is enabled, you can generate data delivery reports, monitor topologies with data SLAs, view the record count for a specific time period and can analyze time series charts for the record count, record throughput, and batch throughput. For example, the following image displays the location where you can select a time period for analysis and displays the Record Count Time Series chart:

When time series analysis is disabled, you can still view the total record count and throughput for a job, but you cannot view the data over a period of time. For example, you can’t view the record count for the last five minutes or for the last hour. You also cannot view the time series charts for the job, generate a data delivery report, or monitor topologies with data SLAs.

Job Status

When you view the list of jobs in the Jobs view or when you monitor a job, you can view the job status. The job status is color-coded as an easy visual indicator of which jobs need your attention. A red status indicates that an error has occurred that you must resolve. An orange status indicates that a warning has occurred that you should look into. A green status indicates that all is well.

Note: A job template is simply a job definition and does not have a status.

The following table describes each job status:

Job Status Description
Job is inactive. A job transitions from an active to an inactive status when you stop the job or when all remote pipeline instances run from the job have reached a finished state.

You can start, edit, reset the origin, and delete inactive jobs.

Control Hub is in the process of starting the job.

You cannot perform actions on activating jobs.

Job is active and remote pipeline instances are running on the Data Collectors or Edge Data Collectors assigned the same labels as the job.

You can monitor, synchronize, and stop active jobs.

Job is active, but there are some issues that you should look into.

For example, an orange active status can indicate one of the following issues:

  • One of the Data Collectors or Edge Data Collectors assigned the same labels as the job is not currently running.
  • One of the Data Collectors or Edge Data Collectors encountered an error while running the pipeline.
  • The system pipeline is not running because the pipeline was not configured to aggregate statistics.
  • Permission enforcement is enabled for your organization, but one of the Data Collector versions is earlier than 2.4.0.0 and does not support pipeline permissions.

You can monitor, synchronize, and stop active jobs.

Job is active, but no remote pipeline instances are running for the job.

For example, all Data Collectors assigned the same labels as the job are not running, or all Data Collectors encountered an error running the pipeline.

You can monitor, synchronize, and stop active jobs.

Control Hub is in the process of stopping the job. It is communicating with the Data Collectors or Edge Data Collectors to stop all remote pipeline instances.

You cannot perform actions on deactivating jobs.

Job is inactive and has an error that you must acknowledge.

This status can occur when at least one Data Collector or SDC Edge reported an error while attempting to stop the remote pipeline instance. For example, one Data Collector might have shut down and so could not properly stop the remote pipeline instance.

You cannot perform actions on jobs with an inactive_error status until you acknowledge the error message. To acknowledge the error, view the job details or monitor the job and acknowledge reading the error message. For more information, see Acknowledging Job Errors.

Acknowledging Job Errors

When a job has an inactive error status, you cannot perform actions on the job until you acknowledge the error message. You can acknowledge job errors from the Jobs view or when monitoring a job.
Tip: You can also acknowledge job errors from a topology.
Acknowledging errors from the Jobs view

To acknowledge job errors from the Jobs view, click the row listing the job with the inactive error to display the job details. The details list the error message for the job and for the system job. Review the messages, taking action as needed, and then click Acknowledge Error in the bottom right corner of the job details.

For example, the following image displays the details of an inactive job that was force stopped, including the error messages for both jobs:

Acknowledging errors when monitoring a job

To acknowledge job errors when monitoring a job with an inactive error, click the Job Status tab in the monitoring panel. The Job Status tab lists the error message for the job and for the system job. Review the messages, taking action as needed, and then click the Acknowledge Error icon () in the top toolbar.

For example, the following image displays the monitoring view of the same inactive job as the previous image. The Job Status tab includes the error messages for the job:

Resetting Metrics for Jobs

When a job is inactive, you can reset the metrics for the job by resetting the origin for the job. You might want to reset metrics when you are testing jobs and want to view the metrics from the current job run only.

For more information about resetting the origin, see Resetting the Origin for Jobs.