Monitoring Data Collectors
When you view registered Data Collectors in the Execute view, you can monitor the performance of each Data Collector and the pipelines currently running on each Data Collector.
To monitor a Data Collector, simply expand the Data Collector details in the view.
Performance
When you view the details of a Data Collector version 3.4.0 or later in the Execute view, you can monitor the performance of the Data Collector. You can monitor the performance of both manually administered and automatically provisioned Data Collectors.
Control Hub does not display performance information for earlier versions of Data Collector.
- CPU Load
- Percentage of CPU being used by the Data Collector.
- Memory Used
- Amount of memory being used by the Data Collector out of the total amount of memory allocated to that Data Collector.
You can sort the list of Data Collectors by the CPU load or by the memory usage so that you can easily determine which Data Collectors are using the most resources.
You can also analyze historical time series charts for the CPU load and memory usage. For example, you can view the performance information for the last hour or for the last seven days. The following image displays the location where you select a time period for analysis of the charts:
By default, registered Data Collectors
send the CPU load and memory usage to Control Hub
every minute. You can change the frequency with which each Data Collector
sends this information to Control Hub by
modifying the dpm.remote.control.status.events.interval
property in the
Control Hub configuration
file, $SDC_CONF/dpm.properties.
Pipeline Status
When you view the details of a Data Collector in the Execute view, Control Hub displays the list of pipelines currently running on this Data Collector.
Control Hub can display the following types of running pipelines for each Data Collector:
- Local pipelines
- A local pipeline is a test run of a draft pipeline or is a pipeline that is managed by a Data Collector and run locally on that Data Collector. Local pipelines should only be run on authoring Data Collectors. Use an authoring Data Collector to start, stop, and monitor local pipelines.
- Control Hub controlled pipelines
- A Control Hub controlled pipeline is a pipeline that is managed by Control Hub and run remotely on registered Data Collectors. Control Hub controlled pipelines should only be run on execution Data Collectors. Control Hub controlled pipelines include the following:
- Published pipelines run from Control Hub jobs.
After you publish or import pipelines to Control Hub, you add them to a job, and then start the job. When you start a job on a group of Data Collectors, Control Hub remotely runs an instance of the published pipeline on each Data Collector. Use Control Hub to start, stop, and monitor published pipelines that are run from jobs.
Control Hub uses the following format to name published pipelines:<pipeline name>:<job ID>:<organization ID>
- System pipelines run from Control Hub jobs.
Control Hub automatically generates and runs system pipelines to aggregate statistics for jobs. System pipelines collect, aggregate, and push metrics for all of the remote pipeline instances run from a job. When you start a job on a group of Data Collectors, Control Hub picks one Data Collector to run the system pipeline.
Control Hub uses the following format to name system pipelines:System Pipeline for Job <job name>:<system job ID>:<organization ID>
Note: Control Hub generates system pipelines as needed. Published pipelines that are not configured to aggregate statistics do not require system pipelines.
- Published pipelines run from Control Hub jobs.
The following image shows the Pipeline Status area for a Data Collector that is currently running a local pipeline and two published pipelines: