System Metrics

The System Metrics origin reads system metrics from the edge device where StreamSets Data Collector Edge (SDC Edge) is installed. Use the System Metrics origin only in pipelines configured for edge execution mode.

The System Metrics origin reads the metrics from the edge device at regular intervals, based on the amount of delay time between batches that you configure. For example, if you set the delay time to 10 minutes, then the origin creates a new batch containing the selected system metrics every 10 minutes.

Each batch contains a single record that includes the timestamp when the data was read and a map field for each selected system metric type. When you configure the origin, you select the types of system metrics to read - including host information and CPU, memory, disk, and network statistics.

Example

You want to collect, monitor, and analyze the system metrics of all of your edge devices.

You install SDC Edge on each edge device. You use Data Collector to design an edge sending pipeline that includes the System Metrics origin and an HTTP Client destination that posts the system metrics to an HTTP endpoint. You deploy the edge sending pipeline to all of the edge devices and then run the pipeline on each device.

You design a Data Collector receiving pipeline that includes an HTTP Server origin that reads the system metrics posted to the HTTP endpoint. After reading the metrics, the Data Collector receiving pipeline performs additional processing on the data and then writes the data to Elasticsearch for analysis of the metrics. You run the Data Collector receiving pipeline on Data Collector.

For more information about designing edge sending pipelines and Data Collector receiving pipelines, see Meet StreamSets Data Collector Edge.

Collected System Metrics

The System Metrics origin uses the psutil package for the Go programming language (or Golang) to collect system metrics.

The values that the psutil package for Golang collects vary based on the operating system of the edge device. For a complete list of the metrics that the System Metrics origin collects for each operating system, run preview for the edge pipeline.

For example, the following image displays preview for a System Metrics origin configured to collect all system metrics types:

When we expand the hostInfo map field, preview displays the host information collected for a Linux operating system:

Configuring a System Metrics Origin

Configure a Systems Metric origin to read system metrics from the edge device where SDC Edge is installed.

  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the pipeline for error handling.
    • Stop Pipeline - Stops the pipeline.
  2. On the System Metrics tab, configure the following properties:
    System Metrics Property Description
    Delay Between Batches Number of milliseconds to wait before creating the next batch of data.
    Fetch Host Information Includes host information from the edge device, such as the host name, operating system, and platform.
    Fetch CPU Stats Includes CPU statistics from the edge device, such as the number of available cores and the percentage of CPU being used.
    Fetch Memory Stats Includes memory statistics from the edge device, such as the amount of available and used memory on the device.
    Fetch Disk Stats Includes disk statistics from the edge device, such as the serial number and disk partitions of the device.
    Fetch Network Stats Includes network statistics from the edge device, such as information about the open connections on the device.