Rules and Alerts

Rules and Alerts Overview

Define rules for a pipeline to capture information about the pipeline as it runs. Rules trigger alerts to notify you when a specified condition occurs.

You can create the following types of rules for a pipeline:
  • Metric rule - Gathers statistics about the pipeline such as pipeline idle time or error record counts. Provides an alert when enabled.
  • Data rule - Gathers details about data as it passes between two stages. Can provide a meter and alert when enabled.
  • Data drift rule - Gathers details about data drift as data passes between two stages. Can provide a meter and alert when enabled.
When a rule triggers an alert, you can be informed of the alert in the following ways:
  • Pipeline monitoring - By default, all triggered alerts display in the Notifications icon () in the toolbar when you monitor the pipeline. The alerts display for the duration of the pipeline run. When you stop the pipeline, the alerts disappear.
  • Webhooks - You can optionally configure webhooks that are sent when alerts are triggered. A webhook is a user-defined HTTP request that the pipeline sends automatically when certain actions occur. All triggered alerts send all webhooks configured in the Rules tab for the pipeline.

Metric Rules and Alerts

Metric rules and alerts provide notifications about real-time statistics for pipelines.

When you run a pipeline, StreamSets Cloud displays real-time statistics about the pipeline in the Monitoring tab. You can define and enable metric rules so that you are alerted when a statistic reaches a certain threshold. By default, each enabled metric rule displays the alert in the Notifications icon () in the toolbar when you monitor the pipeline. You can also configure each metric rule to send a webhook alert.

For example, the Record Count statistics in the Monitoring tab display the number of input, output, and error records that the pipeline has processed:

You can view the number of error records in these statistics. However, you might want to be notified when the number of error records reaches a certain threshold. You can enable the default rule for the Pipeline Error Records Counter metric to send an alert when the pipeline encounters more than 100 error records. When you enable the metric rule and the alert triggers, the Notifications icon () indicates that an alert has triggered. When you click the Notifications icon, the alert text displays, as follows:

You configure metric rules when you configure the pipeline. StreamSets Cloud provides a set of default metric rules that you can edit and enable for any pipeline. Metric rules take effect after you enable them.

You can also create custom metric rules. When you create a custom metric rule, you select the metric type. The metric type determines which statistic triggers the alert. You configure the condition that triggers the alert, and enter the text to display in the alert.

Default Metric Rules

StreamSets Cloud provides a set of default metric rules that you can edit and enable for any pipeline.

You might want to edit a default metric rule to modify the alert text or the condition for the rule. By default, none of the rules are enabled. Select a rule, and then click Enable to enable a rule.

StreamSets Cloud provides the following default metric rules:

Metric Types

You can use different metric types when you create a metric rule. The metric type determines which statistic triggers the alert.

After selecting a metric type, you select the metric ID which specifies the metric to use. For example, a metric ID can be a runtime statistics gauge or an input records meter. You then select the metric element that defines what the metric is measuring. A metric element can be a count, rate, median, minimum, maximum, or percentage. The possible metric ID and metric element vary by metric type.

Gauge

The gauge metric type provides alerts based on the number of input, output, or error records for the last processed batch. It also provides alerts on the age of the current batch, the amount of time a stage takes to process a batch, or the time that the pipeline last received a record from the origin.

The gauge metric type provides alerts about some of the runtime statistics displayed when you monitor a pipeline:

The gauge metric type includes a single metric ID, Runtime Statistics Gauge. You can configure the alert to trigger on the following metric elements:
  • Current Batch Age
  • Last Batch Input, Output, or Error Records Counts
  • Last Batch Error Messages Count
  • Time in Current Stage
  • Time of Last Received Record

For example, you can configure a gauge metric rule that triggers an alert when the pipeline has been processing a batch for more than 5 minutes.

Counter

The counter metric type provides alerts based on the memory usage or the number of input, output, or error records for the pipeline or for a stage in the pipeline.

The following counter metrics display the records processed by the pipeline and the memory usage of the pipeline:

The counter metric type includes the following metric IDs:
  • Pipeline batch count.
  • Number of input records, output records, error records, or stage errors for the pipeline or for a stage in the pipeline.
  • Memory usage for the pipeline or for a stage in the pipeline.

For any of the selected metric IDs, you can configure the alert to trigger on the count metric element.

For example, you can configure a counter metric rule that triggers an alert when a pipeline encounters more than 1,000 error records.

Histogram

The histogram metric type provides alerts based on a histogram of different record types and stage errors for the pipeline or for a stage in the pipeline.

The histogram metric type provides alerts about the Records Per Batch Histogram statistics displayed as you monitor the pipeline:

The histogram metric type includes metric IDs for the input records, output records, error records, or stage errors for the pipeline or for a stage in the pipeline. You can configure the alert to trigger on the metric elements displayed in the monitoring histogram: mean, standard deviation, percentage, or count.

For example, you can configure a histogram metric rule that triggers an alert when the mean of all input records processed by the pipeline reaches 10,000.

Meter

The meter metric type provides alerts based on rates of different record types and stage errors for pipelines or for a stage in the pipeline.

The meter metric type can provide alerts about the number of batches processed by the pipeline. The meter metric type can also provide alerts about the Record Count and Record Throughput statistics for the pipeline or for a stage:

The meter metric type includes metric IDs for the pipeline batch count and for the input records, output records, error records, or stage errors for the pipeline or for a stage in the pipeline. You can configure the alert to trigger on the following metric elements displayed in the Record Count and Record Throughput statistics: count, time rates, or mean.

For example, you can configure a meter metric rule that triggers an alert when the number of output records that a stage processes reaches 5,000 in one minute.

Timer

The timer metric type provides alerts based on batch processing timers for the pipeline or for a stage in the pipeline.

The timer metric type provides alerts about the Batch Throughput and Batch Processing Timer statistics displayed for the pipeline or a stage.

The timer metric type includes the following metric IDs:
  • Pipeline Batch Processing Timer - Amount of time for the pipeline to process a batch.
  • <stage_name> Batch Processing Timer - Amount of time for a stage to process a batch.

You can configure the alert to trigger on the following metric elements displayed in the Batch Processing Timer statistics: mean, standard deviation, percentage, time rates, or count.

For example, you can configure a timer metric rule that triggers an alert when the mean amount of time that the pipeline takes to process a batch reaches 10 minutes.

Metric Conditions

When you configure a metric rule, you configure the condition that defines the threshold at which the metric rule triggers an alert.

Use the expression language to configure the condition. The expression language provides the following functions for creating metric rule conditions:

value()
Returns the value of the current metric selected in the metric rule. Use in conditions for any type of metric rule.
For example, the default rule for the Pipeline Error Records Counter metric includes the following condition:
${value() > 100}

The alert is triggered when the pipeline encounters more than 100 error records.

time.now()
Returns the current time of the execution environment as a java.util.Date object. Use in conditions for gauge metric rules.
For example, the default rule for the Runtime Statistics Gauge metric that checks whether the pipeline is idle includes the following condition:
${time:now() - value() > 120000}

The alert is triggered when the current time is greater than the time of the last received record by 120,000 milliseconds.

Configuring a Metric Rule and Alert

Create a custom metric rule to receive alerts when a real-time statistic reaches a certain threshold. You can create metric rules and alerts when you configure a pipeline.

After you create a custom metric rule, enable the rule to activate it.
  1. To view pipeline configuration options, click an unused section of the pipeline canvas.
  2. In the pipeline properties pane, click the Rules tab.
  3. Click the Metric Rules tab, and then click Create New Rule.
  4. Configure the following properties:
    Metric Rule Property Description
    Alert Text Text to display when the alert is triggered.

    Enter text that explains the reason for the alert. For example, "Over 1000 pipeline error records."

    Metric Type Type of metric information the alert is based on:
    • Gauge
    • Counter
    • Histogram
    • Meter
    • Timer
    Metric ID Metric to use. Provides a list of available metrics based on the metric type.
    Metric Element Metric element to use. Provides a list of available elements based on the metric ID.
    Condition Condition to trigger the alert. Use the expression language to configure the condition.
  5. Click Create.
    The new metric rule displays in the list.
  6. To enable the rule, select the rule and then click Enable.

Data Rules and Alerts

Data rules define information that you want to see about the data that passes between stages. You can create data rules based on any link in the pipeline. You can also enable meters and create alerts for data rules.

For example, in the following sample running pipeline, the first link has a defined data rule, while the second link does not have one:

When you click a link with a defined data rule, the monitoring pane displays summary statistics, data rules, and information about the stream of data that the link represents. The summary statistics that display are based on the data rules that you create.

You configure data rules when you configure the pipeline. To create a data rule, you need familiarity with the data being processed. You might preview data or take a snapshot of data to help determine how to configure data rules.

Configuring a Data Rule and Alert

Create a data rule to view metrics, sample data, and alerts about data that passes between stages.

You can create data rules and alerts when you configure a pipeline. After you create a data rule, enable the rule to activate it.
  1. To view pipeline configuration options, click an unused section of the pipeline canvas.
  2. In the pipeline properties pane, click the Rules tab.
  3. Click the Data Rules tab, and then click Create New Rule.
  4. Configure the following properties:
    Data Rule Property Description
    Stream Link selected for the data rule.
    Label Label to display for the data rule.
    Alert Text Text to display when the alert is triggered.

    Enter text that explains the reason for the alert. For example, "Over 100 missing phone numbers."

    You can use the expression language to define the alert text. For example, use the record:errorMessage() function to display the error message in the alert text.

    Condition Condition that defines the data rule. Use the expression language to configure the condition.
    Sampling Percentage Percentage of records to sample to generate information for the data rule.
    Sampling Records to Retain Number of sampled records to keep in memory for display.
    Enable Alert Enables an alert based on the data rule. Alerts display when the configured conditions occur.
    Enable Meter Enables gathering information for the data rule. The information gathered displays when you select the link while monitoring the pipeline.
    Threshold Type Type of threshold that defines when the alert becomes active:
    • Count - A specified number of records.
    • Percentage - A specified percentage of records.
    Threshold Value Value that defines the threshold at which the rule triggers an alert.
    Min Volume Minimum number of records to process before evaluating a percentage threshold type.
  5. Click Create.
    The new data rule displays in the list.
  6. To enable the rule, select the rule and then click Enable.

Viewing Data Rule Metrics and Sample Data

You can view the sample data generated by a data rule while monitoring a pipeline. For data rules with metering enabled, you can also view a graph that displays the metering information for the rule.

  1. While monitoring a pipeline, in the pipeline canvas, select the link with the data metrics that you want to view.
  2. If necessary, in the monitoring pane, click the Monitoring tab.
    Sample data displays. When enabled for the rule, metering information displays on the right.

    In the following example, a data alert is triggered, and the monitoring pane displays the sample records and metering graphic for a data rule between the MySQL Multitable Consumer and Expression Evaluator stages:

Data Drift Rules and Alerts

You can create data drift rules to indicate when the structure of data changes. You can create data drift rules on any link in the pipeline. You can also enable meters and create alerts for data drift rules.

The expression language provides data drift functions for creating data drift rules. You can use specific field types with each function. The following table describes the type of data drift rules that you can generate on the different field types:
Data Drift Rule Drift Function Valid Field Data Types
Field name changes drift:name() list-map

map

Field order changes drift:order() list-map
Number of fields drift:size() list

list-map

map

Field data type drift:type() any

You can view the metrics and sample records for data drift rules in the same way that you view data rule metrics and records. For more information, see Viewing Data Rule Metrics and Sample Data.

For details about the data drift functions, see Data Drift Functions.

Data Drift Alert Triggers

Data drift alerts trigger when a change of the specified type occurs from record to record.

For example, you have an alert that triggers when the number of fields in the record changes. When processing the records with the following number of columns, an alert triggers for both the third and fourth records:
Record Number Number of Columns
1 10
2 10
3 15
4 10

Data drift functions include an ignoreWhenMissing flag to determine the behavior when the specified field does not exist. When the specified field is missing and ignoreWhenMissing is set to true, an alert is not triggered.

When the specified field is missing and the ignoreWhenMissing flag is set to false, the expression triggers an alert for the missing field, and again for the next record when the field is present.

For example, the following expression checks the data type of the ID column with ignoreWhenMissing set to false:
${drift:type('/UserID', false)}

Say all records include the UserID field, and then a single record passes without the UserID field. This expression triggers an alert for the record with the missing field, and again when the next record arrives that includes the UserID field.

Configuring Data Drift Rules and Alerts

Create a data drift rule to view metrics, sample data, and alerts when the structure of data changes. You can create data drift rules and alerts when you configure a pipeline.
  1. To view pipeline configuration options, click an unused section of the pipeline canvas.
  2. In the pipeline properties pane, click the Rules tab.
  3. Click the Data Drift Rules tab, and then click Create New Rule.
  4. Configure the following properties:
    Data Rule Property Description
    Stream Link selected for the data drift rule.
    Label Label to display for the data drift rule.
    Alert Text Text to display when the alert is triggered. You can use the expression language to define the alert text.

    For example, you might use the following expression to return text related to the drift alert: ${alert:info()}.

    Condition Condition that defines the data drift rule. You can use data drift functions and other aspects of the expression language to configure the condition.
    Sampling Percentage Percentage of records to sample to generate information for the data drift rule.
    Sampling Records to Retain Number of sampled records to keep in memory for display.
    Enable Alert Enables an alert based on the data drift rule. Alerts display when the configured conditions occur.
    Enable Meter Enables gathering information for the data drift rule. The information gathered displays when you select the link while monitoring the pipeline.
  5. Click Create.
    The new data drift rule displays in the list.
  6. To enable the rule, select the rule and then click Enable.

Alert Webhooks

You can configure webhooks that are sent when alerts are triggered.

A webhook is a user-defined HTTP callback - an HTTP request that the pipeline sends automatically when certain actions occur. You can use webhooks to automatically trigger external tasks based on an HTTP request. Tasks can be as simple as sending a message through an application API.

The pipeline sends all alert webhooks each time an alert is triggered. So when you configure an alert webhook, create a webhook payload that is applicable for all triggered alerts. You can configure a payload that includes the details of each alert.

Important: You must configure webhooks as expected by the receiving system. For details on how to configure incoming webhooks check the receiving system's documentation. You might also need to enable webhook usage within that system.

When you configure an alert webhook, you specify the URL to send the request and the HTTP method to use. Some HTTP methods allow you to include a request body or payload. In the payload, you can use parameters to include information about the cause of the trigger, such as the pipeline that triggered the alert and the alert details. You can also include request headers, content type, authentication type, username and password as needed.

Configuring an Alert Webhook

Configure an alert webhook to automatically send an HTTP request each time the pipeline triggers an alert.

  1. To view pipeline configuration options, click an unused section of the pipeline canvas.
  2. In the properties pane, click the Rules tab, and then click Notifications.
  3. Configure the following properties:
    Webhook Property Description
    Webhooks Webhook to send when an alert triggers. Using simple or bulk edit mode, click the Add icon to add additional webhooks.
    Webhook URL URL to send the HTTP request.
    Headers Optional HTTP request headers.
    HTTP Method HTTP method. Use one of the following methods:
    • GET
    • PUT
    • POST
    • DELETE
    • HEAD
    Payload Optional payload to include. Available for PUT, POST, and DELETE methods.

    Use any valid content type.

    You can use webhook parameters in the payload to include information about the triggering event, such as the alert name or condition. Enclose webhook parameters in double curly brackets as follows: {{ALERT_NAME}}.

    Content Type Optional content type of the payload. Configure this property when the content type is not declared in the request headers.
    Authentication Type Optional authentication type to include in the request. Use None, Basic, Digest, or Universal.

    Use Basic for Form authentication.

    User Name User name to include when using authentication.
    Password Password to include when using authentication.
  4. To create an additional webhook, click the Add icon.