Understanding Classifiers

Classifiers identify the data that the classification rule applies to. You define classifiers for a rule from within the rule. You cannot move or reuse classifiers from one rule to another.

Classifiers can identify data based on field names, field values, or field paths. When you create a classification rule, it provides two classifiers that act as templates, one for field names and one for field values. You can edit or delete the classifiers as needed.

When you configure a classifier, you can use the RE2/J or Java regular expression engine to classify a range of values or paths. RE2/J is generally the faster regular expression engine, so should used when possible. To use advanced regular expression functionality not available in RE2/J, such as lookaheads and lookbehinds, use the Java regular expression engine.

When you configure a classifier, you can also configure a classifier to perform case-sensitive matching. By default, matching is not case-sensitive.

For example, your company ID has always been in a c_ID field, so you define a field name classifier to categorize data in those fields as company IDs. But then a new system comes online that places the company ID in an ID field. You can use a regular expression to identify both fields, such as (c_)?ID or simply list both field names.

However, if the company ID has a standard format, such as c-xxxxxx, you should create a classifier based on field values in addition to the field path classifier to ensure that company IDs are classified regardless of the fields that they're in. In the field value classifier, you can use the following regular expression: c-\d\d\d\d\d.

The following classifier uses an RE2/J expression to classify a 6 digit company ID that starts with c, might use a hyphen, and is not case-sensitive:

This classifier identifies all of the following as company IDs: C-235011, C837444, c-87002, c642209.

You can configure classifiers when you create a new classification rule or when viewing the details of an existing rule.