Understanding Classifiers

Classifiers identify the data that the classification rule applies to. You define classifiers for a rule from within the rule. You cannot move or reuse classifiers from one rule to another.

Classifiers can identify data based on field names, field values, or field paths. When you create a classification rule, it provides two classifiers that act as templates, one for field names and one for field values. You can edit or delete the classifiers as needed.

When you configure a classifier, you use a Java or RE2/J regular expression to define the field names, values, or paths to classify. RE2/J is recommended due to better performance.

You can configure a classifier to perform case-sensitive matching. By default, matching is not case-sensitive.

For example, your company ID has always been in a c_ID field, so you define a field name classifier to categorize data in those fields as company IDs. But then a new system comes online that places the company ID in an ID field. You can use a regular expression to identify both fields, such as (c_)?ID or simply list both field names.

However, if the company ID has a standard format, such as c-xxxxxx, you should create a classifier based on field values in addition to the field path classifier to ensure that company IDs are classified regardless of the fields that they're in. In the field value classifier, you can use c-\d\d\d\d\d or c-\d{6} to define the pattern.

The following classifier classifies a 6 digit company ID that starts with c, might use a hyphen, and is not case-sensitive:

This classifier identifies all of the following as company IDs: C-235011, C837444, c-87002, c642209.

You can configure classifiers when you create a new classification rule or when viewing the details of an existing rule.

For more information about the RE2/J regular expressions, see the RE2 documentation. For more information about Java regular expressions, see the Oracle documentation.