UDP Source

The UDP Source origin reads messages from one or more UDP ports. To use multiple threads for pipeline processing, use the UDP Multithreaded Source. For a discussion about the differences between the two origins, see Comparing UDP Source Origins.

UDP Source generates a record for every message. UDP Source can process collectd messages, NetFlow 5 and NetFlow 9 messages, and the following types of syslog messages:
  • RFC 5424
  • RFC 3164
  • Non-standard common messages, such as RFC 3339 dates with no version digit

When processing NetFlow messages, the stage generates different records based on the NetFlow version. When processing NetFlow 9, the records are generated based on the NetFlow 9 configuration properties. For more information, see NetFlow Data Processing.

The origin can also read binary or character-based raw data.

When you configure UDP Source, you specify the ports to use and the batch size and wait time. When epoll is available, you can specify the number of receiver threads to use to increase the throughput of packets to the pipeline.

You also specify the data format for the data, then configure any related properties.

Processing Raw Data

Use the Raw/Separated Data data format to enable the UDP Source origin to generate records from binary or character-based raw data.

When processing raw data, the origin can generate a record for each UDP packet that it receives. Or, if you specify a separator character, then the origin can generate multiple records from each UDP packet.

When generating multiple records, you specify the multiple value behavior: one record with only the first value, one record with all values as a list, or multiple records with one record for each value.

You can optionally specify an output field to use for the data. When not specified, the origin writes the raw data to the root field.

You might use the Raw/Separated Data data format to write raw data to a field that you later process using the Data Parser processor. This allows you to retain the raw data for another use.

Receiver Threads

Receiver threads are used to pass data from the UDP source system to the origin. By default, the origin uses a single receiver thread.

You can configure the UDP Source origin to use additional receiver threads when Data Collector runs on a machine enabled for epoll. Epoll requires native libraries and is only available when Data Collector runs on recent versions of 64-bit Linux. When you enable multiple receiver threads, you increase the volume of data that can be passed to the origin at one time.

To use additional receiver threads, select the Use Native Transports (epoll) property, and then configure Number of Receiver Threads.

Configuring a UDP Source

Configure a UDP Source origin to process messages from a UDP source.

  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the pipeline for error handling.
    • Stop Pipeline - Stops the pipeline.
  2. On the UDP tab, configure the following properties:
    UDP Property Description
    Port Port to listen to for data. Using simple or bulk edit mode, click the Add icon to list additional ports.

    To listen to a port below 1024, Data Collector must be run by a user with root privileges. Otherwise, the operating system does not allow Data Collector to bind to the port.

    Note: No other pipelines or processes can already be bound to the listening port. The listening port can be used only by a single pipeline.
    Use Native Transports (epoll) Specifies whether to use multiple receiver threads for each port. Using multiple receiver threads can improve performance.

    You can use multiple receiver threads using epoll, which can be available when Data Collector runs on recent versions of 64-bit Linux.

    Number of Receiver Threads Number of receiver threads to use for each port. For example, if you configure two threads per port and configure the origin to use three ports, the origin uses a total of six threads.

    Use to increase the number of threads passing data to the origin when epoll is available on the Data Collector machine.

    Default is 1.

    Data Format Data format passed by UDP:
    • collectd
    • NetFlow
    • syslog
    • Raw/separated data
    Max Batch Size (messages) Maximum number of messages to include in a batch and pass through the pipeline at one time. Honors values up to the Data Collector maximum batch size.

    Default is 1000. The Data Collector default is 1000.

    Batch Wait Time (ms) Milliseconds to wait before sending a partial or empty batch.
  3. On the syslog tab, define the character set for the data.
  4. On the collectd tab, define the following collectd properties:
    collectd Property Properties
    TypesDB File Path Path to a user-provided types.db file. Overrides the default types.db file.
    Convert Hi-Res Time & Interval Converts the collectd high resolution time format interval and timestamp to UNIX time, in milliseconds.
    Exclude Interval Excludes the interval field from output record.
    Auth File Path to an optional authentication file. Use an authentication file to accept signed and encrypted data.
    Charset Character set of the data.
  5. For raw data, on the Raw/Separated Data tab, define the following properties:
    Raw/Separated Data Property Description
    Data Separator Optional data separator to use to separate UDP packets to multiple values. Specify byte literals using Java unicode syntax, \u<character code>.

    For example, the default line feed character is expressed as follows: \u000A.

    Raw Data Mode Type of raw data to process: binary or string data.
    Charset Charset used by string data.
    Output Field Path Optional output field for the raw data. When not used, the origin writes the raw data to the root field.
    Multiple Values Behavior
    Action to take when the data in the data separator generates multiple values from a UDP packet:
    • First Value Only - Returns one record with the first value.
    • All Values as a List - Returns one record with all values in a List.
    • Split into Multiple Records - Returns multiple records, one record for each value.
  6. For NetFlow 9 data, on the NetFlow 9 tab, configure the following properties:
    When processing earlier versions of NetFlow data, these properties are ignored.
    Netflow 9 Property Description
    Record Generation Mode Determines the type of values to include in the record. Select one of the following options:
    • Raw Only
    • Interpreted Only
    • Both Raw and Interpreted
    Max Templates in Cache The maximum number of templates to store in the template cache. For more information about templates, see Caching NetFlow 9 Templates.

    Default is -1 for an unlimited cache size.

    Template Cache Timeout (ms) The maximum number of milliseconds to cache an idle template. Templates unused for more than the specified time are evicted from the cache. For more information about templates, see Caching NetFlow 9 Templates.

    Default is -1 for caching templates indefinitely.