KineticaDB

Supported pipeline types:
  • Data Collector

The KineticaDB destination writes data to a table in a Kinetica cluster using the Kinetica bulk inserter.

When you configure the KineticaDB destination, you specify the URL for the Kinetica head node, the credentials for the connection, and the table name. You specify the batch size for the bulk inserter and whether to compress the data before passing it to Kinetica.

When necessary, you can disable multihead ingest, and you can specify a regular expression to filter the IP addresses that the bulk inserter uses.

Multihead Ingest

By default, the KineticaDB destination writes to Kinetica using multihead ingest when possible.

When using multihead ingest, the destination can send data directly to the appropriate shard manager. When writing to a replicated table, the destination passes the data only to the head node, which then replicates the data as expected.

You can configure the Kinetica DB destination to send data only to the Kinetica head node instead. You might need to disable multihead ingest, for example, when the Kinetica worker nodes reside behind a firewall.

To disable multihead ingest, select the Disable Multihead Ingest property on the Connection tab. For more information about multihead ingestion, see the Kinetica documentation.

Inserts and Updates

By default when writing to Kinetica, the KineticaDB destination inserts all new records. If the destination finds an existing record in the table with the same primary key, it leaves the existing record as is, and discards the new record.

You can configure the destination to replace the existing record instead. To replace existing records with the same primary key, select the Update on Existing PK property on the Table tab.

Configuring a KineticaDB Destination

Configure a KineticaDB destination to write data to a KineticaDB cluster.

  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    Required Fields Fields that must include data for the record to be passed into the stage.
    Tip: You might include fields that the stage uses.

    Records that do not include all required fields are processed based on the error handling configured for the pipeline.

    Preconditions Conditions that must evaluate to TRUE to allow a record to enter the stage for processing. Click Add to create additional preconditions.

    Records that do not meet all preconditions are processed based on the error handling configured for the stage.

    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the pipeline for error handling.
    • Stop Pipeline - Stops the pipeline. Not valid for cluster pipelines.
  2. On the Connection tab, configure the following properties:
    Connection Property Description
    Kinetica URL The URL for the head node of the Kinetica cluster. Use the following format:
    http://<host name>:<port number> 
    For example:
    http://kinetica.acme.com:9191
    Batch Size The batch size to use for the Kinetica bulk inserter.

    Default is 10,000 records.

    Transport Compression Compresses data before writing to Kinetica.
    Disable Multihead Ingest Disables the default multihead ingest processing. When selected, the destination passes data to the Kinetica head node for redistribution.
    IP Regex Regular expression to specify the IP addresses to write to. Use to filter out invalid IP addresses associated with multihome hosts.

    For example, if Kinetica hosts have both internal and external IP addresses, you can enter a regular expression to allow writing to only the external IP addresses.

    Custom Worker URL List List of worker node URLs that overrides the default worker node URLs.

    You might configure a list of custom worker node URLs so that the destination uses host names instead of IP addresses to connect to the worker nodes.

    Using simple or bulk edit mode, click the Add icon and define each worker node URL. The URLs must be listed in order and must include all ranks.

    For example, if the Kinetica cluster contains three worker nodes, define a custom URL for each node as follows:
    http://kinetica.acme.com:9191/gpudb-1
    http://kinetica.acme.com:9191/gpudb-2
    http://kinetica.acme.com:9191/gpudb-3
  3. On the Credentials tab, configure the following properties:
    Credentials Property Description
    Username Username for the connection.
    Password Password for the connection.
  4. On the Table tab, configure the following properties:
    Table Property Description
    Table Name Table to write to. Table names are case sensitive.
    Update on Existing PK Determines the behavior when a record with the same primary key is already in the Kinetica table.

    Select to allow updates to existing records. By default, the destination does not write records when one with the same primary key already exists.