Cassandra

The Cassandra destination writes data to a Cassandra cluster.

When you configure the Cassandra destination, you define connection information and map incoming fields to columns in the Cassandra table.

You configure whether the destination uses no authentication or username and password authentication to access the Cassandra cluster. If you install the DataStax Enterprise (DSE) Java driver, you can also configure the destination to use DSE username and password authentication or Kerberos authentication.

Batches written to Cassandra are atomic - this means you can only write entire batches of records to Cassandra. If an error occurs with one or more records in a batch, Cassandra fails the entire batch. When a batch fails, all records are sent to the stage for error handling.

Authentication

Configure the Cassandra destination to use one of the following authentication providers to access the Cassandra cluster:

  • None - Performs no authentication.
  • Username/Password - Uses Cassandra username and password authentication.
  • Username/Password (DSE) - Uses DataStax Enterprise username and password authentication. Requires that you install the DSE Java driver.
  • Kerberos (DSE) - Uses Kerberos authentication. Requires that you install the DSE Java driver.

Before selecting one of the DSE authentication providers, install the DSE Java driver version 1.2.4 or later. For a compatibility matrix, see the Cassandra documentation. For information about installing additional drivers, see Install External Libraries.

Kerberos (DSE) Authentication

If you install the DSE Java driver, you can use Kerberos authentication to connect to a Cassandra cluster. When you use Kerberos authentication, Data Collector uses the Kerberos principal and keytab to connect to the cluster. By default, Data Collector uses the user account who started it to connect.

The Kerberos principal and keytab are defined in the Data Collector configuration file, $SDC_CONF/sdc.properties. To use Kerberos authentication, configure all Kerberos properties in the Data Collector configuration file, install the DSE Java driver, and then enable Kerberos (DSE) authentication in the Cassandra destination.

Cassandra Data Types

Due to Cassandra requirements, the data types of the incoming fields must match the data types of the corresponding Cassandra columns. When appropriate, use a Field Type Converter processor earlier in the pipeline to convert data types.

For details about the conversion of Java data types to Cassandra data types, see the Cassandra documentation.

The Cassandra destination supports the following Cassandra data types:
  • ASCII
  • Bigint
  • Boolean
  • Counter
  • Decimal
  • Double
  • Float
  • Int
  • List
  • Map
  • Text
  • Timestamp
  • Varchar
  • Varint
The following data types are not supported at this time:
  • Blob
  • Inet
  • Set
  • Uuid

Configuring a Cassandra Destination

Configure a Cassandra destination to write data to a Cassandra cluster.
  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    Required Fields Fields that must include data for the record to be passed into the stage.
    Tip: You might include fields that the stage uses.

    Records that do not include all required fields are processed based on the error handling configured for the pipeline.

    Preconditions Conditions that must evaluate to TRUE to allow a record to enter the stage for processing. Click Add to create additional preconditions.

    Records that do not meet all preconditions are processed based on the error handling configured for the stage.

    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the pipeline for error handling.
    • Stop Pipeline - Stops the pipeline. Not valid for cluster pipelines.
  2. On the Cassandra tab, configure the following properties:
    Cassandra Property Description
    Cassandra Contact Points Host names for nodes in Cassandra cluster. Using simple or bulk edit mode, click the Add icon to enter several host names to ensure a connection.
    Cassandra Port The port number for the Cassandra nodes.
    Authentication Provider Determines the authentication provider used to access the cluster:
    • None - Performs no authentication.
    • Username/Password - Uses Cassandra username and password authentication.
    • Username/Password (DSE) - Uses DataStax Enterprise username and password authentication. Requires that you install the DSE Java driver.
    • Kerberos (DSE) - Uses Kerberos authentication. Requires that you install the DSE Java driver.
    Protocol Version Native protocol version that defines the format of the binary messages exchanged between the driver and Cassandra. Select the protocol version that you are using.

    For information about determining your protocol version, see https://datastax.github.io/java-driver/manual/native_protocol/.

    Compression Optional compression type for transport-level requests and responses.
    Fully-Qualified Table Name Name of the Cassandra table to use. Enter a fully-qualified name using the following format: <key space>.<table name>.
    Field to Column Mapping Map fields from the record to Cassandra columns. Using simple or bulk edit mode, click the Add icon to create additional field mappings.
    Note: The record field data type must match the data type of the Cassandra column.
  3. To use username/password authentication, click the Credentials tab, and then enter a user name and password.
    Tip: To secure sensitive information such as usernames and passwords, you can use runtime resources or credential stores.