Security in Kafka Stages

You can configure Kafka stages – Kafka Consumer, Kafka Multitopic Consumer, and Kafka Producer – to connect securely through the following methods:
  • Kerberos
  • SSL/TLS
  • Both SSL/TLS and Kerberos

Enabling security requires configuring additional Kafka configuration properties in the stage in addition to completing the prerequisite tasks.

When you use Kerberos, either alone or with SSL/TLS, you have several options for how to provide Kerberos credentials. The method that you choose determines the additional tasks to perform.

Prerequisite Tasks

Before enabling security for a Kafka stage, complete the following prerequisite tasks for the security method that you want to use:

Kerberos

Complete the following prerequisite tasks before using Kerberos to connect to Kafka:

  • Make sure Kafka is configured for Kerberos as described in the Kafka documentation.
  • Make sure that Kerberos authentication is enabled for Data Collector, as described in Kerberos Authentication.
  • If using a credential store for Kerberos keytabs, make sure that Data Collector is configured to use a supported credential store. Optionally configure Data Collector to require group secrets.

    For a list of supported credential stores and instructions on enabling each credential store, see Credential Stores.

  • If configuring a Kafka YARN cluster pipeline, store the JAAS configuration and Kafka keytab files in the same locations on the Data Collector machine and on each node in the YARN cluster.
SSL/TLS
Complete the following prerequisite tasks before using SSL/TLS to connect to Kafka:
  • Before using SSL/TLS to connect to Kafka, make sure Kafka is configured for SSL/TLS as described in the Kafka documentation.
  • If configuring a Kafka YARN cluster pipeline, store the SSL truststore and keystore files in the same location on the Data Collector machine and on each node in the YARN cluster.

Providing Kerberos Credentials

To use Kerberos to connect to Kafka, you must provide the Kerberos credentials to use.

You can provide Kerberos credentials in either of the following ways. You can also use both methods, as needed:

JAAS file
Define Kerberos credentials in a Java Authentication and Authorization Service (JAAS) file when you want to use the same keytab and principal for every Kafka stage in every pipeline that you create. When configured, credentials defined in stage properties override JAAS file credentials.
You might use this method to provide a default keytab and principal. Then, use stage properties to specify different credentials, as needed.
To use this method, you define a Kerberos keytab and principal in a JAAS file. Then, you update the SDC_JAVA_OPTS environment variable to include the path to the JAAS file.
Stage properties
You can define Kerberos credentials in stage properties when the Kafka stage uses a stage library for Kafka 0.11.0.0 or higher. Define Kerberos credentials in stage properties when you want to use different credentials in different Kafka stages.
If you also configure a JAAS file to provide Kerberos credentials, the credentials that you enter in stage properties overrides those in the JAAS file.
To provide Kerberos credentials in stage properties, you select the Provide Keytab property on the Kafka tab of the stage. You specify the principal in plain text, then you use one of the following methods to specify the keytab:
  • Enter a Base64-encoded keytab in the Keytab property.

    Encode the keytab before entering it in the stage property. Be sure to remove unnecessary characters, such as newline characters, before encoding the keytab.

  • Use a credential function to access a Base64-encoded keytab defined in a credential store.

    For more information, see Using a Credential Store.

Note: Configuring Kerberos credentials in stage properties is not supported in cluster pipelines at this time.

For details on enabling Kafka stages to use Kerberos authentication, see Enabling Kerberos.

Using a Credential Store

You can define Kerberos keytabs in a credential store, then call the appropriate keytab from a Kafka stage.

Defining Kerberos keytabs in a credential store allows you to store multiple keytabs for use by Kafka stages. It also provides flexibility in how you use the keytabs. For example, you might create two separate keytabs, one for Kafka origins and one for Kafka destinations. Or, you might provide separate keytabs for every Kafka stage that you define.

Using a credential store makes it easy to update keytabs without having to edit the stages that use them. This can simplify tasks such as recycling keytabs or migrating pipelines to production.

For an additional layer of security, you can require group access to credential store secrets.

Enabling Kerberos

Kafka stages can connect to Kafka using Kerberos authentication. Before you enable Kafka stages to use Kerberos, make sure that you have performed all necessary prerequisite tasks.

The following steps provide details on providing Kerberos credentials using a JAAS file or stage properties. You can use either method or both. Skip the steps that are not relevant to your desired implementation.

  1. To use a Java Authentication and Authorization Service (JAAS) file to provide Kerberos credentials, add the configuration properties required for Kafka clients based on your installation and authentication type.
    • RPM, tarball, or Cloudera Manager installation without LDAP authentication - If Data Collector does not use LDAP authentication, create a separate JAAS configuration file on the Data Collector machine. Add the following KafkaClient login section to the file:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="<keytab path>"
          principal="<principal name>/<host name>@<realm>";
      };
      For example:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="/etc/security/keytabs/sdc.keytab"
          principal="sdc/sdc-01.streamsets.net@EXAMPLE.COM";
      };
      Then modify the SDC_JAVA_OPTS environment variable to include the following option that defines the path to the JAAS configuration file:
      -Djava.security.auth.login.config=<JAAS config path>

      Modify environment variables using the method required by your installation type.

    • RPM or tarball installation with LDAP authentication - If LDAP authentication is enabled in an RPM or tarball installation, add the properties to the JAAS configuration file used by Data Collector - the $SDC_CONF/ldap-login.conf file. Add the following KafkaClient login section to the end of the ldap-login.conf file:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="<keytab path>"
          principal="<principal name>/<host name>@<realm>";
      };
      For example:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="/etc/security/keytabs/sdc.keytab"
          principal="sdc/sdc-01.streamsets.net@EXAMPLE.COM";
      };
    • Cloudera Manager installation with LDAP authentication - If LDAP authentication is enabled in a Cloudera Manager installation, enable the LDAP Config File Substitutions (ldap.login.file.allow.substitutions) property for the StreamSets service in Cloudera Manager.

      If the Use Safety Valve to Edit LDAP Information (use.ldap.login.file) property is enabled and LDAP authentication is configured in the Data Collector Advanced Configuration Snippet (Safety Valve) for ldap-login.conf field, then add the JAAS configuration properties to the same ldap-login.conf safety valve.

      If LDAP authentication is configured through the LDAP properties rather than the ldap-login.conf safety value, add the JAAS configuration properties to the Data Collector Advanced Configuration Snippet (Safety Valve) for generated-ldap-login-append.conf field.

      Add the following KafkaClient login section to the appropriate field as follows:

      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="_KEYTAB_PATH"
          principal="<principal name>/_HOST@<realm>";
      };
      For example:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="_KEYTAB_PATH"
          principal="sdc/_HOST@EXAMPLE.COM";
      };

      Cloudera Manager generates the appropriate keytab path and host name.

  2. If using a credential store to call keytabs from stage properties, add the Base64-encoded keytabs that you want to use to the credential store.
    Note: Be sure to remove unnecessary characters, such as newline characters, before encoding the keytab.

    If you configured Data Collector to require group secrets. for each keytab secret that you define, create a group secret and specify a comma-separated list of groups allowed to access the keytab secret.

    Name the group secret based on the keytab secret name, as follows: <keytab secret name>-groups.

    For details on defining secrets, see your credential store documentation.

  3. On the General tab of the Kafka stage, set the Stage Library property to the appropriate Kafka version.

    If configuring a Kafka Consumer origin for a Kafka YARN cluster pipeline, select a stage library for Kafka version 0.10.0.0 or later.

    If using stage properties to define Kafka credentials, select a stage library for Kafka version 0.11.0.0 or later.

  4. On the Kafka tab of the Kafka stage, for the Kafka Configuration property, use the Add icon to add the following properties:
    • Add the security.protocol Kafka configuration property, and set it to SASL_PLAINTEXT.
    • Add the sasl.kerberos.service.name configuration property, and set it to kafka.
  5. If using stage properties to provide Kerberos credentials, configure these additional properties on the Kafka tab:
    Note: Configuring Kerberos credentials in stage properties is not supported in cluster pipelines at this time.
    1. Select the Provide Keytab property.
    2. For the Keytab property, use one of the following options:
      • Enter a Base64-encoded keytab.

        Be sure to remove unnecessary characters, such as newline characters, before encoding the keytab.

      • If using a credential store, use the credential:get() or credential:getWithOptions() credential function to retrieve a Base64-encoded keytab.
        Note: The user who starts the pipeline must be in the Data Collector group specified in the credential function. When Data Collector requires a group secret, the user must also be in a group associated with the keytab.

        For more information about using keytabs in a credential store, see Using a Credential Store.

    3. For the Principal property, use the following format to specify the principal: <principal name>/<host name>@<realm>.

For example, the following properties allow the stage to connect to Kafka with Kerberos using a keytab in Azure Key Vault stored under the name readkeytab that allows access to users in the devops user group:

Enabling SSL/TLS

Kafka stages can connect to Kafka using SSL/TLS.

Before you enable Kafka stages to use SSL/TLS, make sure that you have performed all necessary prerequisite tasks. Then, perform the following steps to enable the Kafka stages to use SSL/TLS to connect to Kafka.

  1. On the General tab of the stage, set the Stage Library property to the appropriate Kafka version.

    If configuring a Kafka Consumer origin for a Kafka YARN cluster pipeline, set the property to Kafka version 0.10.0.0 or later.

  2. On the Kafka tab, for the Kafka Configuration property, use the Add icon to add the following properties:
    • Add the security.protocol Kafka configuration property and set it to SSL.
  3. Then, add and configure the following SSL Kafka properties:
    • ssl.truststore.location
    • ssl.truststore.password
    When the Kafka broker requires client authentication – that is when the ssl.client.auth broker property is set to required – add and configure the following properties:
    • ssl.keystore.location
    • ssl.keystore.password
    • ssl.key.password
    Some brokers might require adding the following properties as well:
    • ssl.enabled.protocols
    • ssl.truststore.type
    • ssl.keystore.type

    In Data Collector Edge pipelines, when you configure a Kafka Producer destination, use only the security.protocol, ssl.truststore.location, and ssl.keystore.location properties. The other properties are not valid. Also, when configuring the Kafka Producer, enter an absolute path for the truststore and keystore files that use the PEM format.

    For details about these properties, see the Kafka documentation.

For example, the following properties allow the stage to use SSL/TLS to connect to Kafka with client authentication:

Enabling Kerberos and SSL/TLS

Kafka stages can connect to Kafka using Kerberos authentication and SSL/TLS.

Before you enable Kafka stages to use Kerberos, make sure that you have performed all necessary prerequisite tasks. Then, perform the following steps to enable connecting to Kafka using Kerberos authentication and SSL/TLS.

The steps provide details on providing Kerberos credentials using a JAAS file or stage properties. You can use either method or both. Skip the steps that are not relevant to your desired implementation.
  1. To use a Java Authentication and Authorization Service (JAAS) file to provide Kerberos credentials, add the configuration properties required for Kafka clients based on your installation and authentication type.
    • RPM, tarball, or Cloudera Manager installation without LDAP authentication - If Data Collector does not use LDAP authentication, create a separate JAAS configuration file on the Data Collector machine. Add the following KafkaClient login section to the file:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="<keytab path>"
          principal="<principal name>/<host name>@<realm>";
      };
      For example:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="/etc/security/keytabs/sdc.keytab"
          principal="sdc/sdc-01.streamsets.net@EXAMPLE.COM";
      };
      Then modify the SDC_JAVA_OPTS environment variable to include the following option that defines the path to the JAAS configuration file:
      -Djava.security.auth.login.config=<JAAS config path>

      Modify environment variables using the method required by your installation type.

    • RPM or tarball installation with LDAP authentication - If LDAP authentication is enabled in an RPM or tarball installation, add the properties to the JAAS configuration file used by Data Collector - the $SDC_CONF/ldap-login.conf file. Add the following KafkaClient login section to the end of the ldap-login.conf file:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="<keytab path>"
          principal="<principal name>/<host name>@<realm>";
      };
      For example:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="/etc/security/keytabs/sdc.keytab"
          principal="sdc/sdc-01.streamsets.net@EXAMPLE.COM";
      };
    • Cloudera Manager installation with LDAP authentication - If LDAP authentication is enabled in a Cloudera Manager installation, enable the LDAP Config File Substitutions (ldap.login.file.allow.substitutions) property for the StreamSets service in Cloudera Manager.

      If the Use Safety Valve to Edit LDAP Information (use.ldap.login.file) property is enabled and LDAP authentication is configured in the Data Collector Advanced Configuration Snippet (Safety Valve) for ldap-login.conf field, then add the JAAS configuration properties to the same ldap-login.conf safety valve.

      If LDAP authentication is configured through the LDAP properties rather than the ldap-login.conf safety value, add the JAAS configuration properties to the Data Collector Advanced Configuration Snippet (Safety Valve) for generated-ldap-login-append.conf field.

      Add the following KafkaClient login section to the appropriate field as follows:

      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="_KEYTAB_PATH"
          principal="<principal name>/_HOST@<realm>";
      };
      For example:
      KafkaClient {
          com.sun.security.auth.module.Krb5LoginModule required
          useKeyTab=true
          keyTab="_KEYTAB_PATH"
          principal="sdc/_HOST@EXAMPLE.COM";
      };

      Cloudera Manager generates the appropriate keytab path and host name.

  2. If using a credential store to call keytabs from stage properties, add the Base64-encoded keytabs that you want to use to the credential store.
    Note: Be sure to remove unnecessary characters, such as newline characters, before encoding the keytab.

    If you configured Data Collector to require group secrets. for each keytab secret that you define, create a group secret and specify a comma-separated list of groups allowed to access the keytab secret.

    Name the group secret based on the keytab secret name, as follows: <keytab secret name>-groups.

    For details on defining secrets, see your credential store documentation.

  3. On the General tab of the Kafka stage, set the Stage Library property to the appropriate Kafka version.

    If configuring a Kafka Consumer origin for a Kafka YARN cluster pipeline, select a stage library for Kafka version 0.10.0.0 or later.

    If using stage properties to define Kafka credentials, select a stage library for Kafka version 0.11.0.0 or later.

  4. On the Kafka tab of the Kafka stage, for the Kafka Configuration property, use the Add icon to add the following properties:
    • Add the security.protocol Kafka configuration property, and set it to SASL_SSL.
    • Add the sasl.kerberos.service.name configuration property, and set it to kafka.
  5. Then, add and configure the following SSL Kafka properties:
    • ssl.truststore.location
    • ssl.truststore.password
    When the Kafka broker requires client authentication – that is when the ssl.client.auth broker property is set to required – add and configure the following properties:
    • ssl.keystore.location
    • ssl.keystore.password
    • ssl.key.password
    Some brokers might require adding the following properties as well:
    • ssl.enabled.protocols
    • ssl.truststore.type
    • ssl.keystore.type

    In Data Collector Edge pipelines, when you configure a Kafka Producer destination, use only the security.protocol, ssl.truststore.location, and ssl.keystore.location properties. The other properties are not valid. Also, when configuring the Kafka Producer, enter an absolute path for the truststore and keystore files that use the PEM format.

    For details about these properties, see the Kafka documentation.

  6. If using stage properties to provide Kerberos credentials, configure these additional properties on the Kafka tab:
    Note: Configuring Kerberos credentials in stage properties is not supported in cluster pipelines at this time.
    1. Select the Provide Keytab property.
    2. For the Keytab property, use one of the following options:
      • Enter a Base64-encoded keytab.

        Be sure to remove unnecessary characters, such as newline characters, before encoding the keytab.

      • If using a credential store, use the credential:get() or credential:getWithOptions() credential function to retrieve a Base64-encoded keytab.
        Note: The user who starts the pipeline must be in the Data Collector group specified in the credential function. When Data Collector requires a group secret, the user must also be in a group associated with the keytab.

        For more information about using keytabs in a credential store, see Using a Credential Store.

    3. For the Principal property, use the following format to specify the principal: <principal name>/<host name>@<realm>.

For example, the following properties allow the stage to connect to Kafka using SSL/TLS and Kerberos using the specified Base64-encoded keytab: