Dataflow Performance Blog

Fast, Easy Access to Secure Kafka Clusters

It’s simple to connect StreamSets Data Collector (SDC) to Apache Kafka through the Kafka Consumer Origin and Kafka Producer Destination connectors. And because those connectors support all Kafka Client options, including the secure Kafka (SSL and SASL) options, connecting to an SSL-enabled secure Kafka cluster is just as easy. In this blog post I'll walk through the steps required.

Secure Kafka Broker Configuration

First, the Kafka broker must be configured to accept client connections over SSL. Please refer to the Apache Kafka Documentation to configure your broker. If your Kafka cluster is already SSL-enabled, you can look up the port number in your Kafka broker configuration file (or the broker logs). Look for the listeners=SSL://host.name:port configuration option. To ensure that the Kafka broker is correctly configured to accept SSL connections, run the following command from the same host that you are running SDC on. If SDC is running from within a docker container, log in to that docker container and run the command.

$openssl s_client -debug -connect host.name:port -tls1

The above command should say “CONNECTED” and print out the certificates as shown below:

CONNECTED(00000003)
write to 0x7fae22c02e50 [0x7fae23816800] (100 bytes => 100 (0x64))
0000 - 16 03 01 00 5f 01 00 00-5b 03 01 59 94 d1 29 a1   ...._...[..Y..).
0010 - a3 77 37 3d f1 51 ea ab-eb 54 ee 64 36 1e 39 b5   .w7=.Q...T.d6.9.
...

If you don’t see output like this, double-check the broker configuration file. (Remember that you’ll need to restart the broker after changing the configuration.)

Secure Kafka Client Configuration

The next step is to prepare the Keystore and Truststore files which will be used by Kafka clients and SDC Kafka connectors. The Apache Kafka Documentation shows how to generate a Certificate Authority (CA) and self-signed certificates and import them into the keystore and truststore (JKS).

However, if you already have an existing certificate key pair, you will need to convert it to PKCS12 format before importing into JKS. The following steps describe this in detail.

  • Convert the key/cert to PKCS12 format and import into the JKS:

$ openssl pkcs12 -export -in {your-cert-file} -inkey {your-key-file} > host.p12

$ keytool -importkeystore -srckeystore host.p12 -destkeystore client.keystore.jks -srcstoretype pkcs12

  • Import the CA into the client Keystore:

$ keytool -keystore client.keystore.jks -alias localhost -import -file CA

  • Import the CA into the client Truststore:

$ keytool -keystore client.truststore.jks -alias CARoot -import -file CA

Now your keystore and truststore files are ready for use. Verify that everything has been set up correctly using the built in Kafka Console Consumer and Producers by doing the following:

  • Create a “client-ssl.properties” file and copy this configuration:

security.protocol=SSL
ssl.truststore.location=/full/path/to/your/client/truststore/client.truststore.jks
ssl.truststore.password=*****
ssl.keystore.location=/full/path/to/your/client/keystore/client.keystore.jks
ssl.keystore.password=*****
ssl.key.password=*****

(Note: “ssl.key.password” is the password of the private key. The private key password and keystore password must be the same when using JKS.)

  • Start the Kafka Console Consumer:

$ kafka-console-consumer.sh --bootstrap-server host.name:port --topic test --new-consumer --consumer.config client-ssl.properties

  • Start the Kafka Console Producer:

$ kafka-console-producer.sh --broker-list host.name:port --topic test --producer.config client-ssl.properties

Type a message in the producer console and verify that the consumer receives it. This confirms that your Kafka setup is accurate.

StreamSets Kafka Connector Configuration

Finally, you should configure the SDC Kafka connectors – which is by far the easiest part! When writing to Kafka, use the “Kafka Configuration” option in Kafka Producer destination to pass the security-related options as shown in the screenshot below:

 

If you are reading data from Kafka, use the same options in your Kafka Consumer Origin. Please refer to the documentation for details on the options. As always, if you have questions while you are configuring your StreamSets Data Collector Pipelines, you can ask in the Community Slack or on the Community mailing list.

Hari NayakFast, Easy Access to Secure Kafka Clusters