skip to Main Content

Kafka + TLS/Kerberos in Cluster Streaming Mode is here!

By Posted in Data Integration March 29, 2018

Spark Streaming + Data Collector + Secure Kafka

When we first introduced cluster streaming mode with Apache Spark Streaming 1.3 and Apache Kafka 0.8 several years ago, Kafka didn’t support security features such as TLS (transport encryption, authentication) and Kerberos (authentication). In Spark 2.1, an updated Kafka connector was introduced with support for these features when used with Kafka 0.10 or newer.

In Data Collector 3.3.0.0 (now available) we have introduced support for these features! However, this also means that we’ll be deprecating (in 3.2.0.0) and removing (in 3.3.0.0) support for Spark 1.x. If you want to continue using cluster streaming execution mode you’ll need to have Spark 2.x available.

Currently all major Hadoop distribution vendors provide a means for Spark 1.x and Spark 2.x to coexist on the same cluster in case you haven’t already made the move to Spark 2.x. You can find details on Spark 2 from vendors for each supported distribution below.

We’re always working to provide support for features our users need. The need for Kafka + TLS/Kerberos in cluster execution mode was heard loud and clear. Let us know what you’d like to see in the future by sending your ideas to product-feedback@streamsets.com!

Distribution Notes

Cloudera

Cloudera distribution of Spark 2.1 release 1 will be the earliest supported.
Spark 2.x for Cloudera CDH 

Hortonworks

Hortonworks Data Platform (HDP) since 2.6 ships with Spark 2.2.0.
Hortonworks HDP 2.6 release notes

MapR

MapR provides Spark 2.x in the MapR Ecosystem Pack (MEP) 3.0 and newer.
MapR with MapR Ecosystem Pack (MEP) 3.0 and newer

Conduct Data Ingestion and Transformations In One Place

Deploy across hybrid and multi-cloud
Schedule a Demo
Back To Top