Cluster Pipeline Limitations

Please note the following limitations in cluster pipelines:
  • Non-cluster origins - Do not use non-cluster origins in cluster pipelines. For a description of the origins to use, see Cluster Batch and Streaming Execution Modes.
  • Kafka stages - Kafka stages do not support using the Provide Keytab and related properties to specify credentials for Kerberos authentication. Use JAAS files to provide Kerberos credentials.
  • Pipeline events - You cannot use pipeline events in cluster pipelines.
  • Record Deduplicator processor - This processor is not supported in cluster pipelines at this time.
  • RabbitMQ Producer destination - This destination is not supported in cluster pipelines at this time.
  • Scripting processors - The state object is available only for the instance of the processor stage it is defined in. If the pipeline executes in cluster mode, the state object is not shared across nodes.
  • Spark Evaluator processor - Use in cluster streaming pipelines only. Do not use in cluster batch pipelines. You can also use the Spark Evaluator in standalone pipelines.
  • Spark Evaluator processor and Spark executor - When using Spark stages, the stages must use the same Spark version as the cluster. For example, if the cluster uses Spark 2.1, the Spark Evaluator must use a Spark 2.1 stage library.

    Both stages are available in several CDH and MapR stage libraries. To verify the Spark version that a stage library includes, see the CDH or MapR documentation. For more information about the stage libraries that include the Spark Evaluator, see Available Stage Libraries.