Cluster Callback URL

You can configure a cluster callback URL for a pipeline to use instead of the default Transformer URL.

For cluster pipelines, the Spark cluster must be able to access Transformer to send the status, metrics, and offsets for running pipelines. Similarly for local pipelines, the local Spark installation must be able to access the local Transformer instance.

In most cases, Spark can successfully communicate with Transformer using the URL configured in the Transformer configuration file, $TRANSFORMER_CONF/ However, in some situations, Spark cannot use the Transformer URL defined in the configuration file and thus cannot communicate with Transformer. When this occurs, you must define a cluster callback URL in the pipeline properties to enable the Spark cluster to communicate with Transformer.

For example, let's say that you've registered Transformer to work with Control Hub and you've set up a reverse proxy or a Kubernetes Ingress service for Transformer. Then you set the transformer.base.http.url property in the Transformer configuration file to the reverse proxy or Ingress service URL. This way, the Control Hub web browser can access Transformer as an authoring Transformer for pipeline design.

However, a Spark cluster that runs inside the internal network, such as Kubernetes or YARN, cannot access Transformer using the reverse proxy or Ingress service URL. In this case, you must override the Transformer URL defined in the configuration file by setting the cluster callback URL for the pipeline.

To define a cluster callback URL, in the pipeline properties panel, click the Advanced tab and define the URL in the Cluster Callback URL property.