Provisioning Agent Communication

A Provisioning Agent is a containerized application that runs in a container orchestration framework, such as Kubernetes. The agent automatically provisions Data Collector containers in the Kubernetes cluster on which it runs.

Provisioning includes deploying, starting, stopping, and scaling the Data Collector containers to work with StreamSets Control Hub. Use provisioning to reduce the overhead of managing individual Data Collector installations.

You can provision both authoring and execution Data Collectors as long as you provision them in unique deployments. When you provision an authoring Data Collector, you must associate the deployment with a Kubernetes service to expose the Data Collector container outside the cluster.

After you create a Provisioning Agent and deploy the application to a container orchestration framework, the Provisioning Agent uses encrypted REST APIs to communicate with Control Hub. Provisioning Agents initiate outbound connections to Control Hub on the port number configured in the Control Hub system. The connection must use the same protocol, HTTP or HTTPS, as the Control Hub system.

After the Provisioning Agent deploys authoring or execution Data Collector containers, the Data Collector containers communicate with Control Hub the same way that any registered Data Collector communicates with Control Hub.

The following image shows how a Provisioning Agent communicates with Control Hub to provision authoring and execution Data Collectors:

Provisioning Agent Requests

Provisioning Agents send requests and information to several of the Control Hub applications.

Control Hub applications do not directly send requests to Provisioning Agents. Instead, Control Hub applications send requests using encrypted REST APIs to a messaging queue managed by the Messaging application. Provisioning Agents periodically check with the queue to retrieve application requests.

The container orchestration framework provides high availability for the Provisioning Agent.

Provisioning Agents communicate with the following Control Hub applications:

When the Provisioning Agent deploys a Data Collector container, the agent makes a request to the Security application for a new authentication token. The Provisioning Agent sends the returned authentication token to the Data Collector container and enables the container to work with Control Hub. During the start up of the Data Collector container, the Data Collector registers itself with Control Hub.

The Provisioning Agent uses private keys to sign authentication tokens, and then Data Collector containers decrypt the tokens. As a result, Data Collector containers are not prone to distributed denial-of-service (DDoS) attacks where an impersonating agent attempts to send an invalid authentication token.

Every five seconds, Provisioning Agents send deployment status changes to the Messaging application. At the same time, Provisioning Agents check with the Messaging application to retrieve requests sent by the Provisioning application. When you start, stop, or scale a deployment in Control Hub, the Provisioning application sends a deployment request for a specific Provisioning Agent to the Messaging application. The Messaging application retains the request until the receiving Provisioning Agent retrieves the request.
Every 60 seconds, the Provisioning application checks the Messaging application to retrieve deployment status changes.