Provisioning Agent Communication

A Provisioning Agent is a containerized application that runs in your corporate network within a container orchestration framework, such as Kubernetes. The agent automatically provisions Data Collector containers in the Kubernetes cluster on which it runs.

Provisioning includes deploying, starting, scaling, and stopping the Data Collector containers to work with StreamSets Control Hub. Use provisioning to reduce the overhead of managing individual Data Collector installations.

You can provision both authoring and execution Data Collectors as long as you provision them in unique deployments. When you provision an authoring Data Collector, you must associate the deployment with a Kubernetes service to expose the Data Collector container outside the cluster.

After you create a Provisioning Agent and deploy the application to a container orchestration framework, the Provisioning Agent uses encrypted REST APIs to communicate with Control Hub. Provisioning Agents initiate outbound connections to Control Hub over HTTPS on port number 443.

After the Provisioning Agent deploys authoring or execution Data Collector containers, the Data Collector containers communicate with Control Hub the same way that any registered Data Collector communicates with Control Hub.

The following image shows how a Provisioning Agent communicates with Control Hub to provision authoring and execution Data Collectors:

Provisioning Agent Requests

Provisioning Agents send requests and information to Control Hub.

Control Hub does not directly send requests to Provisioning Agents. Instead, Control Hub sends requests using encrypted REST APIs to a messaging queue managed by Control Hub. Provisioning Agents periodically check with the queue to retrieve Control Hub requests.

The container orchestration framework provides high availability for the Provisioning Agent.

Provisioning Agents communicate with Control Hub in the following areas:

When the Provisioning Agent deploys a Data Collector container, the agent makes a request to Control Hub for a new authentication token. The Provisioning Agent sends the returned authentication token to the Data Collector container and enables the container to work with Control Hub. During the start up of the Data Collector container, the Data Collector registers itself with Control Hub.

The Provisioning Agent uses private keys to sign authentication tokens, and then Data Collector containers decrypt the tokens. As a result, Data Collector containers are not prone to distributed denial-of-service (DDoS) attacks where an impersonating agent attempts to send an invalid authentication token.

Every five seconds, Provisioning Agents send deployment status changes to the messaging queue. At the same time, Provisioning Agents check with the messaging queue to retrieve requests sent by Control Hub. When you start, stop, or scale a deployment in Control Hub, Control Hub sends a deployment request for a specific Provisioning Agent to the messaging queue. The messaging queue retains the request until the receiving Provisioning Agent retrieves the request.
Every 60 seconds, Control Hub checks the messaging queue to retrieve deployment status changes.