SDC Edge Communication

StreamSets Control Hub works with Data Collector Edge (SDC Edge) to execute edge pipelines. SDC Edge is a lightweight agent that runs pipelines on edge devices with limited resources.

You install each SDC Edge on an edge device and then register it to work with Control Hub.

You use an authoring Data Collector to design edge pipelines. You can design edge pipelines in the Control Hub Pipeline Designer after selecting an available authoring Data Collector to use. Or, you can directly log into an authoring Data Collector to design edge pipelines using the Data Collector UI.

To preview and validate edge pipelines as you design them, the authoring Data Collector must connect to a registered SDC Edge. The SDC Edge accepts inbound connections from the authoring Data Collector over HTTP or HTTPS on the port number configured for the SDC Edge.

Registered Edge Data Collectors use encrypted REST APIs to communicate with Control Hub. Edge Data Collectors initiate outbound connections to Control Hub on the port number configured in the Control Hub system. The connection must use the same protocol, HTTP or HTTPS, as the Control Hub system.

The following image shows how each SDC Edge communicates with Control Hub and with the authoring Data Collector:

SDC Edge Requests

Just like Data Collector, a registered SDC Edge sends requests and information to several of the Control Hub applications.

Control Hub applications do not directly send requests to an SDC Edge. Instead, Control Hub sends requests using encrypted REST APIs to a messaging queue managed by the Messaging application. An SDC Edge periodically checks with the queue to retrieve Control Hub requests.

SDC Edge communicates with the following Control Hub applications:

Time Series
Every minute, an SDC Edge sends metrics for remotely running edge pipelines directly to the Time Series application.
Edge Data Collectors send the following information to the Messaging application:
  • At startup, an SDC Edge sends the following information: SDC Edge version, HTTP URL of the SDC Edge, and labels configured in the SDC Edge configuration file, edge.conf.
  • Every five seconds, an SDC Edge sends a heartbeat.
  • Every minute, an SDC Edge sends the last-saved offsets of remotely running edge pipelines and the status of all running edge pipelines.
Every three seconds, the Job Runner application checks the Messaging application to retrieve pipeline status changes and last-saved offsets sent by each SDC Edge.
Every five seconds, each SDC Edge checks with the Messaging application to retrieve requests sent by the Job Runner application. When you start, stop, or delete a job, the Job Runner sends a pipeline request for a specific SDC Edge to the Messaging application. The Messaging application retains the request until the receiving SDC Edge retrieves the request.