Pipeline Event Generation

The event framework generates pipeline events at specific points in the pipeline lifecycle. You can configure the pipeline properties to pass each event to an executor for more complex processing.

The event framework generates the following pipeline-related events:
Pipeline Start
The pipeline start event is generated as the pipeline initializes, immediately after it starts and before individual stages are initialized. This can allow time for an executor to perform a task before stages initialize.
Most executors wait for confirmation that a task completes. As a result, the pipeline waits for the executor to complete the task before continuing with stage initialization.
Pipeline Stop
The pipeline stop event is generated as the pipeline stops, either manually, programmatically, or due to a failure. The stop event is generated after all stages have completed processing and cleaning up temporary resources, such as removing temporary files. This allows an executor to perform a task after pipeline processing is complete, before the pipeline fully stops.

Similar to start event consumers, the behavior of the executor that consumes the event determines whether the pipeline waits for the executor task to complete before allowing the pipeline to stop. Also, if the processing of the pipeline stop event fails for any reason, the pipeline transitions to a failed state even though the data processing was successful.

Pipeline events differ from stage events as follows:
  • Virtual processing - Unlike stage events, pipeline events are not processed by stages that you configure in the canvas. They are passed to an event consumer that you configure in the pipeline properties.

    The event consumer does not display in the pipeline’s canvas. As a result, pipeline events are also not visualized in data preview or pipeline monitoring.

  • Single-use events - You can configure only one event consumer for each event type within the pipeline properties: one for the Start event and one for the Stop event.

    When necessary, you can pass pipeline events to another pipeline. In the event consuming pipeline, you can include as many stages as you need for more complex processing.

Using Pipeline Events

You can configure a pipeline to pass each event type to an executor stage. This allows you to trigger a task when the pipeline starts or stops. You configure the behavior for each event type separately. And you can discard any event that you do not want to use.

Note: If the specified executor fails to process the event, for example if an Amazon S3 executor fails to perform a task in Amazon S3, the pipeline transitions to a failure state.
To pass a pipeline event to the executor, perform the following steps:
  1. In the pipeline properties, select the executor that you want to consume the event.
  2. In the pipeline properties, configure the executor to perform the task.

Example

Say you want to run a daily pipeline that writes to Azure Data Lake Storage Gen2. You want to remove the output directory and all of its contents before running the pipeline the next day.

First, you configure the pipeline to use the ADLS Gen2 File Metadata executor for the pipeline start event. Since you don't need the Stop event, you can simply use the discard option:

Then, also in the pipeline properties, you configure the ADLS Gen2 File Metadata executor on the Start Event tab. After specifying the information to connect to Azure, you configure the executor to remove the output directory: