What is StreamSets Cloud?

StreamSets CloudTM is a cloud-native DataOps platform that you can use to build, run, and monitor your dataflow pipelines.

A pipeline describes the flow of data from an origin system to destination systems and defines how to process the data along the way.

Pipelines can access multiple types of external systems, including cloud storage systems such as Google Cloud Storage or Amazon S3 and storage systems installed on-premises such as relational databases. All systems must be accessible from the StreamSets Cloud IP addresses.

When you run a pipeline, StreamSets Cloud dedicates resources for each pipeline run, ensuring that one running pipeline does not affect other running pipelines. StreamSets ensures that your pipelines are secure.

As a pipeline runs, it displays real-time statistics and error information about the data as it flows from origin to destination systems.

You'll complete the following main tasks to manage pipelines within StreamSets Cloud:

Build a pipeline to define how data flows from origin to destination systems and how the data is processed along the way.

Start with an origin and go from there.

Learn about designing pipelines.

Run a pipeline to start the flow of data from the origin to destination systems.

Learn about running pipelines.

Monitor the health and performance of a running pipeline.

Handle errors as data moves through a pipeline.