StreamSets supports your DataOps practice with a single place to design, operate and iterate your pipelines. You design pipelines using a drag-and-drop GUI as part of a full-featured integrated development environment. You can design, preview, test and run your pipelines within the same interface, and get live metrics on dataflow performance.
With StreamSets, you only specify the fields you’d like to act on, meaning you can create pipelines much more quickly and these pipelines are much more resistant to source schema changing (data drift).
StreamSets lets you connect sources of batch and streaming, structured and unstructured data with a variety of big data platforms including Hadoop, Kalfa, Elastic, NoSQL and many more.
Changes to schema and semantics without notice, known as data drift, can break pipelines and pollute data. StreamSets’ Intelligent Pipelines uniquely inspect the data to detect these changes, which can trigger alerts or actions that adjust downstream systems, such as adding a field to a Hive metastore.