Control. We always want it, regularly don’t get it, yet in business it’s a must have to ensure things run as expected. Control is particularly critical when it comes to moving data around your company. Without it, it’s difficult to know where data is coming from, where it’s going and how it’s been manipulated (and by whom!) along the way. At StreamSets, we specialize in helping customers effectively control data and move it around their business, from any source to any destination. Over time, we’ve observed that most organizations lack proper controls to adequately ensure that data pipelines are built, executed and operated in a manner that meets the needs of the application or business process which they support.
Specifically, we see customers facing the following challenges when attempting to control data:
- Data engineering teams growing in number and working in silos who need tools to help them collaborate more effectively on data pipeline development.
- Organizations needing to scale up their development and architecture efforts by using toolkits, templates and API’s.
- As the number and complexity of data pipelines grows, keeping tabs on them all becomes an onerous task, especially as they evolve over their lifecycle.
- Understanding how pipelines are intertwined and interact becomes difficult. This is particularly important for situations requiring full visibility into end-to-end data movement for compliance reasons.
To address these challenges, we’re excited to announce StreamSets Control Hub as a way for teams to gain better control overall their data pipelines. Control Hub extends the value of the StreamSets Data Operations Platform by delivering capabilities that help teams streamline how they develop, deploy and execute pipelines. Combined with StreamSets Data Collector and StreamSets Transformer, Control Hub helps customers mature their approach to data movement and data operations with better data pipeline management.
StreamSets Control Hub addresses the challenges noted above by bringing the following capabilities to the StreamSets platform:
- Cloud-based design tool & shared pipeline repository: cloud-based design tool with the same UI functionality as StreamSets Data Collector. Control Hub adds a shared pipeline repository for team collaboration and pipeline lifecycle management.
- Automated deployment and provisioning: Automatically deploy pipelines created in Data Collector locally or on AWS, Azure, or Google Cloud Platform and deploy and elastically scale pipelines via Kubernetes.
- Architecture wide visibility and control: Get a complete topology view and visualize multiple pipelines as a dataflow topology with the ability to set Data SLAs.
- Data governance support: Flow metadata throughout dataflow pipelines with built in processors to store metadata at any point in a dataflow. Metadata pushdown integration with Cloudera Navigator™ and Apache Atlas™
We couldn’t be more excited about the introduction of StreamSets Control Hub. As we strive to help customers along their path to success with building, executing and operating data pipelines, Control Hub is an ideal solution to help teams to collaborate better and scale their efforts to effectively control data. We encourage you to learn more about our solution for managing data pipelines.