StreamSets Extends DataOps to Snowflake
New Partnership and StreamSets for Snowflake Brings Simple and High-Performance Data Integration to Leading Cloud-Built Data Warehouse
Snowflake’s enterprise data warehouse is designed to easily amass all data, enabling rapid analytics that quickly make data insights available to all users, consumers and systems that need them. Snowflake customers enjoy benefits such as instant elasticity, secure data sharing and per-second pricing across multiple clouds for data analytics.
StreamSets for Snowflake greatly simplifies development of any-to-any pipelines that include Snowflake while affording continuous visibility and control over pipeline performance to ensure data availability, integrity and protection. Enterprises can use StreamSets for Snowflake to continuously flow data in real-time from relational databases, data lakes, search stores, logs, APIs and numerous other sources into Snowflake for use cases such as bulk upload/database migration, change data capture (CDC) and streaming data ingest across a hybrid multi-cloud architecture.
“As the only data warehouse built for the cloud, Snowflake brings a compelling combination of performance, scalability and simplicity to the enterprise. A data warehouse must integrate with enterprise data sources and other data platforms to form a cohesive architecture that best serves the business,” said Jobi George, vice president of business development and strategic alliances at StreamSets. “StreamSets for Snowflake helps enterprises extend their architecture to Snowflake with efficiency, performance and resilience in the face of change. Through our new partnership we look forward to extending the benefits of DataOps to Snowflake customers.”
StreamSets for Snowflake includes the following capabilities:
- Visual pipeline designer; configuration not coding
- High-performance synchronous and asynchronous data movement
- Change data capture (CDC) for more efficient update of Snowflake objects
- Automatic table creation and multi-table upload without prior schema specification
- Data drift handling: monitors for and automatically propagates source data schema and semantic changes to Snowflake
It also includes these additional DataOps features:
- Ability to integrate Snowflake with myriad data sources and other storage platforms, whether on premises or in the cloud
- Ability to perform numerous pre-built or custom data transformations on data-in-motion
- Centralized end-to-end monitoring of multi-platform dataflows with enforcement of Data SLAs for availability and quality
- Policy-based detection and protection of sensitive data before it lands in the Snowflake data warehouse
“Our customers are continuously looking to add big data and streaming sources to the datasets in their cloud platforms,” said Walter Aldana, vice president of alliances at Snowflake. “Our partnership with StreamSets and their new StreamSets for Snowflake integration gives enterprises an efficient and reliable way to deliver all of their data, traditional or modern, into Snowflake and maximize the potential for analytic insights.”
Red Pill Analytics, an analytics services company, relies on StreamSets and Snowflake to deliver the best elements of the cloud for agile development-as-a-service and continuous delivery to its customers.
“We are excited for both StreamSets and Snowflake, who together enable our customers’ big data modernization projects,” said Stewart Bryson, CEO of Red Pill Analytics. “StreamSets’ intelligent data pipelines, now integrated with Snowflake’s unique architecture, mean our customers can accelerate insights delivery from the premier cloud-built data warehouse.”
StreamSets for Snowflake is available immediately in StreamSets DataOps Platform version 3.7 and is included as part of a StreamSets Standard or Enterprise subscription.
DataOps is the application of DevOps practices to data management and integration to reduce the cycle time of data analytics, with a focus on automation, collaboration and monitoring. DataOps is essential for a data landscape marked by architectural complexity with accelerating change. DataOps is characterized by the following core capabilities:
- Cross-platform data integration that enables flexible selection of fit-for-purpose storage and compute platforms
- Data SLAs for continuous monitoring, measurement and enforcement of business standards for data availability, quality and protection
- Continuous integration and delivery (CI/CD) of dataflows for agility (proactive change management)
- Data drift resilience — automated detection and response to unexpected changes to schema and semantics (reactive change management)
StreamSets built the industry’s first multi-cloud DataOps platform for modern data integration, helping enterprises to continuously flow big, streaming and traditional data to their data scientists and data-intensive applications. It uniquely handles data drift, those frequent and unexpected changes to data that break pipelines and damage data integrity. The platform combines the open source StreamSets Data Collector™ for execution of any-to-any pipelines (the data plane) with a cloud-native StreamSets Control Hub™ for the design, monitoring and performance management of multi-pipeline topologies (the control plane). Founded in 2014 by Girish Pancha, former chief product officer of Informatica, and Arvind Prabhakar, a former engineering leader at Informatica and Cloudera, StreamSets is backed by top-tier Silicon Valley venture capital firms, including Battery Ventures, New Enterprise Associates (NEA), and Accel Partners. For more information, visit https://streamsets.com/partners/snowflake.