skip to Main Content

The DataOps Blog

Where Change Is Welcome

Apache Kafka Invades the Bay

By October 23, 2018

Last week, a sizable group of professionals ascended on Pier 27 in San Francisco for the 3rd Annual Kafka Summit. StreamSets was excited to support the event, especially because of the focus our customers have on Kafka as a critical…

Adaptive Data Integration and Operations on Oracle Cloud using StreamSets

By October 17, 2018

StreamSets is pleased to announce a new partnership with Oracle Cloud Infrastructure (OCI). As enterprises move their big data workloads to the cloud, it becomes imperative that their Data Operations are more resilient and adaptive to continue to serve the business’s needs.  This is why StreamSets Data Collector™ is now easily deployable on OCI.

What led us to this point?  There are fundamental questions such as ‘What good is an Enterprise Data Hub (EDH) without the most current data?’ ‘What good is the EDH without lots of data sources feeding it?’ which leads to the follow up questions of  ‘How do you manage data engineering as quickly as software development in a fast-paced DevOps world?’ ‘How do you manage change-data-capture (CDC) from Oracle, streaming log files, and batch SFTP dumps without using large and confusing toolsets?’  

To answer all of these questions, StreamSets has created the first complete DataOps (DevOps for data integration) platform to compliment the fail-fast world of DevOps toolsets that are commonly found in places like a cloud-based EDH deployment. Running StreamSets in the Oracle Cloud to support a Cloudera Enterprise Data Hub (EDH) provides an excellent example of DevOps being applied to data to harness the value of a big data project.

Streamline Hybrid Cloud Data Integration with DataOps

By September 11, 2018

As cloud adoption grows, so does the complexity of the data architectures that serve as the backbone for modern enterprise applications. This complexity, if not planned for, can cripple any cloud initiative. According to research firm Gartner (subscription required), by 2021, at least 75% of large and global organizations will implement a multi-cloud capable hybrid integration platform, up from less than 25% in 2018. Taking a DataOps approach to data infrastructure can help streamline how data is moved around the business and ensure integration initiatives support the cloud-oriented goals of any organization.

DataOps in Healthcare

By August 28, 2018

In healthcare, data is delivering life-saving results with predictive capabilities that can address preventable outcomes. The intelligence guiding these initiatives relies on timely data delivery to applications and reviewers. This may involve complex, high velocity data forms with the expectation…

StreamSets Enhances its DataOps Platform

By August 6, 2018

Today, StreamSets has announced the immediate availability of StreamSets Data Collector 3.4.0 and StreamSets Control Hub 3.3.0. These enhancements are aimed at delivering a better and more connected cloud experience for users of the StreamSets Data Collector and a refined…

Automating Pipeline Development with the StreamSets SDK for Python

By May 15, 2018

When it comes to creating and managing your dataflow pipelines, the graphical user interfaces of StreamSets Control Hub and StreamSets Data Collector put the complete power of our robust Data Operations Platform at your fingertips. There are times, however, when a more programmatic approach may be needed, and those times will be significantly more enjoyable with the release of version 3.2.0 of the StreamSets SDK for Python. In this post, I’ll describe some of the SDK’s new functionality and show examples of how you can use it to enable your own data use cases.

StreamSets Announces Control Hub version 3.2

By May 14, 2018

Today we are pleased to announce the general availability of StreamSets Control Hub version 3.2. StreamSets has built the industry’s only DataOps platform.  We call it DataOps because our platform makes it easy to iteratively update dataflows when technology changes.…

Using StreamSets Control Hub with Minikube

By April 26, 2018

Hari Nayak's recent blog post provides a quickstart for using StreamSets Control Hub to deploy multiple instances of StreamSets Data Collector on Google's Kubernetes Engine (GKE).  This post modifies the core scripts from that project in order to run on Minikube rather than GKE. As Minikube can run…

Back To Top