skip to Main Content

The DataOps Blog

Where Change Is Welcome

Automating Kerberos KeyTab Generation for Kubernetes-based Deployments

Engineering

A major challenge when deploying dataflow pipelines to run on Kubernetes is how to handle Kerberos principals and keytabs needed when pipelines write to secure Hadoop. One approach, of using Kerberos keytabs for principals of the form @ (without a host field), incurs security risks as a keytab for such a principal could be used on any host in the…

By August 21, 2019

Announcing StreamSets Data Collector 3.10.0 and StreamSets Data Collector Edge 3.10.0

StreamSets News

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.10.0 and StreamSets Data Collector Edge 3.10.0. StreamSets Data Collector is open source under Apache License 2.0 and a powerful design and execution engine. It enables moving data between any source and destination, performing transformations, and push down analytics along the way. To download, click here. StreamSets Data…

By August 1, 2019

How DataOps is Adding Value to Data Lakes

StreamSets News, StreamSets Partners

For those of you who joined us on June 6th, you dialed into a forward-thinking conversation between three industry experts. They waxed poetic about topics including big data, DataOps, governance, data science, and more in an effort to help modern data architects and analytics professionals better understand the emerging practices and themes around DataOps. If you missed the videocast you…

By July 29, 2019

A New Definition of DataOps

Industry, StreamSets News

This is short post, but relevant.  Ever since DataOps was started (about 5 years) it hasn’t had a well-adopted and common definition.  Wikipedia is partially OK at: DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics.  But it is not great and simply builds on…

By July 18, 2019

Enhanced Error Diagnostics in StreamSets Data Collector 3.9.0

StreamSets News

StreamSets Data Collector reads from and writes to a wide variety of data stores and messaging platforms. Any interaction with an external system brings with it the risk of an error, and error messages are often less than helpful at pinpointing the root cause of the problem. Version 3.9.0 of Data Collector, released a few weeks ago, includes an extensible…

By July 13, 2019
Back To Top