skip to Main Content

The DataOps Blog

Where Change Is Welcome

5 Best Practices for Building Data Pipelines


Whether you are building your very first pipeline or you’re an old pro, these best practices for building data pipelines can help you make pipelines that are easy to understand and therefore easy to maintain and extend.   Design Data Pipelines for Simplicity Reduce complexity in the design wherever possible. This is a concept borrowed from software development. When reviewing your…

Brenna Buuck By August 24, 2022

Data Wrangling for Machine Learning


One can imagine the catastrophe of using inaccurate machine learning models in business – accidents, investment losses, and erroneous analysis. However, because the use cases for machine learning algorithms are numerous and can be positively and negatively impactful, a lot relies on the data quality fed into these models. Before machine learning engineers build machine learning models, the data must…

By August 16, 2022

Documenting the Steps in Your Data Migration Process


Every organization will ‌inevitably migrate data between locations at some point. Data migration refers to the movement of data between storage locations and data platforms. For example, you might need data migration when you introduce new database systems or migrate applications from on-premises to the cloud. Before the evolution of data migration tools, the data migration process was inefficient, lengthy,…

By August 9, 2022

How Operational Data Stores (ODS) and Data Warehouses Work Together


Data lacks value until organizations can gain business intelligence and insights from it. The ability to transform and maximize the value of an organization's data can be challenging for most businesses. Data storage options must hold company data and be available for querying as needed. Why? Statistics indicate that fast and easy data access increases business performance by up to…

Brenna Buuck By August 1, 2022

The Costs and Disadvantages of Building an ETL From Scratch


ETL and pipelines are at the center of DataOps as they determine a company's success in managing data. One way you can increase your chances of failing at data management is by building an ETL process from scratch without using a platform like StreamSets. In-house ETL may provide specific custom functions, but it is error-prone and requires more time to…

By July 25, 2022
Back To Top