June 2017

Triggering Databricks Notebook Jobs from StreamSets Data Collector

By Pat Patterson June 21, 2017

Last December, I covered Continuous Data Integration with StreamSets Data Collector and Spark Streaming on Databricks. In StreamSets Data Collector (SDC) version 2.5.0.0 we added the Spark Executor, allowing your pipelines to trigger a Spark application, running on Apache YARN or Databricks. I’m going to cover the latter in this blog post, showing you how to trigger a notebook job on Databricks from events in a pipeline, generating analyses and visualizations on demand.

Introducing the Data Collector Support Bundle

By Wagner Camarao June 13, 2017

Hi, my name is Wagner Camarao and I’m a Software Engineer at StreamSets focusing on the user-facing aspects of our products. Today I’m going to talk about a new feature in the StreamSets Data Collector to optimize the interactions with our support team.

In version 2.6.0.0 of Data Collector, we’ve added a feature called Support Bundle. It allows you to generate an archive file with the most common information required to troubleshoot various issues with Data Collector, such as precise build information, configuration, thread dump, pipeline definitions and history files, and most recent log files.

Announcing Data Collector ver 2.6.0.0

By Kirit Basu, Head of Strategy June 12, 2017

We are excited to announce version 2.6 of StreamSets Data Collector. This release has important functionality focused on helping customers to modernize their enterprise data warehouses on Hadoop, CyberSecurity, IoT and Spark.

This release has 6 new features, 20 improvements and 72 bug fixes. For a full list, see What’s New. For a list of bug fixes and known issues, see the Release Notes.

Embrace Diversity in Your Data Architecture Pipelines

Data Integration

By Jonathan Natkins June 9, 2017

Over the last ten years, the data management landscape has changed dramatically — on that, I think we can all agree. The rise of big data and the new data management ecosystem has created an abundance of new patterns and…

StreamSets Data Integration Blog

Triggering Databricks Notebook Jobs from StreamSets Data Collector

Introducing the Data Collector Support Bundle

Announcing Data Collector ver 2.6.0.0

Embrace Diversity in Your Data Architecture Pipelines

Stay in Touch

Connect