skip to Main Content

StreamSets Data Integration Blog

Where change is welcome.

Elasticsearch plus StreamSets for Reliable Data Ingestion

Industry, StreamSets News

StreamSets Data Collector is open source software that lets you easily build continuous data ingestion pipelines for Elasticsearch. By being resistant to "data drift", StreamSets minimizes ingest-related data loss and helps ensure optimized indexes so that Elasticsearch and Kibana users can perform real-time analysis with confidence. See full post here.

Arvind Prabhakar By November 18, 2015

Ingesting Streaming Data from JMS into HDFS and Solr using StreamSets

Engineering, Use Cases, StreamSets News, StreamSets Partners

Now we’ll start publishing messages to a JMS queue. They are simple text messages with random words. Periodically the program outputs two types of bad records. Records without an id message header and records with empty content. We will use two of the StreamSets error handling facilities later on to catch these bad records.

By November 10, 2015

Introducing the StreamSets Data Collector (video)

Engineering, StreamSets News

Wondering how the StreamSets Data Collector works? Have a look at this quick 4 minute introduction to the software.

By October 8, 2015

What Is StreamSets?

StreamSets News

This 2015 blog post has been updated. The original post is preserved below. StreamSets is a modern data integration platform dedicated to building the smart data pipelines needed to power DataOps across hybrid and multi-cloud architectures. StreamSets was founded in 2015 by a former Cloudera engineer and Informatica product leader to better manage data integration in the modern world. By…

By October 5, 2015
Back To Top