Ingesting Streaming Data from JMS into HDFS and Solr using StreamSets

Ingesting Streaming Data from JMS into HDFS and Solr using StreamSets

A step-by-step walkthrough of how Mac Noland implemented StreamSets to move away from hand-coded ETL and scale out an increasingly complex ingestion pipeline. Mac is a Solution Architect for phData, a Twin Cities services firm focused on Hadoop. He has spent 17 years as a software engineer and architect for projects in the legal, accounting, risk and medical device industries.

Using Open Source StreamSets to Tackle Data Drift (video)

Watch StreamSets Field Engineer Jonathan “Natty” Natkins demonstrate how you can use the open source StreamSets Data Collector to flexibly handle painful “data drift” – the inevitable evolution of infrastructure, semantics and schema that leads to corrupted data and broken pipelines.   Download Open Source StreamSets Data Collector at www.streamsets.com/opensource.

State of the Art Data Ingestion

Forward-looking, data-driven enterprises increasingly leverage Big Data platforms, such as Hadoop, Elasticsearch and Amazon Web Services, to derive insights from non-­transactional, machine­-generated data. Many tools have emerged to power next ­generation data pipelines and provide specialized analytic capabilities. To get value from these technologies, data must reside in intermediate data stores in a consumable form. However, existing […]

Receive Updates

Receive Updates

Join our mailing list to receive the latest news from StreamSets.

You have Successfully Subscribed!