skip to Main Content

Mini MapR Academy: How the ACT Government Uses Data Collector w/ MapR (videos)

By Posted in Data Integration April 23, 2018

Selvaraaju (‘Selva’) MurugesanMini MapR Academy by Selvaraaju (‘Selva') Murugesan is Senior Manager for Innovation and Data Analytics in the Australian Capital Territory (ACT) Government. Selva focuses on data management practices and data analytics, using StreamSets Data Collector to extract data from different databases, perform data cleansing on the fly and push data to the ACT Government’s Open Data Portal. Over the past few months, Selva has assembled a short playlist of videos demonstrating various aspects of Data Collector. From the basics of installation to advanced topics such as configuring impersonation for MapR-FS, Selva’s mini MapR Academy via video provides a great introduction to Data Collector. We’re excited to feature them in this blog post!

Get Started with the Mini MapR Academy

Installing StreamSets with MapR

In this first video, Selva installs Data Collector on Red Hat Enterprise Linux 7 via the full RPM package, configures Data Collector to work with MapR, and sets up an admin user.

Documentation

 

Installing MapR Libraries for StreamSets Data Collector

Selva installs the necessary libraries for Data Collector to integrate with MapR 6.0.0.

Documentation

 

Configuring Impersonation for MapR-FS

By default, Data Collector will write to MapR-FS as the currently logged in Data Collector user, however, it is possible to configure MapR-FS impersonation so that data is written as the user configured in the MapR-FS destination settings.

Documentation

 

Reading and Writing Data to the Local File System

Selva creates a simple pipeline to read CSV data from a local file, remove most of the fields, and writes it back to another local file.

Documentation

 

Masking Fields in the Pipeline

Data engineers often need to mask sensitive data when moving it between systems. Here, Selva shows how to use Data Collector’s Field Masker processor.

Documentation

 

Ingesting Data from a Web Service

In what is currently the last video in the series, Selva shows how Data Collector can read CSV data from a web service and write it to a local file.

Documentation

 

Conclusion

Many thanks to Selva for his permission to share these videos in our mini MapR Academy!

Have something to share yourself? Join us in our Community!

manage-smart-data-pipelines

Conduct Data Ingestion and Transformations In One Place

Deploy across hybrid and multi-cloud
Schedule a Demo
Back To Top