How the ACT Government Uses Data Collector w/ MapR (videos)

How the ACT Government Uses Data Collector w/ MapR (videos)

Selvaraaju MurugesanSelvaraaju (‘Selva') Murugesan is Senior Manager for Innovation and Data Analytics in the Australian Capital Territory (ACT) Government. Selva focuses on data management practices and data analytics, using StreamSets Data Collector to extract data from different databases, perform data cleansing on the fly and push data to the ACT Government's Open Data Portal. Over the past few months, Selva has assembled a short playlist of videos demonstrating various aspects of Data Collector. From the basics of installation to advanced topics such as configuring impersonation for MapR-FS, Selva's videos provide a great introduction to Data Collector. We're excited to feature them in this blog post!
 

Installing StreamSets with MapR

In this first video, Selva installs Data Collector on Red Hat Enterprise Linux 7 via the full RPM package, configures Data Collector to work with MapR, and sets up an admin user.

https://www.youtube.com/watch?v=fLjTkbc5vZ8

 

Documentation

 

Installing MapR Libraries for StreamSets Data Collector

Selva installs the necessary libraries for Data Collector to integrate with MapR 6.0.0.

https://www.youtube.com/watch?v=aR8DXYD0-sk

 

Documentation

 

Configuring Impersonation for MapR-FS

By default, Data Collector will write to MapR-FS as the currently logged in Data Collector user, however, it is possible to configure MapR-FS impersonation so that data is written as the user configured in the MapR-FS destination settings.

https://www.youtube.com/watch?v=9-pwAs5AydU

 

Documentation

 

Reading and Writing Data to the Local File System

Selva creates a simple pipeline to read CSV data from a local file, remove most of the fields, and writes it back to another local file.

https://www.youtube.com/watch?v=Ur3Dtw5hYDs

 

Documentation

 

Masking Fields in the Pipeline

Data engineers often need to mask sensitive data when moving it between systems. Here, Selva shows how to use Data Collector's Field Masker processor.

https://www.youtube.com/watch?v=7kRR3HILakY

 

Documentation

 

Ingesting Data from a Web Service

In what is currently the last video in the series, Selva shows how Data Collector can read CSV data from a web service and write it to a local file.

https://www.youtube.com/watch?v=pj04rVdLJ80

 

Documentation

 

Conclusion

Many thanks to Selva for his permission to share these videos!

Is there an aspect of StreamSets Data Collector, or any of the StreamSets products, that you would like to see demonstrated in a video? Let us know in the comments!

Share This Article :

Related Posts

Schedule a Demo
Receive Updates

Receive Updates

Join our mailing list to receive the latest news from StreamSets.

You have Successfully Subscribed!

Pin It on Pinterest