Creating a Custom Origin for StreamSets Data Collector
Since writing tutorials for creating custom destinations and processors for StreamSets Data Collector (SDC), I've been looking for a good use case for a custom origin tutorial. It's been trickier than I expected, partly because the list of out of the box origins is so extensive, and partly because the HTTP Client origin can access most web service APIs, rendering a custom origin redundant. Then, last week, StreamSets software engineer Jeff Evans suggested Git. Creating a custom origin to read the Git commit log turned into the perfect tutorial.
“Why?” I hear you ask. Well, there are many reasons:
- Git is familiar to most developers
- The Git commit log is an ordered sequence of entries, each with a unique identifier – the commit hash
- JGit offers an easy way to read the commit log, either in its entirety, or across a range of entries
- It's easy to create a repository, and add commits, to test the origin
- Git is free – and who doesn't love free?
If you've been wondering how to get started writing a custom origin, then wonder no more, head on over to the article, Creating a Custom StreamSets Origin, and get started, today!