A key differentiator of StreamSets Data Collector (SDC) is that it operates in continuous mode – set a pipeline running and it will continue to read files from a directory or take messages from a queue. A Twitter conversation with Richard Tuttle, a solution architect at CRM Science, prompted me to wonder, would it be possible to ingest Apache Web Server log data, lookup the geolocation from the client IP address, and plot the results on a map… in Minecraft?
The StreamSets Tutorial Repository gave me a huge head start – the first tutorial covers reading Apache Web Server logs, performing GeoIP lookups, and sending the results to Elasticsearch, which the second describes how SDC can publish records to an Apache Kafka queue. It was a snap to combine the two, reading web server log records, looking up the geolocation of the requester, and writing results to Kafka.
This architecture made it very straightforward to implement a Kafka consumer in a Minecraft plugin. The plugin creates a map of the world in Minecraft, subscribes to a Kafka topic, and renders each log record as a block of sand falling onto the map in the appropriate location. Over time, this forms a three-dimensional histogram of requests. You can see at a glance where in the world your requests are coming from, and even fly around the map to take a closer look.
Watch the plugin in action in this short video:
What integration would you like to see in StreamSets Data Collector? Let us know in the comments!