skip to Main Content

The DataOps Blog

Where Change Is Welcome

Create Microservice Pipelines with StreamSets Data Collector (Tutorial)

By Posted in Engineering August 8, 2018

Template MicroserviceA microservice is a lightweight component that implements a relatively small component of a larger system – for example, providing access to user data. A microservice architecture comprises a set of independent microservices, often implemented as RESTful web services communicating via JSON over HTTP, that together implement a system’s functionality, rather than a single monolithic application.  Think of an e-commerce web site: we might have separate microservices for searching for inventory, managing the shopping cart, and recommending items based on the shopping cart’s content. Compared to monolithic applications, the microservice approach promotes fine-grained modularity, allowing agile implementation of components by independent teams, which may even be using different technologies. Now, one of those technologies can be StreamSets Data Collector. Data Collector 3.4.0, released earlier this week, introduces microservice pipelines, with a new REST Service origin and Send Response to Origin destination allowing you to implement RESTful web services completely within Data Collector.

The REST Service origin accepts requests via HTTP, parsing incoming data into an in-memory record, including metadata such as the query string and HTTP request headers. As in any other Data Collector pipeline, processors such as Field Flattener, JDBC Lookup, or any of the script Evaluators (take your pick of Groovy, JavaScript or Jython) can enrich, filter and transform the data, handing the resulting record off to the Send Response to Origin destination to be returned via HTTP as a JSON-formatted response to the original request.

Although any existing processors or destinations can be included in a microservice pipeline, this initial implementation is particularly well suited to integrations via JDBC. The JDBC Lookup and JDBC Tee processors can interact with a wide variety of data stores, and both return data that can enrich the record on its way through the pipeline. In fact, you can use the JDBC Lookup and JDBC Tee processors to implement create, read, update and delete operations, and a new tutorial guides you through the process of Creating a CRUD Microservice Pipeline. Here’s a short video that shows the resulting pipeline and how it responds to REST requests:

Another application of microservice pipelines is as a RESTful web service proxy. A microservice pipeline can receive a request, perform lookups, rename fields, query a different service via the HTTP Client processor, and similarly process the response before passing it back to the REST client.

Since Data Collector pipelines can send data to multiple destinations, you can send data almost anywhere, as well as returning a response to a REST client. Note, however, at present, it is not possible to define an ordering between Send Response to Origin and another destination – this is why processors such as JDBC Tee and HTTP Client are so useful in microservice pipelines. In the future, microservice pipelines will be able to indicate in the response that the record was successfully written to another destination.

What will you implement in a microservice pipeline? Download StreamSets Data Collector, follow the tutorial, and get to work on your own use case!

Back To Top

We use cookies to improve your experience with our website. Click Allow All to consent and continue to our site. Privacy Policy