Use Cases

Using StreamSets Control Hub for Scalable Deployment via Kubernetes

In my previous blog entry, I explained how to spin up Data Collectors as Kubernetes deployments along with Dataflow Performance Manager. I recommended using a deployment with one replica as the design environment and a deployment with many replicas for execution. We recently announced StreamSets Control Hub which makes the Kubernetes integration way smoother! StreamSets Control Hub adds […]

Getting Started with Cloudera’s Cybersecurity Solution (feat. StreamSets, Arcadia Data and Centrify)

This post was originally published on the Cloudera VISION blog by Sam Heywood.   StreamSets configurations and images of Apache Spot Open Data Model ingest pipelines can be found here on Github. A quick conversation with most Chief Information Security Officers (CISOs) reveals they understand they need to modernize their security architecture and the correct answer […]

Scaling out StreamSets with Kubernetes

UPDATE – Since this blog post was written, StreamSets Control Hub added a Control Agent for Kubernetes that supports creating and managing Data Collector deployments and a Pipeline Designer that allows designing pipelines without having to install Data Collectors. This blog entry has full details: Using StreamSets Control Hub for Scalable Deployment via Kubernetes. In today’s microservice […]

Read and Write JSON to MapR DB with StreamSets Data Collector

MapR-DB is an enterprise-grade, high performance, NoSQL database management system. As a multi-model NoSQL database, it supports both JSON document models and wide column data models. MapR-DB stores JSON documents in tables; documents within a table in MapR-DB can have different structures. StreamSets Data Collector enables working with MapR-DB documents with its powerful schema-on-read and […]

The Challenge of Fetching Data for Apache Spot (incubating)

Reposted from the Cloudera Vision blog. What do Sony, Target and the Democratic Party have in common? Besides being well-respected brands, they’ve all been subject to some very public and embarrassing hacks over the past 24 months. Because cybercrime is no longer driven by angst-ridden teenagers but rather professional criminal organizations and state-sponsored hacker groups, the […]

Creating a Post-Lambda World with Apache Kudu

Apache Kudu and Open Source StreamSets Data Collector Simplify Batch and Real-Time Processing As originally posted on the Cloudera VISION Blog. At StreamSets, we come across dataflow challenges for a variety of applications. Our product, StreamSets Data Collector is an open-source any-to-any dataflow system that ensures that all your data is safely delivered in the […]

Receive Updates

Receive Updates

Join our mailing list to receive the latest news from StreamSets.

You have Successfully Subscribed!

Pin It on Pinterest