2018 | Page 3 of 5 | StreamSets

DataOps Principles Start to Get Attention (thanks, Gartner!)

By Clarke Patterson August 7, 2018

The fact is, our founders started our organization on the foundation of DataOps principles and StreamSets was a DataOps company before the term was even coined in late 2015. oOr founders recognized the serious operational challenges that unstructured, streaming data and hybrid cloud infrastructures would pose to enterprises used to static, batch structured data integration. Since our inception, we’ve been focused on enabling teams to operationalize data movement and we created the StreamSets Data Integration Platform to empower customers to capitalize on a DataOps approach.

Grab the DataOps guide now.

StreamSets Enhances its DataOps Platform

By Sean Anderson August 6, 2018

Today, StreamSets has announced the immediate availability of StreamSets Data Collector 3.4.0 and StreamSets Control Hub 3.3.0. These enhancements are aimed at delivering a better and more connected cloud experience for users of the StreamSets Data Collector and a refined…

Getting Started with StreamSets Control Hub (videos)

Data Integration

By Pat Patterson July 23, 2018

StreamSets solutions architect Alex Woolford is a data engineer with deep experience building robust and scalable solutions using technologies such as the StreamSets DataOps Platform, Apache Kafka, and the Cloudera and Hortonworks Hadoop distributions. In his role at StreamSets, Alex…

Synchronize HDFS Data into S3 Using the Hadoop FS Standalone Origin

Cloud Data Migration

By Ji Sun Kim July 10, 2018

Introduction: from HDFS Data to S3

I am very excited to announce the new Hadoop FS Standalone origin in StreamSets Data Collector 3.2.0.0. Data Collector has long supported the Hadoop FS origin, but only in the cluster mode. The Hadoop FS (HDFS) Standalone origin does not need MapReduce or YARN installed and can run in multithreaded mode, with each thread reading one file at a time in parallel.

Preview and Snapshot Features in StreamSets Data Collector

Data Integration

By Dash Desai July 6, 2018

Hello from your newly-appointed community champion and technical evangelist here at StreamSets! My name is Dash Desai and you will find me writing blog posts and cruising the community forums answering questions about StreamSets Data Collector as well as learning…

Using Docker Wrong: My Journey to a Better Container

Data Integration

By Pat Patterson July 3, 2018

Following on from last week's guest post from MapR's Ian Downard on integrating StreamSets Data Collector with MapR Persistent Application Client Container (PACC), MapR Distinguished Technologist John Omernik offers a cautionary tale on examining your assumptions before jumping into the world of…

Using StreamSets and MapR Together in Docker

Data Integration

By Pat Patterson June 26, 2018

Today's guest blogger is Ian Downard, a Senior Developer Evangelist at MapR Technologies. Ian focuses on machine learning and data engineering, and recently documented how he brought together the MapR Persistent Application Client Container (PACC) with StreamSets Data Collector and Docker to build pipelines…

Streaming Extreme Data Made Simple with Kinetica and StreamSets

By Pat Patterson June 21, 2018

Kinetica, just one of dozens of origins and destinations supported by StreamSets Data Collector, is a distributed, in-memory, GPU database designed for geospatial analysis, machine learning, predictive analytics, and other workloads requiring high performance parallel processing. Mathew Hawkins, a Principal Solutions Architect at Kinetica, recently…

Extract Data from Google Analytics using StreamSets Data Collector

Operational Analytics

By Pat Patterson June 19, 2018

Angel Alvarado is a senior software engineer at One Degree, a San Francisco-based non-profit, and also helps run the Molanco data engineering community. Angel previously contributed a Fun Example of Streaming Data into Minecraft; this time he get serious with the Google Analytics API. Many…

RingCentral Scales Out Big Data Streaming with StreamSets

Stream Data Processing

By Pat Patterson June 14, 2018

RingCentral is an award-winning global provider of cloud-unified communications and collaboration solutions. RingCentral solutions empower today’s mobile and distributed workforces to be connected anywhere and on any device through voice, video, team messaging, collaboration, SMS, conferencing, online meetings, contact center,…

StreamSets Data Integration Blog

DataOps Principles Start to Get Attention (thanks, Gartner!)

StreamSets Enhances its DataOps Platform

Getting Started with StreamSets Control Hub (videos)

Synchronize HDFS Data into S3 Using the Hadoop FS Standalone Origin

Introduction: from HDFS Data to S3

Preview and Snapshot Features in StreamSets Data Collector

Using Docker Wrong: My Journey to a Better Container

Using StreamSets and MapR Together in Docker

Streaming Extreme Data Made Simple with Kinetica and StreamSets

Extract Data from Google Analytics using StreamSets Data Collector

RingCentral Scales Out Big Data Streaming with StreamSets

Stay in Touch

Connect