Kirit Basu

Kirit

Kirit is StreamSets' Community Champion and loves to get users up and running quickly with their data ingest challenges.

Announcing Data Collector ver 2.2.0.0

And here it is folks, the last release of 2016 – StreamSets Data Collector version 2.2.0.0. We've put in a host of important new features and resolved 120+ bugs.

We're gearing up for a solid roadmap in 2017, enabling exciting new use cases and bringing in some great contributions from customers and our community.

Kirit BasuAnnouncing Data Collector ver 2.2.0.0
Read More

Announcing StreamSets Data Collector version 2.0

Last October, we publicly announced StreamSets Data Collector version 1.0. Over the last 12 months we have seen an awesome (a word we don't use lightly) amount of adoption of our first product – from individual developers simplifying their day-to-day work, to small startups building the next big thing, to the very largest companies building global scale enterprise architectures with StreamSets Data Collector at its core.

Kirit BasuAnnouncing StreamSets Data Collector version 2.0
Read More

Announcing Data Collector ver 1.6.0.0

It's been a busy summer here at StreamSets, we've been enabling some exciting use-cases for our customers, partners and the community of open-source users all over the world. We are excited to announce the newest version of the StreamSets Data Collector.

This version has a host of new features and over 100 bug fixes.

Download it now.

Kirit BasuAnnouncing Data Collector ver 1.6.0.0
Read More

Announcing Data Collector ver 1.5.1.0

We're happy to announce a version release of StreamSets Data Collector. This is a relatively minor mid term update with a number of important bug fixes, yet packs in a couple of fun features.

  • Support for Azure Blob storage using the WASB protocol. Customers can now use Data Collector to write directly to Azure HDInsight.
  • Support for Apache Solr 6 (for both standalone and cluster mode).

Please be sure to check out the Release Notes for detailed information about this release. And download the Data Collector now.

Kirit BasuAnnouncing Data Collector ver 1.5.1.0
Read More

Announcing Data Collector ver 1.4.0.0

We are excited to announce the release of the next version of StreamSets Data Collector. With this release we have a number of new features and enhancements and 60+ bug fixes.

New Features :

  • SFTP/FTP origin to read files from an SFTP or FTP server.
  • MapR DB destination to write data to MapR DB using the HBase API.
  • XML Flattener processor to flatten XML data in a string field.
  • HBase Lookup processor to perform key-value lookups from HBase.
  • Redis Lookup processor to perform key-value lookups from Redis.
  • Static Lookup processor to perform key-value lookups from local memory.
  • Updated pipeline library view on the application Home page to organize and filter by pipeline labels, and start/stop multiple pipelines at once.

Improvements :

  • Support for free form labels to help organize and filter pipelines.
  • Hive Streaming support for MapR.
  • Support for Elasticsearch 2.0 and HDP 2.4.0
  • Support for Rate limiting on a pipeline to control the maximum number of records to read per second.
  • Updates to Amazon S3 origin and destinations – S3 origin supports glob patterns in folder paths.
  • Directory Origin now supports reading files based on last modified timestamp, in addition to the previous lexicographically ascending filenames.
  • Directory and File Tail Origins now add source file metadata to record headers to further refine provenance tracking.
  • HBase destination now supports a configurable time basis for the timestamp value added to each column, you can choose from processing time, record time or system time.
  • Kudu destination allows writing table names based on expressions.
  • Thread Pool size is now configurable through the sdc.properties file, that allows you to run several more pipelines within the same Data Collector.
  • A number of new custom metrics have been added to the monitoring console.

And a number of other features and improvements. Please be sure to check out the Release Notes for detailed information about this release.

Download Data Collector now.

Online survey powered by Typeform
Kirit BasuAnnouncing Data Collector ver 1.4.0.0
Read More

Announcing Data Collector ver 1.3.0.0

With this release we have a number of exciting new features and integrations. And as usual, we've addressed a number of bug fixes.

Integrations:

Improvements:

  • Late directory support for File Tail and Directory. You can configure the origins to read from directories and files that show up after you start the pipeline.
  • With external JMX tools, you can view additional metrics for the File Tail origin that let you know how many files are pending in the directory, and how much of the active file remains to be read.
  • The Field Hasher processor now allows hashing in place, hashing to a target field or header, and hashing the entire record.
  • A couple new processors that support Encoding and Decoding Base64 data.
  • The HBase destination now supports implicit field mappings.
  • The Kinesis Consumer origin now supports AWS proxy settings.
  • The JMS Consumer origin provides configurable custom JNDI properties.
  • Users with the Admin role can restart Data Collector from the console.
  • Configurable timeout for inactive user sessions.
  • REST API support for cross-origin resource sharing (CORS).

Download the Data Collector to get started now.

Kirit BasuAnnouncing Data Collector ver 1.3.0.0
Read More

How Trend Micro Uses StreamSets – An Interview with the Threat Research Team

The Forward-Looking Threat Research team at Trend Micro were early adopters of StreamSets Data Collector. They use StreamSets to ingest data from a wide variety of sources to create a Threat Assessment Dashboard in Elasticsearch. In this interview, we talk with members of their team about how they evaluated StreamSets and implemented it in their production environment in a short period of time.

Kirit BasuHow Trend Micro Uses StreamSets – An Interview with the Threat Research Team
Read More