Dash Desai, Director of Platform and Technical Evangelism

Transformer for Snowflake

By Dash Desai June 29, 2021

StreamSets platform provides an end-to-end enterprise solution to maximize the value of your Snowflake Data Cloud. The platform can ingest data into Snowflake (using batch, streaming and change data capture data pipelines). With the preview of the StreamSets Engine for…

Announcing StreamSets Transformer Engine 4.0.0

Data Integration

Data Transformation

By Dash Desai June 24, 2021

StreamSets is excited to announce the immediate availability of StreamSets Transformer Engine 4.0.0. It is a modern ETL engine that enables developers and data engineers to build data pipelines and transformations that execute on Apache Spark. Highlights This is our…

Alexa, Start My Data Pipeline

Change Data Capture

Data Integration

By Dash Desai April 29, 2021

Imagine asking Amazon Alexa or Google Home to run your ETL, data processing, and automate your data pipelines. For example, "Start my data pipeline on Amazon EMR", “How many active jobs do I have running on Databricks?", or "Stop my…

Load Change Data Capture Data from PostgreSQL to Redshift Using StreamSets

Operational Analytics

Change Data Capture

By Dash Desai April 8, 2021

Change Data Capture is becoming essential to migrating to the cloud. In this blog, I have outlined detailed explanations and steps to load Change Data Capture (CDC) data from PostgreSQL to Redshift using StreamSets Data Collector, a fast data ingestion…

Load Data from S3 to Snowflake and Use TensorFlow Model

Data Integration

Operational Analytics

Data Transformation

By Dash Desai March 11, 2021

Learn how to load data from S3 to Snowflake and serve a TensorFlow model in StreamSets Data Collector, a fast data ingestion engine, data pipeline for scoring on data flowing from S3 to Snowflake. Data and analytics are helping us…

How To Load Data Into Google BigQuery on Dataproc and AutoML

Data Transformation

Data Integration

Cloud Data Migration

By Dash Desai February 23, 2021

Load Data Into Google BigQuery and AutoML In this blog, we will review ETL data pipeline in StreamSets Transformer, a Spark ETL engine, to ingest real-world data from Fire Department of New York (FDNY) stored in Google Cloud Storage (GCS),…

StreamSets for Data Engineers: Year 2020 In Review

Data Integration

By Dash Desai December 30, 2020

Times are definitely not “normal”, and more challenging for some than others, but I hope everyone is continuing to stay safe and I wish everyone a very happy holiday season. More importantly, I also want to take this opportunity and…

13 Data Engineering Best Practices At DNB

Data Integration

Cloud Data Migration

By Dash Desai November 17, 2020

DNB is Norway's largest financial services group, and has a reputation as a trusted financial institution throughout the region. In this guest post, the DNB Data Engineering Centre of Practice team--Saleem Pothiwala, Operations Lead - Customer Insights, Jones Mabea Agwata,…

Ingest Salesforce Data Into Amazon S3 Data Lake

Batch Data Processing

Data Integration

By Dash Desai November 5, 2020

In this blog, you will learn how to ingest Salesforce data using Bulk API (optimized to process large sets of data) and store it in Amazon Simple Storage Service (Amazon S3) Data Lake using StreamSets Data Collector, a fast data…

Demystifying Kerberos Authentication on Hadoop Clusters

Data Integration

Data Transformation

By Dash Desai September 29, 2020

Guest post by Rishi Jain, Technical Support Engineer III, StreamSets. In this blog post, you'll learn the recommended way of enabling and using kerberos authentication when running StreamSets Transformer, a modern data transformation engine, on Hadoop clusters. Generally speaking, the --proxy-user…

StreamSets Data Integration Blog

Transformer for Snowflake

Announcing StreamSets Transformer Engine 4.0.0

Alexa, Start My Data Pipeline

Load Change Data Capture Data from PostgreSQL to Redshift Using StreamSets

Load Data from S3 to Snowflake and Use TensorFlow Model

How To Load Data Into Google BigQuery on Dataproc and AutoML

StreamSets for Data Engineers: Year 2020 In Review

13 Data Engineering Best Practices At DNB

Ingest Salesforce Data Into Amazon S3 Data Lake

Demystifying Kerberos Authentication on Hadoop Clusters

Stay in Touch

Connect