Advanced Analytics Archives

By Judy Ko May 24, 2022

We recently kicked off our Women in DataOps series with Dr. Beverly Wright, a data and analytics thought leader with 30 years of experience. Beverly has spent many years teaching data science and analytics to undergraduates, master’s students, PhDs, and…

3 Sessions to Stream at Gartner Data & Analytics Summit

Advanced Analytics

Cloud Data Migration

By Sean Anderson April 30, 2021

This year the Gartner Data and Analytics Summit goes virtual with a strong focus on how businesses create agility and value. That means we won’t get to cross paths on the show floor or start up a conversation after a…

A Cost Comparison of a Cloudera Hadoop Cluster with StreamSets Ingestion Framework on Oracle Cloud Infrastructure

Advanced Analytics

Data Integration

Cloud Data Migration

By Mike Carley April 18, 2019

Introduction It should come as no surprise that a Hadoop cluster and the public cloud go together like peanut butter and jelly because of scale, agility, and economy. It should come as even less of a surprise that a software…

Building a Data Science Pipeline at IBM Ireland with StreamSets

Advanced Analytics

By Pat Patterson September 12, 2016

After Guglielmo Iozzia, a big data infrastructure engineer on the Ethical Hacking Team at IBM Ireland, recently spoke about building a data science pipeline using StreamSets Data Collector Engine at Hadoop User Group Ireland, I invited him to contribute a blog post outlining how he discovered StreamSets Data Collector (SDC) Engine and the kinds of problems he and his team are solving with it. Read on to discover how SDC is saving time and making Guglielmo and his team’s lives a whole lot easier.

Analyzing Salesforce Data with StreamSets, Elasticsearch, and Kibana

Advanced Analytics

Data Transformation

By Pat Patterson June 3, 2016

UPDATE – Salesforce origin and destination stages, as well as a destination for Salesforce Wave Analytics, were released in StreamSets Data Collector 2.2.0.0. Use the supported, shipping Salesforce stages rather than the unsupported code mentioned below!

After I published a proof-of-concept Salesforce Origin for StreamSets Data Collector (SDC), I noticed an article on the Elastic blog, Analyzing Salesforce Data with Logstash, Elasticsearch, and Kibana. In the blog entry, Elastic systems architect Russ Savage (now at Cask Data), explains the motivation for ingesting Salesforce data into Elasticsearch:

Working directly with sales and marketing operations, we outlined a number of challenges they had that might be solved with this solution. Those included:

Interactive time-series snapshot analysis across a number of dimensions. By sales rep, by region, by campaign and more.

Which sales reps moved the most pipeline the day before the end of month/quarter? What was the progression of Stage 1 opportunities over time.

Correlating data outside of Salesforce (like web traffic) to pipeline building and demand. By region/country/state/city and associated pipeline.

It’s very challenging to look back in time and see trends in the data. Many companies have configured Salesforce to save reporting snapshots, but if you’re like me, you want to see the data behind the aggregate report. I want the ability to drill down to any level of detail, for any timeframe, and find any metric. We found that Salesforce snapshots just aren’t flexible enough for that.

Since we have first-class support for Elasticsearch as a destination in SDC, I decided to recreate the use case with the Salesforce Origin and see if we could fulfill those same requirements while taking advantage of StreamSets’ interactive pipeline IDE and ability to continuously monitor origins for new data.

Integrating StreamSets with Salesforce Wave Analytics

Advanced Analytics

Operational Analytics

By Pat Patterson April 4, 2016

UPDATE – Salesforce origin and destination stages, as well as a destination for Salesforce Wave Analytics, were released in StreamSets Data Collector 2.2.0.0. Use the supported, shipping Salesforce stages rather than the unsupported code mentioned below!

In my last blog entry I explained how you can write custom destinations to send data to systems not currently supported by StreamSets Data Collector. As you might know, my last gig was as a developer evangelist at Salesforce, so I put my experience to work writing a destination for Salesforce Wave Analytics. Now you can ingest data from any of a variety of origins, operate on it in a StreamSets pipeline, and write the results into a Wave dataset. Once uploaded, you can combine the dataset with CRM and other data in Salesforce for analysis.

StreamSets Data Integration Blog

3 Skills You Need to Succeed in Data and Analytics Today, According to Dr. Beverly

3 Sessions to Stream at Gartner Data & Analytics Summit

Building a Data Science Pipeline at IBM Ireland with StreamSets

Analyzing Salesforce Data with StreamSets, Elasticsearch, and Kibana

Integrating StreamSets with Salesforce Wave Analytics

Stay in Touch

Connect