Overview You have options when bulk loading data into RedShift from relational database (RDBMS) sources. These options include manual processes or using one of the numerous hosted as-a-service options. But, if you have broader requirements than simply importing, you need another option. Your company may have requirements such as adhering to enterprise security policies which […]
Angel Alvarado is a senior software engineer at One Degree, a San Francisco-based non-profit, and also helps run the Molanco data engineering community. In his spare time, Angel enjoys playing Minecraft with his 11 year-old-cousin. Recently, Angel, found a fun way to combine his gaming with data engineering. This blog entry, reposted from the original with Angel’s […]
A couple of weeks ago, as May the 4th approached, a lively Star Wars debate brewed at StreamSets: “Do new school characters get as much play as old favorites like Darth Vader, Yoda and Han Solo?” “Does the Dark Side of the Force dominate the Light?” “Does Yoda prevail over Darth Vader?” It occurred to us […]
A key differentiator of StreamSets Data Collector (SDC) is that it operates in continuous mode – set a pipeline running and it will continue to read files from a directory or take messages from a queue. A Twitter conversation with Richard Tuttle, a solution architect at CRM Science, prompted me to wonder, would it be possible to ingest […]
You can now install StreamSets Data Collector in minutes using Cloudera Manager. Watch this short video clip to see how easy it is. Download Open Source StreamSets Data Collector at www.streamsets.com/opensource.
Watch StreamSets Field Engineer Jonathan “Natty” Natkins demonstrate how you can use the open source StreamSets Data Collector to flexibly handle painful “data drift” – the inevitable evolution of infrastructure, semantics and schema that leads to corrupted data and broken pipelines. Download Open Source StreamSets Data Collector at www.streamsets.com/opensource.
StreamSets is an open source, enterprise-grade, continuous big data ingest infrastructure that accelerates time to analysis by bringing unprecedented transparency and processing to data in motion. Watch co-founders Girish Pancha (CEO) and Arvind Prabhakar (CTO) talk about the problems they are trying to solve with StreamSets.