Every year the insideBigData team puts together a list of companies making the biggest impact on this space, and for the 3rd year in a row StreamSets is one of these 50 prominent disruptors. StreamSets is ranked #22 alongside notable ecosystem pillars like Nvidia, Snowflake, Cloudera, DataRobot, and Databricks.
Opinions are abundant in today’s economy which is why insideBigData has taken a different approach. They use inference and machine intelligence to help choose these top players. According to the Impact 50 List:
“The selected companies come from our massive data set of vendors and industry metrics. Yes, we use machine learning to analyze the industry in a detailed manner to determine a ranking for this list. We’re using a custom RankBoost algorithm adapted specifically for the big data community along with a plethora of proprietary data sources. The rankings include an indicator for upward movement in the list and also new companies.”
How cool is that? Using machine learning to decide the companies making the biggest impact in machine learning!
InsideBigData is a publication deeply entrenched in the ecosystem of converging topics including Big Data, data science, and machine learning. None of the columnists are closer to the pulse of these ecosystems than managing director and practicing data scientist, Daniel Gutierrez. Daniel’s experience charts back to the days when these topics were en vogue and his opinion involves both important context and evolutionary movement.
While it is hard to dive into the mind of a finely tuned algorithm, this last year has seen StreamSets become not only a de facto tool for ingestion into big data platforms but also a critical capability in developing and operating Apache Spark native applications.
This year at DataOps Summit (a conference dedicated to the people, processes, and technology enabling agile data movement) companies like Shell talked about how StreamSets is enabling their data science teams to develop self-service data access and the impact it had in accelerating exploratory data science and machine learning. Also, with the recent addition of StreamSets Transformer and custom processing stages users are able to design pipelines that can perform machine learning scoring directly in the pipeline.
As StreamSets tackles the world’s most complex data problems, we hope to continue making a much needed impact with our customers, marketplace users, and the industry at large.
To read the full list of companies and rankings please visit the original article.