Throughout the month of March, G2 Crowd’s research department will share our insights around the topic of big data. Check back here, on the blog and on our social channels to read the latest.
It has been discussed constantly in the tech world for a number of years, but the explosion of big data is still as relevant as ever because it is underutilized. The main reason that businesses are still not taking advantage of what big data has to offer is because they simply haven’t figure out how to understand their data. This is where the role of the data scientist becomes essential for companies trying to make intelligent, actionable decisions based off their vast amounts of data.
The data scientist’s role is unique because it requires a variety of skills including computer programming, advanced knowledge of machine learning and mathematics, and a precise business acumen for making decisions. They need to be able to access and gather extremely large, unstructured datasets and manipulate them so that they are understandable to the average employee or executive. This means pulling from a data source based on specific parameters, using a programming language to find the data they need, and then build said data into understandable charts and graphs, which requires the help of a business intelligence or advanced analytics tools. Once the data is digestible, they have to be able to find trends, anomalies and other valuable insights within the data that can help improve business practices.
Due to the benefits a data scientist can provide, they are well compensated, with the median base pay being roughly $110,000 and as high as $200,000 in New York City or Silicon Valley. According to Business Insider, the data scientist role is not only considered the best tech job in America, but also the best job in any field heading into 2017. Part of the reason for the awesome salary is because there is a massive shortage of data scientists—and has been for some time. Many businesses did not foresee the big data boom and therefore did not see the need for someone to handle all their data.
Back in 2011, the Harvard Business Review pointed out that the demand for data scientists was rising rapidly and the supply was greatly lacking. A year later, the publication further emphasized why the job is so crucial to businesses, and since then universities have been adding post-graduate data science programs. Those with computer programming or math skills have been using short online courses to boost their resumes take advantage of the shortage. In 2012, Gartner estimated that by 2020 the shortage of data scientists would reach 100,000, with others giving even higher shortage predictions, and today companies are facing the reality of those estimates.
Can Companies Supplement the Need for Data Scientists?
Forced by the data scientist shortage, companies may need to look for alternative options. Although we’ve harped on the fact that artificial intelligence (AI) will not steal human jobs, in this case it could erase the need for a data scientist altogether. Machine learning algorithms can consume the enormous datasets big data offers much faster than a human can. These same algorithms could then analyze the data and provide similar trends, patterns and one-offs that a data scientist would point out. The initial challenge would be making sure the algorithm gives you data that you actually wanted, but vendors are already advancing in this area with the use of natural language processing (NLP).
Data scientists take requests from certain departments or executives to find out specific information within the company’s data. Requesting the same thing from a machine learning algorithm can be complicated, but with NLP a user can communicate with the AI the same way they would with a data scientist, or any other human, by simply asking. IBM Watson, which has a variety of NLP functions, can interact with business intelligence tools to perform data science tasks. Similarly, Microsoft recently integrated the Cortana AI virtual assistant into its Power BI tool, so that users can interact with big data by speaking out loud. With these NLP capabilities and the growth of self-service business intelligence solutions, which allow users without data science skills to manipulate data and build visualizations, there is potential to remove the need for data scientists.
In most circumstances, these AI functionalities will provide a tool for automating the basic tasks that a data scientist performs, the same way that AI helps HR professionals or project managers. However, with the shortage of data scientists, it is not out of the question that AI and business intelligence vendors attempt to entirely replace the position. A company may badly want a data scientist, but at the end of the day, they would probably rather save that $200,000 a year by talking to a machine.