Closing the Big Data Talent Gap

Curtis Breville

DM/IST, Dell Technologies, Denver, Colorado, USA

Manufacturing aside, predictive analytics made from big data is affecting, and will continue to affect, every industry on earth. Artificial/Augmented Intelligence (AI) has demonstrated an ability to improve quality and efficiencies across multiple industries as well as create new business offerings. Despite the incredible promise big data brings, there has been a lack of available specialists in this discipline, leaving many businesses to lose market share, customers, and to miss opportunities. Eighty-four percent of organizations believe that insights generated from their data provide them with a competitive advantage (Columbus, 2014).

Let’s get to it: If there’s a lack of big data specialists, where are businesses’ finding their big data resources? Are they recruiting from universities, training existing employees, hiring consultants, or searching other resource pools for qualified big data specialists? The answer to this question will help businesses and schools identify opportunities to close this gap of specialists and bring new discoveries and insights, faster, to the world. Back in 2014, a lack of available big data expertise left the business world with a shortage of 1.5 million people needed to analyze a massive amount of data being generated every minute (Manyika et ah, 2011; “SAP”, 2014). In 2018, almost five years later, the gap continued, with IDC reporting that just the US alone had a shortage of 181,000 people to fill deep analytic rolls and over 900,000 data management and interpretation roles. A Linkedln Workforce Report written in 2018 supports the IDC numbers with a similar gap of 151,717 people with data science skills (Linkedln, 2018). On a global scale, in 2017, PricewaterhouseCoopers (PwC) estimated that 2.7 million data science and analytics positions would be available with a shortage of 40%-60% qualified people to fill them (Rivett, 2017).

Technical challenges have traditionally been met with a surplus of new college graduates, armed with the latest technology knowledge. Perhaps, it is the relative newness of big data, but it has been a general acceptance that most undergrad-level college graduates lack the domain or business experience to step into a true data scientist position. Big data analytics involves the application of new technology to businesses across multiple industries to provide new insights. This requires experience with an understanding of businesses and industries, which most recent graduates have yet to develop.

The lack of explanation of what companies are doing to cope with this resource shortage can be deafening, especially to those businesses desperate to gain insights. If students are not graduating with the business knowledge needed, the new technologies including programming languages, hardware, and software tools are not standardized nor common across businesses, where are organizations finding qualified specialists with the ability to apply big data to gain new insights to benefit from?

A person tasked with an understanding of how to capture, store, manipulate, cleanse, model, analyze, identify patterns, and communicate the benefits of big data is referred to as a “data scientist.” Such a person must be fluent in statistics, computer programming/coding, and have strong interpersonal and communication skills. People who possess all of these qualities are often referred to as “unicorns” because of how rare it is for a person possessing all of these qualities (Bakhshi, Mateos-Garcia, & Whitby, 2014). People who work in the big data analytic space who may only have one or two of those qualities are often referred to as a big data specialist, and “data scientist” is reserved for those who have all of them. Throughout this chapter, both titles will be used interchangeably.

Desperation drives innovation, right? A lack of available qualified big data specialists has forced IT Managers to find innovative ways to get the value of big data for their respective businesses. Prior to my research, knowledge around how they are doing this, what methods are successful, and what methods are not was not well-known. In response to this lack of knowledge, a qualitative exploratory case study using surveys and interviews was used to answer how US organizations were meeting their data science needs and explore the insights and perceptions of IT Managers in the United States about the big data resource gap.

With these results, the first academic research conducted to answer where organizations are finding their big data resources; businesses’ needing such specialists now have research-backed information available to justify recruiting efforts. Colleges or other educational institutions have a primary source of information on the big data resource gap to make more informed decisions about how to address the demand for these highly sought-after specialists.

Research Benefits | What’s in It for Me?


If you have been looking for, or will be looking for, big data specialists, you now have the answer to the question of how to find the resources to derive value from the massive amounts of data generated internally and externally that you currently are not able to analyze. Must you look to new places to find external resources? Should you invest in developing resources internally ? Should you hire consultants? Where is the best place to invest to get what you are looking for? This research provides clarity on the state of the big data specialist resource availability and the effectiveness of the practices being followed by IT Managers today.

New degree programs in universities require investments in resources: Instructors, literature, technology, infrastructure, and university publications just to name a few. Such investments often require justification for a new program. Prior to this research, such justifying data for the need for data scientists by businesses have been provided through anecdotal comments and nonacademic surveys. These research-based results, defining and elaborating on the problem of a lack of qualified resources, are the seminal scholarly supporting data academic institutions have to justify the expense needed to develop data science programs.

The State of Big Data Education

The use of the term, “unicorn,” to describe a data scientist, has been done so because of how difficult it currently is to find an individual with the analytical, technical, and business skills the role requires (Bakhshi et al., 2014, p. 32). Dwoskin (2014) stated that the two websites, Linkedln™ and, listed approximately 30,000 openings for positions with “Data Scientist” in their titles. Though a data scientist does not always require a postgraduate degree, it is quite common that they do. In 2012, only 2500 doctoral degrees were awarded in statistics or computer science (Dwoskin, 2014). Jeonghyun Kim (2016) showed that only twenty-five schools in the United States provide postgraduate classes that were data analytic-specific. Table 10.1 lists the number of schools offering data-centric programs:

With over 4726 universities in the United States offering multiple program degrees in a multitude of disciplines, the number of advanced degrees offered in data-centric programs, as listed in Table 10.1, is tiny, but growing.

TABLE 10.1

2016 Data-Centric Programs by Academic Level











Data Scientist vs Data Analyst

The title, “Data Scientist,” is a relatively new title that differs from the traditional data analyst role provided to those who find trends and model results of traditional business intelligence systems. Where data analysts will use structured query language (SQL) to pull information from relational databases, data scientists use SQL as well as the machine language tools to use statistical models to find correlations between different variables (Harris, Shetterley, Alter, & Schnell, 2014). SAS (2015) differentiates data scientists from even statisticians by explaining that data scientists move beyond descriptive statistics and the reporting of past results to predictive modeling of what is likely to happen in the future. Statistician and data analyst roles are traditionally processing type positions. Business leaders ask for supporting statistics or diagrams and ask for the information from these roles. To be a data scientist, one must be front and center in the business goal discussion (Redman, 2013).

Chen, Chiang, and Storey (2012) expand on the unique responsibilities of the data scientist by listing knowledge of accounting rules, finance, management practices, marketing approaches, logistic methods, and operations administration inside of the domain the data scientist is working. This results in an individual with strong math and statistic skills, excellent communication abilities, programming and software expertise, and great business awareness.

Data scientists do not just help run software against the data, they find and pull in the data sets themselves to find out if correlations exist that can be beneficial to the organization. This process of bringing in data sets requires a data filtering process, or cleansing, to make sure the data are trustworthy.

Business skills are as critical as technical skills for data scientists (Debortoli, Muller, & Vom Brocke, 2014). It is because of this as well as the previously mentioned skill set attributed to data scientists that has made finding individuals with all of these skills so difficult and the reason such an individual with such a broad array of skills is referred to as a “unicorn.” When it comes to studies addressing how businesses are meeting their big data needs with such a lack of data scientists, Gupta and George (2016) reveal that there is little known about how organizations are achieving their big data capabilities.

In relation to so many other topics with decades, if not centuries, of background history, big data analytics is less than two decades old and still in its infancy as a business practice. By nature, big data is a quantitative discipline, using mathematics and statistics to identify patterns in mountains of data sets of different sizes and shapes. The business value of the data scientists and the business practice of finding these big data specialists require a qualitative approach to understand the decisions and actions IT Managers are taking to find them. To date, no scholarly articles have been published around this specific hiring practice, making this the seminal scholarly work to answer what IT Managers are doing to fill their big data resource needs as well as what types of skills are most valuable to IT Managers for these positions.

Going back almost a decade, in 2011, the lack of big data specialists was identified as a major hurdle to business’ ability to exploit the benefits big data analytics could bring to them (Manyika, et al., 2011). Five-years later, in 2016, the demand for big data resources continued to grow faster than the supply of data scientists capable of meeting the demand; in the United Kingdom, Kim (2016) reported a continued increase in the market demand for individuals with big data skills starting in 2013 and continuing through 2020. The data also stated that most (77%) of the roles requiring big data skills were considered difficult to fill.

Five years later, Manyika et al. (2011) reported on the lack of big data resources available. Henke, Bughin, and Chui (2016) claimed that most businesses are still nowhere near realizing their potential benefits from big data. A critical reason for this is due to data scientists still being in high demand. In a 2018 report by ITPRO, Europe needs 346,000 more data scientists by 2020 (IT Pro Team, 2018).

< Prev   CONTENTS   Source   Next >