The Rise of the PhD in Data Science

Historically, people who pursued a PhD almost exclusively sought an academic position as an assistant professor at a research university after they successfully defended their dissertation. This was succeeded by five years of research and publication (“publish or perish”) as they progressed through tenure and promotion to associate professor. Academics who are not granted tenure are typically dismissed (or voluntarily leave) the university.

While some academics will stay at the associate professor level for the remainder of their career, some will pursue promotion to "full" professor, which is almost always a function of the impact their research (more about that later). Then they die. Really.

Tenured and tenure-track faculty are older than typical U.S. workers at analogous career points. According to the U.S. Bureau of Labor Statistics, the median age in the U.S. labor force is 42 years compared to the median tenure-track faculty age of494 (although increasingly 49 seems very young, but we digress). In addition, just 23% of all U.S. workers are 55 or older, compared to 37% of university faculty, with 13% of faculty over the age of 655.

In 2018, the National Center for Education Statistics6 reported the following distribution for faculty rank in the United States.

With the rise of data science, this progress of PhD studies to assistant professor ... associate professor ... full professor ... followed by death ... is changing.

In our PhD program, more than half of the applicants express an interest in pursuing a position in the private sector after graduation.

According to the magazine Science, the year 2017 was a milestone, with an equal number of PhDs going into the private sector as entering academia. See Figure 5-2. While the reasons for this shift are many and vary by field, the unprecedented challenges and opportunities which have emerged as the forms, volume, and velocity of data have evolved are contributing to the demand for well-trained data scientists who can engage in research, development and innovation in a corporate environment8.

Where PhDs work after graduation (2010-2017)

Figure 5.2 Where PhDs work after graduation (2010-2017).

Sadly, there are still some in academia who will steer their doctoral students away from the private sector and represent that any career outside of a research university is a “consolation prize”. Case in point:

At a recent national meeting of university analytics program directors, the authors listened to a professor from a large northeastern university utter the words, "If we place PhD graduates into the private sector, we failed".

Yep. He said that out loud. While there was a combination of gasps and giggles, there were people in the room who nodded in agreement. Those nodding then proceeded to adjust the leather patches on their elbows and light their pipes.

Illustrations have been created especially for this book by Charles Larson.

To be fair to our nodding colleagues, the sentiment that placement outside of a research university equals “failure” for a PhD is grounded in the degree’s emphasis on research. And not just any research, but “peer-reviewed” research, where the investigator submits their work to their “peers” for consideration for publication in a journal or for presentation at a national or international conference. A panel of reviewers who are recognized as experts in the area in question - the “peers” - consider the submission and provide feedback. The results of this feedback include “Reject”, “Accept” (this almost never happens), or “Revise and Resubmit”. Journals (and conferences) with lower rates of acceptance are, as expected, more prestigious. The level of prestige - or in academic terms “the impact factor” - of an academic’s publications becomes one of the primary metrics upon which they are considered for university positions. The other metric is the amount of funding their research has received - more about that later.

Historically, only academics published peer-reviewed research. However, a study conducted by the authors in 20199 demonstrated that over a quarter of all publications in “academic” journals aligned with analytics and data science included researchers with no academic affiliation. Many PhD researchers in innovation labs at companies like Google, Facebook, IBM, Equifax, and Hewlett Packard are publishing their work in high impact “academic” journals in data science, thereby contributing to a more permeable research membrane between academia and particularly innovative private sector companies. So, while peer-reviewed publication continues to be the “coin of the realm” for PhDs, those coins have value outside of academia.

As an example, consider Google (of course). Google maintains a repository10 of published research from their PhD scientists in data science, computer science, engineering, and mathematics. From Google’s website:

Our teams aspire to make discoveries that impact everyone, and core to our approach is sharing our research and tools to fuel progress in the field. Our researchers publish regularly in academic journals, release projects as open source, and apply research to Google products.

Because PhD research scientists are so important to the cutting-edge research and thought leadership at Google, in 2019, they announced that they would start conferring their own PhD degrees through Google AI research division". While most organizations do not have the innovation and R&D infrastructure of Google, organizations in all sectors of the economy are increasingly seeking doctoral-level talent in analytics and data science for all of the same reasons.

Consider a second example - Amazon (of course). Amazon launched the “Amazon Research Awards” program in 201712. The program offers awards of up to $80,000 to faculty members (and doctoral students) at academic institutions worldwide for research related to topics such as Artificial Intelligence, Knowledge

Management, Machine Learning, Natural Language Processing, Robotics, and Online Security and Privacy. Each funded proposal is assigned an Amazon research contact from their research division. Amazon Research Award recipients are expected to publish their research in publicly accessible peer-reviewed outlets - with the Amazon research contact. The program has also partnered with the National Science Foundation - which has historically been the primary funding source of university research - to solicit research proposals from universities to investigate “Fairness in Artificial Intelligence.”13 Another demonstration of that increasingly permeable membrane between academia and the private (and public) sectors.

In academia, the first formal PhD program in Analytics and Data Science was launched in 201514. Since then, the number of programs offered through academic institutions has continued to grow to several dozen across the United States15. It is worth noting that these PhD programs (and these are PhD programs, rather than professional doctorates) are housed in different places across their respective universities. Common locations for data science programs include departments of computer science, statistics, engineering, or business. Some programs are housed in research units (i.e., centers or institutes). All programs, regardless of the academic housing unit have the potential to be effective collaborative research partners but may have different orientations and different priorities.

For example, programs with a stronger science orientation (e.g., computer science, statistics) will likely have a bias towards more algorithmic, process, and mathematical research. Alternatively, professional doctorate programs with a stronger application orientation (e.g., business, DBAs, engineering) will likely have a bias towards more defined problem-centered research. Programs that are housed in research units typically have a mission related to a particular goal (e.g., renewable energy, manufacturing, consumer finance).

As an example, the PhD Program in Analytics and Data Science at Kennesaw State University, is housed in the School of Data Science and Analytics. The doctoral program was structured to be completed in four years. See Figure 5-3-

As you consider doctoral programs for research collaboration, spend some time understanding what kind of research initiatives the students and faculty are currently focused on, how those initiatives are currently funded, and where the output of the research is being published.

Unlike one semester projects which are pervasive in undergraduate and master’s- level programs (see Chapters 3 and 4, respectively), productive research initiatives do not readily lend themselves to transactional models that require renegotiation every time another project is proposed; university research collaboration works best under a durable, multi-year cooperative model (research lab) that enables continuity throughout all stages of research. And longer-term engagements more easily facilitate the translation of that research into new products that drive economic growth.

Kennesaw State University PhD program in analytics and data science curriculum (78 Credit Hours)

Figure 5.3 Kennesaw State University PhD program in analytics and data science curriculum (78 Credit Hours).

< Prev   CONTENTS   Source   Next >