Contemporary research on the Big Three HERS and their rankings
Rankings include a small set of indicators, whose meaning in terms of overall education activity' of universities is questionable (Saisana, D’Hombrea, & Satelli, 2011). As discussed in earlier chapters, proxies for education quality such as the number of Nobel and Fields Medal prizes (De Witte & Hudrlikova, 2013; Billaut, Bouyssou, & Vincke, 2010; loannidis, et al., 2007), student-staff ratios (Bekhradnia, 2017; Huang, 2012) and proportions of international stafl/students (Anowar, et al., 2015) are all considered to lack validity' and deemed unreliable by' most analysts. Additionally, they have also been shown to be inconsistent across the various rankings (Saisana et al., 2011).
A number of studies call into question and cast doubt on the statistical properties of the rankings (Harvey, 2008; Bookstein et al., 2010) irrespective of their substantive content, whilst other studies show that rankings systematically alter the representation in favour of large and/or established universities (Daraio et al., 2014; Soh 2015). Anowar et al. (2015) critically analysed the construct validity of some of the indicators of multiple ranking systems. Both QS and THE reported strong construct validity' regarding their opinion surveys and moderate levels for citation analyses. ARWU also reported moderate levels for the citation analyses in the Nature and Science articles. However, ARWU, scored high levels of construct validity' in the indicator for Quality' of Faculty' with Nobel/Field medal awards (Anowar, et al., 2015). Interestingly, the overall score of the THE ranking, and the reputation indicators obtained through survey responses, shows serious statistical problems when year-to-year shifts are examined in detail (Bookstein et al., 2010).
Soh (2015) analysed the THE criteria and discovered a high degree of multicollinearity' between indicators. The problem of multicollinearity is that it signals that there is considerable overlap among the indicators, such that some of them are redundant and make the overall score unstable (Soh, 2015). Soh (2015) found that the Teaching and Research indicators have the largest amount of multicollinearity’ which implies that one of the measures could be a redundant measure of academic excellence. Surprisingly' there was a lack of strength in the correlation between the Citation and Research indicators (Soh, 2015). This might suggest that high research productivity' does not necessarily' translate into publications in learning journals. The QS and the ARWU criteria are yet to be analysed in this manner.
Furthermore, Soh (2015, 13) suggests that because THE and QS have both academic, and what might be regarded as administrative, measures their overall score can be seen as a less ‘pure’ indication of academic excellence. Whilst both systems include students and staff in their indicators this researcher argues that, given the academic achievement of a university' depends to a large extent on the quality' of students and teachers, these deserve more attention than is accorded by'
The Big Three: broad issues and THE WUR detail 81 these ranking systems. Kaychen (2013) analysed THE and ARWU and found that position in rankings is predominantly determined by underlying factors like age, scope, activity in hard sciences, being a university in the U.S., English-speaking countries, annual income, orientation towards research, and reputation. Universities may aspire to become world class but they only have control over a limited number of factors like research and reputation whilst other institutions can depend on historical indicators. Bowman and Bastedo (2011) tested the anchoring effect by examining THE data and illustrated that rankings themselves might substantially influence assessments of institutional reputation.
Kaychen (2013) examined the existence of an underlying dimension (via a Principle Component Analysis) to the variables used in the ARWU and THE rankings. The results of this study show 73.36 % of the variance of the ranking formed by the combination of the ARWU and THE rankings might be explained by six different factors. These factors include; activity in hard sciences, annual income ranking, if the university is from the US or an English-speaking country (excluding the US), a university’s orientation toward research and its reputation.
Researchers like Sorz, Fieder, Wallner, & Seidler (2015) and Dobrota, Bulajic, Bornmann & Jeremie (2015) address yearly data fluctuation. They argue that the THE Rankings in their current form have very limited value for the management of universities ranked below 50 because the described fluctuations in rank and score probably do not reflect actual performance, meaning the results cannot be used to assess the impact of long-term strategies (Sorz et al., 2015). Dobrota et al., (2015) aim to overcome the yearly QS ranking instability and weighting subjectivity using a new weighting system based on statistically multivariate and methodologically grounded method. Their method resulted in more stability and less uncertainty or sensitivity in ranking results. However, their approach is still subject to further significant methodological refinement. Sorz et al., (2015) compared year-to-year result fluctuations in the ARWU rankings with the THE concluding that the ARWU seems to be more stable. Furthermore, a very low correlation between the ranks of THE and ARWU is evident, especially for institutions ranked below 50 (Sorz et al., 2015) which calls into question the value of rankings information in the lower levels of the league table.
Among other methodological propositions Daraio et al. (2014) presents an approach which would use an original and comprehensive database on European universities microdata integrated with bibliometric data on scientific production, and by applying recently developed techniques in efficiency analysis. De Witte and Hudrlikova (2013) suggest using an endogenous weighting system where higher weights are given to outputs the university is relatively good at, and lower weights to outputs in which the university performs relatively less well. They assert that the weights are data dependent and potentially enhance the fairness of ranking for diverse (heterogenic) institutions. Limitations of their study include a lack of transparency and the fact that small changes in a variable may result in big changes in the ranking outputs. Goglio (2016) calls for a plurality of rankings, highlighting numerous stakeholders representing different needs and priorities and suggesting that HERS move away from the one-size-fits-all approach.