Menu
Home
Log in / Register
 
Home arrow Psychology arrow The Wiley Blackwell handbook of the psychology of recruitment, selection and employee retention
Source

Response Distortions

The final predictive validity discussion concerns the persistent problem of response distortions. Personality items are designed to measure respondents’ characteristic patterns of thinking, feeling and behaving. The utility of any measure corresponds to its reliability and validity, both of which are attenuated by measurement error (i.e., measuring things other than the intended target). Self-report questionnaire items are susceptible to a wide variety of measurement errors, from differences in item and response scale interpretations to systematic differences in response styles (e.g., acquiescence; extreme responding; Furnham, 1986). Many of these sources of measurement error are well known. However, of more interest in the selection literature is measurement error arising from response distortions due to low self-awareness and deliberate faking.

There is a number of excellent discussions of faking in the literature, covering areas such as how much people fake, how successful people are at faking and how faking influences reliability and validity (see Birkeland, Manson, Kisamore, Brannick & Smith, 2006; Morgeson et al., 2007a; Mueller-Hanson, Heggestad & Thornton, 2003; Ones & Viswesvaran, 1998; Tett & Christiansen, 2007), and we do not intend to reproduce these discussions here.

For the current authors, the bottom line is that response distortions such as faking almost certainly occur and that some individuals fake more than others, which is of course problematic from measurement, validity and ethical perspectives (Birkeland et al., 2006). Nevertheless, personality measures retain predictive validity, as discussed above, regardless of response distortions (Ones & Viswesvaran, 1998). Thus, personality measures are still useful in a world where response distortions are quite common. This does not mean, however, that we should ignore response distortion or, as some have suggested, see it as a desirable social skill (Morgeson et al., 2007a). Rather, we should aim to measure, model and prevent it. If we can reduce response distortions, then we may be able to improve the predictive validity of personality tests further and certainly reduce the associated ethical issues.

Numerous solutions to combat response distortions have been suggested. Solutions such as social desirability scales (Feeney & Goffin, 2015), forced-choice or ipsative measures (presenting candidates with multiple trait statements that are matched for social desirability and allowing them only to indicate one that is most like them; Heggestad, Morrison, Reeve & McCloy, 2006; Johnson, Wood & Blinkhorn, 1988; Meade, 2004), and imposing time limits for candidates (Holden, Wood & Tomashewski, 2001; Komar, Komar, Robie & Taggar, 2010). While each of these methods shows some promise, none has shown any genuinely compelling empirical support.

That forced-choice personality measures have little influence over social desirability ratings, despite appearing to be more difficult to fake, is surprising (Heggestad et al., 2006). It is also the prevailing view in the organizational psychology community that compared to Likert-type formats, ipsative measures produce lower predictive validity. Recent studies, however, provide evidence to the contrary (Bartram, 2007; Salgado, Anderson & Tauriz, 2014; Salgado & Tauriz, 2014).

Forced-choice measures come in two broadly different formats: fully ipsative (e.g., rank order four items/traits beginning with the one most like you) and partially ipsative, which contain a forced-choice element while retaining some flexibility (e.g., choose from a list of four the item/trait least and most like you; see Hicks, 1970). A recent metaanalysis by Salgado and colleagues (2014) suggests that fully ipsative measures perform poorly with regard to predictive validity, but that partially ipsative measures produce impressive levels of predictive validity (Salgado et al., 2014). Compared to validity estimates derived predominantly from Likert-type measures (Barrick et al., 2001), partially ipsative assessments of emotional stability, openness and conscientiousness are considerably larger, while measures of extraversion and agreeableness are equivalent across formats (See Table 8.2).

Salgado and colleagues (2014) also examined associations within eight job roles: clerical, customer service, health, managerial, military, sales, skilled and supervisory. The primary study numbers (k = 2-11) and sample sizes (N = 171-3,007) are small and vary markedly across job roles. Equally, estimates for each of the Big Five were not available across all roles (e.g. emotional stability not reported in customer service roles). We suggest that these notable limitations preclude firm conclusions regarding which traits best predict which role, however, the pattern of the results remains very interesting. Particularly striking is the range of validities reported, which in raw correlations vary from 0 to 0.4 and in corrected validates vary from 0 to 0.7. Table 8.3 includes the highest and lowest predictive validity reported for each of the Big Five. The difference in variance explained between the mean and largest validity estimates is substantial, with the largest estimates between 4 and 10 times as large as the mean.

The variation in predictive validity provides compelling support for the arguments put forward in the ‘Job analysis and the selection of relevant traits’ section, specifically, that universal job performance does not exist and that the nature of the role moderates the correlations between personality and job performance. Further research using job-relevant, partially ipsative personality measures identified through personality-oriented job analysis or theoretical frameworks appears warranted.

The results from partially ipsative measures appear compelling. However, self-report distortions remain resilient. One potential avenue for mitigating the problems with self-ratings altogether is not to rely on them but instead have ‘others’ rate candidates’ personality. Two meta-analyses indicate that other ratings of personality might offer improved predictive validity over self-ratings. Connelly and Ones (2010) conducted a meta-analysis of 44,178 targeted individuals rated across 263 independent samples. Each target participant had at least one set of other ratings for the Big Five. Similarly, Oh, Wang and Mount (2011)

Table 8.2 Mean, lowest and highest predictive validities of partially ipsative measures of the Big Five.

Trait

Correlations with Job Performance

Partially Ipsative

Likert-type

r

r1

r2

r

r1

r2

Emotional stability

Highest: Supervisory

0.37

0.68

460.2

Lowest: Managerial

-0.01

-0.02

00.0

Mean

0.11

0.20

40.0

0.09

0.10

10.0

Extraversion

Highest: Managerial

0.21

0.34

110.6

Lowest: Sales

0.05

0.08

00.6

Mean

0.07

0.12

10.4

0.06

0.13

10.7

Openness

Highest: Clerical

-0.27

-0.44

190.4

Lowest: Sales

0.11

0.17

20.9

Mean

0.14

0.22

40.8

0.02

0.03

00.0

Agreeableness

Highest: Skilled

0.28

0.42

170.6

Lowest: Managerial

-0.04

-0.07

00.5

Mean

0.10

0.16

20.6

0.07

0.17

20.9

Conscientiousness

Highest: Skilled

0.43

0.71

500.4

Lowest: Supervisory

0.09

0.18

30.2

Mean

0.22

0.38

140.4

0.10

0.23

50.3

Likert-type estimates taken from Barrick et al. (2001); r = uncorrected correlation; r1 = corrected for unreliability in criterion only and indirect range restriction in the predictor; r2 = percentage of variance explained.

conducted a meta-analysis of some 2,000 target individuals from 18 independent samples. Table 8.3 displays a summary of the main findings of these meta-analyses.

The predictive validities of other ratings were substantially higher than for self-ratings, regardless of whether or what type of correction was utilized. In many instances the predictive validity of other ratings are 2, 3 or 4 times the magnitude of self-ratings. In the case of openness, the other ratings are 6 times the magnitude of self-ratings. The magnitudes of these relationships are impressive. If we use the estimates provided by Schmidt and Hunter (1998) as a guide, the univariate validities are equivalent to some of our most valid selection methods, the multivariate validity would no doubt surpass many of these other methods and the potential incremental predictive validity over and above other methods is substantial.

Oh and colleagues’ (2011) meta-analysis provides two more particularly interesting findings for the selection domain. First, it appears that combining self-ratings with other ratings is of little value as self-reports offer little incremental predictive validity over other reports. Second, while predictive validity increases in line with the number of other ratings, the increment is generally small. Specifically, the increase from 1 to 3 other ratings ranges from 0.04 to 0.06 (uncorrected) and 0.05 to 0.09 (corrected), suggesting that while multiple other ratings are optimal, the value of a single other rating is still substantial (Oh et al., 2011).

Table 8.3 Correlations between job performance and personality as assessed by self-ratings and other ratings.

Trait and rating type

Correlations with Job Performance

Connelly and Ones (2010)

Oh et al. (2011)

r

r1

r2

r

r3

Emotional Stability

Other rating

0.14

0.17

0.37

0.17

0.24

Self-rating

0.06

0.11

0.12

0.09

0.14

Extraversion

Other rating

0.08

0.11

0.18

0.21

0.29

Self-rating

0.06

0.11

0.12

0.06

0.09

Openness

Other rating

0.18

0.22

0.45

0.20

0.29

Self-rating

0.03

0.04

0.05

0.03

0.05

Agreeableness

Other rating

0.13

0.17

0.31

0.23

0.34

Self-rating

0.06

0.11

0.13

0.07

0.10

Conscientiousness

Other rating

0.23

0.29

0.55

0.31

0.41

Self-rating

0.12

0.20

0.23

0.15

0.22

r = uncorrected correlation; r1 = corrected for unreliability in criterion only; r2 = corrected for unreliability in the predictor and criterion; r3 = corrected for unreliability in the criterion measure and range restriction in the predictor; Other ratings for Oh et al. (2011) refer to the mean predictive validity taken from three observers.

Clearly, other ratings offer a marked improvement over self-ratings in predicting job performance. One likely contribution to the increase is that other ratings mitigate the response distortions commonly associated with self-ratings, which is highly desirable. Perhaps less desirable is the possibility that observer ratings and job performance ratings are highly correlated due to an element of common method bias. It is plausible that other ratings of personality are assessing reputation and likeability, which is arguably what supervisor ratings of overall job performance are assessing. Whether this shared variance is a good or bad thing remains to be debated. Nevertheless, the results from studies of other ratings are highly promising.

 
Source
Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
 
Subjects
Accounting
Business & Finance
Communication
Computer Science
Economics
Education
Engineering
Environment
Geography
Health
History
Language & Literature
Law
Management
Marketing
Mathematics
Political science
Philosophy
Psychology
Religion
Sociology
Travel