As a scale’s ability to correspond with other, “maximally dissimilar” (Netemeyer et al., 2003) measures of the same construct is an important scale property, the following research question was asked:

RQ 3: Does the developed measure of eWOM trust correlate significantly and considerably with other methods to measure eWOM trust?

In order to prove convergent validity, the eWOM trust scale was correlated with four different measures of trust in online customer reviews: (1) a non-diagnostic single item measure (previously discussed), (2) a Likert-format, multi-item measure of overall eWOM trust, (3) a semantic differential, multi-item measure of overall eWOM trust, and (4) a qualitative measure of overall eWOM trust. Sample 4 (n = 526) provided the necessary data for the first three approaches. Table 36 reports the results of the correlations between the new eWOM trust scale (measured by a composite value obtained by averaging the respondents’ scores across the five sub-dimensions, which were themselves calculated by averaging the dedicated observable items) and the alternative measures. The correlations were all strong (ranging from .78 to .89) and significant on the .001 level. Additionally, the strong correlations among the alternative measures (apart from the new measure) suggested that all approaches assess the same construct.

For the evaluation of the association between the new scale and a qualitative measure of eWOM trust, a separate survey was conducted. By using a paper-and-pencil questionnaire, 117 respondents were asked to answer the following open-ended question: “Why do you trust/distrust information given in online customer reviews?”. 54% of the business and nonbusiness students were females and the average age was 24.1 (ranging from 19 to 30 years). In the course of the survey, participants had also to indicate their personal level of eWOM trust by answering the 22-item eWT-S. Two independent judges, unfamiliar with the research purpose, content-coded the responses of the open-ended question on a 7-Point scale from -3 (no trust), to 0 (neutral), to +3 (high trust). Subsequently, for each respondent the scores of the raters were averaged. The correlation between the eWOM Trust scale and this average value was r = .90, p < .001. Taken together, the correspondence between the eWOM trust scale and the responses to the five alternative measures strongly supported the scale’s convergent validity.

Notes: ^{1}= Internal consistency estimate (Cronbach’s alpha); *** p < .001; n.a. = not available.

Cook and Campbell (Cook & Campbell, 1979) claim that for assuring construct validity, two different assessments have to be made by the researcher: first, testing the new scale’s convergence with alternative measures of the same construct, and second, ascertain that it is discriminative from other measures that are supposed to assess related but conceptually different concepts. While research question three targeted the first assessment, the following addresses the psychometric properties of the eWOM scales that assure its discriminant validity from two well-established constructs:

RQ 4: Is the developed measure of eWOM trust significantly different from the measurements of (a) review credibility and (b) attitude towards reviews in general?

The following discussion is conducted on two different levels of complexity: first, this thesis investigates the divergence on the overall construct level of the three concepts; after that, a more restrictive investigation targets the discriminant validity of the five sub-dimensions of eWOM trust in relation to review credibility (Rcred), as well as attitude (RAtt).

One of the most frequently used methods to test discriminant but also convergent validity on the construct level is the multitrait multimethod matrix (MTMM matrix) introduced by Campbell and Fiske (1959). At its head, this procedure demands that in addition to the trait in focus, at least a second trait has to be measured with two alternative methods. In this study, three conceptually different traits were included: (i) trust in online reviews (eWOM trust), (ii) eWOM credibility (Rcred), and (iii) eWOM attitude in general (RAtt). These three traits were assessed with two alternative measurement methods, namely a multi-item Likert scale and a multi-item semantic differential scale. The latter was used because it represents a measurement method frequently used in social research that is maximally different from Likert scales at the same time. The author is aware that the further inclusion of a third trait and a third measurement method would have enabled this study to further investigate possible systematic influences (e.g., method effects) by means of CFA (Hildebrandt & Temme, 2006). However, the author at the same time refraind from doing so as the burden on the survey respondent was regarded as too high.

Consumer trust in online reviews and recommendations was assessed by the application of the new Likert-formatted eWOM trust scale, as well as an adopted version of the scale which enabled the measurement of the same construct with a semantic differential. In both cases, the respondents’ answers to the 22 scale items were averaged in order to derive the Likert (Tl) and the semantic differential (Ts) score. The Likert score for eWOM credibility (Cl) was obtained by averaging the five scale items adapted from Boush, Friestad and Rose (1994) and the construct’s semantic differential score (Cs) by averaging the responses to the scale proposed by Beltramini and Evans (1985). In a similar manner, the measurement scores for consumers’ eWOM attitude was produced. Here, seven items taken from Pollay and Mittal (1993) formed the Likert score (Al), while the average over participants’ responses to the scale introduced by Olney, Holbrook and Batra (1991) formed the semantic differential score (As). Before averaging, the scales’ uni-dimensionality as well as reliability was verified. The MTMM matrix (see Table 37) represents the (unstandardized) correlation coefficients between the scores of all measures.

Method I - Likert Scale

Method II - Semantic Differential

eWOM

Trust

(Tl)

eWOM

Credibil

ity (Cl)

eWOM

Attitude

(Al)

eWOM

Trust

(Ts)

eWOM

Credibility

(Cs)

eWOM

Attitude

(As)

eWOM Trust (Tl)

Method I -

Likert eWOM Credibility (Cl)

(.95)^{1}

.72^{2}

(.81)^{1}

Scale

eWOM Attitude (Al)

.79^{2}

.59^{2}

(.86)^{1}

Method II eWOM Trust (Ts) Semantic eWOM Credibility (Cs)

As mentioned, the MTMM matrix is also capable of providing additional insights concerning a scale’s convergent validity. Campbell and Fiske (1959) state that evidence for convergent validity is provided where the coefficients of the reliability diagonal are consistently the highest in the matrix. Table 37 demonstrates that these coefficients meet this standard and surpassed a desirable threshold. The two scales applied to measure eWOM trust turned out to perform especially well, as the Likert format scale (Tl) and the semantic differential scale (Ts) exhibited the highest reliabilities: a = .95 and .96 respectively. Internal consistency of the remaining four measurement approaches ranged from .81 to .88 and, hence, also achieved respectable Cronbach’s alphas. Campbell and Friske (1959) also demand that the monotrait-heteromethod coefficients have to be statistically significant and sufficiently large in order to support scale convergence. Similarly, Bagozzi and Yi (1988) argue that only where this condition is fulfilled one can plausibly argue that all alternative instruments are likely to measure the same underlying construct. The research at hand was able to show that also this condition was fulfilled, as all monotrait-heteromethod correlations were significant (p < .001) and large. Specifically, the correlation between Tl and Ts was .86, between Cl and Cs was .75, and between Al and As was .76. While some other correlations also turned out to be large relative to other correlations in the matrix (e.g., the correlation between Ts and Al was .76), this thesis’ author is convinced that - together with the earlier-discussed findings - strong evidence for the new scale’s convergent validity exists.

Concerning discriminant validity, Campbell and Fiske (1959) propose that an MTMM matrix should be reviewed in respect to three criteria. First, the entries in the validity diagonal should be higher than the entries in the heteromethod block that share the same row and column. Empirical data shows that the monotrait-heteromethod coefficient of eWOM trust (Tl - Ts) is bigger than the corresponding heterotrait-heteromethod coefficients. That is, r = .60 (for Ts - Cl), r = .76 (for Ts - Al), r = .76 (Cl - Tl), and r = .56 (As - Tl). The same held true for the other two constructs. One exception was the monotrait-heteromethod coefficient of eWOM attitude (r = .76), which was the same as the correlation between Ts and Al.

The second criterion demands that the correlations between different measures of a trait should be higher than the correlations among traits which have methods in common (Campbell & Fiske, 1959). The MTMM matrix only partially passed this second discriminant validity standard. While the monotrait-heteromethod coefficient of eWOM trust (r = .86) was higher than all heterotrait-monomethod coefficients associated with this trait (.72 for Cl and Tl; .79 for Al and Tl; .81 for Cs and Ts; and .68 for As and T_{s}), the coefficient was sometimes lower for the other two constructs. Here, the monotrait-heteromethod coefficient of eWOM Credibility (r = .75) was less than the correlation between Cs and Ts (r = .81). Similarly, the correlation coefficient for eWOM Attitude (r = .76) was smaller than the .79 which represented the correlation between Al and Tl.

According to Campbell and Fiske’s (1959) third criterion, the pattern of the correlations should be the same in all the monomethod and heteromethod triangles. This was hardly achieved, suggesting that some methods influence existed. While it was refrained in doing so in this research, Hildebrandt and Temme (2006) propose several models that strive to uncover various methods effects by means of confirmatory analysis. some of these may also be applicable to this research.

This thesis also considered additional assessments to evaluate the scale’s discriminant validity on the construct level (by using a single-method approach). While the MTMM matrix showed the correlations between the new Likert-formatted eWOM trust scale (Tl) and the two related constructs to range from .56 to .79 - depending on the measurement instrument - and are therefore desirably low, it is advisable to provide more restrictive evidence for the scale’s discriminant validity. To ensure that the constructs are less than perfectly related, Anderson and Gerbing (1988) recommend that the confidence interval (±2 standard errors) around the constructs’ correlations does not contain a value of 1. The upper bound of the 95% confidence interval of the correlation between eWOM trust (Tl) and eWOM credibility was .76 when the construct was measured with a Likert scale and .77 when it measured with a semantic differential. For the correlation of eWOM trust with eWOM attitude, it was .82 (Tl - Al) and .62 (Tl - As). MacKenzie et al. (2011), on the other hand, propose that the test whether the constructs’ intercorrelation is less than .71 represents a more stringent method to assess discriminant validity. Here, the (unstandardized) correlations were regularly slightly about this threshold. That is, the correlation between eWOM trust and eWOM credibility was .72 [.73] and with eWOM attitude .79 [.56]. Taken together, the above findings demonstrate that the three constructs are separate and the new scale possesses discriminant validity on the construct level. However, the disadvantage of using composite scores for the involved constructs (as was done here, until now) is that this ignores measurement error. Scholars agree that this approach is obviously problematic, since the resulting attenuation makes it more likely that the constructs are distinct. Hence, further analyses were necessary to identify potential threats of discriminant validity on the sub-dimensional level.

To assess discriminant validity on the sub-construct level, three different tests using CFA were performed. First, for each possible pairing of constructs, a one-factor model was compared with a hypothesized two-factor model which separates the individual eWOM sub-dimension from eWOM credibility and eWOM attitude respectively. For comparison a chi-square difference test between measures allowing phi (Ф) to vary (i.e., two factor model) and then constraining phi correlation to unity (i.e., one factor model) (Anderson & Gerbing, 1988). This criterion was met by all competing models. The chi-square fit of the unconstrained models was always significantly lower (p < .001) than the fit of the one factor models, providing first evidence for discriminant validity on the sub-dimensional level.

Next, it was tested whether the cross-construct correlations for each pairing is significantly less than one (Bagozzi & Hearherton, 1994). This was done by examining the confidence interval (CI) of the (completely standardized) correlation estimates. Support for discriminant validity was provided, as none of the confidence intervals for the pairwise correlation estimates (±2 standard errors) included the value of one. The highest upper bond was found for the correlation between integrity/honesty and eWOM credibility (.90), followed by the correlation between ability and eWOM attitude (.89). Both relationships can be explained due to conceptual overlap, explained earlier, and were therefore expected. Finally, Fornell and Larcker (1981) suggest that if the squared phi correlation between two constructs is less than the average variance extracted (AVE) of each involved construct, discriminant validity is supported. Accordingly, the AVEs and the squared correlations between the individual eWOM dimensions and the two related constructs were compared. Here, only three of ten possible pairings passed this standard. Benevolence exclusively turned out to possess discriminative validity towards both related constructs. Here, the highest phi square correlation (.32) was smaller than the AVEs (ranging from .50 to .59.). The rest of the sub-dimensions seemed to be more intertwined. Apart from the results for the most restrictive criterion, the tests provided sufficient evidence to assume that all five eWOM trust sub-dimensions discriminate from both eWOM credibility (Rcred) and eWOM attitude (R-Att).

It may be reasonable to question whether generalized trust in online customer reviews differs from dispositional trust. Hence, the following research question was proposed:

RQ 4: (c) Is the developed measure of eWOM trust significantly different from the

measurement of dispositional trust? (d) Do consumers develop trust that is specific to online customer reviews?

Empirical evidence for the important distinction between a consumer’s disposition to trust (TDispo) and his/her general tendency to trust eWOM was provided by data sourced in this research’s main study (sample 4). In total, 526 respondents - from which 48% were females (average age: 40 years; age ranging from 16 to 74 years) - had to answer an online questionnaire which included, besides the new eWOM trust scale and additional items measuring other constructs, also nine items adopted from McKnight et al. (2004) and Gefen (2000), intended to measure dispositional trust. These items were measured on a 7-Point Likert scale with the maxima 0 (I strongly disagree) and 6 (I strongly agree). Higher values on this scale indicated heightened disposition to trust. The Cronbach alpha of this construct was .90. Whether the two constructs were distinguishable or not was assessed by using a similar approach to that described above.

The first hint suggesting that the two constructs are distinct was obtained by an investigation of the means (composite values), which were 3.73 (SD = .89) for eWOM trust and 3.35 (SD = 1.04) for dispositional trust. These were statistically significantly different, t(525) = 7.85, p < .001, with less trust in generalized others and more trust in information given in online customer reviews. The mean values suggest that this sample discriminated in their targets for trust. Further evidence was obtained by the construction of a confidence interval (±2 standard errors) around the correlation coefficient (r = .36). (The significant correlation between the two constructs (p < .001) finds support in this thesis’ nomological framework, which proposes disposition to trust as a determinant of eWOM trust. Hence, H1 is supported.) The upper bound of the 95 percent confidence interval of this correlation was .43 and, hence, did not include the value of 1. Although a correlation less than 1 is a necessary condition for discriminant validity (Anderson & Gerbing, 1988), the test whether the construct intercorrelation is less than .71 is regarded as a more stringent method to ensure constructs’ distinction (MacKenzie et al., 2011). While the correlation between the constructs was significant, it was, however, far below this recommended threshold. Additional evidence for discriminant validity was gathered by means of confirmatory factor analysis (CFA). Here, a one-factor model, where all items were assumed to load on a single factor, was compared with the hypothesized two-factor model, which separates eWOM trust from dispositional trust. Discriminant validity exists if the chi-square value of the two-factor model is significantly lower than the chi-square value of the one-factor model (Anderson & Gerbing, 1988). Table 38 presents the indicators of model performance for the alternative models. In general, the two-factor model showed satisfactory fit (x^{2} = 1,075.77 (df = 426, p < .001), absolute fit indices: GFI = .88, AGFI = .85, RMSEA = .06, RMR = .10, SRMR = .06; incremental fit indices: CFI = .94, NNFI = .94, NFI = .91; parsimonious fit indices: normed chi-square: 2.53). Additionally, the fit of the two-factor model was significantly better than the fit of the one-factor model (x^{2}Diff = 3,819.16; dfDiff = 8; p < .001).

263

Model

Chi

Square

if

P

Chi Square Difference

Model Fit Indices

Competing

Models

Chi

Square

Difference

if

Difference

Sign.

Chi Square /

if

RMSEA

NFI

NNFI

CFI

RMR

SRMR

GFI

A GFI

Null Model (N)

12,050.83

465

25.92

n.a.

n.a.

n.a.

n.a.

n.a.

n.a.

n.a.

n.a.

One Factor Model (1)

4,894.93

434

.001

11.28

.19

.59

.59

.62

.25

.13

.49

.41

(N-l)

7,155.90

31

***

Two Factor Model (2)

1,075.77

426

.001

2.53

.06

.91

.94

.94

.10

.06

.88

.85

(N-2)

10,975.06

39

...

(1-2)

3,819.16

8

***

Table 38: Model Comparison (eWOM Trust - Disposition to Trust)

Notes: *** = p <.001; n.a. = not availabe.

Item

Factor Loading

Communality

MSA

Factor 1

Factor 2

Factor 3

Factor 4

Factor 5

Dp1

.82

.66

.91

Dp2

.79

.69

.91

Dp3

.61

.63

.93

Dp4

.56

.59

.93

Dp5

.86

.73

.90

Dp6

.85

.72

.92

Dp7

.88

.73

.89

Dp8

.73

.73

.93

Dp9

.51

.51

.95

Ab7

.54

.69

.96

Ab8

.51

.68

.97

Ab9

.49

.68

.96

Ab10

.64

.72

.97

Ab11

.46

.64

.97

In2

.84

.71

.98

In3

.90

.71

.97

In4

.91

.75

.97

In5

.88

.75

.97

In6

.91

.77

.97

In9

.76

.67

.98

In10

.96

.62

.97

Be1

.91

.77

.89

Be2

.76

.67

.94

Be3

.87

.73

.90

Wi1

.97

.78

.93

Wi4

.94

.75

.95

Wi5

.98

.76

.95

Wi8

.94

.78

.95

Wi2

.44

.51

.95

Wi6

.52

.60

.97

Wi7

.45

.58

.95

Eigenvalue

12.44

4.07

2.50

1.15

1.10

% of Variance

40.14%

13.13%

8.08%

3.72%

3.55%

Notes: Total variance explained: 68.61%; Factor loadings below .30 not shown. Extraction method: Principal Component Analysis; Rotation method: Promax with Kaiser Normalization; Rotation converged in 7 iterations; Kaiser-Meyer-Olkin Measure of Sampling Adequacy (MSA) .95 and Bartlett’s test of Sphericity: sig. .001.

By applying exploratory factor analysis (EFA), one is able to get additional insight into the relationship between the two constructs. This approach can be regarded as an even more demanding approach, because no a-priori assignment of the items to their hypothesized latent constructs takes place, but the individual items were grouped according to their relationship inherent in the data. Hence, principal components analysis (PCA) with Promax rotation was applied to the data. An oblique rotation was used due to theorized linear relationships between eWOM trust and disposition to trust, which is assumed to be one of its antecedents (Mayer et al., 1995; McKnight & Chervany, 2002, 2006; Rotter, 1971). The PCA resulted in a five factors with eigenvalues greater than 1 explaining 68.61% of total variance (see Table 39). Factors 1, 2, and 5 represented the cognitive, behavioral and emotional aspects of eWOM trust and included solely items theorized to belong to the eWOM construct. That is, ability and integrity/honesty items all loaded substantially and significantly on the first factor (loadings ranging from .46 to .96). All willingness to rely as well as willingness to depend items showed a significant relationship, with the second factor representing the behavioral aspects of eWOM trust. Wi2 was the item with the weakest - but nevertheless a significant - loading (.44). The emotional/social aspect of trust is mirrored in the fifth factor, including the three benevolence items with strong loadings (> .76). Disposition to trust is represented by the third and the fourth factor. According to literature, dispositional trust consists of two separate dimensions: (1) Faith in humanity and (2) Trusting stance (Gefen, 2000; McKnight et al., 2002b; McKnight et al., 2004). The PCA showed that the items intended to measure faith in humanity (Dp1-Dp4) all loaded on a single factor (Factor 4), while the remaining items hypothesized to measure trusting stance (Dp5-Dp9) loaded significantly on the third factor (item loadings ranging from .51 to .88). None of the items showed notable cross-loadings. Taken together, the above results provided strong evidence in favour of a two-construct-perspective.

In order to address the second part of the research question (RQ4d), this research compared the new scale assessing eWOM trust (eWOMTmst) with scales targeting the measurement of trust in other objects. For this purpose, the new scale was modified by substituting for the word online customer reviews two other targets - friends and family members (i.e., word-of-mouth/WOM) (WOMTrust) as well as salespersons (SPTmst). Besides these two alternative forms of market communication, trust in online advertising (OADTrust) was measured by an adopted 19-item version of the scale advanced by Soh (2007). All items were measured on a 7-Point Likert scale, ranging from 0 (“I strongly disagree”) to 6 (“I strongly agree”). Consequently, three different questionnaires containing these scales were set to a convenience sample. The survey was conducted online in spring 2013. A total of 133 usable responses were collected. 73.7% of the respondents were females and the average age of the sample was 22.6 years (ranging from 18 to 29 years). In total, 45 usable responses were collected for the eWOMTrust - WOMTrust questionnaire, 42 for eWOMTrust - SPTrust, and 46 responses for the eWOMTrust - OADTrust instrument. For analysis, a composite measure for each construct was calculated. All scales achieved desired levels of internal consistency as Cronbach alphas’ surpassed the recommended .70 threshold: .95 for trust in WOM, .98 for trust in salespersons, and .96 for trust in online advertising. The means of the various scales (with higher scores indicating higher trust), were 3.47 (SD = .78) for eWOMTrust, 4.63 (SD = .71) for WOMTrust, 3.76 (SD = 1.16) for SPTrust, and 1.89 (SD = 1.11) for OADTrust. The differences in means between eWOM trust and the other measures were all significant, with consumers having the highest trust in offline word-of-mouth and the lowest trust in online advertising. These differences suggest that the sample discriminated in their willingness to trust different objects. Additional evidence of

265

discrimination was given by a review of the correlations between the eWOM trust scale and the other measures. The only correlation that was significantly different from zero was the moderate correlation between eWOM trust and WOM trust, r = .33, p < .05. This may be due to consumers’ perceptions that these two kinds of information have a similar basic nature. However, in sum, results suggest that consumers who are willing to trust online reviews are not necessarily more confident in other sources of market communication. Therefore, from this perspective, the eWOM trust scale does not reflect a general tendency to trust per se, but a consumer’s tendency to trust a specific kind of market information; that is, online customer reviews and recommendations.