Group Differences with Blocking

This chapter concerns testing and estimation of differences in location among multiple groups, as raised in the previous chapter, in the presence of blocking. Techniques useful when data are known to be approximately Gaussian are first reviewed, for comparison purposes. The simplest example of blocking is that of paired observations; paired observations are considered after Gaussian techniques. More general multi-group inference questions are addressed later.

Gaussian Theory Approaches

Techniques appropriate when observations are well-approximated by a Gaussian distribution are well-established. These are first reviewed to give a point of departure for nonparametric techniques, which are the focus of this volume.

Paired Comparisons

Suppose pairs of values (Xa,X,;2) are observed on n subjects, indexed by i, and suppose further that (Хц, X^) is independent of (Хд,Х/2) for i,j € {1,... , n}, i ф j, and that all of the vectors (Хц, X^) have the same distribution. Under the assumption that the observations have a Gaussian distribution, one often calculates differences Z, = ХцX,2- and applies either the testing procedure of §2.1.2, or the associated confidence interval procedure. Such a test is called a paired t-test.

Multiple Group Comparisons

Suppose one observes independent random variable Xku, with к € {1, • • • , К}, l € {1, ■ ■ ■ . L}, and i € {1, • • • , Мд/}. Here index к represents treatment, and index l represents block. A test of equality of treatment means is desired, without an assumption about the effect of block. When these variables are Gaussian, and have equal variance, one can construct a statistic analogous to the F test (4.3), in which the numerator is a quadratic form in differences between the means specific to group, Хд.,, = ]F/'=i Xa=*i Xkli/ nkU and the overall mean. Formulas are available for the statistic in closed form only when the replicate numbers satisfy

for Mfc. = Ef=i Мы, Мл = ZLi Мы, and M. = Zk=i ZL Мы.

Nonparametric Paired Comparisons

Suppose pairs of values (Хц,Х,2) are observed on n subjects, indexed by i, and suppose further that (Хц, X^) is independent of (Хд,Х/2) for i,j 6 {1,... , n}, i ф j, and that all of the vectors (Хц. A^) have the same distribution. Consider the null hypothesis that the marginal distribution of {X,} is the same as that of {У,}. versus the alternative hypothesis that the distributions are different. This null hypothesis often, but not always, implies that the differences Z, = ХцХг2 have zero location. One might test these hypotheses by calculating the difference Z, between values, and apply the one- sample technique of Chapter 2, the sign test; this reduces the problem to one already solved.

After applying the differencing operation, one might expect the differences to be more symmetric than either of the two original variables, and one might exploit this symmetry. To derive a test under the assumption of symmetry, Let Rj be the rank of j among all absolute values. Let

Define the Wilcoxon signed-rank test in terms of the statistic

Under the null hypothesis that the distribution of the X, is the same as the distribution of the Y, . and again, assuming symmetry of the differences, then (, Sj. ) and (, Rj,) are independent random vectors, because Sj and Xj are pairwise independent under Ho-

Components of the random vector (Ri,..., Rn) are dependent, and hence calculation of the variance from T$r via (5.2) requires calculation of the sum of identically distributed but not independent random variables. An alternative formulation, as the sum of independent but not identically distributed random variables, will prove more tractable. Let

Hence Tsr = Y j Vj ■ and the null expectation and variance are


One can also calculate exact probabilities for Tsr recursively, as one could for the two-sample statistic. There are 2” ways to assign signs to ranks 1,..., n. Let f(t, n) be the number of such assignments yielding Tsr = t with n observations. Again, as in §3.4.1, summing the counts for shorter random vectors with alternative final values,

This provides a recursion that can be used to calculate exact p- values.

Example 5.2.1 Consider data calculated on the size of brains of twins (Tramo et al., 1998). This data set from

http://lib.stat. emu. edu/datasets/IQ_Brain_Size

contains data on 10 sets of twins. Each child is represented by a separate line in the data file, for a total of 20 lines. We investigate whether brain volume (in field 9) is influenced by birth order (in field 4)■ Brain volumes for the first and second child, and their difference, are given in Table 5.1. The rank sum statistic is 3+4 + 10-1-8-1-7 = 32, the null expected rank sum is 10x 11/4 = 27.5, the null variance is 10x21 x 11/24 = 96.25, and so the two-sided p-value is 2 x Ф( —(32 — .5 — 27.5)//96.25) = 2 x Ф(—0.408) = .683. There is no evidence that twin brain volume differs by birth order. This calculation might also have been done in R using



T0TSA=0, T0TV0L=0,WEIGHT=0),skip=27,nmax=20)) fir<-twinbrain[twinbrain$0RDER==l,] fir$vl<-fir$T0TV0L sec<-twinbrain[twinbrain$0RDER==2,] sec$v2<-sec$T0TV0L

brainpairs<-merge(f ir,sec,by="PAIR")[,c("vl","v2")] brainpairs$diff<-brainpairs$v2-brainpairs$vl wilcox.test(brainpairsldiff)

giving an exact p-value of 0.695. Compare these to the results of the sign test and t-test:

TABLE 5.1: Twin brain volume






























1067 1347 1100
























library(BSDA)#Need for sign test.

SIGN.test(brainpairsSdiff) t.test(brainpairsSdiff)

giving p-values of 1 and 0.647 respectively.

As with the extension from two-sample testing to multi-sample testing referred to in §3.4.2, one can extend the other rank-based modifications of §3.4.2 and §3.6 to the blocking context as well. Ties may be handled by using mean ranks, and testing against a specific distribution for the alternative may be tuned using the appropriate scores. The more general score statistic for paired data is Tgp = a;jSj'• in this case, E [Tgp] = ^2jaj/%, and Var [Top] = аУ4.

Example 5.2.2 Perform the asymptotic score tests on the brain volume differences of Example 5.2.1.


cat("Asymptotic test using normal scores ") brainpairs$normalscores<-qqnorm(seq(length( brainpairsSdiff))$x symscorestat(brainpairsSdiff.brainpairsSnormalscores) cat("Asymptotic test using savage scores ") brainpairs$savagescores<-cumsum(

1/rev(seq(length(brainpairsSdiff)))) symscorestat(brainpairsSdiff.brainpairsSsavagescores)

giving p-values of 0.863 and 0.730 respectively.

Permutation testing can also be done, using raw data values as scores.

This procedure uses the logic that Xj and Yj having the same marginal distribution implies YjXj has a symmetric distribution. Joint distributions exist for which this is not true, but these examples tend to be contrived.

Estimating the Population Median Di erence

Estimation of the median of the observations, which is their point of symmetry, based on inversion of the Wilcoxon signed-rank statistic mirrors that based on inversion of the Mann-Whitney-Wilcoxon statistic of §3.10. Let Tsr{0) be the Wilcoxon signed-rank statistic calculated from the data set Zj(0) = Zj — в = Yj — Xj — 9. after ranking the Zj(0) by their absolute values, and summing the ranks with positive values of Zj(0). Estimate в as the quantity в equating Tsr to the median of its sampling distribution; that is, в satisfies Tsr{9) = n(n + l)/4.

After sorting the values Zt, denote value at position j in the ordered list by the order statistic Z^y One can express в in terms of Zyy, by considering the behavior of Tsr(0) as в varies. For в > Ziny Tsr(O) = 0. When в decreases past Zyy, the Z* — в with the lowest absolute value switches from negative to positive, and Tsr{0) moves up to 1. When в decreases past {Ztn + Z(n_i))/2, then Z(nj —в goes from having the smallest absolute value to having the second lowest absolute value, and Tsr goes to 2. The next jump in Tsr occurs if one more observation becomes positive (at Z^n_if) or if the absolute value of the lowest shifted observation passes the absolute value of the next observation (at Z(n) + Z(,t_2)/2).

Generali}', the jumps happen at averages of two observations, including averages of an observation with itself. These averages are called Walsh averages. These play the same role as differences in the two-sample * So

case. First, note that Tsr is the number of positive Walsh averages. This can be seen by letting Zyj be observation ordered by absolute value, and letting Wij = (Z[jj + Z(jj)/2 for i < j. Suppose that Zyj > 0. Then Zyj + Z^j > 0 for i < j, and all Wtj > 0 for г < j, and Wjj > 0. On the other hand, if Zfy < 0, then Z|j] + Zp] < 0 for г < j, and then all IVy < 0 for i < j, and Wjj < 0.

I) has half of the Walsh averages below it, and half above, and в is the median of Walsh averages. This estimator is given by Hodges and Lehmann (1963), in the same paper giving the analogous estimator for the one-sample symmetric context of §3.10, and is called the Hodges-Lehmann estimator.

Example 5.2.3 Walsh averages may be extracted in R using

aves<-outer(brainpairsldiff,brainpairs$diff,"+")/2 sort(aves[upper.tri(aves,diag=TRUE) ])

to obtain the 10 x (10 + l)/2 = 55 pairwise averages

Their median is observation 28, which is 10.0. Estimate the median difference as 10.0.

Con dence Intervals

Confidence intervals may also be constructed using the device of (1.16), similarly as with the one-sample location interval construction of §2.3.3 and the two-sample location shift interval construction of §3.10. Let Wj be the ordered Walsh averages. Find the largest ti such that P [Tsr < ti] < a/2; then ti is the a/2 quantile of the distribution of the Signed Rank statistic. Using (5.3) and (5.4), one might approximate the critical value using a Gaussian approximation ti ss n(n+l)/2 — za/2y/n(2n + l)(n + 1)/24; note that this approximation uses the distribution of the signed-rank statistic, which does not depend on the distribution of the underlying data as long as symmetry holds. Recall that zp is the 1 — /3 quantile of the standard Gaussian distribution, for any 0 € (0,1); in particular, za/2 is positive for any a € (0,1), and «0.05/2 = 1-96. By symmetry, P {Tsr > tu < a/2 for tu = n(n — l)/2 - ti + 1.

As noted above, Tsr(0) jumps by one each time в passes a Walsh average. Hence the confidence interval is

By symmetry, this interval is (U),(, U')„). See Figure 5.1.

FIGURE 5.1: Construction of Median Estimator

Example 5.2.4 Refer again to the brain volume data of Example 5.2.1. Find a 95% confidence interval for the difference in brain volume. The 0.025 quantile of the Wilcoxon signed-rank statistic with 10 observations is t° = 9; this can be calculated from R using

qsignrank(0.025, 10)

and confidence interval endpoints are observations 9 and 55+1-9=47. 4s tabulated above, Walsh average 9 and 47 are -32.0 and 49.5 respectively. Hence the confidence interval is (-32.0, 49.5). This might have been calculated directly in R using


Similar techniques to those of this section were used in §2.3.3 to give confidence intervals for the median, and in §3.10 to give confidence intervals for median differences. In each case, a statistic dependent on the unknown parameter was constructed, and estimates and confidence intervals were constructed by finding values of the parameter equating the statistic to appropriate quantiles. However, the treatment of this section and that of §2.3.3 differ, in that the resulting statistic in this section is non-increasing in в in this section, while in §2.3.3 it was non-decreasing. The increasing parameterization of §2.3.3 was necessary to allow the same construction to be used in §2.3.4, when quantiles other than the median were estimated; these quantiles estimates are an increasing function of each observation separately. A parallel construction might have been used in the current section, by inverting the statistic formed by summing ranks of negative observations; as this definition runs contrary to the common definition of the Wilcoxon Signed Rank statistic, it was avoided.

Signed-Rank Statistic Alternative Distribution

Consider alternative hypotheses Ha , specifying that the median of the distribution of differences is 0. Hence

To apply asymptotic relative efficiency calculations, scale the test statistic to have an expectation that varies with the parameter value specifying the null hypothesis, and approximately independent of sample size, as in (2.15), by switching to S = 2Tsfi/{n(n— 1)). In this case, ft (в) яг p0 [Z + Z-2 > 20], and the statistic variance times the sample size is

to match the notation of (2.15) and (2.19). Note that //(()) is now twice the value from the Mann-Whitney-Wilcoxon statistic, for distributions symmetric about 0. This will be used for asymptotic relative efficiency calculations in the exercises.

< Prev   CONTENTS   Source   Next >