# Relative-Accuracy Confidence Set Estimation

The estimator G„ in (8.2), based on n i.i.d. observations X,..., X„ from F, is a strongly consistent estimator of the Gini index G (Chattopadhyay and De, 2016) to be used throughout this paper. For a prescribed level a e (0,1), we would like to have a (1 - a) relative-accuracy confidence set of G in the form of /„ = [G„ ± dfi~a such that

It is understood that (8.10) may be satisfied only approximately when n becomes large. To this end, we note that G„ is asymptotically normal (Hoeffding, 1948,1961; Xu, 2007) when Ef [X2] < oc. That is,

where the asymptotic variance £2 is given by and

Thus, for large n, we may conclude: provided that

Here, za/2 is the upper (a/2)th quantile of the standard normal distribution

and 8(d) = d~2z2a/r

Clearly, n,i is the optimal (minimal) sample size required to construct a relative-accuracy confidence set for G with approximately 1 - a coverage probability. We tacitly disregard that и,/ may not be an integer.

Remark 8.1. In customary applications, it is not unrealistic to expect that ц would exceed 1. Under this reasonable perception, as we compare the requirements proposed via (8.4) and (8.7), we readily observe that under (8.7) we are certainly controlling a larger estimation error namely, //|G„ - G| instead of claiming that |G„ - G| alone goes below the preassigned half-width d > 0.

The optimal sample size in (8.14) is unknown since ц and £ remain unknown. We must estimate them first to develop the methodology for relative-accuracy confidence interval estimation. We first formulate a strongly consistent estimator of £2. Towards that end, we let:

that is, S2, represents the sample variance which obviously estimates a2. Next, recall that a2 = Vf[Ef{|Xi - Хг| |Xi}].For each/= 1,..., n, weobtain:

where Т,- = {(r'i, /2) : 1 < h < /2 < n and i,h Ф /}■ Finally, we define:

given in Sproule (1969,1985), where Wj„ = nA„ — (n - for / = 1,..., n, and W„ = n~x J2'j=i WF,- We note that S2VII is a strongly consistent estimator of 4rrj.

Now, combining (8.15)—(8.17) with (8.12)-(8.13), we propose to estimate £2 with the following strongly consistent estimator:

This expression of V* was originally developed in Chattopadhyay and De (2016). For a number of other plug-in estimators of the asymptotic variance parameter £2 from (8.12), one may refer to Davidson (2009) and Langel and Tille (2013).

## Purely Sequential Sampling Methodology

Since the optimal sample size nd given by (8.14) is unknown up to ц and £, to achieve approximate or asymptotic 1 - a coverage probability we must draw samples in at least two stages. In the first stage, we estimate nd by estimating /x and £ based on a pilot sample of size m, and then in the subsequent stages we should collect samples until some stopping criteria is satisfied.

In this paper, we propose a purely sequential procedure for 1 - a relative- accuracy confidence set estimation of G in the spirit of Chattopadhyay and De (2016). Suppose m denotes the pilot sample size. We discuss a specific choice of a pilot sample size in Remark 8.2.

The purely sequential procedure collects X,..., X,„ i.i.d. observations from F in the first stage and then collects one observation (independent of the previous ones) at a time until the current sample size is more or equal to the estimate of the optimal sample size nd from (8.14). Thus, we may define the following stopping rule or the final sample size of the procedure as follows:

where 7 is some known positive constant.

After sampling is terminated, based on the finally accrued data {N, X],..., X,v}, we propose to estimate G with

Note that the estimator V2 from (8.18) or X„ can be close to zero with positive probability leading to early termination of sequential sampling, which in turn may cause loss of coverage probability. The term n-7 in (8.19) is introduced to avoid the potential problem of early stopping. Chattopadhyay and De (2016) used 7 = 1 in the definition of their stopping rule and for their ensuing sequential estimation methodology.

Remark 8.2. From (8.19), note that N > /J(d)N-7 since X^V2 > Ow.p.l.That is, the final sample size N must be at least ft(d)x^x+',). Therefore, in order to implement the purely sequential procedure (8.19), we select the pilot sample size as m = max{4, |"/?(d)1^1+n->]} where |Y| denotes the smallest integer > x.

## Asymptotic First-Order Properties

Before establishing the major asymptotic properties (Theorems 8.1-8.2) under the proposed sequential procedure (8.19)—(8.20), we state the following lemma that ensures termination w.p. 1 under the sequential sampling procedure (8.19).

Lemma 8.1. For any fixed d > 0, the stopping time Nd from (8.19) is finite w.p. 1, that is, PF(Nd < oo) = 1 if we assume 0 < £f < oc.

We omit its proof since it follows directly from Lemma A1 of Chat- topadhyay and De (2016). Next, we establish the asymptotic properties of the proposed sequential procedure in the spirits of Theorems 1 and 2 of Chattopadhyay and De (2016).

Theorem 8.1. If the parent distribution F ofX is such that Ef [X2] < oo, then the stopping rule in (8.19) yields as d l 0:

where G, n,/ and Gn<( respectively come from (8.1), (8.14), and (8.20).

We omit its proof noting that it would follow very similarly along the lines of the proof of Theorem 1 in Chattopadhyay and De (2016). The following remark summarizes important information.

Remark 8.3. Hoeffding (1948) suggested that we only need finiteness of second moment, i.e., £f[X2] < oo to ensure the existence of asymptotic variance Sf. Moreover, it follows from Sproule (1969, 1985) that S2„ is a strong consistent estimator of 4rr^ if Ер [X2] < oo, and hence, we only need finiteness of second moment to guarantee that is a strongly consistent estimator of £2. This consistency property is the key to prove part (i) and part (f) is used in the proof of part (ii) of Theorem 8.1. In this context, note that a sufficient condition for Theorem 1 in Chattopadhyay and De (2016) was given as Ef [X4] < oo. In light of the above discussions, it is concluded that the moment condition in Theorem 1 of Chattopadhyay and De (2016) can be relaxed and the existence of second moment is sufficient for their theorem to hold.

Theorem 8.2. If the parent distribution F of X is such that Ef [Xb] < oo for some b = max{4,4(я - 1)} and a > 1, then the stopping rule in (8.19) yields as d l 0:

where nd comes from (8.14).

Proof: Note that applying part (i) of Theorem 8.1 and the Lebesgue dominated convergence theorem, it suffices to show that £r [supi(>() Nd/nd] < oo.

From the stopping rule in (8.19), we obtain the following upper bound for Nd w.p. 1:

Dividing both sides of (8.21) by nd, using (Nd — l)-7 < (nt — l)-7, and then taking the supremum over d > 0, it remains to show that < oo, that is, to show that Ef[supn>ni]^>Vjt} < oo. Using Lemma A2 of Chattopadhyay and De (2016), we observe w.p. 1 for all fixed n > m:

where we denote Th, = X^"“l)S2, T2„ = X^, and T3„ = Sl„.

We need to show that £f [sup|)>|HT,„] < oo for i = 1,2,3. First, using the Cauchy-Schwartz inequality, we Rave:

Since Si and X„ are both U-statistics, a direct application of Lemma 9.2.4 from Ghosh et al. (1997) yields:

provided £f[X('] < oo.

Similarly, we can claim:

provided £f[X4] and EfIX2"] are finite respectively. Note that 2a > 1 as a > 1.

To deal with the term £f[supj!>|H T3„], we may again use the Cauchy- Schwartz inequality, Lemma 9.2.4 of Ghosh et al. (1997), and Sen and Ghosh (1981) to argue that £f[supj)>ih T3„] < oo provided £f[X(’] < oo and Ef[X4] < oo. Note that

This completes the proof. ■

Theorem 8.2 and the first part of Theorem 8.1 establish that the estimated final sample size Nd is asymptotically close to the unknown optimal sample size nd. In the terminology of Ghosh and Mukhopadhyay (1981), the result highlighted by Theorem 8.2 is well-known in the literature as the first- order asymptotic efficiency property of the stopping rule. The last part of Theorem 8.1 proves that the coverage probability attained by the relative- accuracy confidence set Jmj from (8.20) is asymptotically close to the target 1 - q. This property is customarily referred to as asymptotic consistency of the constructed confidence set (Chow and Robbins, 1965). One may gain additional insights from Anscombe (1952,1953) and Ray (1957).

These asymptotic properties hold as the cost d tends to zero. However, in practice, d may not be close to zero, and thus, it becomes essential to assess the performance of the proposed estimation methods for some reasonable choices of d via simulation studies. We present our summary findings in Section 8.4.

De and Chattopadhyay (2017), the approximate expected cost turns out to be (ignoring the 0(n~3/2) term):

which is minimized if

We refer to nc as the optimal (minimal) sample size to minimize the expected cost for point estimation of G and tacitly disregard that nc may not be an integer. The approximate (asymptotically) minimum risk with nc samples is

up to 0(cx/2) term.

However, the optimal sample size nc from (8.24) and the minimum risk in (8.25) are unknown up to the parameters fi and £. Therefore, we need to collect samples at least in two stages where /r and £ are estimated in the first stage based on a pilot sample followed by additional samples drawn in the subsequent stages until some stopping criteria is satisfied.

In this section, we propose a purely sequential procedure along the lines of De and Chattopadhyay (2017). With some predetermined pilot sample size m, we initially collect Xj,..., X,„ i.i.d. observations from F in the first stage and then collect one observation (independent of the previous ones) at a time until the current sample size is more or equal to the estimated optimal sample size from (8.24). Thus, the stopping rule or the final sample size of the procedure is defined as:

where 7 is some known positive constant and V2 from (8.18) is a strong, consistent estimator of £2 shown in (8.12). Here, 7 plays the same role as it did in (8.19). After sampling is terminated, based upon the finally accrued data {Nc, X,..., XK}, we propose:

in the spirit of (8.20).

Remark 8.4. Note from (8.26) that Nc > 7 w.p. 1. In other words, the

final sample size must be at least (A/c)1^1+7 This indicates that one should choose the pilot sample size as m = max{4, |’(Л/с)1^2<1+7)}] to implement the purely sequential procedure (8.26).

The following lemma, similar to Lemma 8.1, guarantees that sampling will stop at some finite time w.p. 1.

Lemma 8.2. For any c > 0, the stopping time Nc is finite, that is, Pf(Nc < oo) = 1 provided £ < oo.

Proof: Follows directly from Lemma 2 of De and Chattopadhyay (2017). ■

Below, we establish some asymptotic optimality properties of the proposed sequential procedure in the spirits of Theorem 1 from De and Chattopadhyay (2017).

Theorem 8.3. Suppose observations X, Xi,... are i.i.d. copies of X having distribution function F. The stopping rule in (8.26) yields the following properties as c J. 0:

Theorem 8.4. Suppose observations X], X2,... are i.i.d. copies of X having distribution function F. For any 7 € (0,2) and different values of a used in (8.23), the purely sequential estimation strategy (8.26)-(8.27) satisfies the asymptoticfirst-order risk efficiency property, that is,

where nc comes from (8.24), under the sufficient conditions laid down as follows:

When a = 1,2 or 3: either (i) the support ofF is (t, 00) for some t > 0 and Ep[X6] is finite or (ii) Ef [X6] and Ep[X_s] are finite for some s > 12;

When a e {4,5,...}: Ef [Хтах16-Й_2>] is finite.

Proof: See Appendix. ■

Without assuming a special nature of the distribution function F generating the incoming data, Theorem 8.3 ensures that the average sample size of the proposed, purely sequential procedure approaches the oracle optimal fixed sample size that minimizes the risk function which is a combination of expected squared error loss due to estimation and linear cost. Theorem 8.4 establishes that the expected overall risk for estimating the population Gini index by the sample Gini index using the proposed sequential procedure is asymptotically close to the expected risk for estimating the population Gini index using the oracle optimal sample size. These asymptotic properties hold as the cost c per unit observation tends to zero. However, in practice, c may not be close to zero, and thus, it becomes essential to assess the performance of the proposed methods for some reasonable choices of c via simulation studies. We present our summary findings in Section 8.4.