Simulation Studies

In Sections 8.2 and 8.3, we established that the proposed, purely sequential procedures for relative-accuracy confidence set estimation as well as MRRPE enjoy a number of relevant asymptotic optimality properties as d or c approaches 0 respectively. In this section, we attempt to assess finite sample performances of our proposed estimation methodologies for moderate to small values of d and c via Monte Carlo simulations.

Confidence Set Estimation

Let us first consider the sequential procedure (8.19) for relative-accuracy confidence set estimation of the Gini index from Section 8.2. With some preassigned values of d, we implemented the sequential procedure corresponding to the stopping rule (8.19) by choosing the pilot sample size

TABLE 8.1

Performance of the Purely Sequential Procedure for Relative-Accuracy Confidence Set Estimation of G for Different Distributions with a = 1:10,000 replications

Distribution

Gamma

shape = 1.5, rate = 1

0.03

461

463.98

1.0066

0.8981

0.4244

0.90

0.6133

0.0030

0.4240

Log-normal

meanlog = 0.75, sdlog = 0.25

0.025

301

314.33

1.0466

0.9492

0.1403

0.95

0.4452

0.0022

0.1400

Pareto

scale = 5, shape = 4.01

0.09

721

693.92

0.9635

0.8817

0.1424

0.90

2.6246

0.0032

0.1401

Exponential rate = 0.4

0.05

О

00

799.7

0.9992

0.9513

0.5000

0.95

0.9167

0.0021

0.4998

m = max{4, |7?(d)1^1+7^l} by fixing 7 = 0.2 (for the Pareto case) and 7 = 1 (for the rest). We evaluated the performances in a number of cases, that is, by drawing independent samples from several distributions, specifically relevant to income or wealth data, namely gamma, log-normal, Pareto, and exponential. Tables 8.1 and 8.2 correspond to the cases a = 1 and a = 2 respectively with a given in (8.6) and show summaries in a limited number of scenarios.

These tables present the estimated expected sample size Nj (estimator of Ef [Nrf]), its standard error s(Nj), the estimated coverage probability CP (estimator of the probability in part (ii) of Theorem 8.1), its standard error s(CP), and the final estimator of the Gini index G^ . based on 10,000 replications. For different values of d, a, and the given parameters of respective distributions, we exhibit the values of the population Gini index G and the optimal oracle

TABLE 8.2

Performance of the Purely Sequential Procedure for Relative-Accuracy Confidence Set Estimation of G for Different Distributions with a = 2:10,000 replications

Distribution

Gamma

shape = 2, rate = 1:5

0.04

305

304.25

0.9987

0.8929

0.3750

0.90

0.6249

0.0031

0.3744

Log-normal

meanlog = 0.6, sdlog = 0.3

0.03

587

582.86

0.9932

0.8952

0.1680

0.90

0.8005

0.0031

0.1676

Pareto

scale = 2.25, shape = 4.01

0.09

1311

1268.54

0.9680

0.8805

0.1424

0.90

4.2764

0.0032

0.1409

Exponential rate = 0.5

0.07

1046

1036.86

0.9919

0.9439

0.5000

0.95

1.5595

0.0023

0.4998

sample size [и/|. Fifth columns in Tables 8.1 and 8.2 illustrate that the ratios of average sample sizes and optimal sample sizes are nearly 1 under all scenarios. These validate the asymptotic first-order efficiency property of Nj given in Theorem 8.2.

The standard errors of the estimator N,t are small in all scenarios. The estimated coverage probabilities in column six of Tables 8.1 and 8.2 are very close to the target level 1 - a for all chosen distributions, which validates the asymptotic consistency property of the sequential procedure. In the Pareto case, we observe that the expected final sample sizes are slightly less, on an average, than the corresponding optimal sample sizes leading to little loss in the attained coverage probabilities. This may be due to the fact that for the given shape and scale parameters of Pareto distributions, the values of £2 and its estimator V are close to zero leading to early stopping of the purely sequential procedure. One way to deal with this problem may be to choose a smaller value of 7 to prevent undersampling. However, if the chosen value of 7 is too small, it may also lead to oversampling. The question of optimal selection of 7 for a given scenario, or in general, is an open question and is beyond the scope of this article. The last columns in Tables 8.1 and 8.2 illustrate that the final estimator G,v(l is very accurate in estimating the population Gini index, G. Overall, the simulation results validate the theoretical properties and the finite-sample-size performances of the purely sequential procedure of Section 8.2 are clearly very encouraging.

Point Estimation

We implemented the purely sequential procedure corresponding to (8.26) for MRRPE with A = $500000, c = $0.5, 7 = 1 and pilot sample size m = max{4, ["(Л/с)1/2(1+'!,)"|}. The performances of the sequential procedure were evaluated for small to moderate sample sizes by drawing random samples from four different income distributions, namely exponential, gamma, log-normal, and Pareto. Tables 8.3 and 8.4 correspond to a = 1 and a = 2 respectively where a is given in (8.8).

Since negative moments do not exist for the exponential and gamma distributions under consideration, we assumed truncated support in such cases, that is, we assumed that the data came from truncated exponential or truncated gamma distributions having support (f, 00) with t = 0.001. In the cases of log-normal and Pareto, since all negative moments exist, we assumed full support (0,00). Tables 8.3 and 8.4 summarize the estimated expected sample size Nc (estimates Ef[Nc]), its standard error s(Nc), the average risk R^c (estimates R/vc(G)), its standard error s(R,v<), and the final estimator G,v, for the Gini index G from 10,000 replications.

For the given distributions, we also provide the values of the population Gini index G and the optimal sample size c] that minimized the expected cost. The ratios of the estimated expected sample sizes and optimal sample

TABLE 8.3

Performance of the Purely Sequential Procedure for Minimum Relative Risk Point Estimation of G with a = 1, A = S500000,7 = 1 and c = $0.5:10,000 replications

Distribution

Exponential

0.4998

409

411.12

1.0073

408

0.9996

rate = 0.5, t = 0.001

0.4996

0.2665

0.2680

Gamma

0.2026

314

315.82

1.0075

312.01

0.9953

shape = 7.5, rate = 1.5, t = 0.001

0.2023

0.1986

0.2006

Log-normal

0.1403

280

280.92

1.005

276.71

0.9902

meanlog = 2, sdlog = 0.25

0.1401

0.2271

0.2299

Pareto

0.0907

913

884.56

0.9692

882.69

0.9672

scale = 45, shape = 6.01

0.0903

1.2633

1.2650

sizes under given scenarios are nearly 1 validating the assertion of asymptotic first-order efficiency property stated in Theorem 8.3. The last columns of Tables 8.3 and 8.4 show that, on the average, the overall risk of estimating G using the final sample size Nc and the accrued data is approximately equal to the minimum possible risk R*(G) given in (8.25). This validates the asymptotic first-order risk efficiency property stated in Theorem 8.4. Moreover, we observed that the final estimator G,v, accurately estimated the population Gini index G under all four scenarios. Based on the large set of our broad-ranging simulations, we feel comfortable concluding that the finite-sample-size performances of the proposed, purely sequential procedure of (8.26) are clearly very encouraging.

TABLE 8.4

Performance of the Purely Sequential Procedure for Minimum Relative Risk Point Estimation of G with a = 2, A = S500000,7 = 1 and c = $0.5:10,000 replications

Distribution

Exponential

0.4992

61

66.70

1.0959

613.31

1.0077

rate = 1.5, t = 0.001

0.4982

0.0979

1.0388

Gamma

0.5194

77

80.99

1.0618

763.63

1.0012

shape = 0.9, rate = 1.1, t = 0.001

0.5182

0.1195

1.2478

Log-normal

0.1955

226

223.19

0.9912

2211.01

0.9819

meanlog = 1.55, sdlog = 0.35

0.1949

0.2445

2.4597

Pareto

0.0907

754

726.37

0.9635

7252.22

0.9619

scale = 16, shape = 6.01

0.0902

1.1460

11.46651

 
Source
< Prev   CONTENTS   Source   Next >