Simulation Studies
In Sections 8.2 and 8.3, we established that the proposed, purely sequential procedures for relative-accuracy confidence set estimation as well as MRRPE enjoy a number of relevant asymptotic optimality properties as d or c approaches 0 respectively. In this section, we attempt to assess finite sample performances of our proposed estimation methodologies for moderate to small values of d and c via Monte Carlo simulations.
Confidence Set Estimation
Let us first consider the sequential procedure (8.19) for relative-accuracy confidence set estimation of the Gini index from Section 8.2. With some preassigned values of d, we implemented the sequential procedure corresponding to the stopping rule (8.19) by choosing the pilot sample size
TABLE 8.1
Performance of the Purely Sequential Procedure for Relative-Accuracy Confidence Set Estimation of G for Different Distributions with a = 1:10,000 replications
Distribution |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
Gamma shape = 1.5, rate = 1 |
0.03 |
461 |
463.98 |
1.0066 |
0.8981 |
0.4244 |
0.90 |
0.6133 |
0.0030 |
0.4240 |
|||
Log-normal meanlog = 0.75, sdlog = 0.25 |
0.025 |
301 |
314.33 |
1.0466 |
0.9492 |
0.1403 |
0.95 |
0.4452 |
0.0022 |
0.1400 |
|||
Pareto scale = 5, shape = 4.01 |
0.09 |
721 |
693.92 |
0.9635 |
0.8817 |
0.1424 |
0.90 |
2.6246 |
0.0032 |
0.1401 |
|||
Exponential rate = 0.4 |
0.05 |
О 00 |
799.7 |
0.9992 |
0.9513 |
0.5000 |
0.95 |
0.9167 |
0.0021 |
0.4998 |
m = max{4, |7?(d)1^1+7^l} by fixing 7 = 0.2 (for the Pareto case) and 7 = 1 (for the rest). We evaluated the performances in a number of cases, that is, by drawing independent samples from several distributions, specifically relevant to income or wealth data, namely gamma, log-normal, Pareto, and exponential. Tables 8.1 and 8.2 correspond to the cases a = 1 and a = 2 respectively with a given in (8.6) and show summaries in a limited number of scenarios.
These tables present the estimated expected sample size Nj (estimator of Ef [Nrf]), its standard error s(Nj), the estimated coverage probability CP (estimator of the probability in part (ii) of Theorem 8.1), its standard error s(CP), and the final estimator of the Gini index G^ . based on 10,000 replications. For different values of d, a, and the given parameters of respective distributions, we exhibit the values of the population Gini index G and the optimal oracle
TABLE 8.2
Performance of the Purely Sequential Procedure for Relative-Accuracy Confidence Set Estimation of G for Different Distributions with a = 2:10,000 replications
Distribution |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
Gamma shape = 2, rate = 1:5 |
0.04 |
305 |
304.25 |
0.9987 |
0.8929 |
0.3750 |
0.90 |
0.6249 |
0.0031 |
0.3744 |
|||
Log-normal meanlog = 0.6, sdlog = 0.3 |
0.03 |
587 |
582.86 |
0.9932 |
0.8952 |
0.1680 |
0.90 |
0.8005 |
0.0031 |
0.1676 |
|||
Pareto scale = 2.25, shape = 4.01 |
0.09 |
1311 |
1268.54 |
0.9680 |
0.8805 |
0.1424 |
0.90 |
4.2764 |
0.0032 |
0.1409 |
|||
Exponential rate = 0.5 |
0.07 |
1046 |
1036.86 |
0.9919 |
0.9439 |
0.5000 |
0.95 |
1.5595 |
0.0023 |
0.4998 |
sample size [и/|. Fifth columns in Tables 8.1 and 8.2 illustrate that the ratios of average sample sizes and optimal sample sizes are nearly 1 under all scenarios. These validate the asymptotic first-order efficiency property of Nj given in Theorem 8.2.
The standard errors of the estimator N,t are small in all scenarios. The estimated coverage probabilities in column six of Tables 8.1 and 8.2 are very close to the target level 1 - a for all chosen distributions, which validates the asymptotic consistency property of the sequential procedure. In the Pareto case, we observe that the expected final sample sizes are slightly less, on an average, than the corresponding optimal sample sizes leading to little loss in the attained coverage probabilities. This may be due to the fact that for the given shape and scale parameters of Pareto distributions, the values of £2 and its estimator V are close to zero leading to early stopping of the purely sequential procedure. One way to deal with this problem may be to choose a smaller value of 7 to prevent undersampling. However, if the chosen value of 7 is too small, it may also lead to oversampling. The question of optimal selection of 7 for a given scenario, or in general, is an open question and is beyond the scope of this article. The last columns in Tables 8.1 and 8.2 illustrate that the final estimator G,v(l is very accurate in estimating the population Gini index, G. Overall, the simulation results validate the theoretical properties and the finite-sample-size performances of the purely sequential procedure of Section 8.2 are clearly very encouraging.
Point Estimation
We implemented the purely sequential procedure corresponding to (8.26) for MRRPE with A = $500000, c = $0.5, 7 = 1 and pilot sample size m = max{4, ["(Л/с)1/2(1+'!,)"|}. The performances of the sequential procedure were evaluated for small to moderate sample sizes by drawing random samples from four different income distributions, namely exponential, gamma, log-normal, and Pareto. Tables 8.3 and 8.4 correspond to a = 1 and a = 2 respectively where a is given in (8.8).
Since negative moments do not exist for the exponential and gamma distributions under consideration, we assumed truncated support in such cases, that is, we assumed that the data came from truncated exponential or truncated gamma distributions having support (f, 00) with t = 0.001. In the cases of log-normal and Pareto, since all negative moments exist, we assumed full support (0,00). Tables 8.3 and 8.4 summarize the estimated expected sample size Nc (estimates Ef[Nc]), its standard error s(Nc), the average risk R^c (estimates R/vc(G)), its standard error s(R,v<), and the final estimator G,v, for the Gini index G from 10,000 replications.
For the given distributions, we also provide the values of the population Gini index G and the optimal sample size c] that minimized the expected cost. The ratios of the estimated expected sample sizes and optimal sample
TABLE 8.3
Performance of the Purely Sequential Procedure for Minimum Relative Risk Point Estimation of G with a = 1, A = S500000,7 = 1 and c = $0.5:10,000 replications
Distribution |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
Exponential |
0.4998 |
409 |
411.12 |
1.0073 |
408 |
0.9996 |
rate = 0.5, t = 0.001 |
0.4996 |
0.2665 |
0.2680 |
|||
Gamma |
0.2026 |
314 |
315.82 |
1.0075 |
312.01 |
0.9953 |
shape = 7.5, rate = 1.5, t = 0.001 |
0.2023 |
0.1986 |
0.2006 |
|||
Log-normal |
0.1403 |
280 |
280.92 |
1.005 |
276.71 |
0.9902 |
meanlog = 2, sdlog = 0.25 |
0.1401 |
0.2271 |
0.2299 |
|||
Pareto |
0.0907 |
913 |
884.56 |
0.9692 |
882.69 |
0.9672 |
scale = 45, shape = 6.01 |
0.0903 |
1.2633 |
1.2650 |
sizes under given scenarios are nearly 1 validating the assertion of asymptotic first-order efficiency property stated in Theorem 8.3. The last columns of Tables 8.3 and 8.4 show that, on the average, the overall risk of estimating G using the final sample size Nc and the accrued data is approximately equal to the minimum possible risk R*(G) given in (8.25). This validates the asymptotic first-order risk efficiency property stated in Theorem 8.4. Moreover, we observed that the final estimator G,v, accurately estimated the population Gini index G under all four scenarios. Based on the large set of our broad-ranging simulations, we feel comfortable concluding that the finite-sample-size performances of the proposed, purely sequential procedure of (8.26) are clearly very encouraging.
TABLE 8.4
Performance of the Purely Sequential Procedure for Minimum Relative Risk Point Estimation of G with a = 2, A = S500000,7 = 1 and c = $0.5:10,000 replications
Distribution |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|
Exponential |
0.4992 |
61 |
66.70 |
1.0959 |
613.31 |
1.0077 |
rate = 1.5, t = 0.001 |
0.4982 |
0.0979 |
1.0388 |
|||
Gamma |
0.5194 |
77 |
80.99 |
1.0618 |
763.63 |
1.0012 |
shape = 0.9, rate = 1.1, t = 0.001 |
0.5182 |
0.1195 |
1.2478 |
|||
Log-normal |
0.1955 |
226 |
223.19 |
0.9912 |
2211.01 |
0.9819 |
meanlog = 1.55, sdlog = 0.35 |
0.1949 |
0.2445 |
2.4597 |
|||
Pareto |
0.0907 |
754 |
726.37 |
0.9635 |
7252.22 |
0.9619 |
scale = 16, shape = 6.01 |
0.0902 |
1.1460 |
11.46651 |