Regression estimator in ranked set sampling

P L Yu; K Lam

Regression estimator in ranked set sampling

Biometrics. 1997 Sep;53(3):1070-80.

Authors

P L Yu¹, K Lam

Affiliation

¹ Department of Statistics, University of Hong Kong.

PMID: 9333340

Abstract

Ranked set sampling (RSS) utilizes inexpensive auxiliary information about the ranking of the units in a sample to provide a more precise estimator of the population mean of the variable of interest Y, which is either difficult or expensive to measure. However, the ranking may not be perfect in most situations. In this paper, we assume that the ranking is done on the basis of a concomitant variable X. Regression-type RSS estimators of the population mean of Y will be proposed by utilizing this concomitant variable X in both the ranking process of the units and the estimation process when the population mean of X is known. When X has unknown mean, double sampling will be used to obtain an estimate for the population mean of X. It is found that when X and Y jointly follow a bivariate normal distribution, our proposed RSS regression estimator is more efficient than RSS and simple random sampling (SRS) naive estimators unless the correlation between X and Y is low (/rho/ < 0.4). Moreover, it is always superior to the regression estimator under SRS for all rho. When normality does not hold, this approach could still perform reasonably well as long as the shape of the distribution of the concomitant variable X is only slightly departed from symmetry. For heavily skewed distributions, a remedial measure will be suggested. An example of estimating the mean plutonium concentration in surface soil on the Nevada Test Site, Nevada, U.S.A., will be considered.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Biometry / methods
Models, Statistical*
Nevada
Nuclear Warfare
Plutonium / analysis
Power Plants
Regression Analysis*
Soil Pollutants, Radioactive / analysis
Statistics, Nonparametric*

Substances

Soil Pollutants, Radioactive
Plutonium