Regression estimator in ranked set sampling

Biometrics. 1997 Sep;53(3):1070-80.

Abstract

Ranked set sampling (RSS) utilizes inexpensive auxiliary information about the ranking of the units in a sample to provide a more precise estimator of the population mean of the variable of interest Y, which is either difficult or expensive to measure. However, the ranking may not be perfect in most situations. In this paper, we assume that the ranking is done on the basis of a concomitant variable X. Regression-type RSS estimators of the population mean of Y will be proposed by utilizing this concomitant variable X in both the ranking process of the units and the estimation process when the population mean of X is known. When X has unknown mean, double sampling will be used to obtain an estimate for the population mean of X. It is found that when X and Y jointly follow a bivariate normal distribution, our proposed RSS regression estimator is more efficient than RSS and simple random sampling (SRS) naive estimators unless the correlation between X and Y is low (/rho/ < 0.4). Moreover, it is always superior to the regression estimator under SRS for all rho. When normality does not hold, this approach could still perform reasonably well as long as the shape of the distribution of the concomitant variable X is only slightly departed from symmetry. For heavily skewed distributions, a remedial measure will be suggested. An example of estimating the mean plutonium concentration in surface soil on the Nevada Test Site, Nevada, U.S.A., will be considered.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biometry / methods
  • Models, Statistical*
  • Nevada
  • Nuclear Warfare
  • Plutonium / analysis
  • Power Plants
  • Regression Analysis*
  • Soil Pollutants, Radioactive / analysis
  • Statistics, Nonparametric*

Substances

  • Soil Pollutants, Radioactive
  • Plutonium