i=i where x(Xi) is an estimator of the conditional expectation x(Xi) = E(h(Xi, Yi) | Xi).

An alternative to the partially imputed estimator is the fully imputed estimator

An extreme case would be that the conditional distribution of Y given X is known. It is easy to see that then the fully imputed estimator n En=i x(Xi) is at least as good as the partially imputed estimator, and strictly better unless Z(h(X, Y) — x(X)) is zero almost surely.

We show that the fully imputed estimator (3) is usually better than the partially imputed estimator (2). We restrict attention to the situation where n is bounded away from zero but otherwise completely unknown. We also impose no structural assumptions on the covariate distribution. We consider four different models for the conditional distribution of Y given X.

Suppose first that the conditional distribution Q(X,dy) of Y given X is completely unknown. For the case h(X, Y) = Y, Cheng [Che94] shows that the partially and fully imputed estimators are asymptotically equivalent, and obtains their asymptotic distribution. He estimates E(Y | X) by a truncated kernel estimator. Wang and Rao [WR02] obtain a similar result with a differently truncated kernel estimator. Cheng and Chu [CC96] study estimation of the response distribution function and quantiles. We generalize Cheng's result to arbitrary functions h and prove efficiency.

Suppose now that we have a parametric model Q$(X,dy) for the conditional distribution of Y given X. In this case the conditional expectation is of the form Xtf(x) = J h(x, y) Q$(x, dy). This suggests estimating x$ by x#. The natural estimator for $ is the conditional maximum likelihood estimator. We show that the fully imputed estimator n En=i Xs(Xi) is efficient, and better than the corresponding partially imputed estimator except in degenerate cases. This is related to Tamhane [Tam78] who assumes a parametric model for the joint distribution of X and Y. Then E[h(X,Y)] is a smooth function of hence it can be estimated efficiently by plugging in an efficient estimator, such as the maximum likelihood estimator.

Next we consider a model between the fully nonparametric and parametric ones for Q, a linear regression model with covariates and errors independent. For simplicity we take Y = $X + e. We do not assume that e has mean zero but require X to have positive variance for identifiability. Here Q(x, dy) = f (y — $x) dy, where f is the (unknown) density of the errors. Then x(x) = J h(x,$x + u)f (u) du. Exploiting this representation, we estimate x(x) by EJn=l Zjh(x, $x + Yj — $Xj)/Y^n=i Zj. We show that the corresponding fully imputed estimator is efficient if an efficient estimator for $ is used. Again the partially imputed estimator will not be efficient in general, even if an efficient estimator for $ is used.

Finally we consider a linear regression model without assuming independence between covariates and errors. For simplicity we take Y = $X + e with E(e | X) = 0. This can be written as a constraint on the conditional distribution of Y given X, namely f yQ(X, dy) = $X. For h(X, Y) = Y this suggests the estimator $ "=1 Xi, which happens to be the fully imputed estimator. Matloff [Mat81] has shown that such an estimator improves upon the partially imputed estimator for his choice of $. We show that the fully imputed estimator of E[h(X, Y)] for general h is efficient if an appropriate estimator for x is used. This requires an efficient estimator $ for $ and a correction term to the nonparametric estimator of x. An efficient estimator of $ can be obtained as a weighted least squares estimator with estimated optimal weights, based on the fully observed pairs. Efficient estimation of $ for more general regression models and various models for n has been studied in Robins, Rotnitzky and Zhao [RRZ94], Robins and Rotnitzky [RbRt95], and Rotnitzky and Robins [RtRb95], among others. Efficient score functions for $ are calculated by Nan, Emond and Wellner [NEW04] and Yu and Nan [YN03]. The partially imputed estimator will not be efficient, in general. In view of this, partially imputed estimators such as the one by Wang, Hardle and Linton [WHL04] for E[Y] in a partly linear model are not efficient.

The paper is organized as follows. In Section 2 we characterize efficient estimators for linear functionals of arbitrary regression models with responses missing at random; in particular for the four cases above. Our results show that the model is adaptive in the sense that we can estimate E[h(X,Y)] as well not knowing n as knowing n. In Section 3 we construct efficient fully imputed estimators of E[h(X,Y)] in these four models.

Was this article helpful?

## Post a comment