Proxy Controls and Panel Data (Revise & Resubmit Review of Economic Studies)
We present a flexible approach to the identification and estimation of causal objects in nonparametric, non-separable models with confounding. Key to our analysis is the use of `proxy controls’: covariates that do not satisfy a standard `unconfoundedness’ assumption but are informative proxies for variables that do. Our analysis applies to both cross-sectional and panel models. Our identification results motivate a simple and `well-posed’ nonparametric estimator and we analyze its asymptotic properties. In panel settings, our methods provide a novel approach to the difficult problem of identification with non-separable general heterogeneity and fixed T. In panels, observations from different periods serve as proxies for unobserved heterogeneity and our key identifying assumptions follow from restrictions on the serial dependence structure. We apply our methodology to two empirical settings. We estimate causal effects of grade retention on cognitive performance using cross-sectional variation and we estimate a structural Engel curve for food using panel data.
Nonparametric Instrumental Variables Estimation Under Misspecification (Formerly Revise & Resubmit Econometrica)
We present a flexible approach to the identification and estimation of causal objects in nonparametric, non-separable models with We show that nonparametric instrumental variables (NPIV) estimators are highly sensitive to misspecification: an arbitrarily small deviation from instrumental validity can lead to large asymptotic bias for a broad class of estimators. One can mitigate the problem by placing strong restrictions on the structural function in estimation. If the true function does not obey the restrictions then imposing them imparts bias. Therefore, there is a trade-off between the sensitivity to invalid instruments and bias from imposing excessive restrictions. In response, we present a method that allows researchers to empirically assess the sensitivity of their findings to misspecification. We apply our procedure to the empirical demand setting of Blundell (2007) and Horowitz (2011).
Estimation and inference in dynamic discrete choice models often relies on approximation to lower the computational burden of dynamic programming. Unfortunately, the use of approximation can impart substantial bias in estimation and results in invalid confidence sets. We present a method for set estimation and inference that explicitly accounts for the use of approximation and is thus valid regardless of the approximation error. We show how one can account for the error from approximation at low computational cost. Our methodology allows researchers to assess the estimation error due to the use of approximation and thus more effectively manage the trade-off between bias and computational expedience. We provide simulation evidence to demonstrate the practicality of our approach.
A recent literature considers causal inference using noisy proxies for unobserved confounding factors. The proxies are divided into two sets that are independent conditional on the confounders. One set of proxies are ‘negative control treatments’ and the other are ‘negative control outcomes’. Existing work applies to low-dimensional settings with a fixed number of proxies and confounders. In this work we consider linear models with many proxy controls and possibly many confounders. A key insight is that if each group of proxies is strictly larger than the number of confounding factors, then a matrix of nuisance parameters has a low-rank structure and a vector of nuisance parameters has a sparse structure. We can exploit the rank-restriction and sparsity to reduce the number of free parameters to be estimated. The number of unobserved confounders is not known a priori but we show that it is identified, and we apply penalization methods to adapt to this quantity. We provide an estimator with a closed-form as well as a doubly-robust estimator that must be evaluated using numerical methods. We provide conditions under which our doubly-robust estimator is uniformly root-$n$ consistent, asymptotically centered normal, and our suggested confidence intervals have asymptotically correct coverage. We provide simulation evidence that our methods achieve better performance than existing approaches in high dimensions, particularly when the number of proxies is substantially larger than the number of confounders.
Research In Progress
Ridge Estimation of Panel Average Effects (with Whitney Newey, Jerry Hausman, and Ying Gao)
We present and analyze a ridge-regularized estimator of the average structural parameters in a linear panel model with general heterogeneity. Price coefficients may differ both between individuals and across time, and may be correlated with the regressors as long as income effects are time-stationary. We allow for a combination of multiple discrete and continuous regressors. We also describe a debiased version of our estimator that corrects for the regularization bias imposed by applying ridge at the level of the individual. We present asymptotic analysis of the estimator under a growing number of individuals and time periods. This approach, used in “Demand Analysis with Many Prices”, provides a promising method for estimating average coefficients, including panel average treatment effects, in other settings with many regressors.
Identification, Estimability, and Unbiasedness in Panel Models
Identification and estimation in panel models with a fixed time dimension T is a topic that has received considerable attention. Much work on this topic aims to develop sufficient conditions under which population parameters are identified and consistently estimable at some rate. We explore the converse question: what conditions are necessary for identification and consistent estimability? In particular, we consider parameters of the form E[μi], where μi is some individual-specific parameter (specific to individual i). It is well-known that the existence of an unbiased estimator of μi that can be evaluated with fewer than T observations is sufficient for identification of E[μi] and existence of a consistent estimator. We consider conditions on the parameter space under which a converse holds: existence of an unbiased estimator or existence of an estimator with arbitrarily small bias is necessary for identification and consistent estimation. Results of this kind provide a path towards developing non-identification results which apply for a range of models. In addition, these results aid in the development of positive identification results because they clarify when it is enough to look for unbiased estimators of the individual-specific parameters.