I am an econometrician studying data-driven decision making and causal inference in markets. My research designs experiments and empirical methods for policy analysis and optimization in the presence of strategic incentives, equilibrium effects and dynamic effects.
Efficient Estimation of Causal Effects under Interference through Designed Markets.
If is often useful for a decision-maker to evaluate the effect of an intervention that affects agent preferences on allocations from a centralized mechanism. For example, if a policymaker aims to decrease socio-economic segregation in schools, they might evaluate whether informing low-income families about school performance raises their access to good quality schools. We define this type of counterfactual as a Global Treatment Effect (GTE) in a general model of causal inference under interference. When interference operates through an allocation mechanism that is truthful and has a cutoff structure, the target estimand can be defined in the asymptotic limit as a moment condition model with missing data. Under a selection-on-observables assumption, we propose a two-step doubly-robust estimator for the GTE and show that it is asymptotically normal with variance that meets the semi-parametric efficiency bound. We derive a theory of heterogeneous treatment effects in this setting, including an estimation method for the optimal targeting rule. We use these methods to analyze the effect of an information intervention in the Chilean school assignment system.
Revise & Resubmit at the American Economic Review
We introduce a stochastic model of potential outcomes in market equilibrium, where the market price is an exposure mapping. We prove that average direct and indirect treatment effects converge to interpretable mean-field treatment effects, and provide estimators for these effects through a unit-level randomized experiment augmented with randomization in prices. We also provide a central limit theorem for the estimators.
Revise & Resubmit at Management Science
We consider the problem of learning how to optimally allocate treatments whose cost is uncertain and can vary with pre-treatment covariates. This setting may arise in medicine if we need to prioritize access to a scarce resource that different patients would use for different amounts of time, or in marketing if we want to target discounts whose cost to the company depends on how much the discounts are used. Here, we show that the optimal treatment allocation rule under budget constraints is a thresholding rule based on priority scores, and we propose a number of practical methods for learning these priority scores using data from a randomized trial. Our formal results leverage a statistical connection between our problem and that of learning heterogeneous treatment effects under endogeneity using an instrumental variable. We find our method to perform well in a number of empirical evaluations.
Winner of the MIT Sports Analytics Research Competition 2020
We study the problem of a planner who wants to reduce inequality by awarding prizes to the worst contestants in a tournament without incentivizing shirking. We design an approximately optimal, incentive-compatible mechanism that targets low-ranked contestants based on the tournament's history up to an endogenous stopping time. We describe applications to eligibility for remedial education, retraining benefits for the unemployed, and draft lotteries in sports.
Treatment Allocation with Strategic Agents. Accepted at Management Science. 2023.
There is increasing interest in allocating treatments based on observed individual characteristics: examples include targeted marketing, individualized credit offers, and heterogenous pricing. Treatment personalization introduces incentives for individuals to modify their behavior to obtain a better treatment. This shifts the distribution of covariates, which means the Conditional Average Treatment Effect (CATE) now depends on how treatments are targeted. The optimal rule without strategic behavior allocates treatments only to those with a positive CATE. With strategic behavior, we show that the optimal rule can involve randomization, allocating treatments with less than 100\% probability even to those with a positive CATE induced by that rule. We propose a sequential experiment based on Bayesian Optimization that converges to the optimal treatment rule without parametric assumptions on individual strategic behavior.
Using Wasserstein Generative Adversarial Networks for the Design of Monte Carlo Simulations (with Susan Athey, Guido Imbens, and Jonas Metzger). Forthcoming at Journal of Econometrics. 2021.
We discuss using Wasserstein Generative Adversarial Networks (WGANs) as a method for systematically generating artificial data that mimic closely any given real data set without the researcher having many degrees of freedom. We apply the methods to compare in three different settings twelve different estimators for average treatment effects under unconfoundedness.
Latent Dirichlet Analysis of Categorical Survey Expectations (with Serena Ng). Journal of Business and Economic Statistics. 2022.
We propose using a Bayesian hierarchical latent class model to summarize and interpret observed heterogeneity in categorical expectations data. We show that the statistical model corresponds to an economic structural model of information acquisition, which guides interpretation and estimation of the model parameters.
Causal Estimation of User Learning in Personalized Systems (with David Jones, Jennifer Brennan, Roland Nelet, Vahab Mirrokni and Jean Pouget-Abadie). The Twenty-Fourth ACM Conference on Economics and Computation (EC'23). 2023.
In online platforms, the impact of a treatment on an observed outcome may change over time as 1) users learn about the intervention, and 2) the system personalization, such as individualized recommendations, change over time. We introduce a non-parametric causal model of user actions in a personalized system. We show that the Cookie-Cookie-Day (CCD) experiment, designed for the measurement of the user learning effect, is biased when there is personalization. We derive new experimental designs that intervene in the personalization system to generate the variation necessary to separately identify the causal effect mediated through user learning and personalization. Making parametric assumptions allows for the estimation of long-term causal effects based on medium-term experiments. In simulations, we show that our new designs successfully recover the dynamic causal effects of interest.