Overcoming the Winner’s Curse: Leveraging Bayesian Inference to Improve Estimates of the Impact of Features Launched via A/B Tests

Conference on Digital Experimentation (CODE) at MIT (2024)
Draft, October 2024

Abstract: Many data-driven companies measure the impact of product groups and allocate resources across them based on the estimated impacts of features they launch via A/B tests. In this doc, we show that, when based on a standard frequentist estimator of the impact of features, this practice can significantly overstate the impact of product groups and distort the allocation of resources. When this practice is instead based on a Bayesian estimator of the impact of features, there are no such problems when the underlying prior beliefs regarding the distribution of true impacts are correctly specified. To help assess performance of the estimators in practice, we conduct simulations, allowing for different forms of misspecification in prior beliefs regarding the distribution of true impacts. In these simulations, we find that the Bayesian estimator generally outperforms the frequentist estimator, even under certain forms of misspecification. We use both the frequentist and Bayesian estimators to measure cumulative impacts across A/B tests at Amazon, highlighting differences in their overall magnitude and their distribution across product groups.

The Effect of SNAP on the Composition of Purchased Foods: Evidence and Implications

with Justine Hastings and Jesse Shapiro
American Economic Journal: Economic Policy 13, no. 3 (2021): 277-315
Manuscript; Appendix

Abstract: We use detailed data from a large retail panel to study the effect of participation in the Supplemental Nutrition Assistance Program (SNAP) on the composition and nutrient content of foods purchased for at-home consumption. We find that the effect of SNAP participation is small relative to the cross-sectional variation in most of the outcomes we consider. Estimates from a model relating the composition of a household’s food purchases to the household’s current level of food spending imply that closing the gap in food spending between high- and low-SES households would not close the gap in summary measures of food healthfulness.

Does Punishment Compel Payment? Driver’s License Suspensions and Fine Delinquency

Draft, March 2020

Abstract: Many state and local governments use the threat of driver’s license suspension (DLS) to compel the payment of fines and fees. In this paper, I provide the first quasi-experimental evidence on the efficacy of such threats. Using administrative records from the City of Chicago, I estimate the effect of receiving a threat of DLS on the payment of traffic fines. To isolate the causal effect of receiving a threat, I exploit cross-sectional variation in exposure to a change in the enforcement of DLS policy in a fuzzy difference-in-differences research design. Receiving a threat of DLS increases traffic fine payment by $658 on average over the four years following receipt, representing 40 percent of the average traffic fine debt among drivers in my sample. The effect is significantly smaller among drivers with vehicles registered to higher-poverty ZIP Codes. My estimates suggest that eliminating DLS for the non-payment of traffic fines would reduce annual traffic fine revenue in the city by 4.5 percent.