We are grateful to Rachel Glennerster and Michael Kremer for their clear and concise explanation of the contributions to development of randomized control trials and behavioral economics. Our understanding of the field has been radically improved by the focus on empirical work, which experimentalists, including the authors, have made a prominent part of the profession. But, despite all they teach us, the insights of behavioral economics are extremely hard to translate into public policy.

Randomized control trials typically tell you what would happen if you were to intervene. For instance, what would happen if the government were to provide free access to water. Randomized control trials do not tell you whether you should intervene. The answer to the that—at least with regard to government action—lies in welfare economics. According to welfare economics, governments should intervene for two basic reasons: to correct market failures and to ensure a more equitable distribution of wealth in society.

If the rationale for intervention is market failure, such as a positive externality, then the first thing we need to know is the extent to which the additional social benefit of an intervention exceeds the additional private benefit. If we vaccinate for flu, for example, the benefits go not only to the person vaccinated, but also to the others who are not exposed as a result. Yet the wedge between social and private benefits is not measured in these experiments (though, to be fair, it is rarely measured by anyone). Depending on how big it is, the policy recommendations will be very different. For example, if you give out free mosquito nets, lots more people will use them, but there is no demonstrated impact on the prevalence of malaria or anemia, as in a recent study from Orissa, India.

Because resources are always limited, policymakers also need to know about the cost of a program. And knowing the cost requires knowing how responsive users are to shifts in prices: is there a really big difference between giving something away and selling it for ten cents? If not, then the program can save money by selling a product for ten cents. The cases presented by Glennerster and Kremer suggest a range, from consumers who are unresponsive to price changes to consumers who are strongly responsive. So we do not get much guidance on policy.

These cases do show that individuals may not respond to subsidies as rational, utility-maximizing agents: maybe there is something especially salient to consumers about a price of zero, about getting a basic good for free. If so, what are the policy implications? There is no current standard for how big the zero-price effect has to be to justify free goods. Every good would be consumed more if it were free. Determining the right subsidy when consumers appear irrational is new terrain for welfare economics, but the mere existence of an inverse relation between prices and consumption can’t possibly be sufficient to justify any particular subsidy.

The second purpose of government intervention is redistribution. Policymakers need to know how much poor people will benefit from a particular redistributive intervention, such as a subsidy for school uniforms. This can usually be determined using methods cheaper and easier than randomized trials, and in a manner that applies to a wider range of beneficiaries than the specific context in which an experiment occurs. Furthermore, if the interventions Glennerster and Kremer discuss are ever taken to scale, they will operate in a market, as opposed to a setting where an NGO delivers a product to villagers. At the market level, individually irrational behavior can contribute to a rational aggregate—the market as a whole works as expected. Moreover, market experience reduces irrational behavior, even among teenagers shopping in American malls.

Even if experiments show us what to do, we don’t know that government intervention will be effective.

Suppose we have identified a welfare-improving rationale for government intervention—be it market failure or redistribution. The important question remains, “What should be done?” Glennerster and Kremer argue that randomized control trials tell you what government should and should not subsidize. But this assumes the government’s intervention will, in practice, improve welfare. Should we be so confident?

Many of the poor outcomes we see in development today arise from badly performing government institutions that are resistant to change. Provider absenteeism and low-quality public facilities are everywhere. In India this inefficacy has led to a widespread movement away from free schools and clinics and toward the private sector, even as subsidies to public institutions have increased. Some of the experiments the authors describe and the policies they imply have proven impossible to implement because of government behavior. As Glennerster and her colleagues found, checking absenteeism among health workers and teachers in India with time-stamp machines failed because the workers broke the machines.

Randomized control trials can actually help overcome some kinds of government failure by publicizing results and building public support for interventions on behalf of the poor. There is some evidence that the publication of the robust results of Mexico’s conditional cash transfer scheme, Progresa, helped avoid elite capture, especially when the party in power changed.

But the general point is that before overriding the behavior of individuals, we should have a good idea of the relevant policymakers’ likely behavior. Individuals are myopic, but so are politicians, who might look no farther than the next election or possible coup. The best health or education intervention that economists can devise to improve peoples’ behavior may not be practical given the real-life officials dealing with real-life doctors and teachers.

Finally, because the experimental approach is based on what is feasible to randomize, it may lead us in the wrong direction. Experimental interventions must be completely under the control of the experimenter. Traditional project evaluation would evaluate the actual policy under the control of the policymaker. But, in contrast to the experimenter, the minister of health does not check to make sure mosquito nets are being used. She makes pronouncements and issues budget figures and government orders. Whether these pronouncements affect reality is not completely under control. In the experimental approach, the necessity of knowing when an intervention is actually undertaken usually requires that it be easily controlled and monitored, narrowing the scope of possible activity to assessment of private goods, such as mosquito nets, chlorine pills, and school uniforms. Yet, those interventions easiest to control and measure may be the ones least in need of government effort.