No evidence for nudging after adjusting for publication bias

a_simm · on July 24, 2022

Wow. As a former social scientist with an axe to grind this hits hard.

I like to provide that HN community with some context as to what this means.

There are some 300 “research” departments in each of the major social sciences: psychology, sociology, economics and anthropology. If you believe what they say, about half of their mission is teaching and the other half is research. That’s a lot, tens of billions of dollars.

The nudge findings were among the few to not only reach the level of public knowledge but, more importantly, directly influence on public policy. To use the one I most familiar with: the so called default for defined contribution retirement plans, eg 401k. These government regs assumed, for good reason, that maximizing contributions was in the public interest. Based on the nudge findings, after much debate and effort, they were updated to dictate that the max options forms was pre selected in the brief it would cause more individuals would opt for that as opposed to contributing zero.

So far so good, right? In fact nudge has become a canonical example in introductory public policy courses as to how their research can in some sense make things better.

This meta-analytic finding turns on the authors’ method for measuring publication bias. Because I accept that, I must believe that this entire body of research, probably the signal behavioral economics work, is essentially worthless! Thus, all that effort has not only been wasted but the credibility of social science in general is damaged.

Adding this to the well/known gamesmanship in peer review, debate over tenure and etc. means it’s past time to reform a large chunk of academia.

TameAntelope · on July 24, 2022

Isn't this just one specific analysis using a very narrow definition of "nudge", one that doesn't even begin to encompass the work being done at those "300 "research" departments"?

Further, isn't this using the same data from the original meta analysis that did find "of small to medium size" effect[0]?

Why would this, alone, undo decades of research and clear, bright-line conclusions such as the ones cited in my sibling comments? In other words, why is this letter the final word on the topic of "nudge", to you, and not the original meta-analysis? Sounds like you think everyone should pack it up and go home, all because of one letter using an alternative set of definitions and analysis.

Just seems like an overreaction on your part, especially given how vocal and... you-sounding (for lack of a better term) the "anti-nudge" crowd often is.

To take a wider view, a comment like yours is a more malicious form of nerd-sniping[1], especially on HN. Claim to have relevant credentials, voice a contrarian-but-popular-here opinion, and make a wild conclusion to give those reading it a feeling of "inside baseball."

[0] https://www.pnas.org/doi/full/10.1073/pnas.2107346118#sec-3

[1] https://xkcd.com/356/

sacrosancty · on July 25, 2022

I think social scientists have lost the right to the benefit of the doubt. They don't preregister their trials. They don't publish their negative results. They're notoriously bad at statistics, notoriously let their political beliefs distort their conclusions, notoriously scatter their work over thousands of arbitrary, hard-to-compare small-sample-size studies instead of concentrating their resources. I don't think it matters if this criticism is right or not. The fact that it's even possible to make such a criticism is already a condemnation of the scientific incompetence of social scientists.

unityByFreedom · on July 25, 2022

> They don't preregister their trials.

First of all, preregistration is not a requirement for the scientific method, which has functioned well for centuries. That is a recent trend in response to the overflowing amount of haphazardly published science.

Second, it is up to the individual scientist to decide to preregister or not. Some social scientists may preregister.

Third, small sample size may be a fair critique, however that overlooks how difficult it is to collect such data.

You've made a lot of generalizations here that amount to, "social scientists aren't as rigorous as other areas of science, therefore we should only believe studies that disagree with their results". I don't think throwing the baby out with the bath water is helpful. You can take results of studies with small sample sizes with a grain of salt, watch for replication, etc. Lambasting the field as a whole doesn't make sense to me.

Finally, readers should note that this isn't a new argument. People have been making this claim about social science for 120 years, if not longer, but at least since Freud and contemporaries began publishing.

rayiner · on July 25, 2022

> You've made a lot of generalizations here that amount to, "social scientists aren't as rigorous as other areas of science, therefore we should only believe studies that disagree with their results".

I think it’s more “ignore them completely.” It brings “science” into disrepute to let social science associate with the other sciences.

> I don't think throwing the baby out with the bath water is helpful.

There is no baby!

> People have been making this claim about social science for 120 years, if not longer, but at least since Freud and contemporaries began publishing.

Doesn’t that prove the point? It wasn’t science then and isn’t science today.

unityByFreedom · on July 25, 2022

> It brings “science” into disrepute to let social science associate with the other sciences.

So you just want it renamed to "social studies" or what? What is your proposal, that nobody research this topic, or that they be separated in journals etc? I doubt that will have much impact on whether it makes the news. If you want that to change, you may need to get yourself onto the board of a journal you care about.

> There is no baby!

That's reductive. Just because you don't see the baby doesn't mean it doesn't exist.

> Doesn’t that prove the point? It wasn’t science then and isn’t science today.

No, it just proves it's an old disagreement, like nature vs nurture.

There is plenty of work in social science that contributes to humanity. It will always have smaller sample sizes due to the nature of collecting the data. The work can be considered useful nonetheless.

Schroedingersat · on July 25, 2022

> So you just want it renamed to "social studies" or what? What is your proposal, that nobody research this topic, or that they be separated in journals etc? I doubt that will have much impact on whether it makes the news.

Or maybe people just want to shine light on the fact that social science is harder than other science for a bunch of different reasons, bring social scientists' attention towards the tools that help mitigate this, and bring the journals that seek profit over reliable results into disrepute?

Most of the papers published in current social science journals are not science, and this has been a problem for those 120 years precisely because the techniques used in chemistry or physics are inadequate for the problem domain, so applying them blindly does not produce scientific outcomes.

unityByFreedom · on July 25, 2022

> maybe people just want to shine light on the fact that social science is harder than other science

The comment to which I was replying lacked the nuance in yours. Context is everything.

rayiner · on July 25, 2022

> So you just want it renamed to "social studies" or what? What is your proposal, that nobody research this topic, or that they be separated in journals etc

Yes. And that the rest of us stop treating it as science, citing it as science, and relying on it as science.

For example, there is a major trend in the law of treating social sciences as having truth value the way real sciences do. That’s the kind of thing we need to stop doing.

unityByFreedom · on July 26, 2022

I found the hard/soft science discussion here [1] to be informative.

It seems unlikely that all of social science will one day be declared as "not science". Aristotle's methods, for example, did not require a certain sample size.

You're applying far too strict of a definition to science. Basic forms of science can be practiced by a child at home. Journals publish more in-depth analyses, and it's up to them what to publish, at the risk or reward of gains and losses of readership.

[1] https://blogs.scientificamerican.com/cross-check/is-social-s...

rayiner · on July 26, 2022

> You're applying far too strict of a definition to science.

This isn’t a fun theoretical exercise. In the public sphere, “science” tends to get invoked with dispositive weight. And it should in many cases. But for that to work, “science” must meet the level of rigor people associate with “science.” To be called “science” it should be like physics in terms of providing truth value, not psychology.

Aristotle didn’t do “science.” He was a philosopher. His ideas were precursors to science, but weren’t science.

unityByFreedom · on July 26, 2022

> To be called “science” it should be like physics in terms of providing truth value, not psychology.

It's possible to form a truth about human behavior, for example, I did X because Y. If someone points my behavior out to me, then in the future I may do Z in response to Y. Human behavior can change upon observation [1]. That doesn't make the study of human behavior "not science" in my opinion.

> Aristotle didn’t do “science.” He was a philosopher. His ideas were precursors to science, but weren’t science.

In that case, I suppose you will acknowledge that Galileo did science, despite not having a lot of data points. I think Aristotle did too, because I draw the line at hypothesis, observation and conclusion, which may or may not result in some ultimate truth that remains constant.

I suppose you would also say that the double slit test is science. Yet that result changes, and there is no truth value that we can explain, except by noting that the result changes when the experiment is observed.

[1] https://news.ycombinator.com/item?id=32233420

planck01 · on July 26, 2022

> That doesn't make the study of human behavior "not science" in my opinion.

It might be valuable and it might be worthwhile to do. But what you describe in itself is not science.

unityByFreedom · on July 26, 2022

> science: the intellectual and practical activity encompassing the systematic study of the structure and behaviour of the physical and natural world through observation and experiment. [1]

Given the definition, why wouldn't you consider observing human behavior to be science?

[1] https://www.google.com/search?q=define+science

istjohn · on July 25, 2022

It's not useful. In fact, it's actively harmful. The constant churn of findings and retractions undermines the public's belief in science. When people stop believing in science, you get things like flat earthers, global warming deniers, and vaccine skeptics.

unityByFreedom · on July 25, 2022

The quibble here is whether social science is a science. Here's a good overview, including a video of Feynman which the HN crowd may appreciate:

https://www.quora.com/Is-social-science-a-real-science/answe...

Humanity is observable. It's just hard to collect the data. I think you can make the case that some science isn't as rigorous, or that the jury is still out, but to say that it isn't science at all is wrong in my opinion. Even Feynman acknowledges that conclusions may be drawn later.

Aristotle philosophized quite a bit and is considered an early contributor to the scientific method. More food for thought:

https://blogs.scientificamerican.com/cross-check/is-social-s...

istjohn · on July 25, 2022

I think social science is clearly possible. The question is whether what passes for social science in our scientific journals deserve the label.

unityByFreedom · on July 26, 2022

That is a much more reasonable question that I think would be up to the journal, and you as a reader.

sbierwagen · on July 25, 2022

>First of all, preregistration is not a requirement for the scientific method, which has functioned well for centuries.

As has been said before, the problem is that the scientific method is right eventually. It can and often does get stuck for decades at a time, if someone with, shall we say, durable beliefs gets tenure, amasses political power and shoves their rivals out of a field. The amyloid hypothesis is just the most recent example.

Modern metascience practices (preregistration, blinding, banning "garden of forking paths" subgroup analysis, demanding high p factors and larger n) don't replace the scientific method, they're supposed to speed it up! But, by definition, these are all political issues, so they attract political arguments.

unityByFreedom · on July 26, 2022

> It can and often does get stuck for decades at a time,

I think this is just more data. If we're all wrong for a longer time, then the impact will be more clear.

I agree that modern additions like preregistration are helpful. I only wanted to remark that it is not a prerequisite for science.

> by definition, these are all political issues, so they attract political arguments.

People are good at gaming systems. We are naturals at recognizing patterns and will adjust our behavior to meet our goals. In that sense, social science may be targeting a moving object, almost like the difference between observing and not observing the atoms in a double slit test.

As hard as it may be in social science, the process of hypothesizing, observing and forming conclusions is still science. For some, it appears that is not science because a definite conclusion never arrives.

Which viewpoint is correct? I think it's up to you to decide. And, when you don't grant people that choice, you get an anti-science response, because people naturally reject being told what to think. Science, for me, is about asking questions, not necessarily arriving at a definitive result.

planck01 · on July 26, 2022

> however that overlooks how difficult it is to collect such data.

It is very difficult in physics as well. Do you know how hard it is and how much effort is involved in building the LHC? Or Ligo? Or the JWST? Or ITER? They cost billions of dollars, thousands of scientists and decades to plan and make before you even get science data. Science is hard! You need to put the work and effort in, because otherwise you can't say anything about the nature of things.

TameAntelope · on July 25, 2022

> I don't think it matters if this criticism is right or not.

Okay, you and I care about very different things.

harry8 · on July 25, 2022

>> I don't think it matters if this criticism is right or not.

> Okay, you and I care about very different things.

Clearly it is being suggested that it doesn't matter with respect to the social studies departments being in dire need of drastic reform. If you don't care about that either direction why are you commenting on this thread? You do actually care one way and you're "point scoring" to further the argument? Something else? I'm misreading something that I think I'm reading clearly?

TameAntelope · on July 25, 2022

Hm? I just think it matters if the letter (the original submission) is correct or not.

harry8 · on July 25, 2022

So are social studies departments and funding in dire need of reform in your opinion? How does the "correctness" in your view of the original submission affect that need or non-need?

TameAntelope · on July 25, 2022

Not sure I know enough to have an opinion about that; what I've been commenting about is the "nudge is dead" attitude that I don't think is warranted.

harry8 · on July 26, 2022

and yet here you are responding in a thread with the conclusion clearly stated:

>means it’s past time to reform a large chunk of academia.

And you're taking exception to it, but now just claiming you're actually not, that these one sentence "you're wrong" responses are really something much more modest.

Out of interest do you work in the field? Have ties as a graduate to one of the departments? Or are you completely disinterested when assessing the research?

If I have an interest it's that quality research is performed that advances human knowledge with some kind of efficiency of the resources spend. ie fund something that is being done properly and well to some effect over something that has been shown to be run by those who seem to be utterly incompetent or fraudulent shysters or something else that engenders zero confidence.

https://www.google.com/search?q=cass+sunstein+site%3Astatmod...

unityByFreedom · on July 26, 2022

> past time to reform a large chunk of academia

>> And you're taking exception to it

TameAntelope was addressing premises to that conclusionary comment.

If the premises do not hold, then TameAntelope does not need to address that conclusion.

harry8 · on July 27, 2022

>If the premises do not hold

But they do tho'

/tips hat to TameAntelope's style

unityByFreedom · on July 28, 2022

TameAntelope's comment did have substance. Give it another read,

https://news.ycombinator.com/item?id=32218408

harry8 · on Aug 4, 2022

one of them did, kind of, sure. Then subsequently no substance or support. Give the thread another read.

TameAntelope · on July 26, 2022

I think you’re on the wrong comment chain, friend…

harry8 · on July 27, 2022

Hey that's just what I said to you! Great minds etc.

Hitch your wagon to Cass Sustein et. al. by all means... Famously successful and successfully famous. What else do you need to know? ESP might be a thing too, nobody has proved it isn't. But we have shut down the research departments at universities involved in that BS...

https://www.google.com/search?q=cass+sunstein+site%3Astatmod...

pizzathyme · on July 25, 2022

I agree, I have worked at 5 separate tech companies and have conducted hundreds of statistically significant experiments, changing what people select by default. These methods have been effective at helping hundreds of millions of people improve choices. It’s common practice, think: pricing on Amazon, default tips on Uber, default purchase price in a video game, etc.

I don’t see how this would make me reinterpret all those successful results. Maybe I don’t understand what this is saying.

efitz · on July 25, 2022

How much of this is “nudging” vs. “clearly explaining trade-offs”?

I’ve not done any rigorous research, but I’ve participated in projects that resulted in dramatic shifts towards customers choosing what the dev team thought was the “best” outcome, just by altering wording, or making “dangerous” choices harder (such as by requiring more clicks to enable).

ad8e · on July 25, 2022

One interpretation is that it is very hard to extract value from the nudge literature. When reading research articles, one must estimate and adjust for biases. This adjustment shifts the p-values by unclear amounts. So a positive result may just be a fluke.

As for your own AB tests, you have seen the processes that go into them and do not need to adjust for unknown biases. So when they demonstrate a nudge effect, you can believe it.

salawat · on July 25, 2022

All you've proven is that controlling the default is a viable means of making a decision for someone.

I assure you, Opt-outs are much less likely to happen when combined with not explaining there is a choice to be made in the first place.

xyzzy123 · on July 24, 2022

In the example you cited I don't think that's a nudge? Or is it?

I ask because I am sure that changing defaults DEFINITELY works, especially if the user does not have a strong existing preference.

You're not really changing user behaviour most of the time, you're changing the outcome of what they're trying to do, which is to reduce their cognitive load by ignoring as much as they possibly can.

mikkergp · on July 24, 2022

Some other poster posted that they must have a pretty specific definition of nudge, because it defies credulity that defaults don’t change outcome if only because half the time I don’t read the defaults or know where to find them.

I mean I just found out two weeks ago you could change the hacker news banner color. Are you telling me I’m in a statistically insignia can’t minority of hacker news users?

Also how many settings are there in the average application, you can’t tell me most users go through all of those settings to get exactly what they want.

xyzzy123 · on July 24, 2022

If defaults don't work then Google wasted 15 billion dollars last year paying Apple to be in the search bar...

I guess there must be further detail in the paper and I will have to read it to understand the nuance.

inglor_cz · on July 24, 2022

Defaults very clearly work in matters such as consent to organ donation. In countries where you need to opt out of organ donation, few people bother to do so.

Another question is whether this increases the total amount of successful donations. I was looking around for studies and found this one [1], which basically says "in some countries, yes".

[1] https://behavioralpolicy.org/wp-content/uploads/2020/01/Does...

iudqnolq · on July 25, 2022

I've heard people argue that that effect isn't a nudge, it's deceit.

That is, all you're doing is tricking people who didn't read carefully. People don't know they've opted in and would opt out if you called and told them that they checked the box.

I find it generally plausible that defaults don't matter much for what people consider very important decisions. I have minimal experience in this area, though.

manwe150 · on July 25, 2022

There is also some research suggesting that defaults in organ donation (so called presumed consent) may decrease rates of actual donations in those countries. I can't find the original podcast where I heard about it (I assume related to either Planet Money or Freakonomics), but found this source:

https://blogs.bmj.com/medical-ethics/2017/09/25/organ-donati...

jjgreen · on July 25, 2022

Without consent, it's not a donation. "Harvesting" would be an appropriate name.

salawat · on July 25, 2022

Consider that 401k's are an onramp through which one delegates one's capital to be allocated by someone else, and it should be obvious the objective.

Capital wants to have the spigot left on. If people don't feed the beast voluntarily, Capital will make that the default.

Gustomaximus · on July 24, 2022

Also fair to recognise google also pays Apple to not make or promote a competitor that may offer far more competition.

Id say this is a large part of the reason Gmail, Android and Chrome exist.

pessimizer · on July 24, 2022

Or Google is using the search bar as a pretense for paying Apple for something else.

greggsy · on July 24, 2022

Suggesting that there is a corporate or government conspiracy without actually saying what it might actually be is the worst type of conspiracy.

pessimizer · on July 24, 2022

That's not a conspiracy. When two companies, or countries, or individuals have business dealings on many different levels, a lot of things can be negotiated at the same time.

analog31 · on July 24, 2022

It seems quite possible for two things to be true: 1) The common sense notion that manipulation works; and 2) Social science couldn't find the signal above the noise.

SilasX · on July 24, 2022

> Some other poster posted that they must have a pretty specific definition of nudge, because it defies credulity that defaults don’t change outcome if only because half the time I don’t read the defaults or know where to find them.

My pet theory is, these results hinge on, “does it scale?”

Like, yes, you can do nudges and see behavioral changes. But what about when everyone is doing it constantly? Then people will get fatigued and form countermeasures.

Imagine this dynamic in another context:

“Guys, guys check this out, people are guaranteed to buy your product if you show arguments for it to random people!”

But, oops, centuries of marketing later, advertising isn’t automatically effective enough to cover its costs, people don’t automatically believe the ads.

IMSAI8080 · on July 24, 2022

I'm guilty of not reading this paper in any detail but it feels that the default setting "nudge" idea should work as described. So if you e.g. nudge people by setting up a pension plan by default (that they can opt out of) does that seriously fail to cause more people to have a pension? Or is this claiming something else?

elevaet · on July 25, 2022

Another similar example is jurisdictions that switched to assuming an individual consents to organs donation when they die, rather than having an opt-in system, see much higher rates of organ donation.

https://sparq.stanford.edu/solutions/opt-out-policies-increa...

rad88 · on July 24, 2022

The hacker news banner color doesn't matter and few have ever wanted to change it. But your financial position and needs, what % of your salary you can afford to money-hole until retirement, does matter and is pretty individual. It doesn't defy credulity to me that generally people would make a choice about this (when can I retire?), and that the default doesn't influence it.

I grant that it would be surprising if it had no influence at all, but I think the effect is more the social signal that you should want to save the max, that your neighbors probably do (it's the default after all), etc., rather than people completely ignoring/missing it.

concordDance · on July 24, 2022

The default influenced me and pretty much everyone in my company Ive talked to (namely, almost everyone stuck to it). So yes, it defies credulity.

rad88 · on July 25, 2022

I believe you. If you know it influenced you though, that means you didn't ignore it or not even realize you could change it, which is the idea I was replying to.

Again, it would be surprising if it didn't matter at all, but not unimaginable. What you're saying is that almost everybody in your company would have contributed a lesser amount if not for the default. It means you can all afford to give up $20k or whatever in income this year. There are other factors.

There is no truth to the matter of "whether defaults change behavior". This thread started about 401ks and then was taken into color preferences on the web. If someone has a gun to their head is asked if they want to die, I'm sure we'll agree that whatever the default is doesn't matter. Whether defaults do anything depends on what we're talking about. Nudges might work in web ux but not economics, why is that so incredulous?

concordDance · on July 25, 2022

But they DO work in economics. Otherwise why would almost everyone contribute the default rather than a lower or higher amount?

Defaults are very strong when there is a lot of uncertainty about the payoffs of different answers.

HardlyCurious · on July 25, 2022

I'm not sure it is useful to lump nudging on decisions users don't want to or don't know how to make in with nudging on decisions users either want to or have to make.

It's a no brainer that defaults will alter outcomes for users who aren't willing or capable of making a selection for the choice in question.

jolux · on July 24, 2022

You have to have a certain amount of karma to change the banner color.

adamisom · on July 24, 2022

Yeah, I am also confused by the statement that nudges theory doesn’t replicate and I’m afraid that statement won’t replicate haha, or rather, there are basic, indisputable findings with mindboggling effect size that countries with different defaults for organ donors have different donation rates.

Now, you can say all day long that those aren’t causal studies, but there is just no way that confounding factors like different cultures explain it, because cultures just aren’t sufficiently different, or rather cultures that are otherwise pretty similar have vastly different donation rates.

A lot of the replication crisis imo is just realizing that landmark studies were underpowered. That is, they don’t prove what they meant to prove, but that is very different from whether the effect exists i.e. an effect may exist yet be hard to prove and social scientists are rarely rigorous in study design, from training and from inherent difficulty.

JusticeJuice · on July 24, 2022

Nudges are often imagined as just how choices are presented, but yes the default option is considered part of nudge theory. As also is social proof ("Your friends picked this choice").

https://en.wikipedia.org/wiki/Nudge_theory#Types_of_nudges

clairity · on July 24, 2022

also, the question is how much structural elements influence outcomes (not merely decisions), not whether they do or not. that’s the extra complexity of a social system built atop a biological system built atop a chemical environment built atop a physical one. we’re complicated. physics is nigh child’s play in comparison.

kristianc · on July 24, 2022

This seems to be the standard response to anything that seeks to debunk nudge. Any time you say ‘This example of a nudge doesn’t work / isn’t replicable / isn’t actually socially helpful’ someone will say ‘Ah but that’s not really nudge tactics.’

mcswell · on July 24, 2022

No true Scotsman.

harry8 · on July 25, 2022

Also the other way will come up in these kinds of "arguments" without doubt.

"How can these horrible critics say nudges don't work? Have they never been nudged with a loaded revolver? Can they not imagine that working?"

Leaving those of us who don't follow the controversy closely in the field and are interested in what has actually been found out and understood about the world with some degree of confidence across many fields of study, leaves us scratching our heads unable to see through the viewing window for all the mud getting flung.

badrabbit · on July 24, 2022

401k is a good example, I have had it for my whole career but if there was a form at any point asking me how much of my pay I want to contribute I would have said 0 because I prefer cash at hand than cash some day and all the b.s. health insurance is already taking a lot. But 401k doesn't bother me enough to change the default so I leave it be as some kind of rainy day fund. I didn't like paying the penalty to withdraw it, unless I turn 65, it will always be worth significantly less than it says on paper, I am not even convinced it is beating inflation. My point is, because people don't change the default it does not mean they have accepted it or like it, that is an incorrect conclusion.

Food is another example, I like cheese sometimes but when there is an option for it I take it out of the food most times but I won't go out of my way to ask for its removal otherwise, this has a real health impact.

mikeiz404 · on July 24, 2022

If you’re worried about beating inflation, have a long time horizon, and don’t mind some risk you might looking into investing in total market index funds. An index fund for the S&P 500 has averaged ~10% returns when looking back 30 years [1].

If you’re just concerned about inflation, don’t like risk, and don’t mind locking you money up for a little bit Treasury Inflation Protected Securities [2] are also a thing. Their returns are tied to the Fed’s measurement of inflation (CPI).

1: https://www.fool.com/investing/how-to-invest/index-funds/ave...

2: https://www.investopedia.com/terms/t/tips.asp

jimmydorry · on July 25, 2022

To add to the above, here's a graph of inflation adjusted S&P 500 [1].

[1] https://www.multpl.com/inflation-adjusted-s-p-500

LegitShady · on July 25, 2022

I max out contribution because my employer matches - instant 100% return. If I didn't have that I'm not sure I'd do it.

curiousllama · on July 24, 2022

> This meta-analytic finding turns on the authors’ method for measuring publication bias. Because I accept that, I must believe that this entire body of research, probably the signal behavioral economics work, is essentially worthless!

I strongly disagree with this statement, even as someone who believes “nudge” effects are wildly overblown.

It means “these studies failed to find evidence” - NOT that there is nothing to find.

The distinction is important because, as it turns out, the policies that the research influenced did work, in many cases. 401k contributions did go up, in many cases. More people became organ donors. More Europeans got stronger privacy protections.

“The power of defaults” is such a cliche because, in many cases, it works.

The problem with these studies is overstating the effect - not spewing worthless BS.

a_simm · on July 24, 2022

I defend and elucidate. Worthlessness I would define here relative to the amounts of public and of policy attention the nudge findings have received vs net value add modified by these results.

Perhaps due to the PR efforts of leading researchers, it was much more than “set defaults intelligently.” The interpretations were more like: we can use social science to shape peoples’ behavior at the margins. Further these marginal changes would cumulate to substantive and lasting societal improvement.

On reflection, it seems to me that the value of this paper stems from its attempt to measure or quantify publication bias. In this case, the bias was positive in the direction of with studies confirming nudge effects.

Taking that a step further implies that the actual net nudge effects across published and unpublished studies were statistically and therefore substantively insignificant. Hence the use of the term worthless, i.e. non-findings.

To say that it is costless to implement a nudge scheme in the behavioral economics sense is simply untrue. In the retirement case it required a lengthy ethical and legal debate; some study and political argument as to the best outcome, which is in part a redistributive question, hard costs associated with revision or development of messages and other materials, etc.

Worse I believe is the damage done from attention and action predicated on now seemingly faulty social science. What could’ve been done instead and what will happen in the next time a social scientist claims an ‘easy’ way to make things better are costs.

curiousllama · on July 25, 2022

> statistically and therefore substantively insignificant

This is not what statistical significance implies. This misunderstanding, and its inverse, leads to the very errors for which you criticize the "nudge" papers.

More to the broader point, "set defaults intelligently" in fact implies the ability to "shape peoples' behavior at the margins." Otherwise, why bother thinking about them?

That's why what is actually at issue with "nudges" is effect size & context: how much of a difference can we have, and where?

And to that question, this paper provides little insight. It aggregates too much & ignores real-world policy evidence.

Now, it's still a good paper - people have gone WAY overboard with nudges in silly places - it just needs to be understood as "let's reign in expectations" and not "this field is bunk"

Retric · on July 24, 2022

> Taking that a step further …

That step is in no way supported by the evidence provided.

HardlyCurious · on July 25, 2022

I think the issue here is using science / evidence to push for policy changes when there isn't actually sound science or evidence. That can be done with sound policies that work just as well as it can be done with bad policies. But we should always be concerned when unsound science gets used. It can be used to shut down valid policy debates. And eventually, on a long enough time line, it will get abused by bad actors.

runarberg · on July 24, 2022

Can these effects be explained without inventing a new term? Because if they can then these studies didn’t really find anything did they?

Whenever I see a new term being introduced as an explanation I am hesitant to accept it, as it may turn out to just be explaining the planetary motions with epicycle, when the motions can be easier explained by moving the sun to the center of the solar system instead of the earth.

dx034 · on July 25, 2022

Not very scientific, but isn't it just laziness? Most people (including me) are too lazy to think about all the choices they could make, so they just stay with the default choice most of the time. Not because they actually prefer it, mainly because they never even read it.

runarberg · on July 25, 2022

Indeed. Your alternative theory doesn’t require an extra construct, and instead uses a pretty established cognitive behavior (the tendency of inaction) to explain the same phenomenon. I would say your explanation has the advantage of Occam’s razor, whereas Nudge Theory doesn’t.

turns0ut · on July 24, 2022

People born in Germany speak German.

That we need to “create” the idea of a “nudge effect” when it’s clear people take on commonly encountered social behaviors is bizarre.

Cognitive experience is a for loop with memory; for time spent in situation X, memory forms at rate Y. Social science solved.

Social science derives all it’s conclusions by studying the same old physical world as physical science. It’s restatement of science customized to cultural tradition. It’s cultural tradition to over hype our specialness selling books and big ideas, when the math is the same everywhere. Creating cultural objects of obvious math is a commodity now.

glitchc · on July 24, 2022

Changing the defaults is not the same as nudging. There is a logical error in your thinking here.

pm_me_your_quan · on July 24, 2022

Except that's not what the study says. Quoting a comment below, "The linked study (and the Merten's study it's built ontop of) classifies defaults as "structural" interventions. In the linked meta-analysis, after stripping out estimates of publication basis, structural interventions have the most "robust" evidence left (95% CI of 0.00-0.43)"

runarberg · on July 24, 2022

I left psychology around the time that nudging was gaining traction and I haven’t really been following it. But it seems to have a couple of red flags:

First of the definition:

> A nudge is a function of (condition I) any attempt at influencing people’s judgment, choice or behavior in a predictable way (condition a) that is motivated because of cognitive boundaries, biases, routines, and habits in individual and social decision-making posing barriers for people to perform rationally in their own self-declared interests, and which (condition b) works by making use of those boundaries, biases, routines, and habits as integral parts of such attempts.

I find this definition overly permissive and overlay reliant on unnecessary cognitive terms (like judgement and choice; which can be shortened to behavior) or economic terms (like rationality and self interest). As a fan of behaviorism this feels like an attempt to introduce epicycle into a theory that doesn’t need it. This effect—if it exists—can probably be adequately explained with good old classical conditioning and conditional reinforcements. This is the first red flag. That is not to say we can’t look for specific cognitive functions which makes some reinforcement contingencies more effective then others, but nudge feels a bit too general to actually be of any use in a model. It in fact reminds me of Albert Bandura’s theory of self-efficacy, a theory that seems to have reach a dead-end at this point.

The second red flag is the economic presuppositions. When I skim through the literature it feels like they are creating a band-aid on the thoroughly debunked notion of Homo economicus (the believe that human individuals always behave in a rational way optimized for their own self interest). So instead of recognizing the fact human behavior is more complicated, what they try to do is invent a new term to counter-act the instances where biases are “preventing” such a behavior pattern. I find such an effort to be doomed to fail, as—despite the persistence of economists—rational behavior means a different thing for each individual, and there is no “patch” for what economists call “biases”.

mikkergp · on July 24, 2022

How does a meta analysis of something like this avoid, I don’t know what it would be called but like regression to the mean. A “nudge” isn’t a singular thing, it’s a very diverse process requiring a competent administrator. My gut would say when you averaged all those out, you’d see no effect because your experimenting, some work some don’t work, some backfire. It seems like you’d have to do a meta analysis on a specific nudge, not on groups of nudges.

svnt · on July 24, 2022

They aren’t summing effects. An effect is not cancelled by an inverse effect or as you put it, backfire.

The methodology should (I haven’t investigated theirs in detail) not be susceptible to this, and I doubt a mean of effects would make it through peer review for reasons including the ones you’ve mentioned.

mikkergp · on July 24, 2022

Where are they getting an effect size of .08 if not by mathing a bunch of other effect sizes.

JusticeJuice · on July 24, 2022

> Thus, all that effort has not only been wasted but the credibility of social science in general is damaged.

I don't think that's entirely true If anything this just highlights how complex behavioural science really is, as they're dealing with surprisingly complex humans and their surprisingly complex lives. Behavioural science is a young field.

mcswell · on July 24, 2022

"Behavioural science is a young field." As was chemistry up until, say, Robert Boyle.

xhkkffbf · on July 24, 2022

Hah. Reform academia. Good one. When people have tenure, they'll be there teaching their version of this stuff for a long time and it's all but impossible to reform them without shutting down the departments.

In many schools, these social science departments are a favorite for the weaker students who don't really do so well with math. They're usually filled with athletes. They love to absorb pop psych results like Amy Cuddy's Power Pose and so they don't want to listen to anyone question their results with lots of meta analysis. They want some basic ideas from class in between lots of time on the playing field.

I'm afraid that their demands will far outweigh any desire to force the fields to search for absolute truth.

mikkergp · on July 24, 2022

> This meta-analytic finding turns on the authors’ method for measuring publication bias. Because I accept that, I must believe that this entire body of research, probably the signal behavioral economics work, is essentially worthless!

This is a pretty short article, how are you confident of such a broad conclusion? What makes you that confident that this meta-analysis is decisive?

raxxorraxor · on July 25, 2022

I think the worst offense of the social sciences recently is their quest to correct bias and with that the creation of idiots that believes themselves to be able to do that.

Because now you have said idiots running around screaming how terrible that bias is completely neglecting the fact that everyone is subjected to it.

t_mann · on July 24, 2022

Yes, nudging is an extremely well established concept, all the way from theory to policy - there's a Nobel (memorial) prize for the theory, and the UK government explicitly established a 'nudge unit' (the Behavioural Insights Team) to turn it into policy.

salawat · on July 25, 2022

Manipulation and deceit works? And the Government is all in on it? Imagine that.

jrochkind1 · on July 25, 2022

> Based on the nudge findings, after much debate and effort, they were updated to dictate that the max options forms was pre selected in the brief it would cause more individuals would opt for that as opposed to contributing zero.

I would be shocked if that wasn't true though. Is there any evidence it's not true in that specific case, that pre-selecting the max options causes more individuals to opt for that? Have individuals opting for that indeed gone up since this was done?

kingkawn · on July 24, 2022

Does academic social science research have credibility to damage?

pid_0 · on July 24, 2022

No offense, but the research in social sciences has very little credibility. From the reproduction crisis to the blatant inability to research things that go against what the professors want, it’s just untrustworthy. Academia needs to fix itself.

b112 · on July 24, 2022

Adding this to the well/known gamesmanship in peer review, debate over tenure and etc. means it’s past time to reform a large chunk of academia.

But we are reforming, right? Merit based learning is over, and so really, what's it matter?

fatherzine · on July 24, 2022

"credibility of social science"

anigbrowl · on July 24, 2022

'Social sciences as Sorcery' indeed.

rmatt2000 · on July 24, 2022

> ... the credibility of social science in general is damaged.

Umm, I have some news for you.

rjmunro · on July 24, 2022

Reading this, I don't get how you can take all "nudging" and declare "No evidence". Surely "nudging" encompasses a whole range of different actual actions. Some nudges work, some don't. You can't just average across all of them.

I'm probably totally misunderstanding, but it sounds similar to saying "there is no evidence for medicine" because you've averaged all the papers describing medical interventions that work and those that don't.

I thought the point of "nudges" is that they are so cheap to implement you can easily afford to try many. Most won't work, some will.

topaz0 · on July 24, 2022

This is my interpretation as well. Also weird to think about publication bias in this way: "these studies about the effectiveness of snake oil as a drug weren't published, so we must be overestimating how effective drugs are".

The authors do mention that there is likely to be heterogeneity in (real) effect sizes, but somehow still go with this title/abstract.

Maybe there is a valid conclusion that some of the many nudge studies are probably claiming effects that don't exist. That could be interesting in itself. But rejecting the whole field based on this kind of argument seems wrong.

raxxorraxor · on July 25, 2022

> but somehow still go with this title/abstract.

While it might be to catch the reader, I don't think it is wrong. The large problem is that we had policies implemented to nudge people in certain directions. Apart from the ethical question there needs to be hard evidence before we employ authoritarianism like that. So the headline should be pardoned, but not those that employ nudging for the time being.

13415 · on July 24, 2022

I have the same concern. Nudging is an umbrella term for a vast number of very different activities. For example, nudging is a term used for motivating more carefully designed road markings. I find it hard to believe that some of these newer designs don't "work" better than the old ones, some of them are quite ingenious. At the same time "nudging" is also used for all sorts of public policy framing issues that are more questionable and have probably harder to measure effects. As you say, each "nudge" needs to be evaluated individually.

picardo · on July 24, 2022

Agree with you. Nudging is a type of user experience design. UX designers nudge with every design decision they make, and the effectiveness of those decisions is quantifiable. So it's hard to argue that all nudges are ineffective, just like one can't argue that all UX design is ineffective.

mcswell · on July 24, 2022

There are a number of posts that address this issue, as well as issues raised by responses to this post. I posted somewhere "no true Scotsman"--I'm not claiming my post was enlightening in itself, but the post I was responded to (and the entire thread) was, IMO, enlightening.

As for averaging, yes, you can: if a nudge is ineffective, then its result will be random, and a bunch of ineffective nudges will average zero effectiveness. The effective nudges will then push the overall average above zero. We don't see that. (The same would be true for medical interventions, unless some cause harm.)

As for being able to try lots of them: in some circumstances, maybe. But when a government is trying to nudge people towards some desired behavior (vaccination, say, to take a random example), they don't try sending out a bunch to different groups of people, then polling each of those groups to see which groups--and therefore which nudges--moved. And it's not always practical, anyway (and the vaccination example is a case where it's almost certainly not practical).

See also threads that mention A/B testing.

zarzavat · on July 25, 2022

> if a nudge is ineffective, then its result will be random

Ineffective and random are different. Ineffective means that the effect size is smaller than required.

For example, if you read "Paracetamol was ineffective for pain relief after surgery" it doesn't necessarily mean that the effect of paracetamol is unpredictable, inconsistent, immeasurable, or that it had no effect or negative effect. You would most likely interpret it to mean that the paracetamol did have an effect but it was insufficient - the patient was still in too much pain.

Similarly, if a nudge intervention was ineffective, it doesn't tell you how it failed, only that it didn't reach the threshold for success. And it certainly doesn't tell you anything about how well an aggregate of some effective and some ineffective results would perform.

civilized · on July 24, 2022

I agree with other commenters that it's unlikely nudges never have an impact.

We should also be wary of high-profile debunkings, now that they're increasingly in fashion due to the replication crisis and the general dour mood. It's easy to p-hack a result into significance, but you can just as easily hack results into insignificance.

These days, both findings and debunkings need a skeptical eye.

nabla9 · on July 24, 2022

There is no evidence of impact.

That's different from the existence of the phenomenon.

Same thing happened to Kahneman, Daniel (2011) and his book of Thinking, Fast and Slow. He acknowledges that several pieces of evidence he presents in the book has disappeared and can't be replicated.

He still thinks he is right, he just admits that he does not have strong evidence anymore.

What is left is a theory with less and less evidence supporting it.

civilized · on July 24, 2022

I just think it's implausible that there's been no good research showing solid evidence of choice architecture mattering. Could be, I'm not an expert, but I'd like to see where things stand after a couple more years of research and debate.

nabla9 · on July 24, 2022

>I just think it's implausible that there's b

The job of scientist is to find and present that evidence.

If your assumption is correct, and Kahneman failed to find good evidence that was there, that makes him a incompetent scientist. I don't think he is.

civilized · on July 24, 2022

I respect Kahneman too but rigorous behavioral science can be extremely demanding. Read the rat story here: http://people.cs.uchicago.edu/~ravenben/cargocult.html

Feynman taught us to be scornful of cargo cult science, but fewer have internalized how difficult real science can be in comparison.

If this is the case, though, maybe I shouldn't be surprised that maybe no quality research has been done.

BeetleB · on July 24, 2022

> He acknowledges that several pieces of evidence he presents in the book has disappeared and can't be replicated.

He mostly says that about just one chapter. A significant portion of the book is fallacies of basic statistics and logic.

JusticeJuice · on July 24, 2022

I'm a UI designer, and my experience of implementing 'nudges' is that sometimes they work, and sometimes they don't.

The reality is that the way people make decisions is stupidly complex, because people have stupidly complex lives. Some tweak will work great for one project, and do nothing on the next one. It's hard to even say if it was the nudge that worked the first time.

I really view nudge theory as one of many ideas of things you can try, a tool in a toolkit. But the only tool I really feel confident works is the design-test-iterate loop.

solarkraft · on July 24, 2022

I'd love to read more about this. Do you have examples?

JusticeJuice · on July 24, 2022

I was working on a fintech project (gonna be vague as it's not yet released).

The legal team told us we couldn't use default choices anywhere, as it could count as giving financial advice. Fair enough. So we designed the onboarding, and there was this choice the user had to make before we could create their account.

During testing, we found people were getting really stuck on this choice, to the point of giving up. The choice actually had quite low impact, but it was really technical - a lot of people just didn't understand it. Which makes sense our users weren't financial experts, which was our target user. This choice was a new concept for the market, so we couldn't relate it to other products they might know. The options inside also had quite a lot of detail when you started digging into them, detail we had to provide if somebody went looking for it. Our testers would hit this choice, get stuck, feel the urge to research this decision, get overwhelmed, give up.

We spent so long trying to reframe this choice, explaining it better in a nice succinct way, we even tried to get this feature removed entirely - but nothing stuck.

Eventually after lots of discussion with legal we were allowed to have a 'base' choice, which the user could optionally change. We tested the new design, and it made a significant difference in conversion rates.

Huzzar for nudge theory! Right? Well, maybe. I think it's a bit more complicated.

- The new design was faster. There was less screens with simpler choices. It went from a 'pick one of 5' to a 'heres the default, would you like to change it?'. Was it just the speed that made a difference?

- The user was not a financial expert, and the company behind the product was. In some sense was the user just thinking 'these guys probably know more than me I'll leave it at that'. Imagine trying to implement this exact change on something the user is an expert in - say like your meal choice in an airplane. I imagine most people would think "How rude choosing for me! I'm an expert in what I feel like eating I want to see all the options".

- It had less of a cognitive load. Like the whole onboarding flow was already really complicated, just reducing the overall mental strain to make an account may have just improved the whole experience. E.g. if we had removed decisions earlier in the flow, would this one still have been as big of an issue? We never had time to test it, so I can't say for sure.

- Lack of confusion == confidence. For the users who didn't look at the options and took the default, did they just feel more in control and confident because they weren't exposed with unfamiliar terms and choices? They never experienced the urge to research.

Like at the surface level this new design worked great, so job done. But it's hard to say definitively it was because of nudge theory. I don't think you can really blindly say "oh yeah defaults == always good" and slap them on every problem - which is why the design-test-iterate loop is so important.

richdougherty · on July 24, 2022

I think all of those things you listed are kinds of nudges. They are changes in the "choice architecture" that steer the user to an action.

In the context of government, a nudge means influencing people to choose something desirable, while still leaving open the option for people to choose what they want (hence preserving liberty). In contrast, a non-nudge solution would be a law or regulation that forces people into the desirable option or perhaps a tax on a certain choice.

In your UI, an example of a non-nudge solution would be removing the other options, effectively forcing their decision. Another example of a non-nudge would be charging different fees depending on their decision.

https://en.wikipedia.org/wiki/Nudge_theory#Definition_of_a_n...

"A nudge, as we will use the term, is any aspect of the choice architecture that alters people's behavior in a predictable way without forbidding any options or significantly changing their economic incentives. To count as a mere nudge, the intervention must be easy and cheap to avoid. Nudges are not mandates. Putting fruit at eye level counts as a nudge. Banning junk food does not."

JusticeJuice · on July 25, 2022

A nudge has to push the user towards one certain decision over another, that's the whole point. It's opinionated. The factors I listed aren't inherently opinionated, we could've tried to improve them without pushing the user to a specific choice:

E.g. speed: We could've removed earlier parts of the onboarding to make the overall experience less long, or compacted the UI so it was visually easier to skim the choices.

Expertise: we could've assured the user before the choice, that all options were good cause we're the experts and that we would've give you a bad option - so don't agonise.

Cognitive load: We could've reduced the info we showed about each option, or hidden it away behind a modal, or re-written it in plain english. The legal team told us we had to use the legal descriptions of the choices, which included technical language.

Confusion: We could've made an visualisation of the impact of their choice, that changed as they swapped between each option - showing them them a more tangible outcome of their choice. It was a complicated concept to get, so the addition of a visual aid instead of just written descriptions might've helped.

To be clear - I'd be surprised if these things would've worked, and I'm certain setting a default made a difference. The point I'm making is that I don't know for sure how much of a difference. The change to implement the default, by my eyes, also improved the overall design in these other ways as well. We didn't isolate it down to exactly what made the improvement, we were just happy it happened.

The point I'm making is you could quickly skim read this story of a team stuck on a problem, who after implementing defaults found their conversion rates jumped 11x holy shiiiiiiii- and it sounds like it's all thanks to nudge theory. It's exactly like a case study you'd see in a co-design agency's portfolio.

But in the actual real messy world of designing interfaces, it's just always a bit more complicated than that. No change is truly isolated, tested in a controlled, academic fashion. You just design your best shot each time and see what works. Because of this, it's hard to truly definitively say an improvement was because of a nudge. Best I can do is, "I mean probably" haha.

richdougherty · on July 25, 2022

There are some good points you raise, and I think you're really testing the nuance of what nudging is. But nudges are perhaps more basic than you're thinking.

Nudges don't need to steer the user to a specific choice, just a behaviour change. Sticking with a conversion flow counts as a behaviour change.

Nudges don't need to be simple or understandable. They can be a set of complex changes where causation isn't clear. They just need to get results.

The only really hard requirement that would rule out a nudge is if you forced a choice or used financial incentives.

If you read the Nudge book you'll see that it's a political book, really. The authors introduce nudges as an alternative to hard regulation. Instead they propose that governments consider influencing behaviour in a softer way, but still leave the escape hatch open for people with strong preferences to choose what they want. This strikes a balance between state involvement and principles of liberty. (Or at least that's their argument.)

Because of this framing a nudge is defined mostly by what it isn't. It's not a nudge if it forces a user to a choice; a nudge is anything you do that changes what users do without forcing them.

This is what you've done with your series of changes that resulted in increased conversion. You've left all the choices open still, so users have as much freedom as before, but you've managed to predictably change user behaviour in a way that aligns with your goals. In other words, you've nudged them.

JusticeJuice · on July 26, 2022

Oh interesting, I had no idea it had such a wide scope, thanks for the explanation. I learnt about nudges via a uni course, it sounds like parts were lost in translation. I should check out the original book.

Izkata · on July 25, 2022

> that alters people's behavior

I've always understood this part of the description to be more than a single one-off choice, so none of that around the decision point in the financial product would count.

richdougherty · on July 25, 2022

I think any alteration to behaviour is open to nudging, but perhaps you can explain more?

civilized · on July 24, 2022

Very interesting. One question:

> The new design was faster. There was less screens with simpler choices. It went from a 'pick one of 5' to a 'heres the default, would you like to change it?'. Was it just the speed that made a difference?

If you're just going from "pick one of 5" to "pick one of 5 but there's a default", I wouldn't expect one or the other to be "faster". Was the new design more different than that?

As for the rest, I think the beneficial features of the design are predicted by nudge theory. "Providing a credible default reduces cognitive load and confusion on the path to a decision, as the user can just trust the defaults have been set up reasonably" has always been the theory for why nudges work.

JusticeJuice · on July 25, 2022

The first version was a screen with 5 choices, and detail about each choice that you'd have to scroll through. The second version was a simple "We've set this up for you" screen with two options, continue or customise. If you hit customise, you'd get shown the original five choice screen.

What I mean by it being faster is you could get to the next step of the process with both reading less text, and seeing less choices (just two buttons not 5). Cause if you just slapped the continue button (which most people did), you'd skip the whole explanation of all the choices.

t_mann · on July 24, 2022

The paper actually explicitly addresses this:

> However, all intervention categories and domains apart from “finance” show evidence for heterogeneity, which implies that some nudges might be effective, even when there is evidence against the mean effect.

themitigating · on July 24, 2022

Isn't that always the case?

pessimizer · on July 24, 2022

I think some people careen from fully trusting one thing to fully trusting the opposite thing. If you're not one of those people, you'll never understand dismissing things on the basis of "you should be critical of this, because not everything that people say is true."

I do feel like that, even though being critical is something we should always do, that in cases where

1) the only reason you started paying attention to something was an intuitive hunch that it could matter, and

2) the only reason you started treating that hunch as established science is because you did experiments that had significant results, then

3) later you found that significance could be entirely accounted for by the file-drawer effect,

you need to adjust your expectation that there actually is an effect to lower than your expectation was at step 1). It isn't that the theory hasn't been tested (although you can argue it hasn't been tested for ingeniously enough yet), it's that it has been tested and no effect has been shown.

If you allow the existence of interest in a theory (represented in amount of ink spilled and number of experiments done) to raise your expectation that the theory is true, despite experimental indications to the contrary, you're not really doing science, you're just throwing good money after bad, probably motivated by a desire to protect the researchers and institutions that are heavily committed to the truth of the theory and/or the desire to protect other theories that depend on the one that hasn't shown results.

civilized · on July 24, 2022

There's a lot of junk behavioral science out there, but things like "people often go with a default or recommended option so they can move on with their day" seem so obvious to me that I become suspicious of this debunking for debunking too much.

pessimizer · on July 24, 2022

That's just refusing to be convinced by evidence, though. It's good to have hunches, but it's good to let them go after you've done the experiments. Come up with a new hunch and a new experiment that shows why the expected effects weren't seen, and you're right back in there.

civilized · on July 24, 2022

"Refusing to be convinced by evidence" is a simplistic false dichotomy. The evidence is interesting but I have several reasons not to immediately take it as definitive.

Are you really certain that a big debunking in PNAS, surfing a wave of other celebrated debunkings, should be taken as definitive, when a good deal of the research being debunked was published to similar fanfare in PNAS back when a different kind of research was fashionable?

I take neither the original research nor the debunking as particularly credible. Without technical expertise, I'm left to educated guess. It's just my guess.

pessimizer · on July 24, 2022

> I have several reasons not to immediately take it as definitive.

The one that you expressed is that it "seems so obvious to you that you become suspicious." I'm just taking you at your word.

civilized · on July 24, 2022

The idea that we should always be convinced by evidence regardless of context is a vast overgeneralization, impossible ("the evidence" overall rarely points only one way, even if the latest chunk of new evidence does), and in contradiction with Bayesian epistemology.

pessimizer · on July 24, 2022

My problem isn't that I think people should be credulous of everything, it's that I don't think "it's just obvious" is a proper counter to experiments that show nothing. If the effect is so obvious, it should be obvious how to design an experiment that would show it.

I don't even know what you're defending here other than believing your first impulse above any subsequent evidence. Nobody is preventing anyone from proving an effect, in fact they poured money into the attempt.

icegreentea2 · on July 24, 2022

The linked study (and the Merten's study it's built ontop of) classifies defaults as "structural" interventions. In the linked meta-analysis, after stripping out estimates of publication basis, structural interventions have the most "robust" evidence left (95% CI of 0.00-0.43), and as their paper text says "whereas the evidence is undecided for “structure” interventions". Other structural interventions include, making it easier to select the desired outcome (or make it harder to switch away from desired out come), changing the range of options to facilitate evaluation, or trying to compensate for biases and loss aversion in choice structure. As you can see, this is a broad range of interventions.

A little bit further they say "However, all intervention categories and domains apart from “finance” show evidence for heterogeneity, which implies that some nudges might be effective, even when there is evidence against the mean effect", which makes sense. People understand stakes generally, and will likely apply different care/effort in different context, modifying the context specific effect of any given intervention.

I think the paper makes a reasonable argument:

1. There is significant publication bias in nudging studies 2. The effect of providing additional information at time of selection, or providing reminders/affirmations for self control is basically non-existent 3. The effect of modifying choice structure is indecisive. Likely we'll find that some structural modifications have strong effects in some context, but others have little or no effect is other context.

light_hue_1 · on July 24, 2022

There's a lot of confusion here about what this article is talking about.

Nudges aren't just defaults. We've known for over a century that people are influenced by defaults. Nudges also aren't anchoring, where choices influence one another. Kahneman & Tversky won a Nobel prize for that and other behavioral economics ideas a decade before the idea of nudges.

Nudges are a bigger idea that many small changes lead to huge behavioral changes. Like providing a social reference point (see the average electricity use of your neighbors), surfacing hidden information (a red light to remind you to change your A/C filter), change the financial effort involved in something (deposit your drinking money into an account that you lose when you drink again; health plans that pay to stay healthy), change the physical effort of making bad choices (a motorcycle license for people who don't want to wear helmets that is much harder to get), change the consequences of options (pay a teenager $1/day to not get pregnant), provide reminders (check if an email is rude and have someone confirm they want to send it), public commitments (say you are doing X makes you more likely to do X), etc.

There are various examples of each of these working to some extent in specific circumstances.

But we have a lot of other tools for changing people's behavior. We have education campaigns. We have fines. We have taxes. We have tax breaks. The idea behind nudges is that they're an easy replacement for many of these other tools.

But the meta-analysis shows that nudges aren't a general-purpose tool that leads to significant changes in people's behavior. The behavioral changes are small, the same as we get from a fine, a tax, or an education campaign.

Aside from specific circumstances, nudges don't work better (and may be much worse) compared to our usual tools for getting people to behave.

vishnugupta · on July 24, 2022

I'm not a social scientist, so please help me understand this. The way you have defined nudge it seems like a very broad category. Some passive nudge (defaults) and some active (red lights) and a whole lot others.

If a category has such a broad number of phenomenon then shouldn't we be analysing individual phenomenon instead of the category as a whole? For example; defaults may work and red-light thing may not work. Why place them both under the same bucket at all? Why not study them in isolation?

light_hue_1 · on July 24, 2022

> For example; defaults may work and red-light thing may not work. Why place them both under the same bucket at all? Why not study them in isolation?

And that's exactly how these meta analyses work! If you look in figure 1, they break down nudges both by the kind of intervention and by the domain. Maybe some types of nudges are much better than others. And maybe nudges work much better for say food vs finance.

Yes. Defaults have an effect, most other nudge types don't. But the domain doesn't matter much it seems.

malisper · on July 24, 2022

Reading through the article, it seems the actual claim is much, much weaker than the title:

> However, all intervention categories and domains apart from “finance” show evidence for heterogeneity, which implies that some nudges might be effective, even when there is evidence against the mean effect

So the article is saying when you look at studies of all "nudges" as whole, adjust for publication bias[0], there isn't evidence for nudges as a whole. Of course individual nudges could still have a positive impact.

Maybe I'm misinterpreting what it's saying, but as an analogy, that would be like doing a study of all "diets", determining that when you combine data for all studies on diets there isn't a positive effect, then writing an article with the claim "no evidence for diets". There's no way you could reasonably make the claim that no diets work.

[0] If you look at the studies on "nudges", studies with smaller sample sizes detected a larger positive effect size. This is because smaller studies that get a positive effect are more likely to be published than smaller studies with a negative effect. The article uses this to analyze just how strong the publication bias is and adjust for it.

whimsicalism · on July 25, 2022

You are misinterpreting what it is saying. It's like multi comparison bias - if you do a bunch of comparisons and find one significant at 0.05, you can go ahead and say that is significant if you had looked at it alone. But if you consider all of the experiments you are doing, you would expect to find one as significant just by random chance even if it isn't significant.

rahimnathwani · on July 25, 2022

> There's no way you could reasonably make the claim that no diets work.

Even if some studies results suggest that diets work, that does not, by itself, mean we should reject the null hypothesis (that diets don't work).

Coincidences exist: https://xkcd.com/882/

akyu · on July 24, 2022

No evidence for nudging =/= nudging doesn't exist.

I'm fairly sure anyone who has done A/B testing at scale has plenty of evidence that nudging works. Perhaps not up to the standard of science, but there are literally people who manipulate choice architecture for a living and I'm fairly convinced a lot of that stuff actually works.

mcswell · on July 24, 2022

"... evidence that nudging works. Perhaps not up to the standard of science..." That's pretty close to saying it doesn't work. The point of this meta-study was precisely to show that the evidence claimed to support nudging was probably attributable to random variation + unnatural selection, where the unnatural selection was publication choice: either the researchers who got negative (null) results chose not to bother writing it up and submitting it, or papers that reported negative were rejected by publishers.

There are lots of people who do X for a living, but where X doesn't work: palm readers, fortune tellers, horoscope writers, and so on. I'm not even sure that funds managers reliably obtain results much above random.

mikkergp · on July 24, 2022

I think what’s not clear is what’s in those papers and what exactly they have to say about nudging and what definition they’re using. It defies credulity to think that changing defaults in software doesn’t change behavior if only because most users aren’t technically savvy enough to change their settings.

On the other hand the dream of nudge theory is something like a study done in the UK that suggests that adding the line “most of your fellow citizens pay their taxes” will increase the likelihood that people pay taxes. This I’d be more likely to believe the benefits are not clear, and more importantly difficult to replicate across time and culture.

It seems that trying to do a meta-analysis on all of nudge theory (or large categories of it) would indeed show know impact. It’s not like you’re testing one thing, you’re comparing well designed programs, with ones that aren’t.

akyu · on July 24, 2022

>That's pretty close to saying it doesn't work.

No it's really not.

To say things a different way, I don't think this study will change anything for people actually doing choice architecture in applied settings. They have results that speak for themselves.

pessimizer · on July 24, 2022

> results that speak for themselves.

This is exactly how a midwife explained to me why she uses magic crystals. She told me that there's science, and there's results, and that she's seen the crystals work.

msrenee · on July 24, 2022

Obviously they don't work by magical vibration, but are you sure they don't work at all? If the midwife feels and acts more confident from having that tool or the mother feels more relaxed because she thinks they will make the process easier, then the crystals do, in fact, work. They just don't work through the mechanism those individuals think they do.

rsanek · on July 24, 2022

I mean, yeah, if she has solid RCT data on thousands to millions of childbirths and has found a statistically significant impact from using the magic crystals, I would support their use. A/B as well as scientific research uses the same basis.

The issue is that in fact the midwife will not have such data. The comparison being made is that A/B testing, if run competently, is pretty close to scientific research, in particular for research related to nudging.

asdff · on July 24, 2022

I wonder how many engineers crack open a statistics book to find the correct test versus just plotting box plots and saying "see looks pretty different"

saalweachter · on July 25, 2022

To be fair, the more profound a result the less math you need to convince anyone it is the case.

asdff · on July 25, 2022

Maybe if you are parroting the result in front of investors instead of statisticians that's the case

pessimizer · on July 24, 2022

But if run rigorously, A/B testing is identical to scientific research, and the scientific research fails to show an effect.

sacrosancty · on July 25, 2022

The OP was referring to A/B tests that were "perhaps not up to the standard of science", not ones that were already science.

mcswell · on July 24, 2022

"I don't think this study will change anything for people actually doing choice architecture in applied settings." Probably true, but then evidence that horoscopes etc. don't work, doesn't prevent people from drawing horoscopes, or other people from relying on their horoscope to plan out their day.

"They have results that speak for themselves." Let me put my point differently. Suppose that nudges don't have any effect at all (null hypothesis). More concretely--and just to take a random number--suppose that 50% of the time when a nudge is used, the nudgees happen to behave in the direction that the nudge was intended to move them, and 50% of the time they don't move, or they move in the opposite direction. And suppose there are a number of nudgers, maybe 100. Then some nudgers will get better than random results, while others will get no result, or negative results. The former nudgers will have results that appear to speak for themselves, even if the nudges actually have no effect whatsoever.

This is the same as asking if a fair coin is tossed ten times, what is the probability that you'll get at least 7 heads. The probability of such a number of heads in a single run is ~17%. So 17% of those nudgers could be getting apparently significant results, even if their results are actually random.

Beldin · on July 24, 2022

I think gp and you probably see eye to eye, but gp has a problem with your phrasing. If the effect does not live up to scientific rigour, that (more or less) implies that the effect is roughly indistinguishable from randomness.

If folks have results that speak for themselves, then the effect more than likely is scientifically rigorously testable. It may already have been - by those very results.

DangitBobby · on July 24, 2022

They would be the people who published, in this scenario.

dr_dshiv · on July 24, 2022

Seriously, what about that kind of publication bias: A/B tests don’t get published.

If you run a useful system where it would be meaningful and interesting to know whether a social science theory actually applied, you might run an A/B test to see if it works. If it works, it is adopted—but it is almost never published. And that is for two reasons: 1. no incentive to publish and 2. major incentive not to publish. #2 is recent (post Facebook experiment) and it is specifically because a large portion of the educated public accepts invisible A/B testing but recoils with moral indignation at the use of A/B testing results in published science. Too bad: Facebook keeps testing social science theories, but no longer publishes the results.

MereInterest · on July 24, 2022

The standards of selecting a result of an A/B test are less stringent than those of publication for the advancement of knowledge. For publication, the goal is to determine whether a model is accurate. For A/B testing, the goal is to select the best design/intervention. The difference is that for scientific testing "inconclusive" means that there isn't enough evidence to consider it a solved problem and it should have more research, while in A/B testing "inconclusive" means that any effect is small so you should pick an option and move on.

As an example, suppose I flip a coin 1000 times and get heads 525 times. The 95% confidence interval for the probability of heads is [0.494, 0.556], so from a scientific standpoint I cannot conclude that the coin is biased. If, however, I am performing an A/B test, I would conclude that I'll bet on heads, because it is at worst equivalent to tails.

dr_dshiv · on July 24, 2022

I think you are missing the point. With academic publication bias, sometimes an unbiased coin gets heads 600 times by chance. Those studies get published. But, if you ran the test again, you might only get 525. That study won’t get published.

And, in opposition to your assumption: there is nothing to prevent A/B tests being published with high academic standards— like a low p value and tons of n. In an academic context, that’s just fine— it’s a small but significant effect.

A/B tests are simply controlled experiments—which are the gold standard of scientific evidence generation in psychology. My point is that the main generators of this evidence are only permitted to use this evidence to inform commerce not public knowledge. That is a loss for science and public policy, in my opinion.

themitigating · on July 24, 2022

You don't have to prove something doesn't exist , you have to prove it exists.

akyu · on July 24, 2022

Absolutely.

zeroonetwothree · on July 24, 2022

They note that there is no evidence for nudging as being generally effective. So any individual nudge could be effective (except in finance in which they found that none are effective).

marcosdumay · on July 25, 2022

"We studied X extensively and there is no evidence that it works" is a textbook example of how scientists say "X doesn't work".

Except the article is more specific and has way more details than that.

aaaaaaaaaaab · on July 24, 2022

>I'm fairly sure anyone who has done A/B testing at scale has plenty of evidence that nudging works

Lol! A/B testing in practice is rife with P-hacking and various other statistical fallacies.

omginternets · on July 24, 2022

What exactly makes you convinced that it works? To be specific: why wouldn’t there be bias in the A/B testing results, too?

There are literally people who give astrological analyses for a living.

zeroonetwothree · on July 24, 2022

A/B testing has a ton of issues as well that make it easy to be fooled

https://biggestfish.substack.com/p/data-as-placebo

akyu · on July 24, 2022

Of course.

lIl-IIIl · on July 24, 2022

We are talking about publication bias, where the decision whether to publish something is biased by the outcome of the experiment.

I think this doesn't really apply to A/B testing, because people are incentivized pay as much attention to negative results as to positive ones.

zeroonetwothree · on July 24, 2022

From what I’ve seen there is even more incentive to focus on positive A/B tests. It’s the way you get credit for your work at a company. A negative test is counted as barely anything. So your incentive is to run tons of tests, then cherry pick only the positive ones and announce them widely. Another strategy is to track multiple metrics for each test and not adjust for that when computing p values. But then at the end you only report the one metric that was positive.

aaaaaaaaaaab · on July 24, 2022

People are incentivized to pay attention to the result that increases their mid-year bonus the most.

akyu · on July 24, 2022

I cannot share the reason I am convinced it works. But I can tell you I am convinced it works.

I'm sure many people here are in similar situations.

mcswell · on July 24, 2022

Great minds! I was writing more or less the same thing, you beat me to publication by three minutes.

lazyant · on July 24, 2022

What's exactly "nudging" here?. For example it has been shown that for organ donation, if the default is affirmative (opt-in) and you have to opt-out, then organ donors double https://www.science.org/doi/10.1126/science.1091721 , I think this is one of the "nudge" example in pop-science books.

tgv · on July 24, 2022

> A nudge, according to Thaler and Sunstein is any form of choice architecture that alters people's behaviour in a predictable way without restricting options or significantly changing their economic incentives. To count as a mere nudge, the intervention must require minimal intervention and must be cheap.

Thaler and Sunstein wrote the book on nudges, quite literally. So their definition counts, and it's the one from the article. The opt-in/out decision you mention isn't a nudge in this sense. You're not asked what you prefer, you have to be aware that you can opt-in/out and then actively pursue that option.

ghaff · on July 24, 2022

In the case of organ donation, Thaler has written that he actually prefers mandated choice--i.e. there is no default but you have to either opt in or opt out--in this case. [1] But I'm not sure why a system where the government creates a default of either opt-in or opt-out (which you can change) wouldn't be a nudge.

[1] https://www.nytimes.com/2009/09/27/business/economy/27view.h... (The issue being that in countries that are opt-out, doctors still often ask families for permission on the grounds that the deceased never made an affirmative choice to donate.)

tgv · on July 24, 2022

The system is that you're by default in some register. The choice has already been made. Many people aren't even aware of it, or only remotely. You have to undertake action to change it. That's not a "choice architecture" in the nudge sense. That would require that you are presented with both options simultaneously, and are forced to choose. A nudge, e.g. on a web form, would then be to have one option already checked.

ghaff · on July 24, 2022

>A nudge, e.g. on a web form, would then be to have one option already checked.

Yes. But an alternative is to present choices on a web form without one being pre-selected but with a choice mandatory. Which is essentially what I understand Thaler to be arguing for.

>The choice has already been made.

I'd say that still is a default but one which requires more effort to change than a pre-selected option on a webform. And arguably sufficient effort that it may no longer be reasonable to default to organ donation in that manner.

Nasrudith · on July 24, 2022

Wouldn't it not be a nudge because it outright changes the unspoken costs? If the public is largely apathetic about what happens to their bodies post death, if the situation would leave them viable having to take an action vs inertia is an added costs. This could make deciding "not worth it" for low rewards, let alone preferring not to think of the possibility of their own demise.

svnt · on July 24, 2022

A nudge is naive if not circularly defined then as it presumes at least two permanently distinct classes of humans: informed humans who can architect nudges and learn about them and other humans who must respond in the same way every time and cannot do this meta-learning.

endominus · on July 24, 2022

I think that's more narrowly defined as "status quo bias" - people tend to take the lowest energy path, so generally accept default choices. The definition of nudging that I could determine from the original book's Wikipedia page includes that, as well as other forms of nudges. I wonder if separating out these nudges by type would result in different results in this metaanalysis. But that is also analogous to p-hacking, isn't it?

lozenge · on July 24, 2022

The full list and criteria is here- https://osf.io/fywae/

adammarples · on July 24, 2022

I think Newton covered that in his first law. Nobody is actually being nudged, which implies a behavior change, at all.

picardo · on July 24, 2022

I'm not very familiar with how "nudging" is defined in the behavioral economics, and perhaps someone can enlighten me, but personally I find it hard to believe that it can be disproven that the way a choice is presented doesn't play any role in one's decision. The Goldilock principle is well-known. Most people instinctively choose the middle option, when given a three things to choose from.

Does this study imply that choice architecture plays no role in our decisions? Or am I mis-understanding it?

ghaff · on July 24, 2022

Defaults are one example of a nudge. One of Thaler's examples is having some default 401(K) contribution for new employees that's greater than zero. While I'm sure there are cases where defaults are less powerful than in others, the idea that defaults don't really matter certainly flies in the face of everything I know about the world.

You give another example of choice architectures though I'm not sure if that's a nudge in the literature or not.

hackerlight · on July 24, 2022

You are misunderstanding. They're saying that there's an absence of evidence, not evidence of absence.

IshKebab · on July 24, 2022

Absence of evidence is evidence of absence. It's just not proof of absence.

For example, if I search your entire house for drugs, using drug sniffing dogs and so on and I don't find any at all, that's pretty good evidence that you don't have any. It's not proof though - you might have just stashed them really well.

Similarly, if people have been looking for nudge effects for ages, doing loads of studies on it for years, and none of them have found any effect, then that's pretty good evidence that the effects don't exist. It's not proof though; they might just have not been very good experiments.

hackerlight · on July 25, 2022

Well said. I'd just change your first sentence to "Absence of evidence can be (but isn't necessarily) evidence of absence.", which is more in line with the rest of your post.

IshKebab · on July 25, 2022

Good point.

tgv · on July 24, 2022

OTOH, people have been looking for evidence of nudging, and didn't find it. Since the a priori probability of a more than marginal effect of nudging is unlikely, we can conclude that it's much more likely that nudging doesn't "work" than that it does.

rayiner · on July 24, 2022

Remember, Obama appointed a regulatory czar based on this science: https://www.nytimes.com/2010/05/16/magazine/16Sunstein-t.htm...

tptacek · on July 25, 2022

Obama would have appointed Sunstein no matter what book he wrote, because he was one of UChicago's superstar professors (and was at the same time being aggressively courted by schools around the country; he ended up, of course, at Harvard). His appointment was notable not so much for "nudges" as for the fact that Sunstein was probably one of the most conservative appointments Obama made.

rayiner · on July 25, 2022

Possibly. I recall nudges being a major talking point during the appointment: https://www.politico.com/agenda/story/2015/10/obamas-effort-....

tptacek · on July 25, 2022

Your comment makes it seem as if the Nudge work had qualified him for OIRA, when in fact he was probably one of the country's most obvious lay-up candidates for that role. As I think you know, given his own background, it would have been weirder if Obama hadn't found a role in the government for Sunstein.

rayiner · on July 25, 2022

He released a book called “Nudge” the year before his appointment. I’m not saying he wasn’t qualified for OIRA without it but “making society better through subconscious manipulation” was definitely pitched as part of his shtick at the time.

pessimizer · on July 24, 2022

Really one of the creepiest people connected to government.

oxfordmale · on July 24, 2022

Reading this article, it is good that the UK COVID policy wasn't based on behavioural nudging /sarcasm. The UK COVID policy heavily relied on this and one of the unwanted side effects was scaring a certain section of the population into submission. Although that may have been effective during COVID, it made it a lot harder for that segment of society to return back to normal.

lozenge · on July 24, 2022

Isn't locking down the obvious departure from a "nudge policy"?

I've seen the claim a lot but it all goes back to documents like this one which discusses strategies for communicating to increase compliance with lockdowns.

https://assets.publishing.service.gov.uk/government/uploads/...

Yes, the UK government received such shocking insights as "Messaging needs to emphasise and explain the duty to protect others", and "Messaging about actions need to be framed positively in terms of protecting oneself and the community, and increase confidence that they will be effective".

Of course, the government did pick and choose what to follow, so it would be absurd to say the entire COVID policy was "based on behavioural nudging". The UK's adherence to isolating after positive tests was thought to be one of the lowest of any country. When SPI-B pointed out that financial support would increase adherence to isolating, no reaction from the government. https://www.instituteforgovernment.org.uk/blog/government-su...

oxfordmale · on July 24, 2022

There are articles like this:

https://www.theguardian.com/commentisfree/2020/mar/13/why-is...

Or from the other side of the political spectrum:

https://www.telegraph.co.uk/politics/2022/01/28/grossly-unet...

lozenge · on July 24, 2022

The first one is from before the first lockdown. The strategy was completely replaced when lockdown happened. It was unrecognisable from then on.

The second one is about "deploying fear, shame and scapegoating" which the document I linked specifically calls out as a communication strategy with more downsides than any of the others they mentioned. However, Priti Patel just can't resist such activities.

asdff · on July 24, 2022

At this point seeing mask usage e.g. outdoors on a hiking trail is a little disturbing, because at this point people are thinking they are fighting the good fight but are now on the other side of the evidence (which says you are pretty safe outdoors or in a big room or while merely interacting briefly in passing with people, as one does with strangers in public). I wonder what the messaging will be given that this supposedly "scientifically minded" mask wearing subset of the population is no longer listening to the science.

I hope this doesn't lead to weakened immunity overall in the population. If you wear a mask every time you go out into the world, that doesn't give you much of a chance to build up acquired immunity to all the other bugs that are out there. There are stories from the early 1900s of native americans coming out of the woods and joining western society. They of course have spent decades in isolation versus just two years, but that's enough for them to end up perennially sick and in poor health when they were actually integrated into western society, and eventually die young of common disease. A lack of acquired immunity is what killed Ishi: https://en.wikipedia.org/wiki/Ishi

gsatic · on July 24, 2022

If they can be nugdged one way why not the other?

themitigating · on July 24, 2022

For what purpose?

renewiltord · on July 24, 2022

Wait, you're arguing that it does have an effect but the effect direction is unpredictable.

As an example to differentiate: drinking a homeopathic solution for health has no effect; driving a radium solution for health hurts.