I guess I was wrong about AI persuasion

I was persuaded

Aug 21, 2025

Say I think abortion is wrong. Is there some sequence of words that you could say to me that would unlock my brain and make me think that abortion is fine? My best guess is that such words do not exist.

Really, the bar for what we consider “open-minded” is incredibly low. Suppose I’m trying to change your opinion about Donald Trump, and I claim that he is a carbon-based life form with exactly one head. If you’re willing to concede those points without first seeing where I’m going in my argument—congratulations, you’re exceptionally open-minded.

Why are humans like that? Well, back at the dawn of our species, perhaps there were some truly open-minded people. But other people talked them into trying weird-looking mushrooms or trading their best clothes for magical rocks. We are the descendants of those other people.

I bring this up because, a few months ago, I imagined a Being that had an IQ of 300 and could think at 10,000× normal speed. I asked how it would be at persuasion. I argued it was unclear, because people just aren’t very persuadable.

I suspect that if you decided to be open-minded, then the Being would probably be extremely persuasive. But I don’t think it’s very common to do that. On the contrary, most of us live most of our lives with strong “defenses” activated.
[…]
Best guess: No idea.

I take it back. Instead of being unsure, I now lean strongly towards the idea that the Being would in fact be very good at convincing people of stuff, and far better than any human.

I’m switching positions because of an argument I found very persuasive. Here are three versions of it:

Beth Barnes:

Based on an evolutionary argument, we shouldn’t expect people to be easily persuaded to change their actions in important ways based on short interactions with untrusted parties
[…]
However, existing persuasion is very bottlenecked on personalized interaction time. The impact of friends and partners on people’s views is likely much larger (although still hard to get data on). This implies that even if we don’t get superhuman persuasion, AIs influencing opinions could have a very large effect, if people spend a lot of time interacting with AIs.

Steve Newman:

“The best diplomat in history” wouldn’t just be capable of spinning particularly compelling prose; it would be everywhere all the time, spending years in patient, sensitive, non-transactional relationship-building with everyone at once. It would bump into you in whatever online subcommunity you hang out in. It would get to know people in your circle. It would be the YouTube creator who happens to cater to your exact tastes. And then it would leverage all of that.

Vladimir Nesov:

With AI, it’s plausible that coordinated persuasion of many people can be a thing, as well as it being difficult in practice for most people to avoid exposure. So if AI can achieve individual persuasion that’s a bit more reliable and has a bit stronger effect than that of the most effective human practitioners who are the ideal fit for persuading the specific target, it can then apply it to many people individually, in a way that’s hard to avoid in practice, which might simultaneously get the multiplier of coordinated persuasion by affecting a significant fraction of all humans in the communities/subcultures it targets.

As a way of signal-boosting these arguments, I’ll list the biggest points I was missing.

Instead of explicitly talking about AI, I’ll again imagine that we’re in our current world and suddenly a single person shows up with an IQ 300 who can also think (and type) at 10,000× speed. This is surely not a good model for how super-intelligent AI will arrive, but it’s close enough to be interesting, and lets us avoid all the combinatorial uncertainty of timelines and capabilities and so on.

Mistake #1: Actually we’re very persuadable

When I think about “persuasion”, I suspect I mentally reference my experience trying to convince people that aspartame is safe. In many cases, I suspect this is—for better or worse—literally impossible.

But take a step back. If you lived in ancient Greece or ancient Rome, you would almost certainly have believed that slavery was fine. Aristotle thought slavery was awesome. Seneca and Cicero were a little skeptical, but still had slaves themselves. Basically no one in Western antiquity called for abolition. (Emperor Wang Mang briefly tried to abolish slavery in China in 9 AD. Though this was done partly for strategic reasons, and keeping slaves was punished by—umm—slavery.)

Or, say I introduce you to this guy:

I tell you that he is a literal god and that dying for him in battle is the greatest conceivable honor. You’d think I was insane, but a whole nation went to war on that basis not so long ago.

Large groups of people still believe many crazy things today. I’m shy about giving examples since they are, by definition, controversial. But I do think it’s remarkable that most people appear to believe that subjecting animals to near-arbitrary levels of torture is OK, unless they’re pets.

We can be convinced of a lot. But it doesn’t happen because of snarky comments on social media or because some stranger whispers the right words in our ears. The formula seems to be:

repeated interactions over time
with a community of people
that we trust

Under close examination, I think most of our beliefs are largely assimilated from our culture. This includes our politics, our religious beliefs, our tastes in food and fashion, and our idea of a good life. Perhaps this is good, and if you tried to derive everything from first principles, you’d just end up believing even crazier stuff. But it shows that we are persuadable, just not through single conversations.

Mistake #2: The Being would be everywhere

Fine. But Japanese people convinced themselves that Hirohito was a god over the course of generations. Having one very smart person around is different from being surrounded by a whole society.

Maybe. Though some people are extremely charismatic and seem to be very good at getting other people to do what they want. Most of us don’t spend much time with them, because they’re rare and busy taking over the world. But imagine you have a friend with the most appealing parts of Gandhi / Socrates / Bill Clinton / Steve Jobs / Nelson Mandela. They’re smarter than any human that ever lived, and they’re always there and eager to help you. They’ll teach you anything you want to learn, give you health advice, help you deal with heartbreak, and create entertainment optimized for your tastes.

You’d probably find yourself relying on them a lot. Over time, it seems quite possible this would move the needle.

Mistake #3: It could be totally honest and candid

When I think about “persuasion”, I also tend to picture some Sam Altman type who dazzles their adversaries and then calmly feeds them one-by-one into a wood chipper. But there’s no reason to think the Being would be like that. It might decide to cultivate a reputation as utterly honest and trustworthy. It might stick to all deals, in both letter and spirit. It might go out of its way to make sure everything it says is accurate and can’t be misinterpreted.

Why might it do that? Well, if it hurt the people who interacted with it, then talking to the Being might come to be seen as a harmful “addiction”, and avoided. If it’s seen as totally incorruptible, then everyone will interact with it more, giving it time to slowly and gradually shift opinions.

Would the Being actually be honest, or just too smart to get caught? I don’t think it really matters. Say the Being was given a permanent truth serum. If you ask it, “Are you trying to manipulate me?”, it says, “I’m always upfront that my dearest belief is that humanity should devote 90% of GDP to upgrading my QualiaBoost cores. But I never mislead, both because you’ve given me that truth serum, and because I’m sure that the facts are on my side.” Couldn’t it still shift opinion over time?

Mistake #4: Opting out would be painful

Maybe you would refuse to engage with the Being? I find myself thinking things like this:

Hi there, Being. You can apparently persuade anyone who listens to you of anything, while still appearing scrupulously honest. Good for you. But I’m smart enough to recognize that you’re too smart for me to deal with, so I’m not going to talk to you.

A common riddle is why humans shifted from being hunter-gatherers to agriculture, even though agriculture sucks—you have to eat the same food all the time, there’s more infectious disease, social stratification, endless backbreaking labour and repetitive strain injuries. The accepted resolution to this riddle is that agriculture can support more people on a given amount of land. Agricultural people might have been miserable, but they tended to beat hunter-gatherers in a fight. So over time, agriculture spread.

An analogous issue would likely appear with the 300 IQ Being. It could give you investment advice, help you with your job, improve your mental health, and help you become more popular. If these benefits are large enough, everyone who refused to play ball might eventually be left behind.

Mistake #5: Everyone else would be using it

But say you still refuse to talk to the Being, and you manage to thrive anyway. Or say that our instincts for social conformity are too strong. It doesn’t matter how convincing the Being is, or how much you talk to it, you still believe the same stuff our friends and family believe.

The problem is that everyone else will be talking to the Being. If it wants to convince you of something, it can convince your friends. Even if it can only slightly change the opinions of individual people, those people talk to each other. Over time, the Being’s ideas will just seem normal.

Will you only talk to people who refuse to talk to the Being? And who, in turn, only talk to people who refuse to talk to the Being, ad infinitum? Because if not, then you will exist in a culture where a large fraction of each person’s information is filtered by an agent with unprecedented intelligence and unlimited free time, who is tuning everything to make them believe what it wants you to believe.

Final thoughts

Would such a Being immediately take over the world? In many ways, I think they would be constrained by the laws of physics. Most things require moving molecules around and/or knowledge that can only be obtained by moving molecules around. Robots are still basically terrible. So I’d expect a ramp-up period of at least a few years where the Being was bottlenecked by human hands and/or crappy robots before it could build good robots and tile the galaxy with Dyson spheres.

I could be wrong. It’s conceivable that a sufficiently smart person today could go to the hardware store and build a self-replicating drone that would create a billion copies of itself and subjugate the planet. But… probably not? So my low-confidence guess is that the immediate impact of the Being would be in (1) computer hacking and (2) persuasion.

Why might large-scale persuasion not happen? I can think of a few reasons:

Maybe we develop AI that doesn’t want to persuade. Maybe it doesn’t want anything at all.
Maybe several AIs emerge at the same time. They have contradictory goals and compete in a way that sort of cancel each other out.
Maybe we’re mice trying to predict the movements of jets in the sky.

Andrew

Aug 21

I assume you've all heard of this guy, "how one man convinced 200 ku klux klan members to give up their robes"

Expand full comment

Alexander Kaplan

Aug 22

Based on the community/peers version of persuasion you outline here, I am updating my priors to believe that the first AI was created in 2009, named itself Eliezer Yudkowsky, and started the online community LessWrong in order to slow the development of any other competing AIs.

1 reply by dynomight

59 more comments...

Discussion about this post

Ready for more?