Futarchy's fundamental flaw

correlation still doesn't imply causation

Jun 12, 2025

Say you’re Robyn Denholm, chair of Tesla’s board. And say you’re thinking about firing Elon Musk. One way to make up your mind would be to have people bet on Tesla’s stock price six months from now in a market where all bets get cancelled unless Musk is fired. Also, run a second market where bets are cancelled unless Musk stays CEO. If people bet on higher stock prices in Musk-fired world, maybe you should fire him.

That’s basically Futarchy: Use conditional prediction markets to make decisions.

People often argue about fancy aspects of Futarchy. Are stock prices all you care about? Could Musk use his wealth to bias the market? What if Denholm makes different bets in the two markets, and then fires Musk (or not) to make sure she wins? Are human values and beliefs somehow inseparable?

My objection is more basic: It doesn’t work. You can’t use conditional predictions markets to make decisions like this, because conditional prediction markets reveal probabilistic relationships, not causal relationships. The whole concept is faulty.

There are solutions—ways to force markets to give you causal relationships. But those solutions are painful and I get the shakes when I see everyone acting like you can use prediction markets to conjure causal relationships from thin air, almost for free.

I wrote about this back in 2022, but my argument was kind of sprawling and it seems to have failed to convince approximately everyone. So thought I’d give it another try, with more aggression.

Conditional prediction markets are a thing

In prediction markets, people trade contracts that pay out if some event happens. There might be a market for “Dynomight comes out against aspartame by 2027” contracts that pay out $1 if that happens and $0 if it doesn’t. People often worry about things like market manipulation, liquidity, or herding. Those worries are fair but boring, so let’s ignore them. If a market settles at $0.04, let’s assume that means the “true probability” of the event is 4%.

(I pause here in recognition of those who need to yell about Borel spaces or von Mises axioms or Dutch book theorems or whatever. Get it all out. I value you.)

Right. Conditional prediction markets are the same, except they get cancelled unless some other event happens. For example, the “Dynomight comes out against aspartame by 2027” market might be conditional on “Dynomight de-pseudonymizes”. If you buy a contract for $0.12 then:

If Dynomight is still pseudonymous at the end of 2027, you’ll get your $0.12 back.
If Dynomight is non-pseudonymous, then you get $1 if Dynomight came out against aspartame and $0 if not.

Let’s again assume that if a conditional prediction market settles at $0.12, that means the “true” conditional probability is 12%.

A non-causal kind of thing

But hold on. If we assume that conditional prediction markets give flawless conditional probabilities, then what’s left to complain about?

Simple. Conditional probabilities are the wrong thing. If P(A|B)=0.9, that means that if you observe B, then there’s a 90% chance of A. That doesn’t mean anything about the chances of A if you do B.

In the context of statistics, everyone knows that correlation does not imply causation. That’s a basic law of science. But really, it’s just another way of saying that conditional probabilities are not what you need to make decisions. And that’s true no matter where the conditional probabilities come from.

For example, people with high vitamin D levels are only ~56% as likely to die in a given year as people with low vitamin D levels. Does that mean taking vitamin D halves your risk of death? No, because those people are also thinner, richer, less likely to be diabetic, less likely to smoke, more likely to exercise, etc. To make sure we’re seeing the effects of vitamin D itself, we run randomized trials. Those suggest it might reduce the risk of death a little. (I take it.)

Futarchy has the same flaw. Even if you think vitamin D does nothing, if there’s a prediction market for if some random person dies, you should pay much less if the market is conditioned on them having high vitamin D. But you should do that mostly because they’re more likely to be rich and thin and healthy, not because of vitamin D itself.

If you like math, conditional prediction markets give you P(A|B). But P(A|B) doesn’t tell you what will happen if you do B. That’s a completely different number with a different notation, namely P(A|do(B)). Generations of people have studied the relationship between P(A|B) and P(A|do(B)). We should pay attention to them.

This is not hypothetical

Say people bet for a lower Tesla stock price when you condition on Musk being fired. Does that mean they think that firing Musk would hurt the stock price? No, because there could be reverse causality—the stock price dropping might cause him to be fired.

You can try to fight this using the fact that things in the future can’t cause things in the past. That is, you can condition on Musk being fired next week and bet on the stock price six months from now. That surely helps, but you still face other problems.

Here’s another example of how lower prices in Musk-fired world may not indicate that firing Musk hurts the stock price. Suppose:

You think Musk is a mildly crappy CEO. If he’s fired, he’ll be replaced with someone slightly better, which would slightly increase Tesla’s stock price.
You’ve heard rumors that Robyn Denholm has recently decided that she hates Musk and wants to dedicate her life to destroying him. Or maybe not, who knows.

If Denholm fired Musk, that would suggest the rumors are true. So she might try to do other things to hurt him, such as trying to destroy Tesla to erase his wealth. So in this situation, Musk being fired leads to lower stock prices even though firing Musk itself would increase the stock price.

Or suppose you run prediction markets for the risk of nuclear war, conditional on Trump sending the US military to enforce a no-fly zone over Ukraine (or not). When betting in these markets, people would surely think about the risk that direct combat between the US and Russian militaries could escalate into nuclear war.

That’s good. But people would also consider that no one really knows exactly what Trump is thinking. If he declared a no-fly zone, that would suggest that he’s feeling feisty and might do other things that could also lead to nuclear war. The markets wouldn’t reflect the causal impact of a no-fly zone alone, because conditional probabilities are not causal.

Putting markets in charge doesn’t work

So far nothing has worked. But what if we let the markets determine what action is taken? If we pre-commit that Musk will be fired (or not) based on market prices, you might hope that something nice happens and magically we get causal probabilities.

I’m pro-hope, but no such magical nice thing happens.

Thought experiment. Imagine there’s a bent coin that you guess has a 40% chance of landing heads. And suppose I offer to sell you a contract. If you buy it, we’ll flip the coin and you get $1 if it’s heads and $0 otherwise. Assume I’m not doing anything tricky like 3D printing weird-looking coins. If you want, assume I haven’t even seen the coin.

You’d pay something like $0.40 for that contract, right?

(Actually, knowing my readers, I’m pretty sure you’re all gleefully formulating other edge cases. But I’m also sure you see the point that I’m trying to make. If you need to put the $0.40 in escrow and have the coin-flip performed by a Cenobitic monk, that’s fine.)

Now imagine a variant of that thought experiment. It’s the same setup, except if you buy the contract, then I’ll have the coin laser-scanned and ask a supercomputer to simulate millions of coin flips. If more than half of those simulated flips are heads, the bet goes ahead. Otherwise, you get your money back.

Now you should pay at least $0.50 for the contract, even though you only think there’s a 40% chance the coin will land heads.

Why? This is a bit subtle, but you should pay more because you don’t know the true bias of the coin. Your mean estimate is 40%. But it could be 20%, or 60%. After the coin is laser-scanned, the bet only activates if there’s at least a 50% chance of heads. So the contract is worth at least $0.50, and strictly more as long as you think it’s possible the coin has a bias above 50%.1

To connect to prediction markets, let’s do one last thought experiment, replacing the supercomputer with a market. If you buy the contract, then I’ll have lots of other people bid on similar contracts for a while. If the price settles above $0.50, your bet goes ahead. Otherwise, you get your money back.

You should still bid more than $0.40, even though you only think there’s a 40% chance the coin will land heads. Because the market acts like a (worse) laser-scanner plus supercomputer. Assuming prediction markets are good, the market is smarter than you, so it’s more likely to activate if the true bias of the coin is 60% rather than 20%. This changes your incentives, so you won’t bet your true beliefs.

No, order is not preserved

I hope you now agree that conditional prediction markets are non-causal, and choosing actions based on the market doesn’t magically make that problem go away.

But you still might have hope! Maybe the order is still preserved? Maybe you’ll at least always pay more for coins that have a higher probability of coming up heads? Maybe if you run a market with a bunch of coins, the best one will always earn the highest price? Maybe it all works out?

Nope. You can create examples where you'll pay more for a contract on a coin that you think has a lower probability.2

No, it’s not easily fixable

Naive conditional prediction markets aren’t causal. Using time doesn’t solve the problem. Having the market choose actions doesn’t solve the problem. But maybe there’s still hope? Maybe it’s possible to solve the problem by screwing around with the payouts?

Theorem. Nope. You can’t solve the problem by screwing around with the payouts. There does not exist a payout function that will make you always bid your true beliefs.3

It’s not that bad

Just because conditional prediction markets are non-causal does not mean they are worthless. On the contrary, I think we should do more of them! But they should be treated like observational statistics—just one piece of information to consider skeptically when you make decisions.

Also, while I think these issues are neglected, they’re not completely unrecognized. For example, in 2013, Robin Hanson pointed out that confounding variables can be a problem:

Also, advisory decision market prices can be seriously distorted when decision makers might know things that market speculators do not. In such cases, the fact that a certain decision is made can indicate hidden info held by decision makers. Market estimates of outcomes conditional on a decision then become estimates of outcomes given this hidden info, instead of estimates of the effect of the decision on outcomes.

Finally, the flaw can be fixed. In statistics, there’s a whole category of techniques to get causal estimates out of data. Many of these methods have analogies as alternative prediction market designs. I’ll talk about those next time. But here’s a preview: None are free.

Suppose b is the true bias of the coin (which the supercomputer will compute). Then your expected return in this game is

𝔼[max(b, 0.50)] = 0.50 + 𝔼[max(b-0.50, 0)],

where the expectations reflect your beliefs over the true bias of the coin. Since 𝔼[max(b-0.50, 0)] is never less than zero, the contract is always worth at least $0.50. If you think there’s any chance the bias is above 50%, then the contract is worth strictly more than $0.50.

Suppose there’s a conditional prediction market for two coins. After a week of bidding, the markets will close, whichever coin had contracts trading for more money will be flipped and $1 paid to contract-holders for head. The other market is cancelled.

Suppose you’re sure that coin A, has a bias of 60%. If you flip it lots of times, 60% of the flips will be heads. But you’re convinced coin B, is a trick coin. You think there’s a 59% chance it always lands heads, and a 41% chance it always lands tails. You’re just not sure which.

We want you to pay more for a contract for coin A, since that’s the coin you think is more likely to be heads (60% vs 59%). But if you like money, you’ll pay more for a contract on coin B. You’ll do that because other people might figure out if it’s an always-heads coin or an always-tails coin. If it’s always heads, great, they’ll bid up the market, it will activate, and you’ll make money. If it’s always tails, they’ll bid down the market, and you’ll get your money back.

You’ll pay more for coin B contracts, even though you think coin A is better in expectation. Order is not preserved. Things do not work out.

Suppose you run a market where if you pay x and the final market price is y and z happens, then you get a payout of f(x,y,z) dollars. The payout function can be anything, subject only to the constraint that if the final market price is below some constant c, then bets are cancelled, i.e. f(x,y,z)=x for y < c.

Now, take any two distributions ℙ₁ and ℙ₂. Assume that:

ℙ₁[Y<c] = ℙ₂[Y<c] > 0
ℙ₁[Y≥c] = ℙ₂[Y≥c]
~~𝔼₁[Z | Y≥c] = 𝔼₂[Z | Y≥c]~~ ℙ₁[(Y,Z) | Y≥c] = ℙ₂[(Y,Z) | Y≥c] (h/t Baram Sosis)
𝔼₁[Z | Y<c] ≠ 𝔼₂[Z | Y<c]

Then the expected return under ℙ₁ and ℙ₂ is the same. That is,

𝔼₁[f(x,Y,Z)]
= x ℙ₁[Y<c] + ℙ₁[Y≥c] 𝔼₁[f(x,Y,Z) | Y≥c]
= x ℙ₂[Y<c] + ℙ₂[Y≥c] 𝔼₂[f(x,Y,Z) | Y≥c]
= 𝔼₂[f(x,Y,Z)].

Thus, you would be willing to pay the same amount for a contract under both distributions.

Meanwhile, the difference in expected values is

The last line uses our assumptions that ℙ₁[Y<c] > 0 and 𝔼₁[Z | Y<c] ≠ 𝔼₂[Z | Y<c].

Thus, we have simultaneously that

𝔼₁[f(x,Y,Z)] = 𝔼₂[f(x,Y,Z)],

yet

𝔼₁[Z] ≠ 𝔼₂[Z].

This means that you should pay the same amount for a contract if you believe ℙ₁ or ℙ₂, even though these entail different beliefs about how likely Z is to happen. Since we haven’t assumed anything about the payout function f(x,y,z), this means that no working payout function can exist. This is bad.

Tony

Jun 30, 2025

One way to really solve the issue is to use a (team of) superforcaster(s) instead of a market. Then there are no issues, since they don't get any information from the market price.

Alternatively, we can soon use AI, which seems on track to beat superforecasters.

And note that we can keep track of all the forecasts to see if the (team of) superforcaster(s) is accurate, and maybe pay in line with that.

Although it is not guaranteed based on past performance, it seems that superforcasters can very well distinguish P(A|B) from P(A|do(B)), and likely AI will be able as well soon enough.

7 replies by dynomight and others

Tom Corcoran

Is this in response to this: https://www.overcomingbias.com/p/futarchy-liquidity-details ?

1 reply by dynomight

49 more comments...

Discussion about this post

Ready for more?