45 Comments

I think there's this mindset from other non-social sciences that spills into social science in a unproductive way. There's no scientific finding that involves human subjects that can divorce itself from human behavior (in this case, selection to comply to treatment assignment). Instead of thinking about whether or not colonoscopies work, you think about whether the policy of nudging people to get colonoscopies work.

This type of evidence answers policy questions and are not necessarily applicable to individual choices or the science of a specific procedure. Maybe in some cases, for some reason, you have very high compliance and it's doubled-blinded, etc etc - but a lot of these studies that don't have these very unusually ideal experimental conditions don't really lend themselves well to extracting the strict science of a procedure, it's always going to be obfuscated by the social stuff.

Expand full comment

This post reminds me of Scott Alexander's excellent article, "Getting Eulered":

https://slatestarcodex.com/2014/08/10/getting-eulered/

Expand full comment

@dynomight - did you intend for the brick estimates to disagree? I ask because the disagreement does not seem to be part of the analogy.

Alice = 1,000 kg, Bob = 1,985.5 kg

Expand full comment

OK, but after all is said and done, do you have an opinion on the effectiveness (or lack thereof) of colonoscopies? I don't know math from mashed potatoes, so I can't really tell from your write-up what you now think. I recently finished reading an interesting book called "Outlive: The Science and Art of Longevity", by Peter Attia, MD. He seems to know what he's talking about, and he comes out strongly in favor of colonoscopies, to the point where I'm thinking of getting another one even though I already had one about six years ago.

Expand full comment

I wish I had access to the full paper to see, but isn't the disagreement between your logic and the proposed solution to the logic (instrumental variables) over the validity of the instrumental variable selected? You seem to be arguing that the instrumental variable selected was invalid, because "receives colonscopy" actually does correlate with lower risk of having colon cancer when the paper is arguing it doesn't and only instrumentally causes lower risk?

What was the justification provided in the paper for why that instrumental variable doesn't have an impact on the final variable?

Expand full comment

Here's a point I heard from a doctor on a podcast:

If everyone got a colonoscopy every three months, no one would die of colon cancer.

(Ooops, I see that Alex C said the same thing-ish - it was Attia who made this point)

Expand full comment

Economist here. I read the initial post and was very tempted to send a message exactly in line with what economists have sent. But then I read your post carefully and noticed your claim was that the 0.443% is biased _for the entire_ population, noted that is correct, and I didn't have any further objections. I fully endorse your "everyone is right" conclusion, and kudos for figuring it out.

Economists sometimes downplay the issue that the IV estimates are specific to the complier population, but that's something they should know. One classic reference about this is Imbens (2010) "Better LATE than nothing".

One note: you ask "I already figured out this number using the obvious assumptions and grade school algebra, why would I need stinking *instrumental variables*? Turns out IV can get complicated in some cases, but boils down to exactly the computation you did when you have binary instrument, binary independent variable. I think the intuition of "IV is rescaling" is underappreciated among economists.

Expand full comment

Great post as usual thank you.

But if you believe this 👇

> If two people disagree, it should be the responsibility Dr. Fancy to explain what’s wrong with Dr. Simple, not the reverse.

Than how you would solve the Brandolini's law?

If we demand from the scientists explaining every axiom or theory to the public than we may occupied them from doing new search and discoveries because they are busy explaining to Dr. Simple that their eyes are biased and the earth is not flat.

Expand full comment

@dynomight - in the brick analogy, it seems like Occam's Razor would fit pretty well as a logical heuristic to put the explanatory burden on Dr. Fancy.

Expand full comment

A couple of things that made me scratch my head.

You 'assume the “decrease” for refusers is zero', but then later say 'refusers had less colorectal cancer then controls'. Is the control group the group of people not invited? If so, wouldn't these two statements be contradictory?

It seems reasonable to guess that the acceptors are biased, and that also means the refusors would be biased. But if so, then surely you can't make the assumption in your calculation?

Expand full comment

I think you and (from what you describe) your economist friends are wrong.

In short: the issue you point out is that being called for colonoscopy likely affects the probability of getting cancer; if so that means that the exclusion restriction fails and the LATE formula does not apply, so your economists friends are wrong. Your argument correctly shows that the prob of cancer in the colonoscopied population differs from that in the basile pop by less than 0.443%, but you are wrong to conclude from that that the decrease in prob from getting a colonoscopy in a random pop is less than 0.443%.

Let me explain (sorry for the legnth).

Let Pi be the prob of getting cancer for those invited (at random) to get colonoscopy.

Pa the prob for those invited that accepted.

Pr the prob for those that declined.

Pb the baseline probability (for those not randomly invited).

We have:

Pi = 0.42 Pa +0.58 Pr. Subtracting Pb from both sides:

Pi - Pb = 0.42 (Pa -Pb) + 0.58 (Pr - Pb) (note that -0.42 Pb -0.58 Pb = -Pb )

So we find:

Pa - Pb = (Pi - Pb) / 0.42 - 0.58/0.42 (Pr - Pb)

Now, as you very nicely point out, and your economist friends seem to miss, Pr is possibly < Pb (those invited that initially refused the colonoscopy might get some awareness of the importance of colonoscopy). THIS IS A KEY POINT: BEING INVITED MIGHT HAVE A CAUSAL EFFECT ON THE REFUSERS; this means that the EXCLUSION RESTRICTION FAILS and so the typical LATE formula is invalid. On the other hand the bias could go the other way around: those refusing the colonoscopy will likely be less health-concerned, and healthy that the average population, and so might get more cancer, but the data seems to indicate that the 1st effect dominates).

Under your assumption that Pr<Pb it does follow that Pa - Pb > (Pi - Pb)/0.42 =-0.186%/0.42, that is, the Prob in those accepting the colonoscopy is larger than Pb -0.186% / 0.42 = Pb -0.443%

But from that you cannot conclude that gettin a colonoscopy in a random pop decreases the prob of cancer by less than 0.443%.

The economists get right is that Pa - Pb is not the (causal) effect of the colonoscopy, there is also a selection bias, those accepting the colonoscopy are likely more health-conscious, so likely Pa< Pb + causal effect of the colonoscopy on a random population -note, here causal effec is likely negative- (on the other hand, maybe those accepting have already some minor symptom, then the selection bias is the other way around, Pa>Pb +causal eff).

This selection bias is what the LATE formula nicely solves, BUT ONLY IF THERE IS EXCLUSION RESTRICTION (i.e. treatment assignment affects the outcome only by its effect on treatment); in addition we need a monotonicity assumption. If the exclusion restriction fails, I'm not sure we can say anything about the causal effect (even in a subset of the population), but i'd have to think a bit more about it.

In the wiki artilce on LATE, the exclusion restriction is somewhat weirdly called "excludability condition"

https://en.wikipedia.org/wiki/Local_average_treatment_effect

Also, LATE usually refers to local average treatment effect, not latent average ... since it is the average treatment effect in a particular population: those who change treatment due to the intervention.

Expand full comment

It's been a while since I did econometrics in undergrad, but I'll try to take a stab at this. Basically, if the procedure is unbiased the estimate is unbiased, and we can show that this experimental method eliminates selection bias.

In summary:

1. We want to know the effect of a colonoscopy on cancer.

2. We can randomly assign people to the colonoscopy group or the not-colonoscopy group, but we cannot randomly assign people to actually get the colonoscopy (they can refuse).

3. The difference between the colonoscopy group and the not-colonoscopy group is an unbiased estimate, because this is randomized. There is a fundamental difference between people who accept colonoscopy once assigned and people who don't accept colonoscopy once assigned, but because of randomization, there is an equal amount of this "fundamental difference factor" in both treatment and control. Hence, there is no bias.

4. The only way being randomly assigned to the colonoscopy group can impact cancer is from actually getting the colonoscopy. Hence, the effect of being randomly assigned to the colonoscopy group gives us an unbiased estimate of the effect of colonoscopy.

The key bit here is that the randomization process controls for selection bias in step 3 already, so as long as we don't introduce any additional bias in step 4, we have an unbiased estimate of what we really care about (the effect of colonoscopy on cancer). It is critical that assignment can only impact cancer via colonoscopy itself — if that's not true, it's biased again.

Part of why your discussion with the economists may have been less productive is because bias is a property of the estimation procedure — by definition, if the procedure is unbiased, the estimate is unbiased. So you can only refute a claim that the estimate is biased by pointing to the process and saying "look, we can prove there's no bias". Rather than being a flawed "parallel argument", I think it's more because bias is process-dependent, so you can only really talking about it in such terms, plus there being too many layers of specialist terminology and math making it confusing.

(In more technical terms, the instrumental variable here is "assignment to the colonoscopy group". We use assignment to the colonoscopy group as an instrument to help estimate the effect of colonoscopy on cancer, since we cannot get an unbiased estimate of colonoscopy on cancer directly).

Expand full comment

Dr. Simple,

Your point that Dr. Big Brains needs to explain the problem is most excellent. Made my day complete to see how you laid out the argument.

Call me Molecule.

Expand full comment

I believe that you are making invalid assumptions regarding burden of proof.

You seem to be assuming that a rebuttal of a simple argument must, necessarily, be simple (and, less importantly, that the rebutal of a complex argument must be complex). This does not follow, and, if a rebuttal of the simple argument is actually complex (and I find this to be the most common case in my life) then there is little to be gained by rebutting it as many people will view the compex rebuttal in the same way as they view the complex argument - You just end up wasting everybody's time and changing no minds. Contrariwise, I often find simple rebuttals to complex arguments - usually by simply disagreeing with one or more of the premises on which they are based.

Expand full comment

This comment ended up being snarkier than I had originally intended. Sorry.

Disclaimer: I did not read the big study because it was paywalled.

I think this is mostly (entirely?) a problem of context. In the context of going from data to a summary statistic, bias means estimator bias not selection bias. Would you still have a problem if everyone explicitly said estimator bias? Do you think you would get a better response if you consistently used selection bias/biased by selection effects? I think Recht in the sensible-med article you link knows that IV only gives the treatment effect for those who are treated "There is no perfect way to correct for the misestimates inherent to the intention-to-treat analyses." as does wikipedia https://en.wikipedia.org/wiki/Instrumental_variables_estimation#Interpretation_under_treatment_effect_heterogeneity "Generally, different subjects will respond in different ways to changes in the "treatment" x. When this possibility is recognized, the average effect in the population of a change in x on y may differ from the effect in a given subpopulation."

The sensible-med article does include the line "If it varies from the per protocol analysis, we know the IV estimate is closer to the effect size and we know the per protocol analysis has bias." which I think is suspect, but it's not written by Recht but by the editor Prasad, so I'm willing to ignore it.

Also note, estimation means estimation of a parameter in a model. Everything is a model. Your model is the standard IV model + a difference in response rates correlated to underlying risk. Granted, this is probably a better model. But do you know how to take the data and estimate all the parameters in your model? No. You've used the standard IV model and then reasoned that in your model the treat effect parameter would be less, but not by how much. The advantage of using simpler and standard models is that you know which calculations to do to get the estimations and even get the confidence intervals of that estimate. Maybe you are unimpressed by that, because the standard model is "too simple". As the adage goes, all models are wrong but some are useful.

Expand full comment

Bob's task was to give a weight, not prove Alice wrong. So, why does Bob need to rebut anything? "There’s zero obligation to do anything else." He was asked for a weight and gave an answer of 1985kg. Yes, one of them has to be wrong. But, they can also both be wrong. Bob's pointing out of errors in Alice's method would not prove his method correct.

Not all simple mistakes are easy to find. Counting bricks sounds simple, but how can an error in the count be found other than by duplicating the work or offering proof through more complicate math?

Expand full comment