28 Comments
User's avatar
Elizabeth Thinks's avatar

> (If you’re wondering why the absolute risk of cancer is so much higher for women than for men, I think it’s mostly that men are at much higher risk of dying from other stuff like cardiovascular disease. You can’t get cancer if you die from something else first.)

I don't think this is the correct explanation here. The graphic only shows risk of "alcohol-related cancer", which it defines as "breast, colorectum, esophagus, liver, mouth, throat, and voice box cancers." Women have a 13% lifetime risk of breast cancer, so that would be the primary driver of the increased risk for women. The reason the increased risk for women is less than 13 percentage points is the fact that men have a higher risk of cancer in general, especially esophageal and voice box/larynx (https://pmc.ncbi.nlm.nih.gov/articles/PMC9987657) which are counted as "alcohol-related cancer".

Expand full comment
dynomight's avatar

I think you're right! The table in that paper doesn't seem to include breast, mouth or throat cancer, but I'm pretty much convinced my explanation is totally wrong and I wish I'd hedged more than just writing "I think". I'll retract that statement, thank you!

Expand full comment
Brian Moore's avatar

re: #10 "alcohol causes oxidative stress" this is the first time I've seen regular normal outlets use that term. Normally it's just weirdos on the internet? If "things that cause oxidative stress" are now conventional-wisdom bad + carcinogenic, then uh.... aren't there a lot of other things that do as well? Is the Surgeon General endorsing applying the same "declared bad" criteria to those things?

Expand full comment
dynomight's avatar

I'd say that oxidative stress alone is only weak evidence. For alcohol we also know that the oxidative stress actually leads to DNA damage, plus we have all the other mechanisms, plus we have the observational data hinting that alcohol does in fact increase cancer.

My impression is that the major mechanism of alcohol causing cancer is acetaldehyde—because alcohol seems much better at causing cancer in people with genes that lead to more acetaldehyde buildup (mostly east asians)

Expand full comment
Emil O. W. Kirkegaard's avatar

If you are trying to aggregate information, and the availability of it depends on it's direction or size, then aggregation will be biased. This is the essence of publication bias, p-hacking etc.

Expand full comment
Douglas's avatar

Well... I have to ask. Who is Doug from #4?

Expand full comment
dynomight's avatar

Doug is someone who I contacted and asked if they wanted to be acknowledged or linked to in more detail, but never heard back from so I went with safe default of first name only. :(

(It wasn't you, was it?)

Expand full comment
Douglas's avatar

I doubt it as I'm unaware of any messages. But the intersection having the name Doug, sharing a link to the HiTop conceptualization of psychopathology paper (in a link list a few months ago), and in this general part of the internet seems like a small overlap. It's a big small world I guess!

Expand full comment
dynomight's avatar

Wait, is this you? You were a different Doug then!

https://dynomight.substack.com/p/arguments-3/comment/82302426

If that's you, then I'd be happy to change Doug and add a link to your blog (or whatever)

Expand full comment
Douglas's avatar

Nope a different Doug! There's just thst many of us out there I guess!

Expand full comment
dynomight's avatar

Huh. Well, I certainly can't blame you for asking!

Expand full comment
Steve Newman's avatar

> Does it matter how information gets to you? For example, say a paper comes out that tries yelling at kids in school and finds that yelling is not effective at making them learn faster. You might worry: If the experiment had found that yelling was effective, would it have been published? Would anyone dare to talk about it?

> How much should you downweight evidence like this, where the outcome changes the odds you’d see it?

> If you’re a true believer fundamentalist Bayesian, the answer is: None. You ignore all those concerns and give the evidence full weight. At a high level, that’s because Bayesianism is all about probabilities in this world, so what could have happened in some other world doesn’t matter.

I'm confused by this: if you receive information via a channel that you know applies selective filtering, your knowledge of the filtering *must* affect how you update, mustn't it?

I'm not practiced in formal Bayesian reasoning, so I can't express this in the appropriate technical language; with that caveat:

Suppose I believe that the New York Times is selective in their reporting of natural disasters, they are more likely to report disasters in some countries and less likely in others. They report on a disaster in Japan. I can 100% update that there is, in fact, a disaster in Japan.

Then I open my morning edition of Weird School Experiments Daily and I see the report of a finding that yelling at kids is not effective. It's just one study, so even setting aside my concerns regarding reporting bias, I can't fully trust it. I should update somewhat in the direction of believing that yelling at kids is not effective to make them learn faster.

How much should I update? Well, imagine a world where:

- This particular question is the subject of 10 studies each year.

- The reality is that yelling is *slightly* effective, such that any given study has a 50% chance of concluding it is effective.

- My prior happened to be precisely correct – I believed that yelling is slightly effective.

If I update on published studies, and only the studies which find that yelling is ineffective are published, then 5 times each year I'll update somewhat in the direction of yelling being ineffective – pushing me away from the truth. If I update on published studies, and all of the studies are published, then I'll update back and forth but my belief will tend to stay in the vicinity of the truth.

So it seems that it is unambiguously incorrect to ignore known (or strongly suspected) selective filtering in an information channel? If I know about the selective reporting of yelling studies, and I see a study finding that yelling is ineffective, it seems like I could respond to this in one of two ways:

A) Ignore it. It contains zero bits of information, because I know a priori that all studies I read on this subject will contain this conclusion.

B) Perform some complicated analysis of how many studies I suspect are conducted on this topic, how many studies I see published, and what the actual effectiveness of yelling must be to result in the observed number of stories being published.

All of this ignores the fact that when we require a study to find that yelling is ineffective, we have not fully constrained the contents of that study – it might find that the effect size was almost large enough to be significantly significant, or it might find a much smaller (or even negative) effect size; there are all the details of how the study was conducted (e.g. sample size), etc. So I could get more sophisticated than B. But I think the point stands that knowledge of the properties of the information channel is important.

My presumption is that you could fit all this into formal Bayesian reasoning if you expand your analysis to include priors around what experiments are performed and which of those will be published. But also that your head would explode.

Expand full comment
dynomight's avatar

Hmmm, good example! I think *part* of the explanation is the issue I talked about here: https://dynomight.net/datasets/. And I think that part of the issue is complexity brought up when you start treating "number of studies observed per year" as a random variable, rather than "value of single study". But I don't think that fully resolves it either. I need to think about this more.

Expand full comment
Steve Newman's avatar

Agreed, the Datasets issue seems related.

I feel like it might be possible to express the thing I'm gesturing at in rigorous mathematical terms, but I'm too lazy to try. I think one of the key parameters would be the reliability of the information I'm receiving. "NYTimes says there is a natural disaster in Japan" is highly reliable. "Study says X" is less reliable, because there are many ways in which studies can be flawed or simply subject to bad statistical luck. And (I meant to say this the first time but forgot) I feel like knowledge that an information channel is selective interacts somehow with the reliability of the information being conveyed through that channel.

Inching in the direction of a mathematical formulation: if I read about a yelling study via a selective channel, I can be highly confident that the yelling study was performed; channel selectivity doesn't undermine that. But I need to be careful how I update my beliefs about yelling.

Expand full comment
Michael Dickens's avatar

> If you’re a true believer fundamentalist Bayesian, the answer is: None.

How do you argue against this as a non-true believer non-fundamentalist? Like your argument can't be "reject Bayes' theorem" because it's a theorem, it's definitely true. The standard anti-Bayes position is something like "priors aren't real" but I don't see how that helps you here. I think you would have to go as far as "reject subjective probabilities" which effectively means it's impossible to coherently reason about uncertain questions.

(I might be missing something because I realize that "subjective probabilities aren't real" is a popular position even among statisticians, but I don't understand how it's possible to have coherent beliefs if subjective probability isn't a thing.)

Expand full comment
dynomight's avatar

Personally, I think the fundamental idea of Bayesianism is "mixing together epistemic and aleratoric uncertainty: Good." There are valid reasons to not want to mix those together, in which case you probably don't want to be Bayesian.

Now, if I understand you, you're asking if it's possible to be "mostly Bayesian" but not accept the extreme view implied by the Voltmeter story? I've wondered that myself. I don't know of any coherent philosophy that would allow you to do that, or how exactly it would work. But I'm not sure it doesn't exist, or couldn't be invented!

Expand full comment
MoltenOak's avatar

> Personally, I think the fundamental idea of Bayesianism is "mixing together epistemic and aleratoric uncertainty: Good.

I've never encountered this characterization. Did you come up with this? Any further reading you can point to? (If you did come up with it, or even if not, I'd love some writing on this by you!)

Expand full comment
dynomight's avatar

Well, people do sometimes discuss this as part of Bayesianism, but I don't think it often gets nearly the centrality that it deserves. Thanks for the encouragement, I've actually got like a dozen Bayesian essays I'm theoretically working on...

Expand full comment
Adam Mastroianni's avatar

#4 strikes me as an exercise in epicycle-fitting, but I mean that in a descriptive way rather than a disparaging way. One clue we’re still dealing with geocentric psychology here is that the taxonomy is based on symptoms rather than causes. Imagine doing this for physical diseases instead—if you get really good at measuring coughing, sneezing, aching, wheezing, etc. you may ultimately get pretty good at distinguishing between, say, colds and flus. But you’d have a pretty hard time distinguishing between flu and covid, and you’d have no chance of ever developing vaccines for them, because you have no concept of the systems that produce the symptoms.

I think approaches like this, which administer questionnaires and then try to squeeze statistics out of them, are going to top out at that level. They’ll probably make us better at treating mental disorders, but not much better.

Expand full comment
dynomight's avatar

Interesting, thanks. That brings up something I always wonder about psychology, which is... Do we have good reasons to suspect that "differences in kind" exist? To take the disease analogy, we know that covid and flu are different in kind because they're caused by different kinds of viruses. But with psychopathy vs., I don't know, BPD, is there any reason there is any simple boundary to be drawn? To talk about epicycles is to imply that you're building an overly complex model when a simpler one exists, but what if there is no simpler model to be found and epicycles are all we've got?

Expand full comment
Adam Mastroianni's avatar

That’s a good question that I wish we all thought about more!

I don’t know if I have a principled way of measuring complexity, but I wonder whether geocentrism really was the simpler model. “Planets go in circles around the Earth because circles are perfect and the Earth is important” is pretty straightforward as long as you don’t ask too many questions. “Planets go in ellipses around the sun because something something gravity spacetime??” seems harder and what’s gravity anyway?

Regardless, I think the reason we should expect better explanations exist is because the ones we have are not that good. Mysteries still abound—we don’t really know why people get sick in the head, or how to make them better. But we do know that it can’t just be random, because we’re able to pick up signal well enough to make graphs like the one in the post. So either we’ve detected all the signal there is and it’s random number generators from here on out, or there’s signal left to detect. I’m betting there’s signal left because so far in history that’s always been true, even when it didn’t seem like it could possibly be true.

We can ask the same question about the past. For a few thousand years, alchemists thought everything was a mix of sulphur and mercury. Should they have suspected differences in kinds that they hadn’t discovered yet? I’m honestly not sure; their theory seemed to work for some things and obviously it didn’t work for other things, but maybe those other things were always going to be impossible. You can’t know for sure until you try, but then, ain’t that the point of science?

Expand full comment
Andrew Lekashman's avatar

I had the pleasure of meeting Ted Nelson at the Internet Archive and he walked me through exactly what he was trying to accomplish with Xanadu. Interestingly enough, now that AI foundation model companies are attempting to grab every last piece of data on the internet, they may actually wind up creating something resembling his dream in the form of a massive vector database of the internet. The part that Mr. Nelson didn't put into the idea was the concept of a fast daemon who matches the intent of the user with the content from the Xanadu.

Expand full comment
dynomight's avatar

Yeah, I'm personally of this opinion that AI is going to force us to rethink our intellectual property regime. So I do wonder if Xanadu-esque ideas are due for a rebound, as we invent some kind of distributed royalty scheme so that we can both have AI and also incentivize people to create (more of) the data that AI needs.

Expand full comment
ZFC's avatar

Claude tells me that cancer risks are dominated by total consumption, not the distribution of consumption. Is this actually true?

Expand full comment
dynomight's avatar

You mean, is drinking 1 drink/day the same as having 7 drinks once per week? To the best of my knowledge, no one really knows. (After all, it's all kinda shaky, since we're relying on observational data. But there's surely some causal impact since the mechanisms are well established.)

Expand full comment
Jason Crawford's avatar

One of my all-time favorite articles is “The Curse of Xanadu,” by Gary Wolf, WIRED, 1995.

I wrote about it here: https://jasoncrawford.org/the-lessons-of-xanadu

Expand full comment
slice's avatar

related, a fun post about this and other Xanadus: https://www.tumblr.com/fipindustries/627667590962151424/the-dream-of-xanadu

> i hereby propose that the Xanadu software was the third instance of this phenomena Borges descrives thusly:

Perhaps an archetype not yet revealed to men, an eternal object (to use Whitehead’s term), is gradually entering the world; its first manifestation was the palace; its second was the poem. Whoever compared them would have seen that they were essentially the same.

Expand full comment
dynomight's avatar

Thanks, this is great! Extremely helpful both in understanding what Xanadu was/is and the general context. This quote is the best description I've seen anywhere:

> In contrast to the later Web, links in Xanadu did not point to entire documents, but to any arbitrary range of characters within any document. Links were to be bi-directional, so they could not be broken. And there was an advanced feature in which “parts of documents could be quoted in other documents without copying”

I had no idea there was supposed to be a royalty and copyright scheme integrated into the platform. I wonder if there's a group of crypto/web3 people somewhere who have latched on to all these ideas...

Expand full comment