Thoughts on the AI 2027 discourse

Not to say the "dream" version doesn't have its benefits, but I feel one clear benefit of peer-review as it exists today is anonymity, which helps make it more likely that work is accepted on its merits, rather than the fame of its authors.

Expand full comment

Stuart Armstrong

Aug 27

In the math fields I was in, the papers were not anonymous - it was so specialised that it was always pretty obvious who wrote what.

Expand full comment

I honestly agree, I think this is the biggest advantage of peer-review. (Although, I'm guessing you're thinking of some kind of engineering? Because anonymity is not actually the norm in most fields!)

Expand full comment

https://authorservices.wiley.com/Reviewers/journal-reviewers/what-is-peer-review/types-of-peer-review.html

Bolton

Jun 25Edited

I have most experience with a few subfields of CS - among conferences/journals I have submitted to, I don't think I've ever submitted to anything without author anonymity. It's sad that it's not more common!

Looking online [1], I found these (IMO weak) arguments against author anonymity:

"Knowing who the author is (and their affiliation) allows reviewers to use their knowledge of the author's previous research."

(Shouldn't the reviewers be able to find the relevant research, done by the authors or not, from the citations?)

"Anonymity isn't guaranteed, as it could be fairly straightforward to discover the identity of the author"

(Yes, but that's no excuse not to try, conference management software will help to keep things as anonymous as possible)

"Some argue that knowledge of the author's identity helps the reviewer come to a more informed judgement"

(How so? Sounds like some people want to be lazy and use the author's fame to more accurately predict if other reviewers will accept without doing the analysis work themselves)

"The transparency of open peer review encourages accountability and civility"

(This one is fair, certainly critical review is a process that inevitably engender some bad feelings, IDK if this is worth what I view as the downsides. You can always make the review process public after the fact, I feel that would lead to accountability against someone totally flipping out on a review they disliked.)

[1]:

Expand full comment

amit

good pov

Expand full comment

ZFC

I want to agree with you here, but the academic ML community already operates with norms similar to your dream and my strong impression is it functions worse than the rest of CS (which is itself a hybrid of the dream and standard peer review -- conferences with 3 month review periods, not journals with 2 year review periods).

The problem ML seems to have run into is this "smart public critique" mechanism is super heavy tailed (no one makes lesswrong posts about COLT weak-accept papers). The peer review system is a way to force even non-famous stuff to get carefully read by [some] experts.

Expand full comment

I'm a little confused. Isn't academic ML based on conferences (mostly) like the rest of CS? I guess you must be referring to how academic ML functions discussions happen in social media or whatever?

(I do agree, BTW, that one of the advantages of peer review is that it makes it easier for less-famous people to make contributions.)

Expand full comment

ZFC

(I was thinking peer review makes it harder for not-famous people to make _false_ contributions, but it probably makes it easier for them to make true contributions as well)

Expand full comment

ZFC

Conference reviewing in ML is very very bad because the field is too large/poorly organized for the social shame of doing a crappy job reviewing to matter, so there’s more reliance on Twitter feedback. Ofc that in part reflects other flaws in peer review…

Expand full comment

Alis

I've so far only found PubPeer; I wish more people were using it so those unspoken flaws came out more often

Expand full comment

Victualis

Jun 26

The flaws are out there and no-one cares. Take a look at some of the heavily cited ML papers on Google Scholar that got rejected from the big conferences using open review: this should be a scandal and cause for reflection and censure of the reviewers involved. It isn't. People know peer review is largely random and this expectation is built in.

Expand full comment

Jun 23Edited

I've found that, for AI topics at least, posting on LessWrong can be a great way to attract this type of constructive engagement.

For example, and not go all meta but: prior to posting my own commentary on AI 2027 on my Substack (https://secondthoughts.ai/p/ai-2027), I posted an early version on LessWrong (https://www.lesswrong.com/posts/bfHDoWLnBH9xR3YAK/ai-2027-is-a-bet-against-amdahl-s-law) and got terrific feedback, including from the authors of AI 2027. (Here's a followup post where I discuss what I learned from the comments on the first post: https://www.lesswrong.com/posts/FFKnWk2MGJmyQWEsd/updates-from-comments-on-ai-2027-is-a-bet-against-amdahl-s.)

[EDIT: Substack has been throwing weird errors. I think I posted this comment twice and deleted it once, leaving one copy, but not sure. There is a mention of glitches on status.substack.com.]

Expand full comment

Victualis

Jun 26

Your point about Amdahl's Law seemed right. I don't understand why you rolled back your argument based on the feedback? AL means that if just one out of the 100 distinct tasks in an activity cannot be arbitrarily subdivided and parallelized , then one can't speed up the activity by more than 100. There seem to be multiple such tasks in AI research, so I don't believe speedups of 2000 or more are possible at the moment. Gustafson's work-increasing framing seems more plausible, but that doesn't allow shrinking the >2100

timeline but instead suggests automation would achieve a bunch of new things we weren't expecting in that time. (BTW I think we already have AGI, I started the clock with GPT-3 even though it can't control a robot body.)

Expand full comment

Jun 26

The feedback I received was that the milestones in AI 2027, such as "superhuman coder" and "superhuman AI researcher", refer to an AI which can accelerate 100% of the relevant work. So, by definition, Amdahl's Law does not apply.

The flip side of this, of course, is that "superhuman coder" and "superhuman AI researcher" by this definition are extremely high bars to clear. And so the argument against the AI 2027 model shifts to an argument that the model may be overly optimistic in its prediction for when those thresholds will be reached. I didn't so much roll back my argument, as reframe it to reflect a more accurate understanding of the AI 2027 model.

I'll also note that I don't think AI 2027 envisions that all relevant tasks can be "arbitrarily subdivided and parallelized". As I wrote in https://secondthoughts.ai/p/ai-2027:

> A 2500x speedup could be achieved by, say, making 10x better choices for selecting experiments, learning 5x more from each experiment, avoiding bugs that might otherwise have caused 50% of experiments to be wasted, and getting away with 25x less computing resources per experiment.

I remain skeptical, but it's not a ridiculous scenario.

Expand full comment

https://www.lesswrong.com/posts/FFKnWk2MGJmyQWEsd/updates-from-comments-on-ai-2027-is-a-bet-against-amdahl-s#:~:text=The%20shape%20of%20the%20acceleration%20curve%20doesn't%20necessarily%20change:

Thanks, going through all this now! Just FYI it looks like an image might be missing in the follow-up post? (I just see the text "Output image".)

Expand full comment

Thanks! Fixed.

This was an image which I'd asked ChatGPT to create, and AFAICT, I somehow managed to embed a URL pointing directly to the image embedded in the chat session – which probably worked for a while and then expired. I've replaced it with an uploaded copy of the image.

Expand full comment

Having read all that, some great discussion. (In particular, for me it's helpful to think about "some of these activities will be harder to automate than others" as an important crux. I tend to agree with you but I find it hard to come up with good arguments for why!)

In any case, I totally agree that LessWrong is great at producing this type of discussion. This is the single thing I most admire about that community. What I'd really like to understand is: Why aren't there more places that cultivate this kind of discussion for other communities and/or topics? Maybe there are and I'm just not aware of them?

My instinct is that this kind of friendly debate is more common in communities that are less "open". Though I'm not sure if that's correct or—if it is—why.

Expand full comment

Great question, and mostly I have no idea. Maybe it's hard to do at scale, and so by definition neither you nor I have heard of most such instances (or, ~equivalently: it takes something unusual, like a tight-nit community + a dedicated forum founder to seed something like LessWrong; it can't scale beyond a certain size; hence the per capita supply of such fora is very small).

It is my understanding that a lot of dedicated work goes into maintaining LessWrong. I'm not sure how much explicit moderation is needed; definitely there's a lot of custom-built software, though I don't know how critical that is to the resulting tone (it's more obviously helpful for the ability to sustain complex discussions over long-form documents).

Expand full comment

Alis

I've been looking for those communities since finishing university. Surprisingly, the most expert opinions that helped me were in seemingly unrelated communities. I got the best advise on liquid chromatogram baseline subtraction in an audiophile community, whereas the ImageJ community didn't help. The best statistics advise in a general science discord, while the statistics discord was no help. You for some reason get the best advise cross-disciplinarily -- anyone deep enough in a community to be considered an expert will rarely deal with seemingly simple questions. I asked a physics professional doing complicated viscosity measurements some elementary questions that they had no answer to.

The only exception I've found is lead developers. The Julia Discord's resident expert an Chris Rackauckas who somehow knows everything and anything you ask. Plugin development for ObsidianMD, mods for Minecraft, RimWorld.

Some more pseudoscientific ideas I tend to use when personalities seem involved is MBTI. Maybe what I'm looking for is a character trait of being authoritative, someone who exudes experience or wisdom. Project leads, even when their output isn't of a high standard, just 'know' things, so the external thinking element in MBTI. I'm definitely INTP and dynomight is super INTP-coded and the 'golden match' of that type is INTJ that has external thinking the most developed. I'm sure lead devs tend to be INTJ or similar.

Expand full comment

Alex C.

https://www.experimental-history.com/p/the-rise-and-fall-of-peer-review

Adam Mastroianni has an interesting Substack post on this subject: "The Rise and Fall of Peer Review". You've probably seen it.

Expand full comment