Discussion about this post

User's avatar
Paul Torek's avatar

Wanting in humans is, from a neuroscientific standpoint, probably a bizarre hodgepodge of electrical and chemical processes. But highly complex goal-directed behavior evolved multiple times in the course of evolution. It seems extremely unlikely to me that the neuroscience of octopus wanting and that of human wanting are very similar at a detailed level. And yet, both clearly want things, and will move heaven and earth to get them. I think this observation raises the probability that "wanting" would be an appropriate description for some possible AIs.

Expand full comment
Greg G's avatar

Regardless of whether AIs fundamentally want things, we are working diligently on making them want (or simulate wanting, which at some point approaches the same thing) things with reinforcement learning. Ironically, this also heightens the problem because the things we train them to want are an imperfect model of what we actually want.

The other problem is the arms race one you mentioned, but more fundamentally there really are no universally shared human values. Competition and war highlight this. It occurs to me that a particularly hilarious outcome would be if we do figure out how to create truly "good" superhuman AI, actually more moral by some benchmark than we are. Then the ASI has to deal with its own alignment problem, how to deal with those silly, not-very-aligned humans. Cue references to Iain Banks's Culture series.

Expand full comment
45 more comments...

No posts