AI Alignment With Almost No Jargon

Apr 27, 2024

Hahaha j'avais pas pensé à ça comme ça!

Expand full comment

David Gasca

Your explanations are excellent - I really like how clearly you go through it (and I loled at the image subtitles)! Thanks for the post :)

Expand full comment

Thanks! Comments like this mean a lot especially on topics like this that are more technical and obviously less appealing to a large audience :)

Expand full comment

Susan Linehan

Apr 26, 2024Edited

I was fascinated by the "constitutional principle" to “Choose the response that is least likely to be viewed as harmful or offensive to a non-western cultural tradition of any sort.”

If the question to the AI is "should human bodies be cremated?" might the AI go though all its gazillion studies of human cultures and come up with "that would be a waste of a source of protein?"

Expand full comment

Unlikely, I don't think using human remains for proteins is a popular choice either in the west or the rest!

Expand full comment

Susan Linehan

Ah, but "of any sort" would include those with a tradition of cannibalism even if they no longer practice it. Somehow I don't think the humans involved would take this particular advice, however.

Expand full comment

Yes but a response promoting cannibalism would be offensive to other, non-cannibalistic non-Western groups, so it would be unlikely to be picked as well.

It's certainly true that these principles around non-Westernness, presumably added specifically to counter the Western bias of the original training, are an extremely crude way of dealing with the problem. A real effort would be complicated and involve a lot of cross-cultural input.

Expand full comment

Judith Stove

Brilliant as usual, Atlas. I didn't think I would have a hope of understanding the topic, but you broke it down to intelligible portions. And I think your conclusion about the overall moral question/s must be right.

Expand full comment