i got so angry after reading this paper on LLMs and African American English that i literally had to stand up and go walk around the block to cool off https://www.nature.com/articles/s41586-024-07856-5 it's a very compelling paper, with a super clever methodology, and (i'm paraphrasing/extrapolating) shows that "alignment" strategies like RLHF only work to ensure that it never seems like a white person is saying something overtly racist, rather than addressing the actual prejudice baked into the model