@UlrichJunker In that piece I think I’m talking about instruction tuning rather than RLHF as such? It was an earlier advance, although the purposes are similar. But I agree with you that this whole topic is under-discussed. One way to put it is that the models responded to / addressed the Stochastic Parrots critique that they weren’t grounded in a communicative situation. But it served no one’s polemical purpose to take note of that.