"So the downward pressure on model size is putting upward pressure on training compute. In effect, developers are trading off training cost and inference cost."
Models need to shrink to make inference economically viable in the long term (we already see a focus on smaller models in the market). And capabilities don't seem to be increasing, they have basically plateaued
(Original title: AI scaling myths)