"Perhaps the best evidence of cost-cutting is the fact that GPT-5 isn't actually one model. It's a collection of at least two models: a lightweight LLM that can quickly respond to most requests and a heavier duty one designed to tackle more complex topics. Which model prompts land in is determined by a router model, which acts a bit like an intelligent load balancer for the platform as a whole": OpenAI's GPT-5 is a cost cutting exercise https://www.theregister.com/2025/08/13/gpt_5_cost_cutting/?utm_medium=share&utm_content=article&utm_source=linkedin