It's a great example of how unequal access gets compounded, and bias essentially multiplies with every layer that is stacked on top of it.
The bias present in large language models is a combination of the colonial heritage of English as an imperial language, the ascent of the American Empire after World War II, accumulated wealth, the allocation of network resources, and so forth.
Take IPv4 address allocation, for example; it's a pretty solid proxy for how biased new tech is going to be.