Yes, and...!
> It's not very good at simple arithmetic.
This is a recurrent example that is starting to illustrate the difference between bare LLMs and the products built on top of them. Eg, ChatGPT is a product built on top of a system. That system has a lot of components. One of those components is a LLM. And another component is a Python interpreter. LLMs can write Python quite well, and Python can do math quite well.
Seems like a pretty intelligent system to me!