Causal reasoning ability is crucial for numerous NLP applications. Despite
the impressive emerging ability of ChatGPT in various NLP tasks, it is unclear
how well ChatGPT performs in causal reasoning. In this paper, we conduct the
first comprehensive evaluation of the ChatGPT's causal reasoning capabilities.
Experiments show that ChatGPT is not a good causal reasoner, but a good causal
interpreter. Besides, ChatGPT has a serious hallucination on causal reasoning,
possibly due to the reporting biases between causal and non-causal
relationships in natural language, as well as ChatGPT's upgrading processes,
such as RLHF. The In-Context Learning (ICL) and Chain-of-Though (COT)
techniques can further exacerbate such causal hallucination. Additionally, the
causal reasoning ability of ChatGPT is sensitive to the words used to express
the causal concept in prompts, and close-ended prompts perform better than
open-ended prompts. For events in sentences, ChatGPT excels at capturing
explicit causality rather than implicit causality, and performs better in
sentences with lower event density and smaller lexical distance between events.