Paper on The perils and promises of fact-checking with large language models

I am very happy to share the first paper of my PhD Student Dorian Quelle The perils and promises of fact-checking with large language models published in Frontiers in Artificial Intelligence.

People are using LLMs, such as Chat GPT, to verify facts, so it’s essential to understand how well they perform at this task. This is what we did in this article. We also propose a framework that enables LLMs to retrieve contextual data and allows users to verify their reasoning and the sources they use to reach a verdict. We find that GPT-4 performs well, but accuracy varies based on language and claim veracity. As they still make mistakes, it is important to integrate mechanisms that allow for verifying LLMs verdict. In particular, they hold potential as tools accelerating human fact-checkers’ work.

Updated: