It’s 2017, and everyone is wrestling with fake news. Given the wide-ranging nature of artificial intelligence (AI) use cases, it is only natural to ask whether machine learning can help us combat fake news. It should be fairly simple to build ourselves a fact-checker, right? After all, software is able to flag spam email or pornographic images with a high degree of accuracy. It turns out that machine learning techniques are useful, but they have their limitations—and we are far away from a fully automated fact-checking tool.

Machine learning techniques—whether they are supervised, unsupervised, or deep-learning methods—are pattern detectors. You provide a large set of labeled input and output data and train the algorithms to grasp the underlying patterns. This works excellently for narrowly defined tasks in which the corpus of knowledge is well-defined.

A reference dataset for detecting falsehoods does not exist. Let us say you intend to use the entire web as your corpus for automated fact-checks. If you take such an approach, you can only check whether such a claim has been previously made or not. But this does not tell you much about the veracity. Furthermore, any such system won’t be able to handle breaking news.

Alternatively, instead of using the entire web, you can create your own dataset. Such a data source has to consist of both real and fake news items to train the fact-checking algorithm. But in reality, the truth is often contested, particularly in the political realm. Any biases, or ideology, inherent to the sources involved get transferred to this dataset.

Another issue pertains to fact-checking at the news-article level. Let’s say each article in turn consists of more granular claims. You have to extract and evaluate each of these claims individually. But as students of dialectic know, it is possible for some of the individual claims to be incorrect, but the overall conclusion can be true.

Another significant challenge is the ability to transfer our intuitive sense of context to the algorithms. Most humans understand satire, humor, exaggeration, and other rhetorical devices and have a shared understanding of reality—that The New Yorker’s Borowitz Report or The Onion’s articles are false, but not false in the sense of fake news. It’s not possible for us to codify all of this context and our communication nuances. In short, when we try to develop an overarching fact-checker, we quickly realize that our current machine learning tools are excellent pattern-matchers and pattern-detectors, but not full-fledged reasoning machines.

So is there any way that we can employ machine learning in the fight against fake news? It’s not my contention that automated fact-checkers are useless. But given the particular nature of fake news, we should strive for domain-specific tools or narrowly targeted tools—for instance, those that identify whether the headline matches the article content or flag seemingly dubious articles for further inspection by human fact-checkers. Ultimately, a fight against fake news is not a problem that can be solved by technology alone. We have to bring in machine learning tools, but human and institutional resources are more important. 

Facebook’s response to calls for curbing fake news dissemination via its platform illustrates a multi-pronged approach. Users now have the ability to flag content as fake news, and based on a threshold level of flagging, such articles are sent for review by professional, non-partisan fact-checkers. If expert human reviewers deem an article to be false, it is labeled as such, and readers are alerted. Facebook’s machine learning algorithms down-rank dubious articles, and they don’t appear prominently in the newsfeed, thus reducing their reach. Here, you can see that a human-plus-machine approach is being leveraged, rather than the brute force of the machines themselves.

Fact-checking (so far) involves uniquely human faculties of reasoning and critical thinking. Automated fact-checkers are useful as a first line of defense, identifying dubious content for further inspection by humans and increasing their efficiency. Thus, in the fact-checking domain, humans lead, and machine learning algorithms play the supporting role.