Turning fake news against itself: AI tool can detect disinformation with 92% accuracy

Getting your Trinity Audio player ready...

Fake news is already a massive problem worldwide and with continuing improvements in content generation tools powered by artificial intelligence we are not far from the era of neural fake news i.e., fake news generated by AI. That would make it an even more formidable challenge for publishers.

Currently, bots are being used to spread fake news, advanced AI models that are capable of consistently generating convincing pieces of disinformation are not yet available.

Researchers are already working to counter such a scenario. An artificial intelligence model developed by researchers from the University of Washington and Allen Institute for AI (AI2) can spot fake news with 92% accuracy, per the team that developed the model.

This is a substantial improvement compared to other similar detectors, the best of which have an accuracy rate of 73%.

The algorithm, Grover (short for “Generating aRticles by Only Viewing mEtadata Records”), can analyze more aspects of a news article than other tools. These include the body of the article, the headline, author name, publication name, and other details that could indicate foul play.

It can turn out to be a powerful tool for publishers dealing with fake news.

“More trustworthy than the original”

Grover was trained on a 120GB library of real news articles from the top 5,000 publications tracked by Google News between the end of 2016 to March of this year. What makes it so effective at spotting fake content is that it is also adept at creating it. Given a headline prompt, Grover can generate an entire news article written in the style of a legitimate news outlet or author.

Part of a fake article generated by Grover. Source: Defending Against Neural Fake News

There’s nothing hard coded — we haven’t told the model who Paul Krugman is. But it learns from reading a lot. The system is just trying to make sure that the generated article is sufficiently like the other data it associates with that domain and author. And it’s going to learn things like, ‘Paul Krugman’ tends to talk about ‘economics,’ without us telling it that he’s an economist.
Rowan Zellers, Project Lead, to TechCrunch

In an experiment, the researchers found that the system can also generate propaganda that readers rate more trustworthy than the original, human-generated versions.

Example of human-written and machine-written articles arguing against fluoride, with the average ratings from human rating study. Source: Defending Against Neural Fake News

“Attack their own systems”

According to the team, this capability can be used for modeling potential threats from those who use AI to create fake news. The approach is similar to that used by cybersecurity professionals, who regularly attack their own systems in order to discover weaknesses.

Our work on Grover demonstrates that the best models for detecting disinformation are the best models at generating it. The fact that participants in our study found Grover’s fake news stories to be more trustworthy than the ones written by their fellow humans illustrates how far natural language generation has evolved — and why we need to try and get ahead of this threat.
Yejin Choi, Professor at the Allen School’s Natural Language Processing group and Researcher at AI2

Manohar Paluri, Director, Facebook AI, seconds Choi saying, “If you have the generative model, you have the ability to fight it.”

So how does Grover detect fake news from the real ones? The researchers say that even the best examples of neural fake news are based on learned style and tone, rather than a real understanding of language and the world. So they will contain evidence of the true source of the content.

Despite how fluid the writing may appear, articles written by Grover and other neural language generators contain unique artifacts or quirks of language that give away their machine origin. It’s akin to a signature or watermark left behind by neural text generators. Grover knows to look for these artifacts, which is what makes it so effective at picking out the stories that were created by AI.
Rowan Zellers, Lead Author, Defending Against Neural Fake News

Although Grover will recognize its own quirks, the ability to detect evidence of AI-generated fake news is not limited to its own content. The algorithm is better at detecting fake news written by both humans and machines than any system that came before it. The paper states that Grover can detect human-written fake news with 98% accuracy.

Give a line, get a page

Many publishers already use AI-powered automated systems to produce articles at scale. Forbes’ CMS, called Bertie, suggests content and titles. Bloomberg uses Cyborg for content creation and management. The Washington Post’s Heliograf can generate articles from quantitative data. The Guardian, Associated Press, and Reuters are also working with AI tools.

These systems are getting highly advanced over time. OpenAI, a non-profit artificial intelligence research company developed a sophisticated language modeling algorithm earlier this year. The model called GPT-2 excels in language modeling, which is the ability to predict the next word in a given sentence.

It just requires a line or two of input to generate several paragraphs of plausible content. It can write across different subjects, and even mimic specific styles and tone to produce text that is rich in context and nuance.

GPT-2 can generate an entire article given a headline, complete with fake quotes and data. OpenAI refused to release its full version deeming it dangerous.

“The best defense we have”

It’s just a matter of time before the purveyors of fake news start using advanced AI to generate fake news at scale. That could make its detection and containment even more difficult for news organizations because AI could emulate legitimate news sources at a massive scale.

It is already virtually impossible for publishers to fact-check each and every piece of news story, given the sheer number of fake news stories and their rapid growth. Automating their detection to flag them at the earliest appears to be the best solution.

According to the researchers, in the long run tools like Grover, driven by clear methodologies to reproduce and deconstruct disinformation, are necessary to prevent the spread of fake news.

Choi said, “People want to be able to trust their own eyes when it comes to determining who and what to believe, but it is getting more and more difficult to separate real from fake when it comes to the content we consume online.

“As AI becomes more sophisticated, a tool like Grover could be the best defense we have against a proliferation of AI-generated fake news.”