LOGO

Facebook AI: Fighting Misinformation and Hate Speech

November 19, 2020
Facebook AI: Fighting Misinformation and Hate Speech

Facebook’s efforts to combat the spread of false information are ongoing, and the company acknowledges this challenge will likely persist indefinitely. However, this hasn’t diminished their commitment to continuous improvement of their automated systems, which are essential for minimizing the presence of harmful content and misinformation on the platform. Recently, Chief Technology Officer Mike Schroepfer highlighted the latest advancements in a series of updates.

These enhancements focus on the AI-powered systems Facebook utilizes to proactively identify and address issues like spam, inaccurate news stories, and offensive language – all before they are seen by users or even Facebook’s content moderation teams.

A key area of improvement lies in the language analysis technologies employed to detect hate speech. Schroepfer explained that this is a particularly sensitive area, requiring a cautious approach. While identifying potential scams through automated systems carries minimal risk, incorrectly flagging legitimate content as hate speech can have significant consequences. Therefore, a high degree of certainty is crucial when making such determinations.

The nature of hate speech and related content can be remarkably subtle. Even seemingly clear instances of racism can be altered in meaning with the addition or modification of a single word. Developing machine learning systems capable of understanding the nuances and complexities of language demands increasingly substantial computing power.

To address the growing computational demands of scanning the billions of posts published daily, Facebook has developed a new tool called Linformer (“linear”+”transformer”). This tool provides an approximation of the attention mechanism found in transformer-based language models, achieving comparable performance with only a modest increase in computational cost. (Those familiar with the technical details will appreciate this innovation.)

This translates to enhanced language comprehension without a substantial increase in processing requirements, eliminating the need to rely on less sophisticated models for initial screening and reserving more powerful models for potentially problematic content.

Facebook’s researchers are also tackling the more complex task of understanding how text, images, and text within images interact. Identifying manipulated screenshots, memes, and other visual content is proving difficult for computers, yet these formats are a significant source of misinformation. A single altered word can drastically change the meaning of an image while leaving the visual elements largely unchanged.

This illustration demonstrates two variations of the same piece of misinformation, differing slightly in their visual presentation. The system successfully identified the second instance after recognizing the first. Image Credits: Facebook

Schroepfer stated that Facebook is making progress in detecting these variations. While still challenging, they have achieved notable success in identifying COVID-19 misinformation, such as fabricated news reports claiming masks are harmful, even when these images are altered and disseminated by users.

The deployment and maintenance of these models is a continuous process, involving iterative prototyping, implementation, online testing, and the integration of feedback into new prototypes. The Reinforcement Integrity Optimizer represents a new strategy, continuously monitoring the effectiveness of new models on live content and feeding that data back into the training system, rather than relying on periodic reports.

Assessing Facebook’s success in these endeavors is not straightforward. The statistics they release suggest a positive trend, with increasing amounts of hate speech and misinformation being removed, and greater volumes of harmful content addressed compared to the previous quarter.

I inquired with Schroepfer about how Facebook can more accurately measure and communicate its progress, given that increases in removal rates could be attributed to either improved detection mechanisms or simply a larger volume of content being addressed at a consistent rate.

“The initial conditions are constantly evolving, so it’s essential to consider all metrics collectively. Ultimately, our primary focus is on prevalence,” he explained, referring to the actual frequency with which users encounter specific types of content, rather than whether it was proactively removed. “Removing a thousand pieces of content that no one would have seen is inconsequential. Removing the single piece of content that was about to go viral is a significant achievement.”

Facebook now incorporates hate speech prevalence into its quarterly “community standards enforcement report,” defining it as follows:

And for its initial measurement of this metric:

This figure suggests that approximately one in every thousand pieces of content on Facebook currently qualifies as hate speech. This appears to be a relatively high proportion. (I have requested further clarification from Facebook regarding this statistic.)

The completeness of these estimates is also open to question. Reports from conflict zones, such as Ethiopia, indicate a significant presence of undetected, unreported, and unaddressed hate speech. Furthermore, the proliferation of white supremacist and nationalist militia content and groups on Facebook has been widely documented.

Schroepfer clarified that his role is primarily focused on “implementation” and that decisions regarding policy, staffing, and other aspects of the social network’s operations fall outside his purview. This response is somewhat disappointing, considering his position as CTO of a globally influential company and his apparent commitment to these issues. However, it is also reasonable to suggest that without the diligent pursuit of technical solutions like those described above, Facebook might have been overwhelmed by harmful and misleading content.

#Facebook#AI#misinformation#hate speech#social media#content moderation