LOGO

OpenAI Misses Opt-Out Tool Deadline

January 1, 2025
OpenAI Misses Opt-Out Tool Deadline

OpenAI's Delayed Media Manager Tool

In May, OpenAI announced the development of a system designed to allow content creators to control the inclusion of their work within its artificial intelligence training datasets. However, after seven months, this promised functionality remains unavailable.

The tool, known as Media Manager, was described by OpenAI as capable of “identifying copyrighted text, images, audio, and video.” Its purpose was to honor creators’ preferences “across multiple sources.”

Addressing Concerns and Potential Legal Issues

The introduction of Media Manager was largely seen as a response to significant criticism leveled against the company. It was also intended to potentially mitigate legal risks related to intellectual property.

Sources close to the matter have indicated to TechCrunch that the tool’s development wasn't considered a high priority within OpenAI. One former employee stated, “I don’t believe it was a key focus.” They further added, “Frankly, I have no recollection of anyone actively working on it.”

Lack of Recent Progress

An individual collaborating with OpenAI, but not directly employed by them, shared with TechCrunch in December that previous discussions regarding the tool had occurred. However, they noted a complete absence of recent updates.

(Both of these sources requested anonymity due to the confidential nature of the business information discussed.)

Key Personnel Changes

Fred von Lohmann, a member of OpenAI’s legal team involved in the Media Manager project, transitioned to a part-time consulting position in October. OpenAI’s public relations team confirmed this change to TechCrunch via email.

Missed Deadlines

OpenAI has not provided any recent updates regarding the progress of Media Manager. The company also failed to meet its own stated deadline for the tool’s implementation, which was set for “by 2025.”

It’s important to note that the phrasing “by 2025” could be interpreted as including the entirety of the year 2025. However, TechCrunch understood OpenAI’s communication to indicate completion before January 1, 2025.

Intellectual Property Concerns in AI

Artificial intelligence models, such as those developed by OpenAI, function by identifying patterns within extensive datasets to formulate predictions. A simple illustration is recognizing that an individual consuming a burger will inevitably create a bite mark.

This process enables these models to gain a limited understanding of the world through observation. Consequently, tools like ChatGPT are capable of generating remarkably persuasive emails and essays, while Sora, OpenAI’s video generation platform, can produce footage that appears convincingly realistic.

The capacity to utilize examples from writing, cinema, and other sources to create novel content bestows significant power upon AI. However, this capability is inherently imitative.

Under specific prompting conditions, these models – largely trained on a vast collection of web pages, videos, and images – can replicate the source data with striking accuracy. Despite this data being “publicly accessible,” its intended use does not encompass such reproduction.

As an instance, Sora is able to generate video segments that prominently feature the TikTok logo and recognizable characters from popular video games. Furthermore, The New York Times has demonstrated ChatGPT’s ability to directly quote its articles without alteration, an occurrence OpenAI attributed to a “security breach.”

This has, understandably, caused considerable distress among creators whose work has been incorporated into AI training datasets without their consent. Legal counsel is now being sought by many affected parties.

OpenAI is currently defending itself against multiple class action lawsuits initiated by a diverse group including artists, writers, YouTubers, computer scientists, and news organizations. These plaintiffs allege that their copyrighted materials were used illegally during the training process.

Notable individuals involved in these legal challenges include authors Sarah Silverman and Ta Nehisi-Coates, alongside visual artists and prominent media companies such as The New York Times and Radio-Canada.

While OpenAI has engaged in licensing agreements with certain partners, the proposed terms have not been universally accepted by creators.

Controlling Media Usage with OpenAI

OpenAI provides various methods for creators to exclude their content from being used in AI model training. A submission form was introduced last September, enabling artists to request the removal of their work from future training datasets. Furthermore, website owners have consistently been able to prevent OpenAI’s web crawlers from accessing and indexing their sites.

However, these approaches have faced criticism for being fragmented and insufficient. Currently, there are no dedicated opt-out options available for text-based content, video files, or audio recordings. The existing image opt-out process necessitates submitting each image individually, alongside a descriptive explanation, which is a burdensome requirement.

Media Manager was initially presented as a comprehensive overhaul and broadening of OpenAI’s opt-out capabilities.

In a May announcement, OpenAI stated that Media Manager would leverage “advanced machine learning research” to empower creators and copyright holders to identify their owned content to OpenAI. The company, asserting collaboration with regulatory bodies during the tool’s development, expressed its aspiration for Media Manager to establish an industry benchmark.

Since the initial announcement, OpenAI has not made any further public statements regarding Media Manager.

A company spokesperson informed TechCrunch in August that the tool remained “under development,” but did not provide a response to a subsequent inquiry in mid-December.

Currently, OpenAI has not communicated any timeline for the potential launch of Media Manager, nor has it detailed the specific features or functionalities it may include.

Current Opt-Out Methods

  • Image Removal Form: Allows artists to request removal of images, requiring submission of each image and a description.
  • Web Crawling Control: Webmasters can block OpenAI’s bots from scraping data from their websites.

These existing methods are considered by some to be inadequate, particularly for content types beyond images.

The Proposed Media Manager

The planned Media Manager aimed to provide a more streamlined and comprehensive solution. It was intended to utilize machine learning to identify content ownership. OpenAI hoped this tool would become an industry standard for content control in the age of AI.

Despite initial promises, the project’s status remains unclear, with no confirmed launch date or feature set.

Fair Use and AI Training Data

Even if OpenAI’s proposed Media Manager is eventually launched, many experts remain skeptical about its ability to fully address creators’ anxieties or resolve the complex legal issues surrounding artificial intelligence and intellectual property rights.

Adrian Cyhan, an intellectual property lawyer from Stubbs Alderton & Markiles, highlighted the significant ambition of the Media Manager project. He questioned whether OpenAI could surpass the content identification capabilities of established platforms like YouTube and TikTok, which already face challenges in managing content at a large scale.

“Meeting the legal requirements for creator protections and potential compensation, which are currently under discussion, presents considerable hurdles,” Cyhan explained to TechCrunch. “This is particularly true considering the rapidly changing and often inconsistent legal frameworks across different countries and regions.”

Ed Newton-Rex, who founded the nonprofit Fairly Trained to certify AI companies that respect creators’ rights, argues that Media Manager could unfairly place the responsibility of controlling AI training on creators themselves. He suggests that choosing not to utilize the tool might be interpreted as implicit consent for their work to be used.

“The majority of creators will likely remain unaware of its existence, let alone actively use it,” Newton-Rex stated to TechCrunch. “However, it could still be leveraged to justify the widespread use of creative work against the wishes of its originators.”

Mike Borella, co-chair of MBHB’s AI practice group, noted that opt-out systems often fail to account for alterations made to original works, such as images that have been reduced in resolution. Joshua Weigensberg, an IP and media lawyer at Pryor Cashman, added that these systems also don’t address the frequent situation of third-party platforms hosting copies of creators’ content.

Copyright owners often lack control over the online distribution of their work, and may not even be aware of where it appears on the internet,” Weigensberg explained. “Even if a creator informs every AI platform of their opt-out preference, those companies could still train on copies found on external websites and services.”

From a legal perspective, Media Manager may not offer substantial benefits to OpenAI. Evan Everist, a copyright law partner at Dorsey & Whitney, believes that while the tool could demonstrate to a court that OpenAI is attempting to limit training on copyrighted material, it likely wouldn’t protect the company from liability if infringement is proven.

“Copyright holders are not obligated to proactively notify others against infringing on their work before such infringement takes place,” Everist clarified. “The fundamental principles of copyright law remain in effect – namely, avoiding the unauthorized use of another’s creations. This feature may be more focused on public relations and positioning OpenAI as a responsible content user.”

The Current Situation

With the Media Manager tool currently unavailable, OpenAI has put in place filtering systems – though these are not without flaws – to stop its models from directly repeating content from their training data.

Furthermore, the company is actively defending itself in ongoing legal battles, consistently arguing for the application of fair use principles.

OpenAI maintains that its models generate new, transformative content, rather than engaging in plagiarism.

Potential Legal Outcomes

It is quite possible that OpenAI will succeed in its copyright-related legal challenges.

The judicial system might determine that OpenAI’s AI possesses a “transformative purpose,” mirroring the outcome of a case from approximately ten years ago involving the publishing industry and Google.

In that prior instance, a court ruled that Google’s digitization of millions of books for the Google Books project, essentially a large digital archive, was legally permissible.

The Necessity of Copyrighted Material

OpenAI has publicly stated that training competitive AI models would be unfeasible without utilizing copyrighted materials, regardless of whether authorization is obtained.

The company argued in a January submission to the U.K.’s House of Lords that restricting training data to works in the public domain – those created over a century ago – would be an insufficient basis for developing modern AI systems.

Such a limitation would only produce an “interesting experiment,” but would not meet the requirements of contemporary users.

Implications for Media Manager

If the courts ultimately side with OpenAI, the Media Manager tool would lose much of its legal justification.

OpenAI appears prepared to accept this possibility – or to re-evaluate its current approach to allowing content creators to opt-out of having their work used for training purposes.

#openai#opt-out#data privacy#ai#artificial intelligence#2025