AI-Generated Paper Peer Review: Sakana's Claim Examined

AI-Generated Research: Sakana's Experiment and the Nuances of Peer Review
Japanese AI firm Sakana has announced that an AI system it developed contributed to a peer-reviewed scientific publication. However, a closer examination reveals complexities surrounding this claim.
The ongoing discussion regarding the integration of AI into the scientific process is intensifying. While some researchers believe AI is not yet prepared to function as a collaborative scientist, others recognize its potential, acknowledging that the field remains in its nascent stages.
Sakana's Approach and the AI Scientist-v2
Sakana aligns with the latter perspective. The company utilized its AI system, designated The AI Scientist-v2, to create a paper. This paper was subsequently submitted to a workshop associated with ICLR, a well-established and respected AI conference.
Sakana reports that the workshop organizers, alongside ICLR’s leadership, agreed to a collaborative experiment involving the double-blind review of manuscripts generated by AI. This initiative aimed to assess the viability of AI-authored research.
In collaboration with researchers from the University of British Columbia and the University of Oxford, Sakana submitted three papers created by The AI Scientist-v2 for peer review. The AI Scientist-v2 autonomously generated all aspects of the papers, including scientific hypotheses, experimental designs, code, data analysis, visualizations, and the final text and titles.
“We formulated research ideas by providing the workshop abstract and description to the AI,” explained Robert Lange, a research scientist and founding member of Sakana, in an email to TechCrunch. “This ensured the generated papers were relevant and appropriate for submission.”
Acceptance and Subsequent Withdrawal
One of the three submitted papers was accepted to the ICLR workshop. This paper offered a critical evaluation of current training methodologies for AI models.
Sakana proactively withdrew the paper prior to publication, citing a commitment to transparency and adherence to ICLR’s established guidelines.
“The accepted paper introduces a novel method for training neural networks and identifies ongoing empirical challenges,” Lange stated. “It serves as a valuable starting point for further scientific inquiry.”
Caveats and Limitations of the Achievement
However, the accomplishment is not without its limitations. Sakana acknowledges that the AI occasionally produced inaccuracies in citations, such as misattributing work to a 2016 publication instead of the original 1997 source.
Furthermore, the paper did not undergo the full extent of scrutiny typical of peer-reviewed publications. Its withdrawal before formal publication meant it bypassed a “meta-review” stage, where workshop organizers could have potentially rejected it.
It’s also important to note that workshop acceptance rates are generally higher than those for the main conference track, a point Sakana openly admits. The company confirmed that none of its AI-generated studies met the criteria for publication in the main ICLR conference.
Expert Perspectives on Sakana's Results
Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, described Sakana’s findings as “somewhat misleading.” He emphasized that the company selected papers from a larger pool generated by the AI, indicating human judgment played a role in choosing potentially successful submissions.
“This demonstrates the effectiveness of combining human expertise with AI, rather than AI independently driving scientific advancement,” Guzdial explained.
Mike Cook, a research fellow at King’s College London specializing in AI, questioned the thoroughness of the peer review process at the workshop. He noted that newer workshops often rely on more junior researchers for reviews.
“These workshops often focus on negative results and challenges, which can make it easier for an AI to convincingly articulate a failure,” Cook added. He also pointed out that AI’s proficiency in generating human-like text is not a new phenomenon, and the ethical implications for science are already being discussed.
AI's Shortcomings and Potential for Noise
AI’s inherent technical limitations, such as its propensity to hallucinate information, raise concerns among scientists regarding its suitability for rigorous research. Experts also fear that AI could introduce unnecessary noise into the scientific literature, hindering genuine progress.
“We must consider whether Sakana’s result reflects AI’s ability to design and conduct experiments, or its skill in persuading humans – a task AI already excels at,” Cook stated. “There’s a distinction between passing peer review and genuinely contributing to a field’s knowledge base.”
Sakana's Perspective and Future Considerations
Sakana clarifies that it does not claim its AI can produce groundbreaking or highly innovative scientific work. The primary objective of the experiment was to “evaluate the quality of AI-generated research” and to emphasize the need for establishing clear guidelines for AI-generated science.
The company argues that AI-generated science should be evaluated on its own merits to avoid inherent biases. Sakana intends to continue engaging with the research community to address the evolving challenges posed by this technology, ensuring it doesn’t devolve into a system solely focused on passing peer review, thereby compromising the integrity of the scientific process.
Related Posts

ChatGPT Launches App Store for Developers

Pickle Robot Appoints Tesla Veteran as First CFO

Peripheral Labs: Self-Driving Car Sensors Enhance Sports Fan Experience

Luma AI: Generate Videos from Start and End Frames

Alexa+ Adds AI to Ring Doorbells - Amazon's New Feature
