LOGO

DataCurve Raises $15M to Compete with Scale AI

October 9, 2025
DataCurve Raises $15M to Compete with Scale AI

The Intensifying Competition for AI Training Data

With the continued advancement of AI firms, securing access to superior data has evolved into a fiercely contested domain within the industry. This has spurred the emergence of companies such as Mercor, Surge, and notably, Scale AI, founded by Alexandr Wang.

Now, following Wang’s transition to lead AI initiatives at Meta, numerous investors are identifying opportunities. They are demonstrating a willingness to invest in companies presenting innovative strategies for gathering training data.

Datacurve Secures Series A Funding

Datacurve, a graduate of Y Combinator, is one such enterprise. It concentrates on delivering high-caliber data specifically for software development purposes.

The company recently announced the successful completion of a $15 million Series A funding round. This round was spearheaded by Mark Goldberg of Chemistry, with additional participation from personnel at DeepMind, Vercel, Anthropic, and OpenAI.

This Series A investment follows an initial $2.7 million seed round, which had previously attracted funding from Balaji Srinivasan, the former CTO of Coinbase.

A "Bounty Hunter" Approach to Data Acquisition

Datacurve employs a unique “bounty hunter” system. This system is designed to attract accomplished software engineers to contribute to the creation of datasets that are particularly challenging to obtain.

The company financially rewards these contributions, having already distributed over $1 million in bounties to date.

Prioritizing User Experience

However, co-founder Serena Ge (featured above alongside co-founder Charley Lee) emphasizes that financial incentives aren’t the primary driver. Compensation for data-related tasks will invariably be lower than traditional software development roles.

Therefore, the company’s key differentiator lies in providing a positive and engaging user experience.

“Our approach centers on building a consumer product, rather than simply operating a data labeling service,” Ge explained. “We dedicate significant effort to optimizing the platform to attract and retain the skilled individuals we seek.”

The Growing Complexity of Post-Training Data

This focus is increasingly crucial as the demands for post-training data become more intricate. Earlier AI models were trained using relatively simple datasets.

Contemporary AI products, however, depend on complex reinforcement learning (RL) environments. These environments necessitate careful and strategic data collection.

As these environments become more sophisticated, the requirements for both the volume and quality of data intensify. This trend could provide a competitive advantage to companies specializing in high-quality data collection, such as Datacurve.

Potential for Expansion Beyond Software Engineering

Currently in its early stages, Datacurve is concentrating on the software engineering sector. However, Ge believes the underlying model is readily applicable to other fields.

These include finance, marketing, and even the medical profession.

“Our current focus is on establishing an infrastructure for post-training data collection that effectively attracts and retains highly skilled professionals within their respective domains,” Ge stated.

#datacurve#scale ai#ai#artificial intelligence#data labeling#data annotation