LOGO

Snorkel AI Raises $35M Series B to Automate Data Labeling

April 7, 2021
Snorkel AI Raises $35M Series B to Automate Data Labeling

Snorkel AI Secures $35 Million Series B Funding

A significant challenge within the field of machine learning involves the process of data labeling, which is essential for training models. Snorkel AI is focused on streamlining this process for subject matter experts, enabling them to apply labels programmatically. The company recently announced a $35 million Series B funding round.

Introducing Application Studio

Alongside the funding announcement, Snorkel AI unveiled a new tool called Application Studio. This platform is designed to facilitate the creation of common machine learning applications through the use of templates and pre-built components.

Investment Details

Lightspeed Venture Partners spearheaded the investment round, with contributions from existing investors including Greylock, GV, In-Q-Tel, and Nepenthe Capital. Walden and BlackRock joined as new investors in this round. To date, the startup has raised a total of $50 million.

The Data Labeling Bottleneck

According to Alex Ratner, co-founder and CEO of Snorkel AI, data labeling continues to be a major obstacle in the advancement of machine learning and artificial intelligence across various industries. This is due to the high costs, intensive labor requirements, and the difficulty for experts to dedicate sufficient time to the task.

The Cost of Manual Labeling

“A frequently overlooked aspect of current AI development is that, despite advancements in technology and tooling, approximately 80 to 90% of the time and expense associated with an average AI project is dedicated to the manual labeling, collection, and re-labeling of training data,” Ratner explained.

Simplifying the Labeling Process

Snorkel AI has developed a solution aimed at simplifying this process, allowing subject matter experts to programmatically apply labels. This approach significantly reduces the time and effort needed for labeling, potentially decreasing it from months to just hours or days, contingent on the data's complexity.

Expanding Beyond Labeling: Application Studio

As the company’s methodology matured, clients expressed a need for assistance with the subsequent stage of machine learning: building applications using the trained data and models. This demand led to the development of Application Studio. It can be utilized for applications like contract classification in banking or anomaly detection in telecommunications, assisting companies in progressing beyond data labeling.

A No-Code Interface for Machine Learning

“Our focus extends beyond programmatic data labeling to encompass models, preprocessors, and post processors. We’ve now made these accessible through a templated, visual, and no-code interface,” Ratner stated.

Origins and Growth

The company’s core products are rooted in research initiated at the Stanford AI Lab in 2015. The founders dedicated four years to research before officially launching Snorkel in 2019. Currently, the startup employs 40 individuals.

Commitment to Diversity and Inclusion

Ratner acknowledges the technology industry’s historical challenges regarding diversity and inclusion. He emphasizes a deliberate effort to cultivate a diverse and inclusive company culture.

Prioritizing DEI

“From the outset, we prioritized diversity at the company, team, and board levels, and backed this commitment with action. We’ve collaborated with external firms for internal training, audits, and DEI strategy development. Pipeline diversity is a mandatory requirement in all our agreements with recruiting firms,” he said.

Addressing Bias in Machine Learning

Ratner also recognizes the potential for automation to introduce bias into machine learning models. He believes that simplifying the labeling process can facilitate easier detection of bias when it occurs.

Auditing and Mitigating Bias

“Even when starting with a dozen or two dozen labeling functions within Snorkel, vigilance and proactive bias detection remain crucial. However, auditing becomes simpler by reviewing a few hundred lines of code to understand what influenced the model’s learning,” he explained.

#snorkel ai#data labeling#machine learning#series b funding#automation#ai