LOGO

Hugging Face Launches Training Data for Self-Driving Machines | LeRobot

March 11, 2025
Hugging Face Launches Training Data for Self-Driving Machines | LeRobot

Hugging Face and Yaak Enhance LeRobot with New Autonomous Driving Dataset

Last year saw the introduction of LeRobot by Hugging Face, an AI development platform. This initiative provides a suite of open AI models, datasets, and tools specifically designed to facilitate the creation of practical robotics systems.

Recently, Hugging Face collaborated with AI startup Yaak to significantly expand LeRobot’s capabilities. This expansion centers around a new training dataset geared towards robots and vehicles capable of autonomous navigation in complex environments.

Introducing the Learning to Drive (L2D) Dataset

The newly released dataset, known as Learning to Drive (L2D), is substantial in size, exceeding one petabyte. It comprises data collected from sensors integrated into vehicles utilized by German driving schools.

L2D meticulously captures data from various sources, including cameras, GPS systems, and measurements of “vehicle dynamics.” This data reflects the experiences of both driving instructors and students as they navigate real-world scenarios.

Addressing Limitations of Existing Datasets

While several open self-driving training sets are currently available – originating from companies like Alphabet’s Waymo and Comma AI – many concentrate on planning-related tasks.

These tasks, such as object detection and tracking, often necessitate high-quality annotations. According to the creators of L2D, this annotation requirement presents a significant obstacle to scalability.

In contrast, L2D is engineered to facilitate the development of “end-to-end” learning approaches.

The Benefits of End-to-End Learning

The creators assert that end-to-end learning allows for direct prediction of actions – for example, anticipating pedestrian movements – based solely on sensor inputs like camera footage.

As Harsimrat Sandhawalia, co-founder of Yaak, and Remi Cadene of Hugging Face’s AI for robotics team explained in a blog post, “The AI community can now build end-to-end self-driving models.”

They further stated that “L2D aims to be the largest open-source self-driving data set that empowers the AI community with unique and diverse ‘episodes’ for training end-to-end spatial intelligence.”

Future Testing and Community Involvement

Hugging Face and Yaak are planning real-world, “closed-loop” testing of models trained using L2D and LeRobot this summer. These tests will be conducted on a vehicle equipped with a safety driver.

The companies are actively soliciting contributions from the AI community. They are requesting submissions of models and specific tasks for evaluation, such as navigating roundabouts and executing parking maneuvers.

  • The dataset includes data from cameras.
  • GPS data is also incorporated.
  • “Vehicle dynamics” data provides further insights.
#Hugging Face#LeRobot#self-driving#autonomous vehicles#training data#AI