Sesame Releases Maya's Base AI Model | Open Source AI

Sesame Releases Base Model for Realistic Voice Assistant Maya
The AI firm Sesame has made available the foundational model that drives Maya, its remarkably lifelike voice assistant.
CSM-1B: A Commercially Usable Model
This model, designated CSM-1B, comprises 1 billion parameters – individual elements within the model’s structure. It is released under the permissive Apache 2.0 license, allowing for broad commercial application with minimal restrictions.
According to Sesame’s documentation on the AI development platform Hugging Face, CSM-1B generates “RVQ audio codes” from both textual and audio inputs.
Understanding RVQ Technology
RVQ stands for “residual vector quantization,” a method for converting audio into distinct units known as codes. This technique is increasingly prevalent in modern AI audio technologies.
Examples of its use include Google’s SoundStream and Meta’s Encodec.
Model Architecture
CSM-1B utilizes a model from Meta’s Llama family as its core, combined with an audio “decoder” component. Maya, Sesame’s voice assistant, is powered by a refined version of CSM-1B.
Sesame clarifies that the open-sourced model is a base generation model. While capable of producing diverse voices, it hasn’t been specifically fine-tuned for any particular voice.
Language Capabilities and Limitations
The model exhibits some limited capacity for languages other than English. This is attributed to incidental data present during the training process, but performance in non-English languages is expected to be suboptimal.
The specific dataset used to train CSM-1B remains undisclosed by the company.
Safeguards and Ethical Considerations
Currently, the model lacks robust safeguards. Sesame relies on an honor system, requesting developers and users to refrain from voice mimicry without consent, the creation of deceptive content, or engagement in harmful activities.
Testing the Hugging Face demo revealed that voice cloning could be achieved in under a minute. This facilitated the generation of speech on various subjects, even those considered sensitive.
Concerns Regarding Voice Cloning
Consumer Reports has recently cautioned that many readily available AI-powered voice cloning tools lack adequate safeguards against fraud and misuse.
Sesame’s Breakthrough Technology
Co-founded by Brendan Iribe, a co-creator of Oculus, Sesame gained significant attention in late February for its advanced assistant technology. Maya and Miles, Sesame’s other assistant, exhibit realistic characteristics like natural breathing patterns and speech disfluencies.
Furthermore, these assistants can be interrupted mid-sentence, mirroring the behavior of OpenAI’s Voice Mode.
Funding and Future Developments
Sesame has secured an undisclosed amount of funding from Andreessen Horowitz, Spark Capital, and Matrix Partners. The company is also developing AI-powered glasses designed for continuous wear, incorporating its proprietary models.
These glasses are intended to be equipped with the company’s custom AI models.
Related Posts

Google's New AI Agent vs. OpenAI GPT-5.2: A Deep Dive

Disney Cease and Desist: Google Faces Copyright Infringement Claim

OpenAI Responds to Google with GPT-5.2 After 'Code Red' Memo

Google Disco: Build Web Apps from Browser Tabs with Gemini

Waymo Baby Delivery: Birth in Self-Driving Car
