Google Cloud AI: New Music Generation Model

Google Enhances Generative AI Models on Vertex AI

Updates to several of Google’s first-party AI models for media generation were deployed on Wednesday through its Vertex AI cloud platform.

New Capabilities Across Models

Lyria, Google’s text-to-music model, is now accessible in preview to a limited number of customers. Furthermore, the Veo 2 video-creation model has received enhancements, including expanded editing and visual effects customization options.

A voice-cloning feature, powered by Chirp 3 – Google’s audio understanding model – has also been launched for a select group of “allow-listed” users. The Imagen 3 image generator now delivers, according to Google, a “significantly” improved level of performance.

Competition in the Enterprise AI Market

These updates, coinciding with Cloud Next, represent Google’s ongoing effort to establish a leading position in the enterprise generative AI market. The company’s primary competitor in this space is Amazon, which provides a similar cloud AI platform, Bedrock, featuring its own proprietary generative AI models.

Lyria: A Royalty-Free Music Alternative

Google is positioning Lyria as a viable alternative to traditional royalty-free music libraries. The model enables users to compose songs across diverse styles and genres, ranging from jazz piano pieces to lo-fi compositions.

Chirp 3: Voice Cloning and Transcription

Chirp 3 is capable of synthesizing speech in approximately 35 different languages. The Instant Custom Voice feature, initially previewed earlier this year, utilizes Chirp 3 to clone a voice using just 10 seconds of audio and is now generally available.

A new tool, Transcription with Diarization, is also launching in preview, leveraging this model to separate and identify individual speakers within multi-participant recordings.

Safeguards for Voice Cloning

To mitigate potential misuse, Instant Custom Voice undergoes a “diligence” process to confirm “appropriate voice usage permissions,” as stated by Google.

Veo 2: Advanced Video Editing Features

The Veo 2 model now offers the ability to remove backgrounds, logos, and unwanted objects from videos. It can also extend video footage, for example, converting landscape orientation to portrait.

Additionally, users can adjust camera angles and pacing within AI-generated scenes to create effects like time lapses and drone footage. The model can also interpolate between defined start and end frames.

These Veo features are currently available in preview.

Imagen 3: Enhanced Image Manipulation

The upgrades to Imagen 3 improve the model’s capabilities in removing objects and reconstructing missing or damaged areas within images.

Watermarking and Safety Measures

All media generated by Imagen, Veo, and Lyria (excluding Chirp) is watermarked using Google’s SynthID technology. The company asserts that all its generative AI models incorporate “built-in safeguards” to prevent the creation of harmful content.

Data Training Transparency

Google has maintained its practice of not disclosing the specific data used to train its models. Data used for training is often a contentious issue due to intellectual property concerns.

Some companies train their models on copyrighted material without obtaining prior authorization from copyright holders. While these companies invoke the U.S. fair use doctrine, many creators dispute this claim and are pursuing legal action.

Copyright Protection for Users

Google has previously informed TechCrunch that it provides opt-out options for model training and an indemnity policy to protect Google Cloud and Vertex AI customers from potential AI-related copyright disputes.

Topics

More

Google Cloud AI: New Music Generation Model

Google Enhances Generative AI Models on Vertex AI

New Capabilities Across Models

Competition in the Enterprise AI Market

Lyria: A Royalty-Free Music Alternative

Chirp 3: Voice Cloning and Transcription

Safeguards for Voice Cloning

Veo 2: Advanced Video Editing Features

Imagen 3: Enhanced Image Manipulation

Watermarking and Safety Measures

Data Training Transparency

Copyright Protection for Users

Related Posts

ChatGPT Launches App Store for Developers

Pickle Robot Appoints Tesla Veteran as First CFO

Peripheral Labs: Self-Driving Car Sensors Enhance Sports Fan Experience

Luma AI: Generate Videos from Start and End Frames

Alexa+ Adds AI to Ring Doorbells - Amazon's New Feature

Amazon Appoints Peter DeSantis to Lead New AI Organization