Stability AI Launches Innovative Sound Generator

Stability AI’s new Stable Audio model allows users to generate various types of high-quality audio by simply inputting text descriptions.

Main Points

Main Points:

Generative Capabilities: Stable Audio can produce a wide range of audio, from instrumental sounds to ambient noises, using text prompts.
Technical Specifications: The model is based on a U-Net diffusion architecture, trained on over 19,500 hours of audio data.
Usage and Accessibility: Users can access the service with a free tier, generating up to 20 audio clips per month, though commercial use is restricted.

Summary

Stability AI has introduced Stable Audio, a generative AI model capable of creating high-quality audio from text descriptions. Leveraging a U-Net-based diffusion model, Stable Audio can produce diverse audio outputs, including single instruments, full ensembles, and ambient sounds. Trained on a vast dataset of 19,500 hours of audio, the model ensures high fidelity and responsiveness.

Stable Audio utilizes a text-to-audio embedding approach similar to that used in other generative models by Stability AI. Users provide a text prompt and specify the desired audio length, which the model then uses to generate the corresponding sound. While currently available in a limited, non-commercial capacity, the tool offers significant potential for future development in audio and music generation. Stability AI plans to release open-source versions and custom training capabilities, enhancing the tool’s accessibility and adaptability for various creative and professional uses.

Source: Stability AI releases a sound generator

Keep up to date on the latest AI news and tools by subscribing to our weekly newsletter, or following up on Twitter and Facebook.

Spread the love