
Stability AI announced today that it partnered with Arm Holdings on Stable Audio Open, a generative AI audio model for mobile.
“We’ve partnered with Arm to bring generative audio to mobile devices, enabling high-quality sound effects and audio sample generation directly on-device with no internet connection required,” Stability AI explains in the announcement post. “Leveraging Arm KleidiAI libraries and Stability AI’s cutting-edge technology, Stable Audio Open, can now run 30 times faster on smartphones with Arm CPUs, reducing generation time from minutes to seconds.”
Windows Intelligence In Your Inbox
Sign up for our new free newsletter to get three time-saving tips each Friday — and get free copies of Paul Thurrott’s Windows 11 and Windows 10 Field Guides (normally $9.99) as a special welcome gift!
“*” indicates required fields
Stable Audio Open is a local, on-device, text-to-audio AI model that runs entirely on the Arm CPU with no Internet connection required. The firm says that AI audio generation on Armv9-based CPUs was a significant challenge, with the initial results taking 240 seconds for an 11-second clip. But by distilling the model using Arm’s software stack, it was able to reduce that time to just 8 seconds, a 30x improvement.
Stability AI is showing off real-world examples of this model at MWC this week in Barcelona. The most obvious is converting text to clear and believable audio, but the model can also be used to generate sound effects, short music clips, and other audio content. The model is open and was trained on audio data from Freesound and the Free Music Archive. Developers, sound designers, musicians, and audio enthusiasts can interact with it now on Hugging Face,
Stability AI notes that it will continue working with Arm to bring new local AI capabilities to Arm-based devices across image, video, and 3D too.
You can learn more on the Stability AI website.