loader

Nvidia AI model for modifying audio and voices

The Future of Audio: Nvidia’s New AI Model

On November 26, 2024, Nvidia unveiled an exciting new artificial intelligence model designed for generating and modifying audio, targeting the music, film, and gaming industries. The new technology, known as Fugatto, has garnered attention for its innovative capabilities.

The model can produce sound effects and music from text descriptions, showcasing a range of possibilities including the ability to make a trumpet sound like a barking dog. This distinct feature sets Nvidia’s offering apart from other AI technologies currently available, which typically focus on generating audio without modification.

Nvidia’s Bryan Catanzaro stated, ‘If we think about synthetic audio over the past 50 years, music sounds different now because of computers, because of synthesizers.’ He believes generative AI could open up new potentials not only in music but also in video gaming and beyond.

Nevertheless, Nvidia has made it clear that they have no immediate plans to publicly release Fugatto due to concerns about possible misuse. ‘Any generative technology always carries some risks, because people might use that to generate things that we would prefer they don’t,’ Catanzaro mentioned, highlighting the ongoing debate in the tech community regarding the ethical implications of such technologies.

Nvidia’s model was trained on open-source data, and the company is still evaluating how to balance accessibility with control to prevent abuse. As the development of AI audio systems continues, the industry remains on alert regarding the impacts these tools may have in various fields.