
If you’ve been involved in open-source development recently, you might recognize Stability AI, known for initiating the image generation revolution with Stable Diffusion and for developing the Stable Diffusion XL model. In a leap towards the field of AI audio generation, Stability AI introduced Stable Audio Open last year, which quickly made waves in the audio generation landscape. Their latest contribution, Stable Audio Small, along with the updated Stable Audio Tools library, represents significant advancements in open-source audio generation.
This article delves into the features and capabilities of Stable Audio Small, a model that stands out for its innovative approach and performance. We will explore how to utilize this model on a DigitalOcean GPU Droplet and why it’s a tool worth getting excited about.
What is Stable Audio Small
Stable Audio Small introduces the Adversarial Relativistic-Contrastive (ARC) post-training method. This technique is groundbreaking as it represents the first adversarial acceleration algorithm tailored for diffusion and flow models that does not rely on expensive distillation methods. Typically, model distillation involves training a smaller model to emulate a larger one, a process that often proves to be costly and requires a solid small model architecture.
By applying ARC to enhance their Stable Audio Open model, Stability AI has successfully created a robust yet compact version, Stable Audio Small. This model generates 10-second audio samples on an NVIDIA H100 GPU in less than 7 milliseconds, with diverse output styles, particularly excelling in music production.
Using Stable Audio Small on DigitalOcean GPU Droplets
To get started, the first step is to prepare your environment following the instructions from a relevant tutorial. Launch your Jupyter Notebook for the project using the terminal command:
Jupyter notebook –allow-root
This command opens your Jupyter Labs environment in your browser. From here, you can create a new IPython Notebook and install the necessary packages by running:
!pip install stable_audio_tools einops
After a brief wait for the installation, you’re ready to generate audio. Input the following code in a new cell of your notebook:
import torchimport torchaudiofrom einops import rearrangefrom stable_audio_tools import get_pretrained_modelfrom stable_audio_tools.inference.generation import generate_diffusion_cond# Determine devicedevice = "cuda" if torch.cuda.is_available() else "cpu"# Load modelmodel, model_config = get_pretrained_model("stabilityai/stable-audio-open-small")sample_rate = model_config["sample_rate"]sample_size = model_config["sample_size"]model = model.to(device)# Conditioning setupconditioning = [{ "prompt": "128 BPM tech house drum loop", "seconds_total": 11}]# Generate stereo audiooutput = generate_diffusion_cond( model, steps=8, conditioning=conditioning, sample_size=sample_size, sampler_type="pingpong", device=device)# Rearrange audio batch to single sequenceoutput = rearrange(output, "b d n -> d (b n)")# Normalize and save audiooutput = output.to(torch.float32).div(torch.max(torch.abs(output))).clamp(-1, 1).mul(32767).to(torch.int16).cpu()torchaudio.save("output.wav", output, sample_rate)
You can experiment with the conditioning
list by adjusting various prompts. Changes can lead to different outcomes, ranging from realistic sounds like "a cat meowing" to more intricate musical themes.
To listen to the generated audio, incorporate the following command in another cell:
import IPythonIPython.display.Audio("output.wav")
Closing Thoughts
In summary, Stable Audio Small is an impressive model given its lightweight design and versatility. Its development hints at exciting future projects within AI and beyond, particularly with the promising capabilities of the ARC post-training technique.
Thank you for engaging with the DigitalOcean Community. Explore our offerings in compute, storage, networking, and managed databases for more insights.
Welcome to DediRock, your trusted partner in high-performance hosting solutions. At DediRock, we specialize in providing dedicated servers, VPS hosting, and cloud services tailored to meet the unique needs of businesses and individuals alike. Our mission is to deliver reliable, scalable, and secure hosting solutions that empower our clients to achieve their digital goals. With a commitment to exceptional customer support, cutting-edge technology, and robust infrastructure, DediRock stands out as a leader in the hosting industry. Join us and experience the difference that dedicated service and unwavering reliability can make for your online presence. Launch our website.