Our AI Song Contest 2022 Submission: Noise To Water

Tool Description

We used a range of tools and models. We used Midi-based models as well as Audio-based models. We configured models ourselves, as well as using readymade tools.

Own models

GAN Audio Models

For our GAN model, we both heavily drew on our musical “heroes” in curating the training data. We both are long-standing fans and producers of electronic music and ambient, so it was clear that we wanted the AI to take inspiration from the same artists that inspire us. As such, we trained the GANs on datasets of artists such as Aphex Twin, Boards of Canada and Steve Reich. However, we wanted to see what was possible if we combined the ideas and textures so typical of these artists and also trained GANs on combined datasets. This turned out to be especially successful in our case and many of the leading components in our song are trained on Steve Reich and Boards of Canada songs simultaneously.

More specifically, our models are based on adapted code of Chris Donahue’s WaveGan in the Tensorflow version trained in Colab. The models were trained on thousands of kick drums from Chris Liebing or Truncate sample packs, on 7 hours of Aphex Twin, on 9 hours of Steve Reich, 8 hours of Boards of Canada, or even combinations of all these. The models were trained between 6000 - 25000 epochs.

Lyric Model

Based on a vanilla LSTM implementation, written in PyTorch following standard textbook descriptions. Given the limited computational resources, we deprioritised the lyric generation knowing that we could not train a model sufficiently well to deliver outstanding results. The training data was based on the Music O-Net billboard chart library that included metadata such as lyrics and certain song qualifiers and descriptors. To suit the song we produced, the lyrics were trained on energetic music that featured little vocals and was labelled as intense, while remove songs labelled as “explicit” from the corpus to avoid sexist or racist vocabulary.

Elements

Melody: We entered our favourite part of Chopin’s Nocturnes into the MuseNet Model, using the Chopin corpus, and generated a couple of melody continuations. Therefore, we generated a Chopin-like melody that does not exist. We then layered the Piano line with a Sax line, generated by Google’s ToneTransfer.

Source Song

Source Song
Drone: With our GAN model we generated audio bits based on the music of Boards of Canada and Steve Reich. We looped our favourite output and processed it with effects such as resonators, delays, reverbs and distortion units. We then modulated some of the parameters over time to give the element an evolving and dynamic character.
Pads & Stabs: We entered some of our GAN outputs into a Simpler to create a chord progression and melodic stabs.
Drums: We used our GAN model and Donahue’s WaveGAN demo. One Hihat element is the result of a concatenative synthesis tool that tried to recreate a drum pattern we programmed.
Lyrics: In the last step we generated lyrics through our own model and have them read out by an online translation tool.

Detailed Component Demo

Own GAN Model

Corpus: Boards of Canada & Steve Reich

Ambient Loop

Used a file generated by the BOC Reich Model, looped and pitched it. The loop was then treated with a wide range of Audio Effects such as Delays, Saturators, Compressors, Reverbs and Resonators. These effects were modulated over time to create a dynamic performance that spans throughout the entire track.

Original

Background Ambient Drone

Background Ambient Drone
Modified

Background Ambient Drone Modulated

Background Ambient Drone Modulated