video: Yansu Kim

“AI DJ Project - A dialogue between AI and a human” is a live performance featuring an Artificial Intelligence (AI) DJ playing alongside a human DJ. Utilizing deep neural network technology, the AI system selects and mixes songs and performs other musical tasks. Playing back to back, each DJ selects one song at a time, embodying a dialogue between the human and AI through music.

In “Back to Back” the AI system and human DJ perform under similar conditions as much as possible. For example, the AI daringly uses physical vinyl records and turntables. The system listens to songs played by the human DJ, detects the tempo, judges the genre, and processes information on the spot. Following this process, the AI chooses the next record to be played. Of course, since the AI does not have a physical body it cannot set the new record on the turntable, so it requires some human assistance. After a record is set the AI begins the process again, adjusting the tempo of the next song to the tempo of the song played by its human counterpart. The beats of both songs are matched by controlling the pitch of the turntable via MIDI.

Occasionally the AI performs in unexpected ways. Sometimes the selection suits the atmosphere very well, but at times it appears random. For example, someone viewing the performance may expect a fast tempo techno track to be queued after a previous track, however, the AI system has a mind of its own and might play a free jazz song next. The system has affinities and rhythm unknown to the audience. This unpredictability brings an amusing tension into the performance, which is only made possible by the interaction between the human and AI DJs.

Artificial Intelligence should not be considered as an imitation or emulation of humans. Rather, it is an “Alternative Intelligence” with a logic different to humans. “Back to Back” serves as a critical investigation into the unique relationship between humans and machines. In the performance AI is not a replacement for the human DJ, instead it is a partner that is able to think and play alongside its human counterpart, bringing forth a wider perspective of our own relationship to contemporary technologies.

SOFTWARE

The software consists of two parts
1. Deep learning system for selecting and mixing songs
2. Visualization

1. Deep learning system for selecting and mixing songs

Recommending music on Spotify with deep learning | http://benanne.github.io/2014/08/05/spotify-cnns.html

1-a: Selecting songs

We trained a neural network to extract auditory features based on spectrogram images of song snippets. As for the goal, we have collected over 10000 songs with labels describing the genre of the song (i.e, hiphop, techno, dubstep, house… etc) and trained a convolutional neural network(CNN) to infer a genre of a given song based on a spectrogram image of the song. Once we got the network trained, we can use the same model to extract auditory features in high dimensional vector and then squeeze it into 3D space using T-SNE dimension reduction algorithms. When human DJ is playing, the system feeds the incoming audio into the model and mapped it into the same 3D space, so that the system can select the closest song, which presumably has similar musical tone/mood, as the next song to play.

1-b: Matching the beat

The second neural network is used to control the speed of turntable to match the beat with music human DJ plays. We used a machine learning technique called “Reinforcement Learning” to “teach” the model how to speed up/down, nudge/pull the turntable to align downbeats through million of times of trials and errors.

2. Visualization

We also tried to visualize the aforementioned two processes during the performance.
The visualization software has multiple scenes including:

  • a. T-SNE 3D space of sound features.
  • b. Activations inside the neural network model for selecting a next song.
  • c. Input (timeline of beats) and output (action) of beat-matching model

CREDIT

Concept/Programming

Nao Tokui

Visualization

Shoya Dozono

Project Management

Miyu Hosoi

Assistance

Yuma Kajihara

Robot

TASKO, inc.

REFERENCES

1-a: Selecting songs

- AUTOMATIC TAGGING USING DEEP CONVOLUTIONAL NEURAL NETWORKS – Keunwoo Choi, et al | https://arxiv.org/abs/1606.00298

- Deep content-based music recommendation – Aaron van den Oord, et al | https://papers.nips.cc/paper/5004-deep-content-based-music-recommendation

- Recommending music on Spotify with deep learning (blog) | http://benanne.github.io/2014/08/05/spotify-cnns.html

- Explaining Deep Convolutional Neural Networks on Music Classification | https://keunwoochoi.wordpress.com/2016/03/23/what-cnns-see-when-cnns-see-spectrograms/

1-b: Matching the beat

- Playing Atari with Deep Reinforcement Learning | https://arxiv.org/abs/1312.5602

- EVALUATING THE EVALUATION MEASURES FOR BEAT TRACKING | http://telecom.inesctec.pt/~mdavies/pdfs/DaviesBock14-ismir.pdf

- Measuring the performance of beat tracking algorithms using a beat error histogram | https://www.eecs.qmul.ac.uk/~markp/2011/DaviesDegaraPlumbley11-spl_accepted_postprint.pdf

SPECIAL THANKS

wasabeat, Chris Romero, Yansu Kim, Rakutaro Ogiwara, Ametsub

PRESS

DOWNLOAD HERE

Please credit “photo by Rakutaro Ogiwara”

^