In a collaborative research project with AlphaTheta Corporation, we developed a deep-learning model to detect the presence and position of vocals in music. This model was integrated into the Pioneer DJ software.
Today’s DJs use audio data to perform. Waveforms can now be visualized on a screen, making the structure of a song much easier to identify. Even so, identifying the starting point of vocals from waveforms alone is no easy task.
In this project, using deep learning, the system analyzes data on songs that contain vocals (songs, rap, choruses, etc.) and those that do not and learns to predict which songs will contain vocals. The software displays the timing of the detected vocals as an overlay over the waveform. DJs can tell at a glance where the vocals appear in the song, so they can use this information in mixing and other performances.
The engine developed this time was integrated in “Pioneer DJ rekordbox for Mac/Windows (ver. 6.0.1),” which was released in May 2020.
Links
Client
AlphaTheta Corporation
Credits
Machine Learning / Technical Direction: Nao Tokui (Qosmo, Inc.), Max Frenzel(Qosmo, Inc.) Project Management: Jun Kato (Dentsu Craft Tokyo), Yumi Takahashi (Qosmo, Inc.)
Product
This project uses Qosmo Music & Sound AI. Playlist Generation