Neural Beatbox



Rhythm is one of the most ancient means of communication. Neural Beatbox enables anyone to collectively create beats and rhythms using their own sounds. The AI segments and classifies sounds into drum categories and continuously generates new rhythms. By combining the contributions of viewers, it creates an evolving musical dialogue between people. The AI’s slight imperfections enrich the creative expression by generating unique musical experiences.


Check out the website of Neural Beatbox. (


Through interactive design principles, this audiovisual installation allows for collaboration between a user and the AI system; while the AI guides the creative process and makes decisions, the content itself only comes from humans.

This resonates with the practice of beatboxing, where the instruments are removed from the equation to put the emphasis on the creative potential of the individual.

Here, the AI becomes a tool and enabler for natural expression, trying to make the best out of any human-produced content. Despite the computational process involved, it remains imperfect and produces results that might meet expectations or not, adding a creative element of surprise and novelty.


One critical aspect of the installation was to propose a setup intuitive enough to allow for a fully interactive experience, while still showcasing the potential of machine learning in the context of music production.

In order to do so, the system was designed from server to client-side;

The AI system is divided into two parts:


The first step is to receive sound files (recorded from users), divide them into meaningful segments, and use a neural network classifier to classify them into one of eight possible drum categories (e.g. “kick”, “snare”, “clap”, etc).


The second step is to generate beats — sequences of drum patterns — that will be played.

This generative part is also the result of a trained neural network – a VariationalAutoencoder (VAE) – but it involves extra processes that give us more control over the choice of sequence: for instance, each drum can be weighted in order to pull a beat with more relevant items within the current context (e.g. if a user records a kick, we will want to update the beat with a sequence containing a kick).

The client-side runs as a web app, and makes uses of the moderns features provided by browsers: media recording API, advanced graphics, etc.

This allows the application to be accessible from any modern machine with minimal configuration.

In the context of the event at the Barbican Centre, the system behind Neural Beatbox was first designed to run as an exhibition app, and thus features reduced interactions to fit the simple interface provided on-site; In the long term though, it is planned to evolve into a proper browser-friendly application, to allow anyone to access it from a browser and experiment with it.

While the data processing could be done in the browser with modern APIs, delegating the heavier computations to the server lets us have more control over the smoothness of the process. Being centralized, it also opens doors for further iterations in the future; collaboration could now potentially happen across users, establishing a connection between them, with the AI acting as a “curator” in the middle. 

Neural Beatbox at AI: More than Human


In order to convey the intents of the experience and interaction with it intuitive, the visual setup had to go through a proper design process. We collaborated with Alvaro Arregui (Nuevo.Studio) for that purpose, and developed a set of animations to reflect the dynamics of the music, and give a rhythm to the piece, visually as well as audibly.


2019/05 – 2019/08AI: More than HumanBarbican Centre, London UK



  • Concept / Direction

    Nao Tokui (Qosmo, Inc.)

  • Research / Management

    Max Frenzel (Qosmo, Inc.)

  • Development

    Robin Jungers (Qosmo, Inc.)

  • Design

    Alvaro Arregui (Nuevo.Studio)

Get in touch with us here!