Qosmo updates “Imaginary Soundscape”, a web service where AI selects soundscape for images and scenes

Share

Qosmo offers a license to the updated multi-modal AI technology for commercial use

Tokyo, Japan, Apr 27, 2021 – Qosmo, Inc., who advances the creativity of mankind with Art and Technology, publicly releases today the new updated version of “Imaginary Soundscape”, an online service which finds a matching sound-clip to uploaded images using multi-modal deep learning technology. This service is now offered in English and Japanese for free. Qosmo also announces the commercial licensing of the core technology of this service, Img2Sound Engine.

Photo:Imaginary Soundscape website top

■ What is Imaginary Soundscape?
We all have the ability and experience of imagining the sound you’d experience when seeing a photo. A beach would remind you of the sound of the waves, an image of a busy crossing would remind you of the sound of cars. This project is an attempt to externalize such human capability. Using the image uploaded by the user, the AI model will find the best matching sound out of the large sound library which consists of over 60 thousand sound-clips. In the Google Street View mode, the user can walk around the place of their choice and experience the soundscape “imagined” by the AI. Since its initial launch in 2017, this project has been highly reputed and has been used by more than 500 thousand users so far.

Google Street View mode will let you experience the soundscape selected by AI

■ Features of the Latest Version
There are three major updates to the service at this time: improvement of the model accuracy, expansion of the sound library, and improvement of the UI. Internally, the AI model based on object recognition technology was replaced with a multi-modal AI model built with the Contrastive Learning technique. Together with the expanded sound library, these updates allow for more contextualized matching with awareness to subtle differences in nuances.

■ Commercial licensing of “Img2Sound Engine”
At this time we are starting commercial licensing of our core technology of Imaginary Soundscape, Img2Sound Engine. This technology is an application of a Deep Learning technique called Contrastive Learning. Both the input image and the candidate sound files are converted into vector representations which allows for comparison of similarities between different content types. This technique is also applicable in the other direction (sound to image) as well as other modalities such as text, music and video. Qosmo is experienced at multi-modal AI technologies as we have helped our clients incorporate them into their products, services and projects on various occasions.

Diagram : Img2Sound Engine system schematics

■ Past Exhibitions and Awards

2017Submitted and selected for NIPS2017 Machine Learning for Creativity and Design
2018Exhibited “Imaginary Soundwalk” at Media Ambition Tokyo 2018
Favorite Website Award(FWA) Site of the day
Posted to “Experiments with Google — AI Experiments”

Get in touch with us here!

CONTACT