The development of artificial intelligence (AI) has enabled the technology to be implemented in various industries. We have seen how Warner Bros. has implemented the tech. Now, Deezer, a music streaming service, has built an AI model that can spot explicit lyrics in a song and remove the necessary words to make it listenable by every age group.

The task of finding explicit lyrics and separating them from a song can be done by human operators. However, it can be a tedious process. So, to make it simpler, the company built a tool, Spleeter, that separates the individual tracks of a song to analyse the lyrical part.

Spleeter is a tool that is built to aid the researchers working in Music Information Retrieval (MIR). It basically uses a “state-of-the-art” source separations algorithm to help the MIR researchers get hold of the separate tracks from a “Mix track”.

When a song is tagged as an “explicit” song (generally with the “Parental Advisory” label), it means that it is not suitable for children. Now, when a song is submitted to streaming services like Deezer without an “explicit” tag, it might mean that the song is suitable for all age groups. However, sometimes it can also mean that the music label did not analyse the song for explicit content and released it as it is.

Now, using existing machine learning models require the use of text-based transcripts, that can be sometimes hard to find. However, there is an alternate approach by using deep neural networks, but they also sometimes can be unreliable as researchers do not always know about the complex processing functions that the system uses to get the results.

So, by using Spleeter, the researchers can separate the voice tracks from the actual songs quite fast as it is 100 times faster than real-time. So, it should prove to be a good option when processing huge datasets. Also, the system uses a technique called “Keyword Spotting System” to spot the “explicit” words in the voice tracks.

Deezer 2
Image: Deezer

The researchers tested the model with a “black-box” model and an “Oracle” system that takes the lyrics of the songs to detect the keywords directly from the text. Spleeter was able to beat the black-box model, however, it fell short of the “Oracle” system. Nonetheless, beating the black-box model is also quite a big deal for the tool.

Deezer 1
Image: Deezer

So, MIR researchers can easily make an explicit song kid-friendly with the help of Spleeter by separating the voice track and analysing it for key “explicit” words.