Google has introduced a musicalm – a model of generating music from the lyrics

A team of engineers from Google presented a new one AI generation of music called Musiclm. The model creates high -quality music based on textual descriptions, such as “a calming melody of the violin supported by a distorted guitar riff.” It works in a similar way as Dall-E, which generates images from texts.

Musiclm uses multi -stage autoregression modeling Audiolm as a generative component, expanding it to text processing. To resolve the main challenge of a shortage of paired data, scientists used a mulana-lived model of a musical text that is trained for the music project and corresponding to the descriptions of the text to representing close together in the space of the deposition.

During the Musiclm training on a large set of data from unknown music, the model treats the process of creating conditional music as a task of modeling the hierarchical sequence and generates music at 24 kHz, which remains stood for several minutes. To resolve the lack of evaluation, programmers issued a musiccaps-new set of data on high-quality music signature with 5,500 examples of text couples prepared by professional musicians.

Experiments show that Musiclm exceeds previous systems both in terms of sound quality and compliance with the description of the text. In addition, the Musiclm model can be conditioned by both text and melody. The model can generate music according to the style described in the text description and transform melodies, even if the songs were whistled or bored.

See the demo model on website.

The AI ​​system learned to create music through training on a set of data containing five million audio clips, representing 280,000 hours of songs made by singers. Musiclm can create songs of different lengths. For example, it can generate a quick riff or the whole song. And he can even go beyond that, creating songs with alternating compositions, as is often the case in Symphon to create a sense of history. The system can also support specific demands, such as the demands of certain instruments or a specific species. It can also generate the appearance of vocals.

The creation of the Musiclm model is part of the deep learning of AI applications designed to recreate human mental abilities, such as speaking, writing documents, drawing, conducting tests or writing evidence of mathematical claims.

For now, programmers have announced that Google will not spend the system for public use. Tests have shown that about 1% of the music generated by the model is copied directly from the real performer. That is why they consider the confusion of content and lawsuits.

LEAVE A REPLY

Please enter your comment!
Please enter your name here