123,123

基于多Transformer网络协同生成的自动作曲

信息技术与网络安全 5期

王嵩超，李金龙

(中国科学技术大学计算机科学与技术学院，安徽合肥230026)

摘要： 多音轨的自动作曲算法需要同时兼顾单条序列的连贯性与多个序列之间的和谐程度。以往工作通常选择合并序列或并行多生成器两种方案，它们都无法同时完全捕获音符之间的依赖关系以及做到单条序列的连续性。提出了MuseTransformer框架，其包括由多个Transformer组成的生成器池模块，并设计了多生成器的异步执行策略与同步机制，以确保细粒度依赖关系的捕获。在乐谱的序列表示方面，提出了关键位置符号(Key Position Symbol，KPS)以提高表示效率。多种音乐领域评价指标的实验结果表明，所提模型生成的多轨序列之间在和谐程度、连贯性以及序列表示空间效率上，均等同或优于其他先进方法。

關鍵詞： 音乐生成序列表示序列模型

中圖分類號： TP37
文獻標識碼： A
DOI： 10.19358/j.issn.2096-5133.2022.05.008
引用格式：王嵩超，李金龍. 基于多Transformer網(wǎng)絡協(xié)同生成的自動作曲[J].信息技術與網(wǎng)絡安全，2022，41(5)：51-58.

Automatic music composition based on multi-Transformer cooperation

Wang Songchao，Li Jinlong

(School of Computer Science and Technology，University of Science and Technology of China，Hefei 230026，China)

Abstract： Multi-track music generation algorithm needs to take account of both coherence on one single track and strong dependencies among multiple tracks. Previous methods either choose to merge multiple sequences into one long sequence, or use multiple generators in parallel, both of which either fail to capture complete dependencies among tokens, or loss single track′s completeness. In this paper，we proposed MuseTransformer, which contains multiple Transformer generators corresponding to each track. In order to capture dependencies among tracks in a fine-grained manner, we designed an asynchronous execution strategy to enable cooperation and synchronization among all generators. In terms of music sequence representation, we designed KPS(Key Position Symbol) to improve the representation efficiency. Experiments on multiple music field metrics show advantages of our model on multi-track harmony, coherence and spatial-compactness, compared to other state-of-the-art methods.

Key words : music generation；sequence representation；sequence model

0 引言

多目標序列生成技術在多軌音樂生成等任務中有著重要應用，這需要同時確保多個生成的序列自身的連續(xù)性與序列之間很強的相關性。本文關注音樂生成背景下的多序列生成問題?，F(xiàn)代音樂歌曲通常包含多個音軌，包括旋律音軌和用于伴奏的多個樂器音軌。早期的研究[1-2]專注于只有單軌的旋律生成，而最近的工作[3-4]已經(jīng)開始探索多軌音樂生成。在本文中，僅關注使用基于序列的方法的多軌音樂生成問題。

基于序列的方法首先會將樂譜序列化為一個或多個符號序列，并輸入至序列模型。通常，會設計出類似MIDI協(xié)議的序列格式來表示一個單軌音樂序列[1-2，5]。與單軌生成相比，多軌生成任務需要其生成的軌道具有很強的相關性，同時保持其自身的連續(xù)性。

本文詳細內(nèi)容請下載：http://m.ihrv.cn/resource/share/2000004247

作者信息：

王嵩超，李金龍

(中國科學技術大學計算機科學與技術學院，安徽合肥230026)

原創(chuàng)聲明：此內(nèi)容為AET網(wǎng)站原創(chuàng)，未經(jīng)授權禁止轉載。

相關內(nèi)容