Improved Capsule Routing for Weakly Labeled Sound Event Detection

2022年12月1日·
Haitao Li
杨树国
杨树国
,
Wenwu Wang
· 0 分钟阅读时长
摘要
Polyphonic sound event detection aims to detect the types of sound events that occur in given audio clips, and their onset and offset times, in which multiple sound events may occur simultaneously. Deep learning-based methods such as convolutional neural networks (CNN) achieved state-of-the-art results in polyphonic sound event detection. However, two open challenges still remain: overlap between events and prone to overfitting problem. To solve the above two problems, we proposed a capsule network-based method for polyphonic sound event detection. With so-called dynamic routing, capsule networks have the advantage of handling overlapping objects and the generalization ability to reduce overfitting. However, dynamic routing also greatly slows down the training process. In order to speed up the training process, we propose a weakly labeled polyphonic sound event detection model based on the improved capsule routing. Our proposed method is evaluated on task 4 of the DCASE 2017 challenge and compared with several baselines, demonstrating competitive results in terms of F-score and computational efficiency.
类型
出版物
EURASIP Journal on Audio, Speech, and Music Processing
publications
Authors
杨树国
Authors
正教授
教授,博士生导师,哈尔滨工业大学博士后。数据科学与信息技术研究中心主任,人工智能海洋技术场景化应用山东省工程研究中心主任,青岛市人工智能海洋技术创新中心主任,青岛科技大学数学与交叉研究院院长。美国佐治亚理工学院高级访问学者、香港中文大学高级访问学者、北京交通大学高级访问学者;山东省数学会常务理事、山东省应用统计学会常务理事、人工智能海洋学专业委员会常务委员。近年来,主持或参与国家自然科学基金、国防科工委、电子工业部、省自然基金、省重点科研计划、省高校科研计划、省优秀中青年科学家基金、青岛市科技发展计划项目等各级各类科研项目40多项。
Authors