水声目标识别技术在舰船航行安全、海洋动物保护、水下态势智能感知等领域均有重要应用价值。针对水声目标识别方法精度不足及嵌入式模块部署困难等问题,本文提出一种基于差分特征融合和注意力机制改进的轻量化水声目标识别模型。该模型采用CNN-Transformer混合架构,将原始音频信号中提取的梅尔倒谱系数及其一阶、二阶差分特征组成三通道融合特征作为模型的输入,并引入Shuffle Attention模块改进MobileViT模型,利用迁移学习技术优化模型权重。实验结果表明,在公开的ShipsEar数据集上,本文提出的方法测试精度达到98.49%,同时模型参数量仅为952.79 K,均优于其他方法。
Underwater acoustic target recognition technology has significant application value in some fields such as ship navigation safety, marine animal protection and intelligent perception of underwater situation. This paper proposes an improved lightweight underwater acoustic recognition model based on delta feature fusion and attention mechanism to address the problems of insufficient accuracy and difficulty in embedded deployment. The model employs a CNN-Transformer hybrid architecture. Mel Frequency cepstral coefficients, delta and delta-delta features are extracted to form a three-channel fused input feature. The MobileViT model is enhanced with the Shuffle Attention block. Also, the transfer learning technique is applied to optimize its weights. The experimental results show that the proposed method achieves the testing accuracy of 98.49% on the ShipsEar dataset with the parameter of 952.79 K, which is superior to other methods.
2025,47(16): 122-127 收稿日期:2024-10-8
DOI:10.3404/j.issn.1672-7649.2025.16.019
分类号:TB56
基金项目:国家重点研发计划专题(2023YFC3011704-2);北京市教育委员会科技计划一般项目(KM202210017006);北京市教育委员会科研计划项目资助(KM202410017006);宁夏自然科学基金(2022AAC03757)
作者简介:李晶(1987-),男,博士,副教授,研究方向为海洋水下攻防与多样化目标识别
参考文献:
[1] 李启虎. 进入21世纪的声纳技术[J]. 信号处理, 2012, 28(1): 1-11.
[2] 强超超, 王元斌. 水声目标识别技术现状与发展[J]. 指挥信息系统与技术, 2018, 9(2): 73-78.
[3] 张延厚, 王超, 张奇, 等. 水声目标探测和识别融合技术发展综述[J]. 信号处理, 2023, 39(10): 1711-1727.
[4] LIU F, SHEN T, LUO Z, et al. Underwater target recognition using convolutional recurrent neural networks with 3-D Mel-spectrogram and data augmentation[J]. Applied Acoustics, 2021, 178: 107989.
[5] YANG S, XUE L, HONG X, et al. A lightweight network model based on an attention mechanism for ship-radiated noise classification[J]. Journal of Marine Science and Engineering, 2023, 11(2): 432.
[6] 张钇, 熊水东, 马燕新, 等. 嵌入注意力机制的卷积神经网络水声目标识别[J]. 声学技术, 2022, 41(6): 796-803.
[7] WU J, LI P, WANG Y, et al. VFR: The underwater acoustic target recognition using cross-domain pre-training with fbank fusion features[J]. Journal of Marine Science and Engineering, 2023, 11(2): 263.
[8] 郭佳霖, 智敏, 殷雁君, 等. 图像处理中CNN与视觉Transformer混合模型研究综述[J/OL]. 计算机科学与探索, 1-18[2024-09-25].
[9] LI P, WU J, WANG Y, et al. STM: Spectrogram transformer model for underwater acoustic target recognition[J]. Journal of Marine Science and Engineering, 2022, 10(10): 1428.
[10] XU J, XIE Y, WANG W. Underwater acoustic target recognition based on smoothness-inducing regularization and spectrogram-based data augmentation[J]. Ocean Engineering, 2023, 281: 114926.
[11] YAO H, GAO T, WANG Y, et al. Mobile_ViT: Underwater acoustic target recognition method based on local–global feature fusion[J]. Journal of Marine Science and Engineering, 2024, 12(4): 589.
[12] LYU C, HU X, NIU Z, et al. A light-weight neural network for marine acoustic signal recognition suitable for fiber-optic hydrophones[J]. Expert Systems with Applications, 2024, 235: 121235.
[13] MEHTA S, RASTEGARI M. Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. arXiv 2021[J]. arXiv preprint arXiv: 2110.02178.
[14] ZHANG Q L, YANG Y B. Sa-net: Shuffle attention for deep convolutional neural networks[C]//ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021: 2235-2239.
[15] SANTOS-DOMÍNGUEZ D, TORRES-GUIJARRO S, CARDENAL-LÓPEZ A, et al. ShipsEar: An underwater vessel noise database[J]. Applied Acoustics, 2016, 113: 64-69.
[16] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[17] SANDLER M, HOWARD A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition, 2018.
[18] HOWARD A, SANDLER M, CHU G, et al. Searching for mobilenetv3[C]//Proceedings of the IEEE/CVF international conference on computer vision, 2019.
[19] MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design[C]//Proceedings of the European conference on computer vision (ECCV), 2018.