ARTICLE

Volume 2,Issue 1

Fall 2025

Cite this article
1
Citations
3
Views
14 January 2025

基于语音特征的抑郁症 AI 筛查模型的研究与设计

宵 张1 雪俊 白1 琳 唐1 乐伊 张1 雪 苏1 金社 王1 宇星 沈2 彦华 陈3
Show Less
1 宁夏医科大学, 中国
2 成都好麦科技有限公司, 中国
3 宁夏医科大学总医院心理卫生中心, 中国
TACS 2025 , 2(1), 120–121; https://doi.org/10.61369/TACS.2025010025
© 2025 by the Author. Licensee Art and Design, USA. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC BY-NC 4.0) ( https://creativecommons.org/licenses/by-nc/4.0/ )
Abstract

针对传统抑郁症量表 [2] 筛查效率低的问题,本研究提出基于语音特征的自动筛查模型。通过采集 200例临床患者和健康个体的语音样本,经预处理提取特征后,构建结合 LSTM 时间建模与 Attention 机制的深度学习模型。测试显示模型准确率达 84.62%,F1分数 0.86,在效率和一致性上优于传统量表。

Keywords
抑郁症
语音特征
LSTM-Attention 机制
深度学习
心理健康筛查
References

[1] 世界卫生组织 . 抑郁症及其他常见精神障碍 : 全球卫生估算报告 [R]. 瑞士 : 世界卫生组织 , 2022.

[2]World Health Organization. The ICD-10 classification of mental and behavioural disorders: Clinical descriptions and diagnostic guidelines[M]. Geneva: WHO, 1992.

[3]Kroenke K, Spitzer R L. The PHQ-9: A new depression diagnostic and severity measure[J]. Psychiatric Annals, 2002, 32(9): 509-515.

[4]Hamilton M. A rating scale for depression[J]. Journal of Neurology, Neurosurgery & Psychiatry, 1960, 23(1): 56-62.

[5]Donoho D L, Johnstone I M. Ideal spatial adaptation by wavelet shrinkage[J]. Biometrika, 1994, 81(3): 425-455.

[6]Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation,1997, 9(8): 1735-1780.

[7]Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate[J]. arXiv preprint arXiv:1409.0473, 2014.

[8]Cooley J W, Tukey J W. An algorithm for the machine calculation of complex Fourier series[J]. Mathematics of Computation, 1965, 19(90): 297-301.

[9]Kingma D P, Ba J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014.

[10]Vaswani A, et al. Attention is all you need[C]. Advances in Neural Information Processing Systems, 2017: 5998-6008.

[11]Cummins N, et al. A review of depression and suicide risk assessment using speech analysis[J]. Speech Communication, 2015, 71: 10-49.

[12]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016: 770-778.

[13]Devlin J, Chang M W, Lee K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[J/OL]. arXiv:1810.04805, 2018.

[14]Valstar M, Schuller B, Smith K, et al. AVEC 2016: Depression, mood, and emotion recognition workshop and challenge[C]//Proceedings of the 6th International Workshop on Audio/Visual Emotion Challenge. Amsterdam, Netherlands: ACM,2016: 3-10.

[15]Li X, Pang T, Liu Y, et al. Multimodal fusion for mental health assessment[J]. IEEE Transactions on Affective Computing, 2021, 12(3): 582-595.

Share
Back to top