Multimedia ResearchISSN:2582-547X

Hybrid Particle Swarm Optimization-Deep Neural Network Model for Speaker Recognition

Volume 3 |
Issue 1 |
January 2020

Abstract

Nowadays, speaker recognition is considered as a current research topic. Moreover, the voice biometrics which is attained from the speaker's behavior or physical related features offers a pattern of data that contains sensitive information regarding the speaker. The efficiency of speaker recognition systems is observed to minimize expeditiously because of the mismatch incidence, such as noise and channel degradations. With the aspire to promise security and effectual recognition, a Hybrid Particle Swarm Optimization–based Deep Neural Network (Hybrid PSO-based DNN classifier) is used to identify a speaker for that the frequency-dependent features, like MKMFCC, autocorrelation, and spectral skewness, are exploited. The classification is done by exploiting the DNN classifier based on feature extraction and classifier is performed optimally by exploiting the proposed Particle Swarm Optimization. Finally, the simulation analysis of the proposed technique is compared with the LM, SVM, GMM, and BSW. It shows the performance of the proposed technique outperforms the conventional techniques concerning the accuracy, FAR FRR.

References

Tengyue Bian, Fangzhou Chen, Li Xu,"Self-attention based speaker recognition using Cluster-Range Loss", Neurocomputing, vol. 368, pp.59-68, , 27 Nov. 2019.
Jesús Villalba, Nanxin Chen, David Snyder, Daniel Garcia-Romero, Najim Dehak,"State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations", Computer Speech & Language, vol. 60, March 2020.

Shuping Peng, Tao Lv, Xiyu Han, Shisong Wu, Heyong Zhang,"Remote speaker recognition based on the enhanced LDV-captured speech", Applied Acoustics, vol. 143, pp. 165-170, 1 January 2019.

Emma Jokinen, Rahim Saeidi, Tomi Kinnunen, Paavo Alku,"Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task",Computer Speech & Language, vol. 53, pp. 1-11, January 2019.

Michael Jessen, Jakub Bortlík, Petr Schwarz, Yosef A. Solewicz,"Evaluation of Phonexia automatic speaker recognition software under conditions reflecting those of a real forensic voice comparison case (forensic_eval_01)", Speech Communication, vol. 111, pp 22-28, August 2019.

Ville Vestman, Dhananjaya Gowda, Md Sahidullah, Paavo Alku, Tomi Kinnunen, "Speaker recognition from whispered speech: A tutorial survey and an application of time-varying linear prediction", Speech Communication, vol 99, pp. 62-79, May 2018.

Ing-Jr Ding, Jia-Yi Shi, "Kinect microphone array-based speech and speaker recognition for the exhibition control of humanoid robots", Computers & Electrical Engineering, vol. 62, pp. 719-729, August 2017.

Johan Rohdin, Anna Silnova, Mireia Diez, Oldřich Plchot, Ondřej Glembek, "End-to-end DNN based textindependent speaker recognition for long and short utterances", Computer Speech & Language, vol. 59, pp. 22- 35, January 2020.

Ramaiah, V.S. Rao, R.R, “Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering," International Journal of Speech Technology, vol. 19, no. 4, pp. 945–963.

Valsalan, P. Manimegalai. S.O and Augustine, S,” Non invasive estimation of blood pressure using a linear regression model from the photoplethysmogram (PPG) signal," Perspectivas em Ciencia da Informacao, vol. 22, no. 4, 2017

Vrabie, V. Granjon, P. and Serviere, C” Spectral kurtosis: from definition to application," In the proceedings on 6th IEEE International Workshop on Nonlinear Signal and Image Processing (NSIP), 2003.

Auto content analysis, https://www.audiocontentanalysis.org/code/, 2018.

Morales-Cordovilla, J.A. Peinado, A.M. Sánchez, V. González, J.A,” Feature Extraction Based on PitchSynchronous Averaging for Robust Speech Recognition," IEEE transactions on audio, speech, and language processing, vol. 19, no. 3, pp. 640-651, 2011.

Manoela Kohler, Marley M. B. R. Vellasco, Ricardo Tanscheit,"PSO+: A new particle swarm optimization algorithm for constrained problems", Applied Soft Computing, 21 Oct. 2019

J. Ren and S. Yang, ``An improved PSO-BP network model,'' Proc. 3rd Int. Symp. Inf. Sci. Eng. (ISISE), Shanghai, China, pp. 426-429, Dec. 2010.

Ravi Kumar Vuddagiri, Hari Krishna Vydana, Anil Kumar Vuppala,"Curriculum learning based approach for noise robust language identification using DNN with attention", Expert Systems with Applications, Volume 110, 15 November 2018, Pages 290-297.

English Language Speech Database for Speaker Recognition (ELSDSR) from http://www2.imm.dtu.dk/~lfen/elsdsr/

Multimedia ResearchISSN:2582-547X

Hybrid Particle Swarm Optimization-Deep Neural Network Model for Speaker Recognition

Abstract

References

Access options

DOI : https://doi.org/10.46253/j.mr.v3i1.a1

Author information

Affiliations

Publisher Information

Speaker Recognition; Feature Extraction; Classification; DNN; PSO.

Publisher : Resbee Info Technologies Pvt Ltd