Md Sahidullah (মুহাম্মদ শহীদুল্লাহ)

Md Sahidullah
Assistant Professor, Institute for Advancing Intelligence, TCG CREST, India
Phone: +91-9433289799
Email: md.sahidullah@tcgcrest.org OR sahidullahmd@gmail.com

About me

I am a researcher with more than eight years of post-qualification experience in the speech processing area. I have a general interest in speech science and technology. My activities include writing monographs, mentoring doctoral students, creating teaching materials for post-graduate courses, reviewing research articles, and promoting scientific awareness.

Specific areas of interest

Signal processing and machine learning for acoustic pattern analysis

Speaker recognition and spoofing countermeasures

Speech representation learning

Audio dataset collection, analysis and design of experiments

Professional experiences

Assistant Professor, Institute for Advancing Intelligence, TCG CREST, India (June 2023 -- Present)

Independent Researcher & Consultant (September 2021 -- May 2023)

Researcher, Inria, France (September 2018 -- August 2021)

Post-doctoral Researcher, Inria, France (January 2018 -- August 2018)

Visiting Scientist, Inria, France (August 2017 -- November 2017)

Post-doctoroal Researcher, University of Eastern Finland, Finland (April 2014 -- April 2017)

Programmer Analyst Trainee, Cognizant Technology Solutions Corporation, India (May 2006 -- January 2008)

Education

Ph.D. in Speech Processing from Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology Kharagpur (2015)
Thesis title: Enhancement of Speaker Recognition Performance Using Block Level, Relative and Temporal Information of Subband Energies

M.E. in Computer Science and Engineering from West Bengal University of Technology (2006)

B.E.(Hons.) in Electronics and Communication Engineering from Vidyasgar University (2004)

Publications

Refereed Journal papers

X. Liu, M. Sahidullah, T. Kinnunen; "Optimizing multi-taper features for deep speaker verification"; IEEE Signal Processing Letters, Vol 28, Page 2187-2191, 2021.

N. Sen, M. Sahidullah, H. Patil, S.K.D. Mandal, K.S. Rao, T.K. Basu; "Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework"; International Journal of Speech Technology, Vol 24, Issue December 2021, Page 1067-1088, 2021.

A. Nautsch, X. Wang, N. Evans, T. H Kinnunen, V. Vestman, M. Todisco, H. Delgado, M. Sahidullah, J. Yamagishi, K.A. Lee; "ASVspoof 2019: spoofing countermeasures for the detection of synthesized, converted and replayed speech"; IEEE Transactions on Biometrics, Behavior, and Identity Science, Vol 3, Issue 2, Page 252-265, 2021.

A.K. Kumar, D. Paul, M. Pal, M. Sahidullah, G. Saha; "Speech frame selection for spoofing detection with an application to partially spoofed audio-data"; International Journal of Speech Technology, Vol 24, Issue March 2021, Page 193–203, 2021.

X. Wang, J. Yamagishi, M. Todisco, H. Delgado, A. Nautsch, N. Evans, M. Sahidullah, V. Vestman, T. Kinnunen, K.A. Lee, L. Juvela, P. Alku, Y.H. Peng, H.T. Hwang, Y. Tsao, H.M. Wang, S.L. Maguer, M. Becker, F. Henderson, R. Clark, Y. Zhang, Q. Wang, Y. Jia, K. Onuma, K. Mushika, T. Kaneda, Y. Jiang, L.J. Liu, Y.C. Wu, W.C. Huang, T. Toda, K. Tanaka, H. Kameoka, I. Steiner, D. Matrouf, J.F. Bonastre, A. Govender, S. Ronanki, J.X. Zhang, Z.H. Ling; "ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech"; Computer Speech & Language, Vol 65, Issue November 2020, 101114, 2020.

S. Sarangi, M. Sahidullah, G. Saha; "Optimization of data-driven filterbank for automatic speaker verification"; Digital Signal Processing, Vol 104, Issue September 2020, 102795, 2020.

T. Kinnunen, H. Delgado, N. Evans, K.A. Lee, V. Vestman, A. Nautsch, M. Todisco, X. Wang, M. Sahidullah, J. Yamagishi, D.A. Reynolds; "Tandem assessment of spoofing countermeasures and automatic speaker verification: Fundamentals"; IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol 28, Page 2195-2210, 2020.

V. Vestman, T. Kinnunen, R.G. Hautamäki, M. Sahidullah; "Voice mimicry attacks assisted by automatic speaker verification"; Computer Speech & Language, Vol 59, Issue January 2020, Page 36-54, 2020.

A. Poddar, M. Sahidullah, G. Saha; "Quality measures for speaker verification with short utterances"; Digital Signal Processing, Vol 88, Issue May 2019, Page 66-79, 2019.

A. Poddar, M. Sahidullah, G. Saha; "Improved i-vector extraction technique for speaker verification with short utterances"; International Journal of Speech Technology, Vol 21, Issue 3, Page 473-488, 2018.

V Vestman, D Gowda, M. Sahidullah, P Alku, T Kinnunen; "Speaker recognition from whispered speech: A tutorial survey and an application of time-varying linear prediction"; Speech Communication, Vol 99, Issue May 2018, Page 62-79, 2018.

A. Poddar, M. Sahidullah, G. Saha; "Speaker verification with short utterances: a review of challenges, trends and opportunities"; IET Biometrics, Vol 7, Issue 2, Page 91-101, 2018.

A. Sholokhov, M. Sahidullah, T. Kinnunen; "Semi-supervised speech activity detection with an application to automatic speaker verification"; Computer Speech & Language, Vol 47, Issue January 2018, Page 132-156, 2018.

M. Sahidullah, D.A.L. Thomsen, R.G. Hautamäki, T. Kinnunen, Z.H. Tan, R. Parts, and M. Pitkänen; "Robust voice liveness detection and speaker verification using throat microphones"; IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol 26, No 1, Page 44-56, 2017.

R.G. Hautamäki, M. Sahidullah, V. Hautamäki, T. Kinnunen; "Acoustical and perceptual study of voice disguise by age modification in speaker verification"; Speech Communication, Vol 95, Issue December 2017, Page 1-15, 2017.

Z. Wu, J. Yamagishi, T. Kinnunen, C. Hanilci, M. Sahidullah, A. Sizov, N. Evans, M. Todisco, H. Delgado; "ASVspoof: The automatic speaker verification spoofing and countermeasures challenge"; IEEE Journal on Selected Topics in Signal Processing, Vol 11, Issue 14, Page 588-604, 2017.

C. Hanilci, T. Kinnunen, M. Sahidullah, and A. Sizov; "Spoofing detection goes noisy: An analysis of synthetic speech detection in the presence of additive noise"; Speech Communication, Vol 85, December 2016, Page 83-97.

N. Sengupta, M. Sahidullah and G. Saha; "Lung sound classification using cepstral-based statistical features"; Computers in Biology and Medicine, Vol 75, No 1, Page 118-129, 2016.

M. Sahidullah and T. Kinnunen; "Local spectral variability features for speaker verification"; Digital Signal Processing, Vol 50, Page 1-11, March 2016.

M. Sahidullah and G. Saha; "A novel windowing technique for efficient computation of MFCC for speaker recognition"; IEEE Signal Processing Letters, Vol 20, No 2, Page 149-152, 2013.

M. Sahidullah and G. Saha; "Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition"; Speech Communication, Vol 54, No 4, Page 543-565, 2012.

M. Sahidullah, S. Chakroborty and G. Saha; "On the use of perceptual line spectral pairs frequencies and higher-order residual moments for speaker identification"; International Journal of Biometrics, Vol 2, No 4, Page 358-378, 2010.

Refereed Conference papers

D. Paul, M. Sahidullah, G. Saha; "Generalization of Spoofing Countermeasures: A Case Study with ASVspoof 2015 and BTAS 2016 Corpora"; Proc. IEEE ICASSP 2017, pp. 2047-2051, New Orleans, USA, March 2017. (PDF)
T. Kinnunen, M. Sahidullah, M. Falcone, L. Costantini, R.G. Hautamäki, D. Thomsen, A. Sarkar, Z.-H. Tan, H. Delgado, M. Todisco, N. Evans, V. Hautamäki and K.A. Lee; "RedDots Replayed: A New Replay Spoofing Attack Corpus for Text-dependent Speaker Verification Research"; Proc. IEEE ICASSP 2017, pp. 5395-5399, New Orleans, USA, March 2017. (PDF)
A. Kanervisto, V. Vestman, M. Sahidullah, V. Hautamäkii, T. Kinnunen; "Effects of Gender Information in Text-independent and Text-dependent Speaker Verification"; Proc. IEEE ICASSP 2017, pp. 5360-5364, New Orleans, USA, March 2017. (PDF)
H. Delgado, M. Todisco, M. Sahidullah, A. Sarkar, N. Evans, T. Kinnunen, and Z.-H. Tan; "Further optimisations of constant Q cepstral processing for integrated utterance verification and text-dependent speaker verification"; Proc. IEEE workshop on Spoken Language Technology 2016 (SLT2016) pp. 179-185, San Diego, USA, December 2016. (PDF)
M. Sahidullah, H. Delgado, M. Todisco, H. Yu, T. Kinnunen, N. Evans and Z.-H. Tan; "Integrated Spoofing Countermeasures and Automatic Speaker Verification: an Evaluation on ASVspoof 2015"; Interspeech 2016, San Francisco, USA. (PDF)
M. Sahidullah, R.G. Hautamäki, D.A.L. Thomsen, T. Kinnunen, Z.-H. Tan, V. Hautamäki, R. Parts and M. Pitkanen; "Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech"; Interspeech 2016, San Francisco, USA. (PDF)
T. Kinnunen, M. Sahidullah, I. Kukanov, H. Delgado, M. Todisco, A. Sarkar, N. Thomsen, V. Hautamaki, N. Evans and Z.-H. Tan; "Utterance Verification for Text-Dependent Speaker Recognition: a Comparative Assessment Using the RedDots Corpus"; Interspeech 2016, San Francisco, USA. (File Lists) (PDF)
T. Kinnunen, A. Sholokhov, E. Khoury, D.A.L. Thomsen, M. Sahidullah and Z.-H. Tan; "HAPPY Team Entry to NIST OpenSAD Challenge: A Fusion of Short-Term Unsupervised and Segment i-Vector Based Speech Activity Detectors"; Interspeech 2016, San Francisco, USA. (PDF).
P. Korshunov, S. Marcel, H. Muckenhirnand, A.R. Gonçalves, A.G. Souza Mello, R.P. Velloso Violato, F.O. Simões, M.U. Neto, M. de Assis Angeloni, J.A. Stuchi, H. Dinkel, N. Chen, Y. Qian, D. Paul, G. Saha and M. Sahidullah; "Overview of BTAS 2016 Speaker Anti-spoofing Competition"; 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), Niagara Falls, Buffalo, New York (USA). (PDF)
R.G. Hautamäki, M. Sahidullah, T. Kinnunen and V. Hautamäki; "Age-Related Voice Disguise and its Impact on Speaker Verification Accuracy"; Speaker Odyssey, Bilbao, Spain, 2016. (PDF).
A. Poddar, M. Sahidullah and G. Saha; "Performance Comparison of Speaker Recognition Systems in Presence of Duration Variability"; Proc. 2015 Annual IEEE India Conference (INDICON), New Delhi, India, December 2015. (PDF)
N. Sengupta, M. Sahidullah and G. Saha; "Optimization of Cepstral Features for Robust Lung Sound Classification"; Proc. 2015 Annual IEEE India Conference (INDICON), New Delhi, India, December 2015.
M. Sahidullah, T. Kinnunen and C. Hanilci; "A comparison of features for synthetic speech detection"; Proc. Interspeech 2015, pp. 2087-2091, Dresden, Germany, September 2015. (MATLAB Codes) (PDF)
C. Hanilci, T. Kinnunen, M. Sahidullah, and A. Sizov; "Classifiers for synthetic speech detection: a comparison"; Proc. Interspeech 2015, pp. 2057-2061, Dresden, Germany, September 2015. (PDF)
Z. Wu, T. Kinnunen, N. Evans, J. Yamagishi, C. Hanilci, M. Sahidullah, A. Sizov; "ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge"; Proc. Interspeech 2015, pp. 2037-2041, Dresden, Germany, September 2015. (PDF)
M. Sahidullah, G. Saha; "On the use of perceptual Line Spectral Pairs Frequencies for speaker identification"; Proc. National Conference on Communications (NCC2010) 2010, Chennai, India, January 2010. (PDF)
M. Sahidullah, G. Saha; "In Search of Auto Correlation Based Vocal Cord Cues for Speaker Identification"; Proc. International Conference on RF and Signal Processing Systems (RSPS2010), Vijayawada, India, January 2010.
M. Sahidullah, S. Chakroborty, G. Saha; "Improving Performance of Speaker Identification System Using Complementary Information Fusion"; Proc. 17th International Conference on Advanced Computing and Communications (ADCOM 2009), Bangalore, India, December 2009. (PDF)
M. Sahidullah, G. Saha; "On the Use of Distributed DCT in Speaker Identification"; Proc. 2009 Annual IEEE India Conference (INDICON), Ahmedabad, India, December 2009. (PDF)
J. Basu, M. Sahidullah, A. Sinha; "A New Generalized Reconfigurable Architecture for Digital Signal Processor"; Proc. 15th International Conference on Advanced Computing and Communications (ADCOM 2007), Guwahati, India, December 2007. (PDF)
J. Basu, M. Sahidullah; "A New Generalized Architecture for Digital Signal Processor"; Proc. Second International Conference on Embedded Systems, Mobile Communication and Computing (ICEMC2 2007), Bangalore, India, August 2007.

Others (Technical reports, system descriptions, and rejected manuscripts)

M. Sahidullah, G. Saha; "Comparison of speech activity detection techniques for speaker recognition"; arXiv preprint arXiv:1210.0297, 2012.

M. Pal, D. Paul, M. Sahidullah, G. Saha; "Robustness of voice conversion techniques under mismatched conditions"; arXiv preprint arXiv:1612.07523, 2016.

N. Sengupta, M. Sahidullah, G. Saha; "Lung sound classification using local binary pattern"; arXiv preprint arXiv:1710.01703, 2017.

K.A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P.M. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, J. Patino, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T.N. Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K.K. Teh, H.D. Tran, K.K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J.F. Bonastre, C. Xu, Z.H. Lim, E.S. Chng, S. Ranjan, J.H.L. Hansen, M. Todisco, N. Evans; "I4U submission to NIST SRE 2018: Leveraging from a decade of shared experiences"; arXiv preprint arXiv:1904.07386, 2019.

M. Sahidullah, J. Patino, S. Cornell, R. Yin, S. Sivasankaran, H. Bredin, P. Korshunov, A. Brutti, R. Serizel, E. Vincent, N. Evans, S. Marcel, S. Squartini, C. Barras; "The Speed submission to DIHARD II: Contributions & lessons learned"; arXiv preprint arXiv:1911.02388, 2019.

M. Sahidullah, R. Serizel, E. Vincent; "Inria-MULTISPEECH system description for VoxSRC 2019 challenge"; Proc. The first VoxCeleb Speaker Recognition Challenge, 2019.

M. Sahidullah, R. Serizel, E. Vincent; "Inria system description for NIST SRE19"; Proc. The NIST SRE19 Workshop, 2019.

A.K. Sarkar, M. Sahidullah, Z.H. Tan; "Data generation using pass-phrase-dependent deep auto-encoders for text-dependent speaker verification"; arXiv preprint arXiv:2102.02074, 2021.

A.K. Kumar, S. Waldekar, M. Sahidullah, G. Saha; "ABSP system for the third DIHARD challenge"; arXiv preprint arXiv:2102.09939, 2021.

A.K. Kumar, S. Waldekar, M. Sahidullah, G. Saha; "Domain-dependent speaker diarization for the third DIHARD challenge"; Proc. The Third DIHARD Speech Diarization Challenge Workshop, 2021.

K.A. Lee, T. Kinnunen, D. Colibro, C. Vair, A. Nautsch, H. Sun, L. He, T. Liang, Q. Wang, M. Rouvier, P.M. Bousquet, R.K. Das, I.V. Bailo, M. Liu, H. Deldago, X. Liu, M. Sahidullah, S. Cumani, B. Zhang, K. Okabe, H. Yamamoto, R. Tao, H. Li, A.O. Gimenez, L. Wang, L. Buera; "I4U system description for NIST SRE20 CTS challenge"; arXiv preprint arXiv:2211.01091, 2022.

Professional Activities

Reviewer (Selected Journals)

IEEE/ACM Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Information Forensics and Security
IEEE Signal Processing Letters
Speech Communication (Elsevier)
Computer Speech & Language (Elsevier)
Digital Signal Processsing (Elsevier)

Society Memberships

IEEE (Student Member: 2009-2014, Member: 2015-)
IEEE Signal Processing Society (2013-)
International Speech Communication Association (2011-)
American Mathematical Society (2014-)

Hobbies and Interests

Pages I Visit

The Writing Center at UNC-Chapel Hill Link
IEEE Signal Processing Society Link
ISCA Web Link
American Mathematical Society Link

Good Reads

You and Your Research by Richard Hamming Link
What's new by Terence Tao Link
Complete Works of Swami Vivekananda Link

Books I Am Reading

Invisible: The Dangerous Allure of the Unseen by Philip Ball Link
What If?: Serious Scientific Answers to Absurd Hypothetical Questions by Randall Munroe Link
Metamagical Themas: Questing For The Essence Of Mind And Pattern by Douglas Hofstadter Link

Copyright Notice: The PDFs of the papers are provided for academic purpose ONLY. All the papers are copyrighted by the corresponding publishers.

Last updated: January 2023