Select Publications
Journal articles
, 2025, 'Selective State Space Model for Monaural Speech Enhancement', IEEE Transactions on Consumer Electronics, 71, pp. 5414 - 5424, http://dx.doi.org/10.1109/TCE.2024.3523297
, 2025, 'Mamba in Speech: Towards an Alternative to Self-Attention', IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 33, pp. 1933 - 1948, http://dx.doi.org/10.1109/TASLPRO.2025.3566210
, 2023, 'Twin-S: a digital twin for skull base surgery', INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 18, pp. 1077 - 1084, http://dx.doi.org/10.1007/s11548-023-02863-9
Conference Papers
, 2025, 'Multi-Class Dementia Detection Using Acoustic Features - ICASSP-2025 PROCESS Challenge', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10889847
, 2025, 'Rethinking Mamba in Speech Processing by Self-Supervised Models', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10889111
, 2024, 'UNIDIRECTIONAL BRAIN-COMPUTER INTERFACE: ARTIFICIAL NEURAL NETWORK ENCODING NATURAL IMAGES TO fMRI RESPONSE IN THE VISUAL CORTEX', in 2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, IEEE, SOUTH KOREA, Seoul, pp. 1851 - 1855, presented at 49th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), SOUTH KOREA, Seoul, 14 April 2024 - 19 April 2024, http://dx.doi.org/10.1109/ICASSP48485.2024.10446366
, 2024, 'Binaural Selective Attention Model for Target Speaker Extraction', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 4323 - 4327, http://dx.doi.org/10.21437/Interspeech.2024-683
, 2024, 'ENHANCING CODE-SWITCHING SPEECH RECOGNITION WITH INTERACTIVE LANGUAGE BIASES', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 10886 - 10890, http://dx.doi.org/10.1109/ICASSP48485.2024.10448335
, 2024, 'Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model', in Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, pp. 159 - 171, http://dx.doi.org/10.18653/v1/2024.emnlp-main.9
, 2024, 'Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy', in Teachnlp 2024 6th Workshop on Teaching Nlp Proceedings of the Workshop, pp. 23 - 32
, 2024, 'When LLMs Meet Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection', in Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, pp. 146 - 158, http://dx.doi.org/10.18653/v1/2024.emnlp-main.8
, 2023, 'MERLIon CCS Challenge: A English-Mandarin code-switching child-directed speech corpus for language identification and diarization', in INTERSPEECH 2023, ISCA-INT SPEECH COMMUNICATION ASSOC, IRELAND, Dublin, pp. 4109 - 4113, presented at Interspeech Conference, IRELAND, Dublin, 20 August 2023 - 24 August 2023, http://dx.doi.org/10.21437/Interspeech.2023-1446
, 2023, 'A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters', in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1 - 5, presented at ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 04 June 2023 - 10 June 2023, http://dx.doi.org/10.1109/icassp49357.2023.10095885
, 2023, 'PQLM - Multilingual Decentralized Portable Quantum Language Model', in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1 - 5, presented at ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 04 June 2023 - 10 June 2023, http://dx.doi.org/10.1109/icassp49357.2023.10095215
Preprints
, 2025, Distinctive Feature Codec: Adaptive Segmentation for Efficient Speech Representation, http://arxiv.org/abs/2505.18516v1
, 2025, Why Pre-trained Models Fail: Feature Entanglement in Multi-modal Depression Detection, http://arxiv.org/abs/2503.06620v1
, 2025, SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information, http://arxiv.org/abs/2502.10950v2
, 2024, Auto-Landmark: Acoustic Landmark Dataset and Open-Source Toolkit for Landmark Extraction, http://arxiv.org/abs/2409.07969v2
, 2024, Rethinking Mamba in Speech Processing by Self-Supervised Models, http://arxiv.org/abs/2409.07273v1
, 2024, Mamba in Speech: Towards an Alternative to Self-Attention, http://arxiv.org/abs/2405.12609v6
, 2024, Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy, http://arxiv.org/abs/2405.09854v2
, 2024, When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection, http://arxiv.org/abs/2402.13276v2
, 2024, Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model, http://arxiv.org/abs/2402.10642v2
, 2022, PQLM -- Multilingual Decentralized Portable Quantum Language Model for Privacy Protection, http://arxiv.org/abs/2210.03221v5
, 2022, End-to-End Lyrics Recognition with Self-supervised Learning, http://arxiv.org/abs/2209.12702v4