ORCID as entered in ROS

Select Publications
2025, 'Selective State Space Model for Monaural Speech Enhancement', IEEE Transactions on Consumer Electronics, http://dx.doi.org/10.1109/TCE.2024.3523297
,2025, 'Mamba in Speech: Towards an Alternative to Self-Attention', IEEE Transactions on Audio, Speech and Language Processing, 33, pp. 1933 - 1948, http://dx.doi.org/10.1109/taslpro.2025.3566210
,2023, 'Twin-S: a digital twin for skull base surgery.', Int J Comput Assist Radiol Surg, 18, pp. 1077 - 1084, http://dx.doi.org/10.1007/s11548-023-02863-9
,2025, 'Rethinking Mamba in Speech Processing by Self-Supervised Models', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10889111
,2024, 'Unidirectional Brain-Computer Interface: Artificial Neural Network Encoding Natural Images to FMRI Response in the Visual Cortex', in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1851 - 1855, presented at ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 14 April 2024 - 19 April 2024, http://dx.doi.org/10.1109/icassp48485.2024.10446366
,2024, 'Binaural Selective Attention Model for Target Speaker Extraction', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 4323 - 4327, http://dx.doi.org/10.21437/Interspeech.2024-683
,2024, 'ENHANCING CODE-SWITCHING SPEECH RECOGNITION WITH INTERACTIVE LANGUAGE BIASES', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 10886 - 10890, http://dx.doi.org/10.1109/ICASSP48485.2024.10448335
,2024, 'Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model', in Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, pp. 159 - 171, http://dx.doi.org/10.18653/v1/2024.emnlp-main.9
,2024, 'Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy', in Teachnlp 2024 6th Workshop on Teaching Nlp Proceedings of the Workshop, pp. 23 - 32
,2024, 'When LLMs Meet Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection', in Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, pp. 146 - 158, http://dx.doi.org/10.18653/v1/2024.emnlp-main.8
,2023, 'MERLIon CCS Challenge: A English-Mandarin code-switching child-directed speech corpus for language identification and diarization', in INTERSPEECH 2023, ISCA, pp. 4109 - 4113, presented at INTERSPEECH 2023, http://dx.doi.org/10.21437/interspeech.2023-1446
,2023, 'A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters', in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1 - 5, presented at ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 04 June 2023 - 10 June 2023, http://dx.doi.org/10.1109/icassp49357.2023.10095885
,2023, 'PQLM - Multilingual Decentralized Portable Quantum Language Model', in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1 - 5, presented at ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 04 June 2023 - 10 June 2023, http://dx.doi.org/10.1109/icassp49357.2023.10095215
,2025, Distinctive Feature Codec: Adaptive Segmentation for Efficient Speech Representation, http://arxiv.org/abs/2505.18516v1
,2025, Why Pre-trained Models Fail: Feature Entanglement in Multi-modal Depression Detection, http://arxiv.org/abs/2503.06620v1
,2025, SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information, http://arxiv.org/abs/2502.10950v2
,2024, Auto-Landmark: Acoustic Landmark Dataset and Open-Source Toolkit for Landmark Extraction, http://arxiv.org/abs/2409.07969v2
,2024, Rethinking Mamba in Speech Processing by Self-Supervised Models, http://arxiv.org/abs/2409.07273v1
,2024, Mamba in Speech: Towards an Alternative to Self-Attention, http://arxiv.org/abs/2405.12609v6
,2024, Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy, http://arxiv.org/abs/2405.09854v2
,2024, When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection, http://arxiv.org/abs/2402.13276v2
,2024, Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model, http://arxiv.org/abs/2402.10642v2
,2022, PQLM -- Multilingual Decentralized Portable Quantum Language Model for Privacy Protection, http://arxiv.org/abs/2210.03221v5
,2022, End-to-End Lyrics Recognition with Self-supervised Learning, http://arxiv.org/abs/2209.12702v4
,