Select Publications

Journal articles

Chen M; Zhang Q; Wang M; Zhang X; Liu H; Ambikairaiah E; Chen D, 2025, 'Selective State Space Model for Monaural Speech Enhancement', IEEE Transactions on Consumer Electronics, http://dx.doi.org/10.1109/TCE.2024.3523297

Zhang X; Zhang Q; Liu H; Xiao T; Qian X; Ahmed B; Ambikairajah E; Li H; Epps J, 2025, 'Mamba in Speech: Towards an Alternative to Self-Attention', IEEE Transactions on Audio, Speech and Language Processing, 33, pp. 1933 - 1948, http://dx.doi.org/10.1109/taslpro.2025.3566210

Shu H; Liang R; Li Z; Goodridge A; Zhang X; Ding H; Nagururu N; Sahu M; Creighton FX; Taylor RH; Munawar A; Unberath M, 2023, 'Twin-S: a digital twin for skull base surgery.', Int J Comput Assist Radiol Surg, 18, pp. 1077 - 1084, http://dx.doi.org/10.1007/s11548-023-02863-9

Conference Papers

Zhang X; Ma J; Shahin M; Ahmed B; Epps J, 2025, 'Rethinking Mamba in Speech Processing by Self-Supervised Models', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10889111

Liang R; Zhang X; Li Q; Wei L; Liu H; Kumar A; Kempski Leadingham KM; Punnoose J; Garcia LP; Manbachi A, 2024, 'Unidirectional Brain-Computer Interface: Artificial Neural Network Encoding Natural Images to FMRI Response in the Visual Cortex', in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1851 - 1855, presented at ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 14 April 2024 - 19 April 2024, http://dx.doi.org/10.1109/icassp48485.2024.10446366

Meng H; Zhang Q; Zhang X; Sethu V; Ambikairajah E, 2024, 'Binaural Selective Attention Model for Target Speaker Extraction', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 4323 - 4327, http://dx.doi.org/10.21437/Interspeech.2024-683

Liu H; Garcia LP; Zhang X; Khong AWH; Khudanpur S, 2024, 'ENHANCING CODE-SWITCHING SPEECH RECOGNITION WITH INTERACTIVE LANGUAGE BIASES', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 10886 - 10890, http://dx.doi.org/10.1109/ICASSP48485.2024.10448335

Zhang X; Liu D; Liu H; Zhang Q; Meng H; Garcia LP; Chng ES; Yao L, 2024, 'Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model', in Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, pp. 159 - 171, http://dx.doi.org/10.18653/v1/2024.emnlp-main.9

Joshi A; Renzella J; Bhattacharyya P; Jha S; Zhang X, 2024, 'Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy', in Teachnlp 2024 6th Workshop on Teaching Nlp Proceedings of the Workshop, pp. 23 - 32

Zhang X; Liu H; Xu K; Zhang Q; Liu D; Ahmed B; Epps J, 2024, 'When LLMs Meet Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection', in Emnlp 2024 2024 Conference on Empirical Methods in Natural Language Processing Proceedings of the Conference, pp. 146 - 158, http://dx.doi.org/10.18653/v1/2024.emnlp-main.8

Chua VYH; Liu H; Garcia LP; Woon FT; Wong J; Zhang X; Khudanpur S; Khong AWH; Dauwels J; Styles SJ, 2023, 'MERLIon CCS Challenge: A English-Mandarin code-switching child-directed speech corpus for language identification and diarization', in INTERSPEECH 2023, ISCA, pp. 4109 - 4113, presented at INTERSPEECH 2023, http://dx.doi.org/10.21437/interspeech.2023-1446

Xuan Y; Zhang X; Li SS; Shen Z; Xie X; Garcia LP; Togneri R, 2023, 'A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters', in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1 - 5, presented at ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 04 June 2023 - 10 June 2023, http://dx.doi.org/10.1109/icassp49357.2023.10095885

Li SS; Zhang X; Zhou S; Shu H; Liang R; Liu H; Garcia LP, 2023, 'PQLM - Multilingual Decentralized Portable Quantum Language Model', in ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp. 1 - 5, presented at ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 04 June 2023 - 10 June 2023, http://dx.doi.org/10.1109/icassp49357.2023.10095215

Preprints

Zhang X; Fang F; Gao P; Qin B; Ahmed B; Epps J, 2025, Distinctive Feature Codec: Adaptive Segmentation for Efficient Speech Representation, http://arxiv.org/abs/2505.18516v1

Zhang X; Ahmed B; Epps J, 2025, Why Pre-trained Models Fail: Feature Entanglement in Multi-modal Depression Detection, http://arxiv.org/abs/2503.06620v1

Zhang X; Liu H; Zhang Q; Ahmed B; Epps J, 2025, SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information, http://arxiv.org/abs/2502.10950v2

Zhang X; Liu D; Xiao T; Xiao C; Szalay T; Shahin M; Ahmed B; Epps J, 2024, Auto-Landmark: Acoustic Landmark Dataset and Open-Source Toolkit for Landmark Extraction, http://arxiv.org/abs/2409.07969v2

Zhang X; Ma J; Shahin M; Ahmed B; Epps J, 2024, Rethinking Mamba in Speech Processing by Self-Supervised Models, http://arxiv.org/abs/2409.07273v1

Zhang X; Zhang Q; Liu H; Xiao T; Qian X; Ahmed B; Ambikairajah E; Li H; Epps J, 2024, Mamba in Speech: Towards an Alternative to Self-Attention, http://arxiv.org/abs/2405.12609v6

Joshi A; Renzella J; Bhattacharyya P; Jha S; Zhang X, 2024, Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy, http://arxiv.org/abs/2405.09854v2

Zhang X; Liu H; Xu K; Zhang Q; Liu D; Ahmed B; Epps J, 2024, When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection, http://arxiv.org/abs/2402.13276v2

Zhang X; Liu D; Liu H; Zhang Q; Meng H; Garcia LP; Chng ES; Yao L, 2024, Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model, http://arxiv.org/abs/2402.10642v2

Li SS; Zhang X; Zhou S; Shu H; Liang R; Liu H; Garcia LP, 2022, PQLM -- Multilingual Decentralized Portable Quantum Language Model for Privacy Protection, http://arxiv.org/abs/2210.03221v5

Zhang X; Li SS; He Z; Togneri R; Garcia LP, 2022, End-to-End Lyrics Recognition with Self-supervised Learning, http://arxiv.org/abs/2209.12702v4


Back to profile page