ORCID as entered in ROS

Select Publications
2025, Distinctive Feature Codec: Adaptive Segmentation for Efficient Speech Representation, http://arxiv.org/abs/2505.18516v1
,2025, Why Pre-trained Models Fail: Feature Entanglement in Multi-modal Depression Detection, http://arxiv.org/abs/2503.06620v1
,2025, SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information, http://arxiv.org/abs/2502.10950v2
,2024, Auto-Landmark: Acoustic Landmark Dataset and Open-Source Toolkit for Landmark Extraction, http://arxiv.org/abs/2409.07969v2
,2024, Rethinking Mamba in Speech Processing by Self-Supervised Models, http://arxiv.org/abs/2409.07273v1
,2024, Mamba in Speech: Towards an Alternative to Self-Attention, http://arxiv.org/abs/2405.12609v6
,2024, Striking a Balance between Classical and Deep Learning Approaches in Natural Language Processing Pedagogy, http://arxiv.org/abs/2405.09854v2
,2024, When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection, http://arxiv.org/abs/2402.13276v2
,2024, Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model, http://arxiv.org/abs/2402.10642v2
,2022, PQLM -- Multilingual Decentralized Portable Quantum Language Model for Privacy Protection, http://arxiv.org/abs/2210.03221v5
,2022, End-to-End Lyrics Recognition with Self-supervised Learning, http://arxiv.org/abs/2209.12702v4
,