Select Publications

By Associate Professor Vidhyasaharan Sethu

Conference Papers

Hong X; Gong Y; Sethu V; Dang T, 2025, 'AER-LLM: Ambiguity-aware Emotion Recognition Leveraging Large Language Models', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10888198

Meng H; Breebaart J; Stoddard J; Sethu V; Ambikairajah E, 2025, 'Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10887842

Jing M; Sethu V; Ahmed B, 2025, 'Evidential Neural GPLDA: A Novel Approach to Quantify Prediction Uncertainty in Speaker Verification Systems', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10887887

Jing M; Sethu V; Ahmed B, 2025, 'Improved Out-of-domain Detection in VAE Latent Spaces with Boundary-driven Regularisation', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49660.2025.10890806

Hong X; Gong Y; Sethu V; Dang T, 2025, 'AER-LLM: Ambiguity-aware Emotion Recognition Leveraging Large Language Models.', in ICASSP, IEEE, pp. 1 - 5, https://doi.org/10.1109/ICASSP49660.2025

Meng H; Breebaart J; Stoddard J; Sethu V; Ambikairajah E, 2025, 'Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features.', in ICASSP, IEEE, pp. 1 - 5, https://doi.org/10.1109/ICASSP49660.2025

Jing M; Sethu V; Ahmed B, 2024, 'A PROBABILITY GRADIENT BASED APPROACH FOR SAMPLING BOUNDARIES OF IN-DOMAIN DATA', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 5340 - 5344, http://dx.doi.org/10.1109/ICASSP48485.2024.10445872

Ambikairajah E; Thiruvaran T; Sethu V; Mishra D; Sirojan T, 2024, 'A Tiered Learning Framework for Self-Guided Engineering Design Education', in IEEE Global Engineering Education Conference Educon, http://dx.doi.org/10.1109/EDUCON60312.2024.10578840

Ambikairajah E; Sirojan T; Sethu V; Mishra D, 2024, 'Aligning Tiered Assessments With Course Learning Outcomes', in 2024 IEEE International Conference on Teaching Assessment and Learning for Engineering Tale 2024 Proceedings, http://dx.doi.org/10.1109/TALE62452.2024.10834314

Meng H; Zhang Q; Zhang X; Sethu V; Ambikairajah E, 2024, 'Binaural Selective Attention Model for Target Speaker Extraction', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 4323 - 4327, http://dx.doi.org/10.21437/Interspeech.2024-683

Wu YT; Wu J; Sethu V; Lee CC, 2024, 'Can Modelling Inter-Rater Ambiguity Lead To Noise-Robust Continuous Emotion Predictions?', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 3714 - 3718, http://dx.doi.org/10.21437/Interspeech.2024-482

Ambikairajah E; Sirojan T; Thiruvaran T; Sethu V, 2024, 'ChatGPT in the Classroom: A Shift in Engineering Design Education', in IEEE Global Engineering Education Conference Educon, http://dx.doi.org/10.1109/EDUCON60312.2024.10578884

Wu J; Dang T; Sethu V; Ambikairajah E, 2024, 'Emotion Recognition Systems Must Embrace Ambiguity', in Proceedings 2024 12th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos Aciiw 2024, pp. 166 - 170, http://dx.doi.org/10.1109/ACIIW63320.2024.00033

Nan Z; Dang T; Sethu V; Ahmed B, 2024, 'Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 6495 - 6499, http://dx.doi.org/10.1109/ICASSP48485.2024.10447530

Meng H; Zhang Q; Zhang X; Sethu V; Ambikairajah E, 2024, 'Binaural Selective Attention Model for Target Speaker Extraction.', in Lapidot I; Gannot S (ed.), INTERSPEECH, ISCA, https://doi.org/10.21437/Interspeech.2024

Wu J; Dang T; Sethu V; Ambikairajah E, 2024, 'Dual-Constrained Dynamical Neural ODEs for Ambiguity-aware Continuous Emotion Prediction.', in Lapidot I; Gannot S (ed.), INTERSPEECH, ISCA, https://doi.org/10.21437/Interspeech.2024

Nan Z; Dang T; Sethu V; Ahmed B, 2024, 'Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling.', in ICASSP, IEEE, pp. 6495 - 6499, https://doi.org/10.1109/ICASSP48485.2024

Wu J; Dang T; Sethu V; Ambikairajah E, 2023, 'Belief Mismatch Coefficient (BMC): A Novel Interpretable Measure of Prediction Accuracy for Ambiguous Emotion States', in 2023 11th International Conference on Affective Computing and Intelligent Interaction Acii 2023, http://dx.doi.org/10.1109/ACII59096.2023.10388210

Dang T; Dimitriadis A; Wu J; Sethu V; Ambikairajah E, 2023, 'Constrained Dynamical Neural ODE for Time Series Modelling: A Case Study on Continuous Emotion Prediction', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, http://dx.doi.org/10.1109/ICASSP49357.2023.10095778

Wu J; Dang T; Sethu V; Ambikairajah E, 2023, 'From Interval to Ordinal: A HMM based Approach for Emotion Label Conversion', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 1843 - 1847, http://dx.doi.org/10.21437/Interspeech.2023-2213

Shahin M; Nan Z; Sethu V; Ahmed B, 2023, 'Improving wav2vec2-based Spoken Language Identification by Learning Phonological Features', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 4119 - 4123, http://dx.doi.org/10.21437/Interspeech.2023-2533

Meng H; Sethu V; Ambikairajah E, 2023, 'What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 2898 - 2902, http://dx.doi.org/10.21437/Interspeech.2023-1617

Meng H; Sethu V; Ambikairajah E, 2023, 'What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions.', in Harte N; Carson-Berndsen J; Jones G (eds.), INTERSPEECH, ISCA, pp. 2898 - 2902, https://doi.org/10.21437/Interspeech.2023

Wu J; Dang T; Sethu V; Ambikairajah E, 2022, 'A NOVEL SEQUENTIAL MONTE CARLO FRAMEWORK FOR PREDICTING AMBIGUOUS EMOTION STATES', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 8567 - 8571, http://dx.doi.org/10.1109/ICASSP43922.2022.9746350

Ahmed B; Ballard K; Burnham D; Sirojan T; Mehmood H; Estival D; Baker E; Cox F; Arciuli J; Benders T; Demuth K; Kelly B; Diskin-Holdaway C; Shahin M; Sethu V; Epps J; Lee CB; Ambikairajah E, 2021, 'AusKidTalk: An auditory-visual corpus of 3-to 12-year-old Australian children's speech', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 4351 - 4355, http://dx.doi.org/10.21437/Interspeech.2021-2000

Bose D; Sethu V; Ambikairajah E, 2021, 'Parametric Distributions to Model Numerical Emotion Labels', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 576 - 580, http://dx.doi.org/10.21437/Interspeech.2021-1000

Ahmed B; Ballard KJ; Burnham D; Sirojan T; Mehmood H; Estival D; Baker E; Cox F; Arciuli J; Benders T; Demuth K; Kelly B; Diskin-Holdaway C; Shahin MA; Sethu V; Epps J; Lee CB; Ambikairajah E, 2021, 'AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children's Speech.', in Hermansky H; Cernocký H; Burget L; Lamel L; Scharenborg O; Motlícek P (eds.), Interspeech, ISCA, pp. 3680 - 3684, https://doi.org/10.21437/Interspeech.2021

Bose D; Sethu V; Ambikairajah E, 2021, 'Parametric Distributions to Model Numerical Emotion Labels.', in Hermansky H; Cernocký H; Burget L; Lamel L; Scharenborg O; Motlícek P (eds.), Interspeech, ISCA, pp. 4498 - 4502, https://doi.org/10.21437/Interspeech.2021

Suthokumar G; Sethu V; Sriskandaraja K; Ambikairajah E, 2020, 'Adversarial Multi-Task Learning for Speaker Normalization in Replay Detection', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 6609 - 6613, http://dx.doi.org/10.1109/ICASSP40776.2020.9054322

Ambikairajah E; Sethu V, 2020, 'Cochlear Signal Processing: A Platform for Learning the Fundamentals of Digital Signal Processing', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 9229 - 9233, http://dx.doi.org/10.1109/ICASSP40776.2020.9054297

Ouyang A; Dang T; Sethu V; Ambikairajah E, 2019, 'Speech based emotion prediction: Can a linear model work?', in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, ISCA, Graz, Austria, pp. 2813 - 2817, presented at INTERSPEECH 2019, Graz, Austria, 15 September 2019 - 19 September 2019, http://dx.doi.org/10.21437/Interspeech.2019-3149

Bose D; Dang T; Sethu V; Ambikairajah E; Fernando S, 2019, 'A Novel Bag-of-Optimised-Clusters Front-End for Speech based Continuous Emotion Prediction', in 2019 8th International Conference on Affective Computing and Intelligent Interaction Acii 2019, http://dx.doi.org/10.1109/ACII.2019.8925490

Atcheson M; Sethu V; Epps J, 2019, 'Using Gaussian Processes with LSTM Neural Networks to Predict Continuous-Time, Dimensional Emotion in Ambiguous Speech', in 2019 8th International Conference on Affective Computing and Intelligent Interaction Acii 2019, http://dx.doi.org/10.1109/ACII.2019.8925450

Wickramasinghe B; Ambikairajah E; Epps J; Sethu V; Li H, 2019, 'Auditory Inspired Spatial Differentiation for Replay Spoofing Attack Detection', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 6011 - 6015, http://dx.doi.org/10.1109/ICASSP.2019.8683693

Suthokumar G; Sriskandaraja K; Sethu V; Wijenayake C; Ambikairajah E, 2019, 'Phoneme Specific Modelling and Scoring Techniques for Anti Spoofing System', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 6106 - 6110, http://dx.doi.org/10.1109/ICASSP.2019.8682411

Fernando S; Irtza S; Sethu V; Ambikairajah E, 2018, 'Advances in Feature Extraction and Modelling for Short Duration Language Identification', in 2018 IEEE 9th International Conference on Information and Automation for Sustainability Iciafs 2018, http://dx.doi.org/10.1109/ICIAFS.2018.8913386

Suthokumar G; Sriskandaraja K; Sethu V; Wijenayake C; Ambikairajah E, 2018, 'An Investigation about the Scalability of the Spoofing Detection System', in 2018 IEEE 9th International Conference on Information and Automation for Sustainability Iciafs 2018, http://dx.doi.org/10.1109/ICIAFS.2018.8913369

Gamage KW; Dang T; Sethu V; Epps J; Ambikairajah E, 2018, 'Speech-based Continuous Emotion Prediction by Learning Perception Responses related to Salient Events: A Study based on Vocal Affect Bursts and Cross-Cultural Affect in AVEC 2018', in Avec 2018 Proceedings of the 2018 Audio Visual Emotion Challenge and Workshop Co Located with mm 2018, pp. 47 - 55, http://dx.doi.org/10.1145/3266302.3266314

Dang T; Sethu V; Ambikairajah E, 2018, 'Dynamic Multi-Rater Gaussian Mixture Regression Incorporating Temporal Dependencies of Emotion Uncertainty Using Kalman Filters', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 4929 - 4933, http://dx.doi.org/10.1109/ICASSP.2018.8461321

Irtza S; Sethu V; Ambikairajah E; Li H, 2018, 'End-to-End Hierarchical Language Identification System', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 5199 - 5203, http://dx.doi.org/10.1109/ICASSP.2018.8461419

Fernando S; Sethu V; Ambikairajah E, 2018, 'Factorized Hidden Variability Learning for Adaptation of Short Duration Language Identification Models', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 5204 - 5208, http://dx.doi.org/10.1109/ICASSP.2018.8462094

Ma J; Sethu V; Ambikairajah E; Lee KA, 2018, 'Speaker-Phonetic Vector Estimation for Short Duration Speaker Verification', in ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, pp. 5264 - 5268, http://dx.doi.org/10.1109/ICASSP.2018.8461978

Fernando S; Sethu V; Ambikairajah E; Li H, 2018, 'Second Order Factorized Model Adaptation for Short Duration Language Identification', in 2018 Asia Pacific Signal and Information Processing Association Annual Summit and Conference Apsipa ASC 2018 Proceedings, pp. 1440 - 1447, http://dx.doi.org/10.23919/APSIPA.2018.8659586

Suthokumar G; Sriskandaraja K; Sethu V; Wijenayake C; Ambikairajah E; Li H, 2018, 'Use of Claimed Speaker Models for Replay Detection', in 2018 Asia Pacific Signal and Information Processing Association Annual Summit and Conference Apsipa ASC 2018 Proceedings, pp. 1038 - 1046, http://dx.doi.org/10.23919/APSIPA.2018.8659510

Sriskandaraja K; Sethu V; Ambikairajah E, 2018, 'Deep Siamese architecture based replay detection for secure voice biometric', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 671 - 675, http://dx.doi.org/10.21437/Interspeech.2018-1819

Atcheson M; Sethu V; Epps J, 2018, 'Demonstrating and modelling systematic time-varying annotator disagreement in continuous emotion annotation', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 3668 - 3672, http://dx.doi.org/10.21437/Interspeech.2018-1933

Suthokumar G; Sethu V; Wijenayake C; Ambikairajah E, 2018, 'Modulation dynamic features for the detection of replay attacks', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 691 - 695, http://dx.doi.org/10.21437/Interspeech.2018-1846

Fernando S; Sethu V; Ambikairajah E, 2018, 'Sub-band envelope features using frequency domain linear prediction for short duration language identification', in Proceedings of the Annual Conference of the International Speech Communication Association Interspeech, pp. 1818 - 1822, http://dx.doi.org/10.21437/Interspeech.2018-1805

Cetin E; Abewardana Wijenayake C; Sethu V; Ambikairajah E, 2017, 'A Flipped Mode Approach to Teaching an Electronic System Design Course', in PROCEEDINGS OF 2017 IEEE 6TH INTERNATIONAL CONFERENCE ON TEACHING, ASSESSMENT, AND LEARNING FOR ENGINEERING (TALE), IEEE, Hong Kong, pp. 223 - 228, presented at IEEE International Conference on Teaching, Assessment, and Learning for Engineering, Hong Kong, 12 December 2017 - 14 December 2017, http://dx.doi.org/10.1109/TALE.2017.8252337

Dang T; Atcheson M; Stasak B; Hayat M; Goecke R; Huang Z; Le P; Epps J; Jayawardena S; Sethu V, 2017, 'Investigating word affect features and fusion of probabilistic predictions incorporating uncertainty in AVEC 2017', in Ringeval F; Schuller BW; Valstar MF; Gratch J; Cowie R; Pantic M (eds.), AVEC 2017 - Proceedings of the 7th Annual Workshop on Audio/Visual Emotion Challenge, co-located with MM 2017, Association for Computing Machinery (ACM), Mountain View, California, USA, pp. 27 - 35, presented at 7th Annual Workshop on Audio/Visual Emotion Challenge, Mountain View, California, USA, 23 October 2017 - 23 October 2017, http://dx.doi.org/10.1145/3133944.3133952

Back to profile page

Filter by type

View all »

ORCID as entered in ROS

https://orcid.org/0000-0001-8492-1787