Publications
.
2025. Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey. ACM Computing Surveys.
2402.17467.pdf (1.01 MB)
.
2021. Underwater Acoustic Communication Receiver Using Deep Belief Network. IEEE Transactions on Communications. :1-1.
2102.13397.pdf (12.87 MB)
.
2019. Doppler Invariant Demodulation for Shallow Water Acoustic Communications Using Deep Belief Networks. 16th IEEE Asia Pacific Wireless Communications Symposium (APWCS).
1909.02850.pdf (790.54 KB)
.
2019. A Hybrid Fuzzy Logic-Neural Network Approach For Multi-path Separation Of Underwater Acoustic Signals. 89th IEEE Vehicular Technology Conference.
fuzzy logic.pdf (1.66 MB)
.
2022. Downscaling using Deep Convolutional Autoencoders, a case study for South East Asia. Egusphere preprint.
egusphere-2022-234.pdf (8.99 MB)
.
2018. Singing Voice Separation Using a Deep Convolutional Neural Network Trained by Ideal Binary Mask and Cross Entropy. Neural Computing and Applications.
main.pdf (2.59 MB)
.
2025. Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction.
.
2025. MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection. arXiv:2505.20979.
.
2020. Singing voice conversion with disentangled representations of singer and vocal technique using variational autoencoders. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
1912.02613.pdf (2.9 MB)
.
2025. BandCondiNet: Parallel Transformers-based Conditional Popular Music Generation with Multi-View Features. Expert Systems with Applications. 130059
2407.10462v2.pdf (2.6 MB)
.
2019. Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders. ISMIR.
jyun-ismir.pdf (5.62 MB)
.
2020. Unsupervised disentanglement of pitch and timbre for isolated musical instrument sounds. Proceedings of the International Society of Music Information Retrieval (ISMIR).
.
2022. Conditional Drums Generation using Compound Word Representations. EvoMUSART (EVO*) - Lecture Notes in Computer Science.
2202.04464.pdf (525.36 KB)
.
2021. Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework. Proceedings of the International Joint Conference on Neural Networks (IJCNN).
2104.13056.pdf (857.78 KB)
.
2024. DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech. Audio Imagination: NeurIPS 2024 Workshop.
.
2025. SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering. arXiv:2508.03448.
2508.03448v2.pdf (3.31 MB)
.
2024. Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training. Proc. of IEEE Tencon, Singapore.
.
2024. Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder. Proc. of IEEE Tencon, Singapore.
.
2024. MidiCaps — A large-scale MIDI dataset with text captions. ISMIR.
2406.02255v1.pdf (699.83 KB)
.
2023. Learning accent representation with multi-level VAE towards controllable speech synthesis. IEEE Spoken Language Technology (SLT) Workshop.
.
2025. Analysis and Synthesis of Audio with AI: from Neurological Disease to Accented Speech and Music.
thesis_Jan.pdf (26.4 MB)
.
2024. Mustango: Toward Controllable Text-to-Music Generation. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). pages 8293–8316.
2311.08355 (1).pdf (11.38 MB)
.
2020. A dataset and classification model for Malay, Hindi, Tamil and Chinese music. 13th Workshop on music and machine learning (MML) as part of ECML/PKDD.
2009.04459.pdf (234.8 KB)
.
2024. DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts. arXiv:2406.08742.
2406.08742v1.pdf (1.06 MB)
.
2024. Modern Portfolio Construction with Advanced Deep Learning Models. SUTD. PhD
Joel_Ong_Thesis.pdf (3.44 MB)
.
2023. Constructing Time-Series Momentum Portfolios with Deep Multi-Task Learning. Expert Systems with Applications. 230(120587)
2306.13661.pdf (707.95 KB)
.
2022. EmoMV: Affective Music-Video Correspondence Learning Datasets for Classification and Retrieval. Information Fusion.
SSRN-id4189323.pdf (2.01 MB)
.
2020. Data-driven 3D Scene Understanding. PhD
.
2021. AttendAffectNet: Self-Attention based Networks for Predicting Affective Responses from Movies. Proceedings of the International Conference on Pattern Recognition (ICPR2020).
2010.11188.pdf (7.07 MB)
.
2019. Multimodal Deep Models for Predicting Affective Responses Evoked by Movies. The 2nd International Workshop on Computer Vision for Physiological Measurement as part of ICCV. Seoul, South Korea. 2019.
1909.06957.pdf (836.3 KB)
.
2021. AttendAffectNet – Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-attention. Sensors. Special issue on Intelligent Sensors: Sensor Based Multi-Modal Emotion Recognition.
sensors-21-08356.pdf (1.03 MB)
.
2026. Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment. ICASSP.
.
2025. JamendoMaxCaps: A Large Scale Music-caption Dataset with Imputed Metadata. Proceedings of IJCNN, Rome, Italy.
.
2025. Towards the future of education: cyber-physical learning. Discover Education. 4:1–16.
.
2022. A white paper on cyberphysical learning. White paper, Singapore University of Technology and Design.
LSL_WhitePaper_Cyber-physical-Campus-Higher-Education.pdf (6.98 MB)
.
2018. A Novel Interface for the Graphical Analysis of Music Practice Behaviours. Frontiers in Psychology - Human-Media Interaction. 9
practice_browser.pdf (4.9 MB)
.
2025. Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics. arXiv:2510.05137.
.
2025. LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions. arXiv:2508.18321.
.
2019. Machine Learning Research that Matters for Music Creation: A Case Study. Journal of New Music Research. 48(1):36-55.
concert_paper_preprint.pdf (1.6 MB)
.
2021. Deep Neural Network Based Respiratory Pathology Classification Using Cough Sounds. Sensors. 21(16):5555.
2106.12174.pdf (6.52 MB)
.
2020. Music FaderNets: Controllable Music Generation Based On High-Level Features via Low-Level Feature Modelling. ISMIR.
2007.15474.pdf (2.67 MB)
.
2020. Generative Modelling for Controllable Audio Synthesis of Expressive Piano Performance. Workshop on Machine Learning for Music Discover (ML4MD) as part of ICML.
2006.09833.pdf (2.81 MB)
.
2025. End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation. Proceedings of IJCNN, Rome, Italy.
.
2022. HEAR 2021: Holistic Evaluation of Audio Representations. Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track.
2203.03022.pdf (406.58 KB)
.
2021. Evaluating the Effectiveness of an Augmented Reality Game Promoting Environmental Action. Sustainability. 13(24):13912.
sustainability-13-13912.pdf (16.23 MB)
.
2024. DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage. Proc. of IEEE Tencon, Singapore.
.
2023. A Multimodal Model with Twitter Finbert Embeddings for Extreme Price Movement Prediction of Bitcoin. Expert Systems with Applications.
2206.00648.pdf (3.26 MB)
]