Publications

Export 149 results:
[ Author(Desc)] Title Type Year
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
L
Le D-V-T, Bigo L., Keller M., Herremans D..  2025.  Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey. ACM Computing Surveys. PDF icon 2402.17467.pdf (1.01 MB)
Lee-Leon A., Yuen C., Herremans D..  2021.  Underwater Acoustic Communication Receiver Using Deep Belief Network. IEEE Transactions on Communications. :1-1.PDF icon 2102.13397.pdf (12.87 MB)
Lee-Leon A., Yuen C., Herremans D..  2019.  Doppler Invariant Demodulation for Shallow Water Acoustic Communications Using Deep Belief Networks. 16th IEEE Asia Pacific Wireless Communications Symposium (APWCS). PDF icon 1909.02850.pdf (790.54 KB)
Lee-Leon A., Yuen C., Herremans D..  2019.  A Hybrid Fuzzy Logic-Neural Network Approach For Multi-path Separation Of Underwater Acoustic Signals. 89th IEEE Vehicular Technology Conference. PDF icon fuzzy logic.pdf (1.66 MB)
Levers O.D, Herremans D., Dipankar A., Blessing L..  2022.  Downscaling using Deep Convolutional Autoencoders, a case study for South East Asia. Egusphere preprint. PDF icon egusphere-2022-234.pdf (8.99 MB)
Lin K.W.E., BT B, Koh E., Lui S., Herremans D..  2018.  Singing Voice Separation Using a Deep Convolutional Neural Network Trained by Ideal Binary Mask and Cross Entropy. Neural Computing and Applications. PDF icon main.pdf (2.59 MB)
Liu R., Roy A., Herremans D..  2025.  Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction.
Lu T., Geist C-M, Melechovsky J., Roy A., Herremans D..  2025.  MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection. arXiv:2505.20979.
Luo Y.J., Hsu C.-C., Agres K., Herremans D..  2020.  Singing voice conversion with disentangled representations of singer and vocal technique using variational autoencoders. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). PDF icon 1912.02613.pdf (2.9 MB)
Luo J., Yang X., Herremans D..  2025.  BandCondiNet: Parallel Transformers-based Conditional Popular Music Generation with Multi-View Features. Expert Systems with Applications. 130059PDF icon 2407.10462v2.pdf (2.6 MB)
Luo Y.J., Agres K., Herremans D..  2019.  Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders. ISMIR. PDF icon jyun-ismir.pdf (5.62 MB)
Luo Y.J., Cheuk K.W., Nakano T., Goto M., Herremans D..  2020.  Unsupervised disentanglement of pitch and timbre for isolated musical instrument sounds. Proceedings of the International Society of Music Information Retrieval (ISMIR).
M
Makris D., Guo Z, Kaliakatsos-Papakostas N., Herremans D..  2022.  Conditional Drums Generation using Compound Word Representations. EvoMUSART (EVO*) - Lecture Notes in Computer Science. PDF icon 2202.04464.pdf (525.36 KB)
Makris D., Agres K., Herremans D..  2021.  Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework. Proceedings of the International Joint Conference on Neural Networks (IJCNN). PDF icon 2104.13056.pdf (857.78 KB)
Melechovsky J., Mehrish A., Sisman B., Herremans D..  2024.  DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech. Audio Imagination: NeurIPS 2024 Workshop.
Melechovsky J., Mehrish A., Roy A., Herremans D..  2025.  SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering. arXiv:2508.03448. PDF icon 2508.03448v2.pdf (3.31 MB)
Melechovsky J., Mehrish A., Sisman B., Herremans D..  2024.  Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training. Proc. of IEEE Tencon, Singapore.
Melechovsky J., Mehrish A., Sisman B., Herremans D..  2024.  Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder. Proc. of IEEE Tencon, Singapore.
Melechovsky J., Roy A., Herremans D..  2024.  MidiCaps — A large-scale MIDI dataset with text captions. ISMIR. PDF icon 2406.02255v1.pdf (699.83 KB)
Melechovsky J., Mehrish A., Herremans D., Sisman B..  2023.  Learning accent representation with multi-level VAE towards controllable speech synthesis. IEEE Spoken Language Technology (SLT) Workshop.
Melechovsky J..  2025.  Analysis and Synthesis of Audio with AI: from Neurological Disease to Accented Speech and Music. PDF icon thesis_Jan.pdf (26.4 MB)
Melechovsky J, Guo Z, Ghosal D, Majumder N, Herremans D, Poria S.  2024.  Mustango: Toward Controllable Text-to-Music Generation. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). pages 8293–8316. PDF icon 2311.08355 (1).pdf (11.38 MB)
N
Nahar F., Agres K., BT B, Herremans D..  2020.  A dataset and classification model for Malay, Hindi, Tamil and Chinese music. 13th Workshop on music and machine learning (MML) as part of ECML/PKDD. PDF icon 2009.04459.pdf (234.8 KB)
P
Pham Q-H, Herremans D., Roig G..  2022.  EmoMV: Affective Music-Video Correspondence Learning Datasets for Classification and Retrieval. Information Fusion. PDF icon SSRN-id4189323.pdf (2.01 MB)
Pham Q-H.  2020.  Data-driven 3D Scene Understanding. PhD
T. Phuong HThi, BT B, Herremans D., Roig G..  2021.  AttendAffectNet: Self-Attention based Networks for Predicting Affective Responses from Movies. Proceedings of the International Conference on Pattern Recognition (ICPR2020). PDF icon 2010.11188.pdf (7.07 MB)
T. Phuong HThi, Herremans D., Roig G..  2019.  Multimodal Deep Models for Predicting Affective Responses Evoked by Movies. The 2nd International Workshop on Computer Vision for Physiological Measurement as part of ICCV. Seoul, South Korea. 2019. PDF icon 1909.06957.pdf (836.3 KB)
T. Phuong HThi, BT B, Roig G., Herremans D..  2021.  AttendAffectNet – Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-attention. Sensors. Special issue on Intelligent Sensors: Sensor Based Multi-Modal Emotion Recognition. PDF icon sensors-21-08356.pdf (1.03 MB)
S
Sockalingam N., Lo K., Teo J., Wei C.C., Chow D., Herremans D., Jun M.L.M., Kurniawan O., Wang Y., Leong P.K.  2025.  Towards the future of education: cyber-physical learning. Discover Education. 4:1–16.
Sockalingam N., Lo K., n KO., Herremans D., Raghunath N., Cancion H.GC, Kejun H., Leong H., Tan J., Nizharzharudin K. et al..  2022.  A white paper on cyberphysical learning. White paper, Singapore University of Technology and Design. PDF icon LSL_WhitePaper_Cyber-physical-Campus-Higher-Education.pdf (6.98 MB)
Sokolovskis J., Herremans D., Chew E..  2018.  A Novel Interface for the Graphical Analysis of Music Practice Behaviours. Frontiers in Psychology - Human-Media Interaction. 9PDF icon practice_browser.pdf (4.9 MB)
Song M., Liu R., Wang X, Jiang Y, Xie P, Huang F, Zhou J, Herremans D., Poria S..  2025.  Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics. arXiv:2510.05137.
Song M., Pala T.D, Jin W., Zadeh A., Li C., Herremans D., Poria S..  2025.  LLMs Can't Handle Peer Pressure: Crumbling under Multi-Agent Social Interactions. arXiv:2508.18321.
Sturm B., Ben-Tal O., Monaghan U., Collins N., Herremans D., Chew E., Hadjeres G., Deruty E., Pachet F..  2019.  Machine Learning Research that Matters for Music Creation: A Case Study. Journal of New Music Research. 48(1):36-55.PDF icon concert_paper_preprint.pdf (1.6 MB)

Pages