Publications

Export 124 results:
Author Title Type [ Year(Asc)]
2024
Melechovsky J., Mehrish A., Sisman B., Herremans D..  2024.  Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training. Proc. of IEEE Tencon, Singapore.
Melechovsky J., Mehrish A., Sisman B., Herremans D..  2024.  Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder. Proc. of IEEE Tencon, Singapore.
Kang J., Herremans D..  2024.  Are we there yet? A brief survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges arXiv:2406.08809. PDF icon 2406.08809v1.pdf (156.19 KB)
Luo J., Yang X., Herremans D..  2024.  BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features. arXiv:2407.10462. PDF icon 2407.10462v1.pdf (2.3 MB)
Ong J., Herremans D..  2024.  DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts. arXiv:2406.08742. PDF icon 2406.08742v1.pdf (1.06 MB)
Wang K., Herremans D..  2024.  DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage. Proc. of IEEE Tencon, Singapore.
Chow D., Herremans D..  2024.  Gamification and skills tree. Trends and Foresight Report on Cyber-Physical Learning.
Melechovsky J., Roy A., Herremans D..  2024.  MidiCaps — A large-scale MIDI dataset with text captions. arXiv:2406.02255. PDF icon 2406.02255v1.pdf (699.83 KB)
Ong J..  2024.  Modern Portfolio Construction with Advanced Deep Learning Models. SUTD. PhDPDF icon Joel_Ong_Thesis.pdf (3.44 MB)
Melechovsky J, Guo Z, Ghosal D, Majumder N, Herremans D, Poria S.  2024.  Mustango: Toward Controllable Text-to-Music Generation. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). pages 8293–8316. PDF icon 2311.08355 (1).pdf (11.38 MB)
Le D-V-T, Bigo L., Keller M., Herremans D..  2024.  Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey. arXiv. 2402.17467PDF icon 2402.17467.pdf (1.01 MB)
Lam P., Zhang H., Chen N.F, Sisman B., Herremans D..  2024.  SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech. Proc. of IEEE Tencon, Singapore. PDF icon 2211.07283.pdf (435.22 KB)
Kang J, Poria S, Herremans D..  2024.  Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model. Expert Systems with Applications. PDF icon 2311.00968.pdf (5.51 MB)
2022
Clarke C.J., Chowdhury J., BT B, Priyadarshinee P., Lim C.M.Ying, I. Tan FXing, Herremans D., Chen J.M..  2022.  Computationally Efficient Physics Approximating Neural Networks for Highly Nonlinear Maps. 2022 International Conference on Research in Adaptive and Convergent Systems.
Makris D., Guo Z, Kaliakatsos-Papakostas N., Herremans D..  2022.  Conditional Drums Generation using Compound Word Representations. EvoMUSART (EVO*) - Lecture Notes in Computer Science. PDF icon 2202.04464.pdf (525.36 KB)
Levers O.D, Herremans D., Dipankar A., Blessing L..  2022.  Downscaling using Deep Convolutional Autoencoders, a case study for South East Asia. Egusphere preprint. PDF icon egusphere-2022-234.pdf (8.99 MB)
Pham Q-H, Herremans D., Roig G..  2022.  EmoMV: Affective Music-Video Correspondence Learning Datasets for Classification and Retrieval. Information Fusion. PDF icon SSRN-id4189323.pdf (2.01 MB)
Herremans D., Low K.W..  2022.  Forecasting Bitcoin Volatility Spikes from Whale Transactions and Cryptoquant Data Using Synthesizer Transformer Models. SSRN. PDF icon SSRN-id4247684.pdf (5.05 MB)
BT B, Hee H.I., Ming C., Lin Y., Priyadarshinee P., Clarke C.J., Herremans D., Chen J.M..  2022.  A Gaussian mixture classifier model to differentiate respiratory symptoms using phonated /ɑː/ sounds. The 18th Australasian International Conference on Speech Science and Technology (SST). PDF icon ahsounds.pdf (1018.01 KB)
Turian J, Shier J, Khan HRaj, Raj B, Schuller BW, Steinmetz CJ, Malloy C, Tzanetakis G, Velarde G, McNally K et al..  2022.  HEAR 2021: Holistic Evaluation of Audio Representations. Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track. PDF icon 2203.03022.pdf (406.58 KB)
Cheuk K.W., Choi K., Kong Q., Li B., Won M., Hung A., Wang J.-C., Herremans D..  2022.  Jointist: Joint Learning for Multi-instrument Transcription and Its Applications. PDF icon 2206.10805.pdf (427.51 KB)
Kaliakatsos-Papakostas N., Bastas G., Makris D., Herremans D., Katsouros V., Maragos P..  2022.  A Machine Learning Approach for MIDI to Guitar Tablature Conversion. Sound and Music Computing Conference (SMC). PDF icon 25.pdf (528.42 KB)
Guo R, Simpton I., Kiefer C., Magnusson T, Herremans D..  2022.  MusIAC: An extensible generative framework for Music Infilling Application with multi-level Control. EvoMUSART. PDF icon 2202.05528.pdf (893.23 KB)
Chua P., Makris D., Agres K., Roig G., Herremans D..  2022.  Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses. Arxiv preprint.
Huang J, Chia YKen, Yu S, Yee K, Küster D, Krumhuber EG, Herremans D, Roig G..  2022.  Single Image Video Prediction with Auto-Regressive GANs. Sensors. 22:3533.
Kwan Y.H., Cheuk K.W., Herremans D..  2022.  Understanding Audio Features via Trainable Basis Functions. Arxiv preprint. PDF icon 2204.11437.pdf (7.36 MB)
Sockalingam N., Lo K., n KO., Herremans D., Raghunath N., Cancion H.GC, Kejun H., Leong H., Tan J., Nizharzharudin K. et al..  2022.  A white paper on cyberphysical learning. White paper, Singapore University of Technology and Design. PDF icon LSL_WhitePaper_Cyber-physical-Campus-Higher-Education.pdf (6.98 MB)
2021
Herremans D.  2021.  aiSTROM - A roadmap for developing a successful AI strategy. IEEE Access.
T. Phuong HThi, BT B, Roig G., Herremans D..  2021.  AttendAffectNet – Emotion Prediction of Movie Viewers Using Multimodal Fusion with Self-attention. Sensors. Special issue on Intelligent Sensors: Sensor Based Multi-Modal Emotion Recognition. PDF icon sensors-21-08356.pdf (1.03 MB)
T. Phuong HThi, BT B, Herremans D., Roig G..  2021.  AttendAffectNet: Self-Attention based Networks for Predicting Affective Responses from Movies. Proceedings of the International Conference on Pattern Recognition (ICPR2020). PDF icon 2010.11188.pdf (7.07 MB)
T BB, Hee HIng, Kapoor S, Teoh OHoe, Teng SShin, Lee KPin, Herremans D, Chen JMing.  2021.  Deep Neural Network Based Respiratory Pathology Classification Using Cough Sounds. Sensors. 21(16):5555.PDF icon 2106.12174.pdf (6.52 MB)
Cheuk K.W., Luo Y.J., Benetos E., Herremans D..  2021.  The Effect of Spectrogram Reconstructions on Automatic Music Transcription:An Alternative Approach to Improve Transcription Accuracy. Proceedings of the International Conference on Pattern Recognition (ICPR2020). PDF icon 2010.09969.pdf (3.46 MB)
Wang K., Tekler Z., Cheah L., Herremans D., Blessing L..  2021.  Evaluating the Effectiveness of an Augmented Reality Game Promoting Environmental Action. Sustainability. 13(24):13912.PDF icon sustainability-13-13912.pdf (16.23 MB)
Makris D., Agres K., Herremans D..  2021.  Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework. Proceedings of the International Joint Conference on Neural Networks (IJCNN). PDF icon 2104.13056.pdf (857.78 KB)
Guo Z, Makris D., Herremans D..  2021.  Hierarchical Recurrent Neural Networks for Conditional Melody Generation with Long-term Structure. Proceedings of the International Joint Conference on Neural Networks (IJCNN). PDF icon 2102.09794.pdf (1015.73 KB)
Agres K., Schaefer R, Volk A, Van Hooren S, Holzapfel A, Bella SDalla, Müller M, de Witte M, Herremans D., Melendez RRamirez et al..  2021.  Music, Computing, and Health: A roadmap for the current and future roles of music technology for healthcare and well-being. Music & Science. PDF icon Preprint for OSF_Agres, Schaefer, Volk, et al. (2021)_Music & Science_watermark.pdf (4.07 MB)
Kroonenberg P., Herremans D..  2021.  Musical stylometry: Characterisation of music. Multivariate Humanities.
Cheuk K.W., Su L., Herremans D..  2021.  ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data. ACM Multimedia.
Cheuk K.W., Luo Y.J., Benetos E., Herremans D..  2021.  Revisiting the Onsets and Frames Model with Additive Attention. Proceedings of the International Joint Conference on Neural Networks (IJCNN). PDF icon 2104.06607.pdf (1.52 MB)
Lee-Leon A., Yuen C., Herremans D..  2021.  Underwater Acoustic Communication Receiver Using Deep Belief Network. IEEE Transactions on Communications. :1-1.PDF icon 2102.13397.pdf (12.87 MB)
2020
BT B, Aslim E.J, Ng YShu Lynn, Kuo TLi Chuen, Chen JShihang, Herremans D., Ng LGuat, Chen J.M..  2020.  Acoustic prediction of flowrate: varying liquid jet stream onto a free surface. IEEE International Conference on Signal Processing and Communications (SPCOM). PDF icon preprint flow.pdf (1.01 MB)
BT B, Hee H.I., Teoh O.H., Lee K.P., Kapoor S., Herremans D., Chen J.M..  2020.  Asthmatic versus healthy child classification based on cough and vocalised /a:/ sounds. The Journal of the Acoustical Society of America (JASA). 148, EL253
Pham Q-H.  2020.  Data-driven 3D Scene Understanding. PhD
Nahar F., Agres K., BT B, Herremans D..  2020.  A dataset and classification model for Malay, Hindi, Tamil and Chinese music. 13th Workshop on music and machine learning (MML) as part of ECML/PKDD. PDF icon 2009.04459.pdf (234.8 KB)
Tan HHao, Luo Y.J., Herremans D..  2020.  Generative Modelling for Controllable Audio Synthesis of Expressive Piano Performance. Workshop on Machine Learning for Music Discover (ML4MD) as part of ICML. PDF icon 2006.09833.pdf (2.81 MB)
Cheuk K.W., Agres K., Herremans D..  2020.  The impact of Audio input representations on neural network based music transcription. Proceedings of the International Joint Conference on Neural Networks (IJCNN). PDF icon 2001.09989.pdf (1.87 MB)
Tan H.H., Herremans D..  2020.  Music FaderNets: Controllable Music Generation Based On High-Level Features via Low-Level Feature Modelling. ISMIR. PDF icon 2007.15474.pdf (2.67 MB)
Cheuk K.W., Anderson H., Agres K., Herremans D..  2020.  nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks. IEEE Access. PDF icon nnAudio.pdf (10.2 MB)
Garg K., Singh A., Herremans D., Lall B..  2020.  PerceptionGAN: Real-world image construction from provided text through perceptual understanding. 4th Int. Conf. on Imaging, Vision and Pattern Recognition (IVPR), and 9th Int. Conf. on Informatics, Electronics & Vision (ICIEV). PDF icon perceptionGAN-preprint.pdf (2.83 MB)
Cheuk K.W., Luo Y.J., BT B, Roig G., Herremans D..  2020.  Regression-based music emotion prediction using triplet neural networks. Proceedings of the International Joint Conference on Neural Networks (IJCNN). PDF icon 2001.09988.pdf (777.31 KB)
Luo Y.J., Hsu C.-C., Agres K., Herremans D..  2020.  Singing voice conversion with disentangled representations of singer and vocal technique using variational autoencoders. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). PDF icon 1912.02613.pdf (2.9 MB)
Luo Y.J., Cheuk K.W., Nakano T., Goto M., Herremans D..  2020.  Unsupervised disentanglement of pitch and timbre for isolated musical instrument sounds. Proceedings of the International Society of Music Information Retrieval (ISMIR).
Guo R, Simpson I, Magnusson T, Kiefer C., Herremans D..  2020.  A variational autoencoder for music generation controlled by tonal tension. Joint Conference on AI Music Creativity (CSMC + MuMe). PDF icon 2010.06230.pdf (622.82 KB)
2019
Hee H.I., BT B, Karunakaran A., Herremans D., Teoh O.H., Lee K.P., Teng S.S., Lui S., Chen J.M..  2019.  Development of Machine Learning for asthmatic and healthy voluntary cough - a proof of concept study. Applied Sciences. 9(14)PDF icon applsci-09-02833.pdf (2.06 MB)
Lee-Leon A., Yuen C., Herremans D..  2019.  Doppler Invariant Demodulation for Shallow Water Acoustic Communications Using Deep Belief Networks. 16th IEEE Asia Pacific Wireless Communications Symposium (APWCS). PDF icon 1909.02850.pdf (790.54 KB)
Herremans D., Chuan C.-H..  2019.  The emergence of deep learning: new opportunities for music and audio technologies. Neural Computing and Applications. PDF icon main_preprint.pdf (102.16 KB)
Lee-Leon A., Yuen C., Herremans D..  2019.  A Hybrid Fuzzy Logic-Neural Network Approach For Multi-path Separation Of Underwater Acoustic Signals. 89th IEEE Vehicular Technology Conference. PDF icon fuzzy logic.pdf (1.66 MB)
Agres K., Bigo L., Herremans D..  2019.  The impact of musical structure on enjoyment and absorptive listening states in trance music. Music and Consciousness 2 - Worlds, Practices, Modalities.
Cheuk K.W., BT B, Roig G., Herremans D..  2019.  Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networks. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2019). PDF icon 1910.01463.pdf (934.76 KB)
Luo Y.J., Agres K., Herremans D..  2019.  Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders. ISMIR. PDF icon jyun-ismir.pdf (5.62 MB)
Sturm B., Ben-Tal O., Monaghan U., Collins N., Herremans D., Chew E., Hadjeres G., Deruty E., Pachet F..  2019.  Machine Learning Research that Matters for Music Creation: A Case Study. Journal of New Music Research. 48(1):36-55.PDF icon concert_paper_preprint.pdf (1.6 MB)
Guo R, Herremans D, Magnusson T.  2019.  Midi Miner – A Python library for tonal tension and track classification. ISMIR - Late Breaking Demo. PDF icon midi_miner.pdf (83.7 KB)
T. Phuong HThi, Herremans D., Roig G..  2019.  Multimodal Deep Models for Predicting Affective Responses Evoked by Movies. The 2nd International Workshop on Computer Vision for Physiological Measurement as part of ICCV. Seoul, South Korea. 2019. PDF icon 1909.06957.pdf (836.3 KB)
Cheuk K.W., Agres K., Herremans D..  2019.  nnAudio: A PyTorch Audio Processing Tool Using 1D Convolution neural networks. ISMIR - Late Breaking Demo. PDF icon nnAudio.pdf (399.08 KB)
Agres K., Lui S., Herremans D..  2019.  A novel music-based game with motion capture to support cognitive and motor function in the elderly. IEEE Conference on Games. PDF icon preprint.pdf (2.6 MB)
Herremans D., Chew E..  2019.  Towards emotion based music generation: A tonal tension model based on the spiral array. Proceedings of Cognitive Science (CogSci). PDF icon CogSci_tension (1).pdf (610.91 KB)
BT B, Lin K.W.E., Lui S., Chen J.M., Herremans D..  2019.  Towards robust audio spoofing detection: a detailed comparison of traditional and learned features. IEEE Access. 7:84229-84241.PDF icon ieee_access_herremans.pdf (14.31 MB)
2018
Cheuk K.W., BT B, Roig G., Herremans D..  2018.  Blacklisted speaker identification using triplet neural networks. MCE2018 competition. PDF icon SUTD_description.pdf (133.08 KB)
Chuan C.-H., Agres K., Herremans D..  2018.  From Context to Concept: Exploring Semantic Relationships in Music with Word2Vec. Neural Computing and Applications. PDF icon paper.pdf (1.64 MB)
Agus N., Anderson H., Chen J.M., Lui S., Herremans D..  2018.  Minimally Simple Binaural Room Modelling Using a Single Feedback Delay Network. Journal of the Audio Engineering Society. 66(10):791-807.PDF icon angus_jaes_preprint.pdf (6.39 MB)
Chuan C.-H., Herremans D..  2018.  Modeling temporal tonal relations in polyphonic music through deep networks with a novel image-based representation. The Thirty-Second AAAI Conference on Artificial Intelligence. PDF icon preprint_lstm.pdf (741.28 KB)
Sokolovskis J., Herremans D., Chew E..  2018.  A Novel Interface for the Graphical Analysis of Music Practice Behaviours. Frontiers in Psychology - Human-Media Interaction. 9PDF icon practice_browser.pdf (4.9 MB)
Herremans D., Chew E..  2018.  O.R. and music generation. OR/MS Today. 45(1)PDF icon O.R. and music generation - INFORMS.pdf (825.66 KB)
Agus N., Anderson H., Chen J.M., Lui S., Herremans D..  2018.  Perceptual evaluation of measures of spectral variance. Journal of the Acoustical Society of America. 143(6):3300–3311.PDF icon jasa_an_dh_preprint.pdf (2.46 MB)
Agus N..  2018.  Real-Time Binaural Auralization. ISTD. PhDPDF icon NatalieAngus_PhD_Thesis_01Jul18.pdf (6.19 MB)
Lin K.W.E., BT B, Koh E., Lui S., Herremans D..  2018.  Singing Voice Separation Using a Deep Convolutional Neural Network Trained by Ideal Binary Mask and Cross Entropy. Neural Computing and Applications. PDF icon main.pdf (2.59 MB)
Agres K., Herremans D..  2018.  The Structure of Chord Progressions Influences Listeners’ Enjoyment and Absorptive States in EDM. 15th International Conference on Music Perception and Cognition. PDF icon Agres460_preprint_v2.pdf (387.15 KB)
2017
Herremans D., Chuan C.-H., Chew E..  2017.  A Functional Taxonomy of Music Generation Systems. ACM Computing Surveys. 50(5):30.PDF icon music_generation_survey_dh_preprint.pdf (349.15 KB)
Cunha N., A. S, Herremans D.  2017.  Generating guitar solos by integer programming. Journal of the Operational Research Society. :971-985.PDF icon preprint_guitar_solo_generation_dh.pdf (772.59 KB)
Agres K., Herremans D., Bigo L., Conklin D..  2017.  Harmonic Structure Predicts the Enjoyment of Uplifting Trance Music. Frontiers in Psychology, Cognitive Science. 7(1999)PDF icon agres16ut.pdf (1.15 MB)
Herremans D., Bergmans T..  2017.  Hit Song Prediction Based on Early Adopter Data and Audio Features. The 18th International Society for Music Information Retrieval Conference (ISMIR) - Late Breaking Demo. PDF icon paper_preprint_hit.pdf (221.73 KB)
Herremans D., Yang S., Chuan C.-H., Barthet M., Chew E..  2017.  IMMA-Emo: A Multimodal Interface for Visualising Score- and Audio-synchronised Emotion Annotations. Audio Mostly. PDF icon IMMA-emo_preprint.pdf (1.4 MB)
Herremans D., Chuan C.-H..  2017.  Modeling Musical Context with Word2vec. First International Workshop On Deep Learning and Music. 1:11-18.PDF icon herremans2017work2vec.pdf (745.8 KB)
Herremans D., Chew E..  2017.  MorpheuS: generating structured music with constrained patterns and tension. IEEE Transactions on Affective Computing. PP (In Press)(99)PDF icon herremans2017morpheusFullIEEE.pdf (5.71 MB)
Herremans D., Chuan C.-H..  2017.  A multi-modal platform for semantic music analysis: visualizing audio- and score-based tension. 11th International Conference on Semantic Computing IEEE ICSC 2017. PDF icon paper_preprint.pdf (1.63 MB)
Agres K., Herremans D..  2017.  Music and Motion-Detection: A Game Prototype for Rehabilitation and Strengthening in the Elderly. IEEE International Conference on Orange Technologies (ICOT) . PDF icon agres_herr_music_rehab_preprint.pdf (1.77 MB)
Balliauw M., Herremans D., D. Cuervo P, Sörensen K..  2017.  A variable neighborhood search algorithm to generate piano fingerings for polyphonic sheet music. International Transactions in Operational Research, Special Issue on Variable Neighbourhood Search. 24(3):509–535.PDF icon ITOR_VNS_APF_preprint.pdf (840.28 KB)
Herremans D., Lauwers W..  2017.  Visualizing the evolution of alternative hit charts. The 18th International Society for Music Information Retrieval Conference (ISMIR) - Late Breaking Demo. PDF icon dh_visualiation_preprint.pdf (5.34 MB)
2016
Agres K., Bigo L., Herremans D., Conklin D..  2016.  The Effect of Repetitive Structure on Enjoyment in Uplifting Trance Music. 14th International Conference for Music Perception and Cognition (ICMPC). :280-282.PDF icon preprint_trance.pdf (139.27 KB)
Herremans D., Chew E..  2016.  MorpheuS: Automatic music generation with recurrent pattern constraints and tension profiles. IEEE TENCON. PDF icon paper_morpheus_dh_ieee.pdf (550.61 KB)
Herremans D., Chew E..  2016.  MorpheuS: constraining structure in automatic music generation. Dagstuhl seminar on Computational Music Structure Analysis. PDF icon abstract_dagstuhl_dh.pdf (88.49 KB)
Herremans D., Chew E..  2016.  Music generation with structural constraints: an operations research approach. 30th Annual Conference of the Belgian Operational Research (OR) Society (ORBEL30). :37-39.PDF icon orbel30_dh.pdf (117.78 KB)
Herremans D., Chew E..  2016.  Tension ribbons: Quantifying and visualising tonal tension. Second International Conference on Technologies for Music Notation and Representation (TENOR). 2:8-18.PDF icon paper_tenor_dh_preprint_small.pdf (1.67 MB)
Cunha N., A. S, Herremans D..  2016.  Uma abordagem baseada em programação linear inteira para a geração de solos de guitarra. XLVIII Simpósio Brasileiro de Pesquisa Operacional (SBPO). PDF icon sbpo_dh.pdf (346.61 KB)

Pages