Highlights/Upcoming events
Proceedings of EAIM available in PMLR!
Posted by dorien on Wednesday, 11 February 2026A few weeks ago, we had the pleasure of organizing the Workshop on Emerging AI Technologies for Music @AAAI in Singapore.
I’m excited to share that our workshop proceedings are now officially published in the Proceedings of Machine Learning Research!
Explore the full proceedings here: https://proceedings.mlr.press/v303/
Editors: Dorien Herremans, Keshav Bhandari, Abhinaba Roy, Ph.D., Simon Colton, Mathieu Barthet
Job opening: research assistant Large Multimodal Models
Posted by dorien on Tuesday, 3 February 2026The Audio, Music, and AI Lab (AMAAI) at Singapore University of Technology and Design (SUTD) is seeking a talented Research Assistant (or potentially a Postdoc) — essentially a coding wizard — to join us in pushing forward large-scale generative multimodal AI models, with a strong emphasis on music and audio.
Looking to hire: Founding Business & Product Lead – Music AI Startup
Posted by dorien on Wednesday, 21 January 2026Top 2% of Scientists Worldwide in the 2024 Stanford/Elsevier ranking
Posted by dorien on Tuesday, 18 November 2025Excited to share that my research impact placed me in the Top 2% of Scientists Worldwide in the 2024 Stanford/Elsevier ranking.
Grateful for the amazing collaborators and support that made this possible!
Can AI really compose band-quality music - with structure, harmony, and creative control?
Posted by dorien on Thursday, 30 October 2025That’s the question we set out to explore in our latest work, BandCondiNet, now accepted in Expert Systems with Applications!
Conditional music generation promises more user control, but current systems often struggle with three things:
- low-fidelity input conditions,
- weak structural coherence, and
- poor harmony across instruments.
SonicMaster - all-in-one mastering model
Posted by dorien on Wednesday, 22 October 2025Ever struggled with cleaning up home-recorded music? Issues like weird echoes, distortion, uneven sound, or mastering to bring your music up to production-level quality can be a huge pain to fix — usually needing several different tools and lots of tweaking.
We just released SonicMaster, a model that aims to simplify this process by handling all those common problems in one place. The coolest part? You can control it with simple text instructions ('Make the audio smoother and less distorted.'') or let it automatically restore your audio.
Congratulations Dr. Jan on graduating!
Posted by dorien on Thursday, 11 September 2025Congratulations Dr. Jan Melechovsky on obtaining your PhD! I’ve had the pleasure to guide Jan through his PhD journey at the AMAAI lab at Singapore University of Technology and Design. The last five years, Jan has explored a number of fascinating (yet connected ; ) topics, ranging from dysarthric speech analysis to text-to-music.
Some highlights:
AMAAI is organizing EAIM workshop at AAAI
Posted by dorien on Tuesday, 9 September 2025Together with the AIM at QMUL, the AMAAI Lab is organizing the First Workshop on Emerging AI Technologies for Music (EAIM 2026) as part of the AAAI conference 2026 in Singapore.
The workshop will bring together researchers, industry leaders, and practitioners working at the intersection of AI and music. We’ll explore how advances in generative models, multimodal learning, personalization, explainability, and human–AI collaboration are shaping the future of music creation, analysis, and interaction.
Royalties in the age of AI: paying artists for AI-generated songs
Posted by dorien on Friday, 25 April 2025As we celebrate World IP Day tomorrow, it’s a fitting time to reflect on how generative AI is transforming music creation, producing impressibly polished tracks in seconds, while also raising vital questions about fairly compensating artists whose work is used to train these models. I explore this topic in my article featured in the World Intellectual Property Organization – WIPO Magazine’s' special issue on Music and IP, 'Royalties in the age of AI: paying artists for AI-generated songs'.
PhD Fellowship Opportunities at SUTD
Posted by dorien on Wednesday, 19 March 2025I am thrilled to announce two PhD fellowship opportunities at the Singapore University of Technology and Design (SUTD) for talented Thai students, Singaporean students, and Singapore Permanent Residents (PR). These fully-funded positions are available in the fields of AI for Finance and AI for Music, hosted by the AIFi Lab (AI for Finance) and AMAAI Lab (Audio, Music, and AI Lab), respectively. We are seeking exceptional candidates with a passion for AI and strong academic backgrounds to join our vibrant research community.
What should I work on next?
Posted by dorien on Friday, 7 February 2025"What should I work on next?", is the question we are trying to answer in our latest paper.
The arrival of LLMs and foundational models have significantly changed the field of Music Information Retrieval (ISMIR Conference).
Many of the researchers in the field have had to pivot or adapt to the changing environment and the powerful tools that we now have available. The question many of us are asking is: what topics remain unexplored and are in need of solving?
Text2midi at AAAI
Posted by dorien on Tuesday, 14 January 2025I’m thrilled to introduce text2midi, an end-to-end trained AI model designed to bridge the gap between textual descriptions and MIDI file generation! Our paper has been accepted in the Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), and will be presented in Philadelphia the coming month.
PhD internships at AMAAI Lab
Posted by dorien on Wednesday, 11 December 2024I am happy to share that we have PhD internship positions available at the Audio, Music, and AI lab (AMAAI) at Singapore University of Technology and Design (https://dorienherremans.com). At our lab, we are building large multimodal models for music such as Mustango.
Presenting MidiCaps and MIRFLEX at ISMIR in San Francisco
Posted by dorien on Tuesday, 10 December 2024Exciting news from the AMAAI Lab at this year’s ISMIR conference in San Francisco! We were thrilled to showcase some of our research:
MidiCaps
Presented by Jan Melechovsky and Abinaba Roy, MidiCaps is the first large-scale open midi dataset with text captions. This resource will enable us to develop the very first text-to-midi models (stay tuned -- our lab's model is coming soon!).
AI Output: To Protect or Not to Protect – That Is the IP Question
Posted by dorien on Wednesday, 6 November 2024Visit from ZJU
Posted by dorien on Saturday, 2 November 2024The AMAAI Lab was honoured to receive Prof. Kejun Zhang and Jiaxing Yu from Zhejiang University last week.
It was fascinating to hear about their latest research in multimodal AI, affective computing for music, latest large datasets, and cool image-based music generation interfaces. I am looking forward to future collaboration!
AMAAI: Abhinaba Roy, Ph.D., Renhang Liu, 路通宇, Geeta Puri, Sithumi Kavindya, Jan Melechovsky, Charlotta-Marlena Geist
Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction
Posted by dorien on Wednesday, 30 October 2024Do you listen to music when you are down? Emotion and music are intrinsically connected. Yet we still struggle to model this. Why?
One of the reasons is that we do only have a handful of small datasets, each using a different set of emotion labels. The AMAAI lab set out to overcome this by developing a zero shot alignment method that is able to merge different datasets using LLM embeddings.
Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges
Posted by dorien on Monday, 28 October 2024How come music and emotion are so intrinsically connected, yet, any music emotion prediction model has sub-par performance? We talk about the current challenges in the field and provide a comprehensive list of music-emotion datasets as well as recent models.
We also kept these lists in a community GitHub so they can be kept up to date: https://github.com/AMAAI-Lab/awesome-MER. If we forgot any model or dataset: just do a pull request to add yours!
20 years since my first generative music model!
Posted by dorien on Thursday, 10 October 2024Can't believe it's been 20 years already (!!!) since I wrote my first work on generative music models as my master thesis as commercial engineer @University of Antwerp, supervised by Kenneth Sörensen.
For those interested, it used a Tabu Search algorithm to optimize abc notation format melodies given a ruleset. Coded in Pascal with midi expert function coded in pure hex.
Read thesis here (in Dutch with full source code in appendix).





























