Synthia’s Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio

Published in NeurIPS 2023 Workshop + ICASSP 2024', 2023

Chia-Hsin Lin, Charles Jones, Björn W. Schuller, Harry Coppock, Synthia's Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio, NeurIPS 2023 Workshop Machine Learning for Audio, 2024 https://arxiv.org/pdf/2309.15024

Despite significant advancements in deep learning for vision and natural language, unsupervised domain adaptation in audio remains relatively unexplored. We, in part, attribute this to the lack of an appropriate benchmark dataset. To address this gap, we present Synthia’s melody, a novel audio data generation framework capable of simulating an infinite variety of 4-second melodies with user-specified confounding structures characterised by musical keys, timbre, and loudness. Unlike existing datasets collected under observational settings, Synthia’s melody is free of unobserved biases, ensuring the reproducibility and comparability of experiments. To showcase its utility, we generate two types of distribution shifts-domain shift and sample selection bias-and evaluate the performance of acoustic deep learning models under these shifts. Our evaluations reveal that Synthia’s melody provides a robust testbed for examining the susceptibility of these models to varying levels of distribution shift. Download paper here

Bibtex Entry

@misc{lin2023synthiasmelodybenchmarkframework,
      title={Synthia's Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio}, 
      author={Chia-Hsin Lin and Charles Jones and Björn W. Schuller and Harry Coppock},
      year={2023},
      eprint={2309.15024},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2309.15024}, 
}