The UK Choral AI Dataset (uk-choral-ai) is an audio dataset curated by the Serpentine Arts Technologies team for The Call exhibition by Holly Herndon and Mat Dryhurst.
The dataset was created to train and finetune generative audio models, like those produced by IRCAM with Herndon and Dryhurst for the exhibition. However the data collection methods were designed to create a rich dataset which maybe useful for a wide range of audio tasks.
The dataset contains:
483 Recordings of performances of vocal exercises and compositions devised by Holly Herndon and Matt Dryhust
Recordings are licenced for ML research and development and are GDPR compliant
The complete dataset of 48000hz 24bit WAVs is approx. 240 GB
15 UK-based choirs are represented
Each choir was recorded according to the same schema with a multi-microphone array capturing: close miked 8 soloists, 4 room mics, and a first-order ambisonic microphone.
The audio was post-processed and mixed down to stereo files.
The dataset includes the stereo mixes from the different microphone sources, a main mix and the isolated recordings from each microphone.
The dataset is available in 3 segments: Complete WAV, Complete Ogg, and Preview Ogg.
Curated by: Serpentine Arts Technologies with Holly Herndon and Matt Dryhurst
Data Steward: Jennifer Ding
License: LINK TO LICENCE
Code of Conduct: LINK TO CODE OF CONDUCT