Nijmegen Corpus of Casual Spanish

The Nijmegen Corpus of Casual Spanish contains around 30 hours of high-quality recordings featuring 52 Spanish speakers from Madrid conversing among friends. The speech has been orthographically annotated by professional transcribers. The transcriptions are stored in Transcriber xml and Praat TextGrid files, as well as ELAN .eaf format.

The corpus is available to researchers in academics. If you would like to obtain a copy of the corpus, you can contact the Radboud University Faculty of Arts data officer by e-mail dataofficer@let.ru.nl

A detailed description of the corpus is provided in:

  • F. Torreira & M. Ernestus (2012). Weakening of intervocalic /s/ in the Nijmegen Corpus of Casual Spanish. Phonetica 69, 124-148.[pdf]

This project was funded by a European Young Investigator Award to Mirjam Ernestus. The corpus was recorded by Francisco Torreira at the Universidad Polit├ęcnica de Madrid in March 2008 as part of his dissertation work at the Radboud University Nijmegen. The orthographic transcription was carried out by Verbio Speech Technologies S.L. in Spain.