Nijmegen Corpus of Spanish English

The Nijmegen Corpus of Spanish English (NCSE) contains 38.5 hours of high-quality recordings of English speech produced by 34 native Spanish speakers in interaction with two native Dutch confederates. The NCSE contains a formal and an informal recording for each Spanish speaker. The speech has been orthographically transcribed into Praat TextGrid files.

The corpus is available to researchers in academics. If you would like to obtain access to the corpus, you can send an access request to the Radboud University Faculty of Arts data officer by e-mail dataofficer@let.ru.nl. The data officer will send you a data use agreement. After the data use agreement has been signed, you will be granted access to the corpus.

A detailed description of the corpus is provided in:

Kouwenhoven, H., M. Ernestus & M. van Mulken (accepted). Register variation by Spanish users of English. The Nijmegen Corpus of Spanish English. Corpus Linguistics and Linguistic Theory.

The NCSE was recorded by Huib Kouwenhoven in the laboratory of the Grupo de Tecnología del Habla at the Escuela Técnica Superior de Ingenieros de Telecomunicación of the Universidad Politécnica de Madrid as part of his dissertation work at the Radboud University. The orthographic transcription was produced at the Centre for Language Studies of Radboud University under the direction of Huib Kouwenhoven and myself.