Nijmegen Corpus of Spanish English

The Nijmegen Corpus of Spanish English (NCSE) contains 38.5 hours of high-quality recordings of English speech produced by 34 native Spanish speakers in interaction with two native Dutch confederates. The NCSE contains a formal and an informal recording for each Spanish speaker. The speech has been orthographically transcribed into Praat TextGrid files.

The corpus is available to researchers in academics. If you would like to obtain a copy of the corpus, you can contact the Radboud University Faculty of Arts data officer by e-mail

A detailed description of the corpus is provided in:
  • Kouwenhoven, H., M. Ernestus & M. van Mulken (accepted). Register variation by Spanish users of English. The Nijmegen Corpus of Spanish English. Corpus Linguistics and Linguistic Theory.

The NCSE was recorded by Huib Kouwenhoven in the laboratory of the Grupo de Tecnología del Habla at the Escuela Técnica Superior de Ingenieros de Telecomunicación of the Universidad Politécnica de Madrid as part of his dissertation work at the Radboud University. The orthographic transcription was produced at the Centre for Language Studies of Radboud University under the direction of Huib Kouwenhoven and myself.