The DELNN corpus

DELNN: Dutch English Lombard Native and Non-native Speech

The DELNN corpus consists of recordings of 30 female native speakers of Dutch and 9 female native speakers of English. They all read 144 question-answer pairs in English. The 144 answers featured 36 target words:

- 12 words that started with "th" (e.g. "theater");

- 12 words that ended in a voiced obstruent (e.g., "club");

- 12 Dutch-English cognates with a schwa that in the word's spelling is represented with a full vowel and that is followed by a stressed syllable (e.g. "parade").

Each target word occurs in two sentences in which it carries sentence accent and in two sentences in which it does not. Half of the sentences were produced in Lombard speech (that is, were recorded with background noise).

The native speakers of Dutch also read 96 question-answer pairs in Dutch, with 24 target words, 12 of which were the Dutch translations of the English target words with schwa. These 96 question-answer pairs represented the same four conditions as the English question-answer pairs (accented versus non-accented target word crossed with Lombard versus plain speech).

The recordings are aligned at the word and the phone level.

A detailed description of the corpus is provided in:

  • K. Marcoux & M. Ernestus (2024). Acoustic characteristics of non-native Lombard speech in the DELNN corpus Journal of Phonetics, 102, 101281[link]

The corpus is freely available for research purposes here

This project was funded by the European Union’s Horizon 2020 research innovation programme (Marie SkłodowskaCurie grant No. 675324).