The CHiME-5 distant-microphone dinner party speech corpus

The CHiME-5 dataset is a collection of over 50 hours of conversational speech recordings collected from twenty real dinner parties that have taken place in real homes. The recordings have been made using multiple 4-channel microphone arrays and have been fully transcribed.
Technology No.

The CHiME-5 dataset is a collection of over 50 hours of conversational speech recordings collected from twenty real dinner parties that have taken place in real homes. The recordings have been made using multiple 4-channel microphone arrays and have been fully transcribed.

The dataset features:

  • simultaneous recordings from multiple microphone arrays;
  • real conversation, i.e. talkers speaking in a relaxed and unscripted fashion;
  • a range of room acoustics from 20 different homes each with two or three separate recording areas;
  • real domestic noise backgrounds, e.g., kitchen appliances, air conditioning, movement, etc.

Fully-transcribed utterances are provided in continuous audio with ground truth speaker labels and start/end time annotations for segmentation.

The dataset was used for the 5th CHiME Speech Separation and Recognition Challenge. Further information and an open source baseline speech recognition system are available online (http://spandh.dcs.shef.ac.uk/chime_challenge/chime2018).

  • swap_vertical_circlemode_editAuthors (1)
    Professor Jonathan Barker
  • swap_vertical_circlelibrary_booksReferences (1)
    1. Trmal, Vincent, Watanabe , Barker (2018), The Fifth "CHiME" Speech Separation and Recognition Challenge: Dataset, Task and Baselines, Proc. Interspeech 2018, 1561-1565
  • swap_vertical_circlecloud_downloadDownloads (1)
    *
    CHiME-5-Instructions-Non-Commercial download.pdf
    size: 35 KB, type: application/pdf
    Files marked with an asterix (*) can only be downloaded by users that have the appropriate product license. The license must be active and you must be logged into your account.
Commercial - Data FXD licence - CHiME-5
Data licence - commercial

Term: perpetual

Price per 1 unit:
£2000.00 excl. VAT

Non-commercial/academic - Data FOC licence - CHiME-5
Data licence Non-commercial

Term: perpetual

Price per 1 unit:
£0.00 excl. VAT