You are here

CHIL Head Orientation


This database consists of a set of sentences reproduced by a loudspeaker with different orientations and positions. It was recorded to analyze and evaluate algorithms for talker’s position and orientation estimation.
The use of a loudspeaker guarantees a precise reference for both position and orientation.

Description of the recording audio setup

The database was recorded in the CHIL room available at ITC-irst with 7 T-shaped microphone arrays distributed along the walls. Each T-shaped array consists of 4 microphones. Data were recorded at 44.1kHz and 16-bit precision. All the arrays were synchronized through a clock on a BNC connector. Signal format is RAW Little Endian. All the arrays but T0 are time aligned. Array T0 can be aligned using a periodic prearranged signal.

Description of the recording procedure

The collected database consists of a single audio sequence reproduced several times by a loudspeaker located in different positions and pointing to different directions. The loudspeaker is a Tannoy System 600A . The audio sequence includes different acoustic events: chirp, speech, fricative sounds, white and brown noises; for an overall length of 37.7 seconds. The loudspeaker, 157 centimeters high, reproduced the same sequence in 12 positions, 120 cm distant each other, in order to cover the whole room. For each position the audio sequence was reproduced 8 times, rotating each time the loudspeaker by a step of 45 degrees. Then, keeping the same spatial position, 7 more directions were investigated, aiming the loudspeaker at each T-shaped microphone array.

Description of the database

The database consists of a set of files, each one representing a single audio channel from the arrays. Recordings were done in 12 positions with 15 axis rotations. This leads to 180 different recording of the same audio sequence, but stored in 28 audio files. The total length of the audio signals is about 110 minutes.