You are here

Distant-speech interaction

Speech interaction with distant microphones represents a crucial step towards the deployment of flexible and non-invasive voice-enabled interfaces in novel application contexts, as for instance the smart home. In general, the distortion introduced in the signal by the environment, due to the distance between user and microphone, determines a significant reduction of speech recognition accuracy, if compared to the performance that can be obtained in an ideal close-talking condition, i.e., with the speaker at a distance of few centimeters from the microphone. This problem is quite well known by the related scientific community, and it is being tackled by an increasing number of labs and researchers. To further highlight the complexity of this challenge, it is worth noting that any speech recognition technology nowadays available in the market does not provide acceptable performance when applied in a distant-talking fashion, with the speaker at a distance of a few meters.

A description of our past activities on this issue can be found here.

Since January 2012 the unit coordinates a new EC project DIRHA (see http:/, whose reference scenario is an automated home that is equipped with a microphone network and can be voice-controlled in any room.  The challenges and complexity addressed by the project are multi-fold, but basically related to the opportunity for the user to interact with the system, through spoken dialogue sessions, even while other speakers, noise, and interfering sources are active. The targeted system will also have to operate in five different languages. More details about the project can be found here.

The unit is also working on application-oriented activities, in particular under DOMHOS (Domotic and hospital speech interaction), a project that is funded at local level (FESR) and is conducted in cooperation with two local small enterprises (DomoticArea s.p.a. and UniHospital s.p.a.). The project aims to transfer to the market technologies for distant-speech interaction in the home and in the surgery-room contexts:
- in the former case, the strategy is that of transfering established technological components, while this innovation action would then be supported at longer-term thanks to the expected innovative achievements that will be obtained under the EC project DIRHA;
- in the second case, the project deals wtih the application of similar technologies (involving in this study the S. Chiara Hospital of Trento) in a surgery room, which means a very complex environment from the acoustic point of view, due the large number of personnel and noise sources normally present in it.

Finally, we are active towards the development of embedded solutions for distant-speech interaction. The study on the adaptation of these technologies to the latter context (i.e., a software implementation on computing platforms of small size, low-cost, and low-power) started in 2012. In our case, the acoustic and speech signals are acquired by an array consisting of miniaturized digital MEMS microphones. More details on our recent activities can be found here. Such solutions can represent the starting point for next important technology transfer actions, given the very limited cost to realize it and the small size of the targeted product.