论文部分内容阅读
从汉语听觉视觉双模态数据库CAVSR1.0中选出 10个人的视听数据 ,每人发 14个音节 /ba,bi,bian ,biao ,bin ,de ,di,dian ,duo,dong,gai,gan ,gen ,gu/ .感知实验的样本分单语音信号、语音信号 +视觉信号、单视觉信号 3类 .单语音信号、语音信号+视觉信号分别包括 5种声学条件 :无噪语音信号 ,信噪比S/N为 0 ,-8,-12 ,-16dB的语音信号 .由 2 0名观察者进行感知识别 .通过对实验结果分析 ,发现人类对单视觉信号有较强的识别能力 ;声母的发音方法、发音部位和韵母造成了视觉上的不同差异 ;在噪声环境下 ,视觉信息对听觉信息有非常明显的补偿作用 ,可以使正确识别率大幅度提高 .
Audiovisual data of 10 individuals were selected from CAVSR1.0, a Chinese biblio-auditory visual modality database, with 14 syllables / person, bian, biao, bin, de, di, dian, duo, dong, gai, , gen, gu / .Analytical experiment samples were divided into single speech signal, speech signal + visual signal and single visual signal.The single speech signal, speech signal and visual signal respectively included five kinds of acoustic conditions: noisy speech signal, signal noise S / N ratio of 0, -8, -12, -16dB voice signal by 20 observers Perceptual recognition through the analysis of experimental results, we found that humans have a strong single visual signal recognition ability; initials The method of pronunciation, pronunciation part and vowel caused different visual differences. Under the noise environment, the visual information has a very obvious compensation effect on auditory information, which can greatly improve the correct recognition rate.