Audio-Visual Underdetermined Blind Source Separation Algorithm Based on Gaussian Potential Function

来源 :中国通信 | 被引量 : 0次 | 上传用户:como
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
Most existing algorithms for the underdetermined blind source separation(UBSS) problem are two-stage algorithm, i.e., mixing parameters estimation and sources estimation. In the mixing parameters estimation, the previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, i.e., the interaural time difference(ITD) and the interaural level difference(ILD), as the initializations of the mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms. Most existing algorithms for the underdetermined blind source separation (UBSS) problem are two-stage algorithms, ie, mixing parameters estimation and sources estimation. The previously proposed traditional clustering algorithms are sensitive to the initializations of the mixing parameters. To reduce the sensitiveness to the initialization, we propose a new algorithm for the UBSS problem based on anechoic speech mixtures by employing the visual information, ie, the interaural time difference (ITD) and the interaural level difference (ILD), as the initializations of The mixing parameters. In our algorithm, the video signals are utilized to estimate the distances between microphones and sources, and then the estimations of the ITD and ILD can be obtained. With the sparsity assumption in the time-frequency domain, the Gaussian potential function algorithm is utilized to estimate the mixing parameters by using the ITDs and ILDs as the initializations of the mixing parameters. And the time-frequency masking is used to recover the sources by evaluating the various ITDs and ILDs. Experimental results demonstrate the competitive performance of the proposed algorithm compared with the baseline algorithms.
其他文献
本文通过对荣华二采区10
期刊
前不久,美国轮胎翻新及维修信息局(TRIB)表示,支持美国钢铁工人联合会(USW)要求对中国进口的卡客车胎征税的申请,认为低成本进口轮胎已经对美国市场造成了“不利影响”。TRIB
本文通过对荣华二采区10
分析仿生六足生机器人步进足轨迹规划的要求和特点,选取机构中的转动副作为关节变量,根据机器人步进足足端的位置矢量和姿态变换矩阵,得到了步进足足端轨迹规划的运动学方程
在多次检出中国人大肠癌患者hMLH1基因T1151A的基础上,探讨这一单核苷酸多态性在不同人群中的存在状况及其在消化道肿瘤发病中的作用。方法:100例健康中国汉族人、80例健康日本
期刊
如果我们无所作为,宽带(运营商)就会以其无法接受的成本强加给我们。它会扼杀创新,扼杀投资者对于我们今天依赖、明天仍将依赖的自由和开放的互联网的信心!  ——FCC主席格纳考斯基  9月21目,FCC主常格纳考斯基关于“网络中立”立法的一番言论,掀起了轩然大波——这是继2005年“网络中立四大原则”提出后对于美国电信以及有线运营商的又一次挑战。    “4+2”新规要出台    事情还要从9月21日
目的了解深圳市坪山新区企业职工职业病防治知识知晓情况、影响因素及获取途径。方法对辖区916名企业职工进行职业卫生知识问卷调查,并分析调查结果。结果影响企业职工职业病
随着经济体制改革不断深化,经济领域的各种问题也越来越复杂,其中也不可避免地出现了很多财富的投机者和掠夺者,经济案件增多。近年来,在这些经济案件中,越来越多地牵扯到金
根据已报道的其他植物H+-PPase基因的保守序列设计一对简并性引物,以马蔺幼根总RNA为模板,采用RT-PCR方法克隆出马蔺H+-PPase基因片段并克隆到p UCm-T载体,命名为Il VP。阳性
本文从目前医用血液管理单位的实际运作情况出发,提出建立医用血液网络管理系统的构想,详细论述了系统的总体框架及各功能模块,重点分析了网络管理系统实施的关键技术,并以运行实例证实该医用血液管理系统的使用价值。