关于性别、国籍和情感信息对文本无关说话人确认的影响的研究

来源 :天津大学 | 被引量 : 0次 | 上传用户：dgmlovett

【摘要】

：

【作者】

：

李凯

【机构】

：

天津大学

【出处】

：

天津大学

【发表日期】

：

2023年01期

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

As unique biometric information of human being,voice print has been widely applied in many fields of daily life and has a broad market application prospect,such as mobile payment,a criminal investigation,and so on.At the same time,neural network structures are developed more and more and applied to speaker recognition.The accuracy of speech recognition is getting higher and higher and better than that of humans in some scenarios.The purpose of an automatic speaker verification（ASV）system is to verify whether two test utterances belong to the same speaker based on speaker characteristics extracted from the raw speech.Speech is a complex signal that conveys many types of information,such as linguistic content,speaker individuality,nationality,gender,and emotion.Some information may be useful for ASV,while others may not.Therefore,the performance of ASV could be increased by enhancing meaningful information and suppressing useless information.However,the learning ability of a deep neural network（DNN）is still limited because of the practice of using speaker labels only in the training stage of the ASV system,which does not take into account the interaction between different types of domain information.Multitask learning（MTL）,which is recently proposed to learn useful information for ASV by using speaker-irrelevant labels.Domain adversarial training（DAT）,which is designed to eliminate the effect of useless information by using a gradient reversal layer（GRL）in different domains.These two methods have significantly improved the performance of ASV.What these two approaches have in common is that they add more constraints during the DNN training stage.There are many types of domain information that has been shown to be useful for ASV,such as phonetic in frame level,channel,and signal-to-noise-ratio variability information.However,information for gender and nationality is crucial in verifying the identity of a speaker because these information can be used as multiple verifications.Utterances uttered by the same speaker in different emotions vary significantly in their characteristics,which also influence the extraction of speaker individual features and decrease the accuracy of the ASV.Subjectively,gender and nationality are speaker-invariant information.This means that,for a given speaker in a training database,they will not change and can provide additional information for the authentication of speaker identity.Therefore,these two pieces of information should be beneficial for ASV.On the contrary,emotion information can change in different speaking scenarios,which will decrease the cosine similarity score in test pairs even though the two utterances are from the same speaker.Therefore,it has to be suppressed.Based on these considerations,the authors of this thesis investigated the effects of gender,nationality,and emotion information on the performance of ASV systems.Four effective systems were proposed by using MTL-and DAT-based methods.More specifically,MTL-based systems,which including multitask gender（MTG）,multitask nationality（MTN）,and multitask gender and nationality（MTGN）,were used to enhance gender and nationality information learning in the NN training stage.The DATbased system,which including emotion domain adversarial training（EDAT）,was used to suppress different emotions information learning.Experimental results indicate that encouraging gender and nationality information and suppressing emotion information learning improves the performance of ASV.In the end,proposed systems achieved 16.4and 22.9% relative improvements in the equal error rate for MTL-and DAT-based systems,respectively.

其他文献

计及需求侧响应的智能配电企业能源成本控制研究

需求侧响应是智能配电企业中的首要组成部分,随着近年来的不断发展,智能配电企业的需求侧响应也有了新的特征。与以往的需求侧响应相比较,智能配电企业的需求侧管理能够带来更加领先的管理和控制技术,配备技术一流的负荷监控和智能控制技术,可以有效支持分布式能源和能量存储设备的高效接入,有着更为显著的终端节能效率,进而做到实时的用户响应。计及需求侧响应实施,有效引导用户直接参与到电力系统削峰填谷过程当中,此类变

学位

有机硅功能化碳点的制备及其荧光发光调控和ClO-检测应用

碳点作为一类新型荧光碳纳米材料,其结构和性能具有丰富的可变性。在碳点的制备过程中,反应条件的变化可促使碳点形成不同的微观结构,并进而影响其荧光发光性能和其他理化性能。研究发现硅氧烷单体和聚硅氧烷等有机硅材料可对碳点进行功能化修饰,从而避免碳点产生团簇聚集,有利于调控碳点的光学性能,拓展碳点的应用前景。本文通过原位水热或溶剂热的方法,分别以硅氧烷单体和聚硅氧烷两类有机硅材料作为修饰剂,与柠檬酸反应得

学位

雾无线接入网中基于联邦强化学习的协作缓存方法研究

学位

MG53调控PPARα/PGC-1α通路介导运动对糖尿病大鼠心肌损伤的影响

研究目的:比较8周跑台和游泳运动前后糖尿病大鼠心肌损伤相关指标MG53、PPARα/PGC-1α通路的变化,探讨运动干预2型糖尿病的可能机制,并为临床应用提供实验依据。研究方法:60只8周龄SPF级雄性SD大鼠适应性喂养一周后随机选取15只作为正常对照组,其余45只高脂饲养三个月后按照35mg/kg的剂量腹腔注射STZ溶液,以血糖≥16.6mmmol/L作为T2DM造模成功的标准。注射后每天检测血

学位

基于相干光通信技术的5G云数据中心光互连研究

学位

基于模糊综合评价的L型飞机改装项目风险管理研究

H公司是我国中小型通用飞机制造中心,是通用航空产业的主要生产单位之一。近年来,随着我国航空事业的蓬勃兴起,越来越多的目光投入到通用航空产业中来。H公司为扩展航空产品谱系,延申产品概念,扩大市场份额,考虑以现有产品L型飞机为基础,开展L型飞机改装项目研制工作。由于航空型号产品研发项目研发周期长,研制难度大,研制风险高,因此开展型号研制项目的风险管理研究具有充分必要性。为明确L型飞机改装项目风险等级,

学位

代建公司管理能力对绩效水平影响的研究

本文主要研究代建公司管理能力对绩效水平的影响,代建公司是代建制度下的项目管理公司,由政府投资部门通过招标等方式选定,代替政府进行非经营性项目的投资建设,在项目实施过程中控制进度、质量和成本,将验收合格的工程项目移交给相应的用户。本研究基于项目管理知识体系,初步识别代建公司管理能力评价指标,通过专家访谈和阅读文献的方式,对识别出的指标进行补充和完善,最终确定16项管理能力评价指标;然后依据平衡计分卡

学位

基于随机森林算法的商业银行移动支付特约商户风险识别优化

2010年左右出现的移动支付在短短10年间发展迅速,在其基础上发展而来的聚合支付更是在2017年迎来了爆发式增长,各家商业银行也纷纷在这一时期拓展了移动支付业务。相应的,业务中的各类风险开始受到关注,随着2017年监管部门的政策密集出台、监管机构的执法力度不断提高,各家商业银行纷纷加大了风险管理方面的投入,建立自己的风控模型。该研究是基于某商业银行已经应用的风险监控系统,而该系统采用了随机森林算法

学位

基于“经口裂间隙扇形扫查序列”的胎儿上腭超声筛查培训及应用软件开发

学位

基于新零售模式H公司ERP系统开发风险管理研究

随着我国提出以国内大循环为主体、国内国际双循环相互促进的发展新格局,新经济模式随着我国当下疫情得到有效控制,国内经济开始全面复苏,伴随着我国消费结构的不断升级,传统家居家装工程行业也随着消费者消费需求的改变而产生变革。H公司作为传统家居家装工程公司的代表,在2018年就开始了新零售业务模式转型。新零售的变革涉及到了消费者触达、供应链、零售模式等前端后端各个方面的变革,H公司在引流量、聚会员、推设计

学位

关于性别、国籍和情感信息对文本无关说话人确认的影响的研究

与本文相关的学术论文