论文部分内容阅读
本文提出在压缩域上直接对MPEG音频信号进行分析,达到电视节目实时分析检索目的.算法分为三步:首先利用压缩域特征对音频信号进行分割,然后应用分层方法把分割出来的音频片段粗分成音乐、语音和其它三个基本类别;由于话者身份是语音信号中的重要检索线索,最后利用隐马尔可夫链实现了与文本无关的话者识别,并用识别出来的话者身份对语音信号和其相应的视频进行标注.
In this paper, the MPEG audio signal is analyzed directly on the compressed domain to achieve the purpose of real-time analysis and retrieval of television programs.The algorithm is divided into three steps: Firstly, the compressed domain feature is used to segment the audio signal, and then the layered method is used to divide the segmented audio segment Roughly divided into music, speech and other three basic categories; because the speaker identity is an important retrieval clue in the speech signal, the last uses the hidden Markov chain to realize the non-text-independent speaker recognition, and uses the identified speaker identity to the speech signal And its corresponding video tagging.