论文部分内容阅读
在低速率声码器中,对激励信号的描述直接影响重建语音的质量。为了改善音质,引入了DCT-M模型对激励谱幅度参数进行描述,通过二维离散余弦变换将变长的谱幅度矢量转为固定长度,再对其进行多级矢量量化。测试表明,该方法能够保留全带激励谱幅度矢量的形状,降低模型误差,从而提高了全带激励谱幅度的描述精度。将其应用在正弦激励线性预测(SELP)声码器中进行测试,结果表明,它能够改善重建语音的自然度,主观测试结果达65%。
In low rate vocoder, the description of the excitation signal has a direct impact on the quality of reconstructed speech. In order to improve the sound quality, the DCT-M model is introduced to describe the amplitude parameters of the excitation spectrum. The two-dimensional discrete cosine transform is used to transform the variable length spectrum amplitude vector to a fixed length, and then the multi-level vector quantization is performed. The test shows that this method can preserve the shape of the full-band excitation amplitude vector and reduce the model error, so as to improve the description accuracy of the full-band excitation spectrum amplitude. Applying this to a sinusoidal excited linear prediction (SELP) vocoder, the results show that it improves the naturalness of reconstructed speech with subjective test results up to 65%.