论文部分内容阅读
提出了一种最大后验 (m aximum a posteriori,MAP)估计和加权近邻回归 (weighted neighbors regression,WNR)相结合的说话人自适应方法。在 MAP自适应中 ,只有自适应数据对应的模型参数可以得到调整。针对这一缺点 ,提出一种基于变换的模型插值 /平滑方法 - WNR,利用模型近邻信息和 MAP自适应结果 ,建立距离加权的回归模型 ,对没有自适应数据的模型完成模型调整。实验证明 ,该方法可以有效地提高 MAP自适应的速度。在自适应数据为 10句时 ,音节误识率降低近 15 % ;而在自适应数据为 2 5 0句时 ,误识率降低 5 0 %以上。此外 ,证明了向量域平滑 (vectorfield sm oothing,VFS)是 WNR方法的一种退化的特例
This paper proposes a speaker adaptive method based on the combination of maximum a posteriori (MAP) estimation and weighted neighbors regression (WNR). In MAP adaptation, only the model parameters corresponding to adaptive data can be adjusted. In view of this shortcomings, this paper proposes a transform-based model interpolation / smoothing method - WNR, which uses the model neighborhood information and the MAP adaptive result to establish a distance-weighted regression model and to adjust the model without the adaptive data. Experimental results show that this method can effectively improve MAP adaptation speed. When the adaptive data is 10 sentences, the syllable misrecognition rate is reduced by nearly 15%. When the adaptive data is 250 sentences, the misrecognition rate is reduced by 50% or more. In addition, it is proved that vectorfield smoothing (VFS) is a special case of degeneracy of the WNR method