论文部分内容阅读
Putonghua prosody is characterized by its hierarchical structure when influenced by linguistic environments. Based on this, a neural network, with specially weighted factors and optimizing outputs, is described and applied to construct the Putonghua prosodic model in Text-to-Speech (TTS) system. Extensive tests show that the structure of the neural network characterizes the Putonghua prosody more exactly than traditional models. Learning rate is speeded up and computational precision is improved, which makes the whole prosodic model more efficient. Furthermore, the paper also stylizes the Putonghua syllable pitch contours with SPiS parameters (Syllable Pitch Stylized Parameters), and analyzes them in adjusting the syllable pitch. It shows that the SPiS parameters effectively characterize the Putonghua syllable pitch contours, and facilitate the establishment of the network model and the prosodic controlling.
Based on this, a neural network, with specially weighted factors and optimizing outputs, is described and applied to construct the Putonghua prosodic model in Text-to-Speech (TTS) system Extensive tests show that the structure of the neural network characterizes the the Putonghua prosody more exactly than traditional models. Learning Rate is speeded up and computational precision is improved, which makes the whole prosodic model more efficient. Furthermore, the paper also stylizes the Putonghua syllable pitch contours with SPiS parameters (Syllable Pitch Stylized Parameters), and analyzes which in adjusting the syllable pitch. It shows that the SPiS parameters effectively characterize the Putonghua syllable pitch contours, and facilitate the establishment of the network model and the prosodic controlling.