论文部分内容阅读
Semantic textual similarity(STS) is a common task in natural language processing(NLP). STS measures the degree of semantic equivalence of two textual snippets. Recently, machine learning methods have been applied to this task, including methods based on support vector regression(SVR). However, there exist amounts of features involved in the learning process, part of which are noisy features and irrelative to the result.Furthermore, different parameters will significantly influence the prediction performance of the SVR model. In this paper, we propose genetic algorithm(GA) to select the effective features and optimize the parameters in the learning process, simultaneously. To evaluate the proposed approach, we adopt the STS-2012 dataset in the experiment. Compared with the grid search, the proposed GA-based approach has better regression performance.
Semantic textual similarity (STS) is a common task in natural language processing (NLP). STS measures the degree of semantic equivalence of two textual snippets. Recently, machine learning methods have been applied to this task, including methods based on support vector regression However, there exist amounts of features involved in the learning process, part of which are noisy features and irrelative to the result. Future, different parameters will significantly influence the prediction performance of the SVR model. To evaluate the proposed approach, we adopt the STS-2012 dataset in the experiment. Compared with the grid search, the proposed GA-based approach has better regression performance.