论文部分内容阅读
Background: βαβ motif is an important super secondary structure in proteins.In strandloop-helix-loop-strand structures, if two parallel β-strands are connected by a α-helix, and there are one or more hydrogen bonds between two adjacent strands, then the structure is called as βαβ motifs, otherwise it is considered as non-βαβ motifs.The function of proteins is closely related with their structure, thus study of protein structure is very important to understand its function.In the high-throughput era, by using the theoretical method to predict protein structure has become one of the important ways in biology research.It is very difficult to directly predict the tertiary structure from sequence.Super secondary structure, especially βαβ motif, is a bridge between secondary structure and tertiary structure.βαβ motifs often appear in the bacillus subtilis protease, and many function sites occur in the βαβ motifs, therefore, prediction of βαβ motif has very important meaning.It provides theoretical direction for drug molecules design.Methods: We constructed a new dataset, which contains 4277 βαβ motifs and 3366 nonβαβ motifs with sequence identity <25% and resolution < 3.0 (A).Support Vector Machine (SVM) algorithm is used to predict βαβ motif by using increment of diversity(ID) values, matrix scoring(MS) values and amino acids component as parameters.Results: In this paper, by using the ID values, MS values and amino acids component to express the sequence information, then we combined these parameters as SVM input.The predictive performance is obtained.The overall accuracy and Matthews correlation coefficient of 5-fold cross-validation achieve 77.7% and 0.527, the sensitivity of βαβ motifs and non-βαβ motifs achieve respectively 83.6% and 68.4%, the specificity of βαβ motifs and non-βαβ motifs achieve respectively 79.1% and 74.4 %.Conclusions: The MS values can reflect the conservatism of the amino acids sequence, and ID value is the sequence informations secondary refining.The SVM algorithm is a convex quadratic optimization problem, so it is the effective method to predict small sample.SVM algorithm can effectively syncretize useful parameters.In general, SVM algorithm by using the ID values and MS values to predict βαβ motifs is a helpful method .