Identification of blood-secretory human proteins using Support Vector Machines

来源 :第五届全国生物信息学与系统生物学学术大会 | 被引量 : 0次 | 上传用户:myxing
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Background: In recent years, secreted proteins have been identified as markers for disease typing and staging or the development of drugs.Computational identification of blood-secretory human proteins, especially proteins with highly and abnormally expressed genes in diseased human tissues, such as cancers, can provide useful information to proteomic studies for targeted disease biomarker discovery in serum.Methods: In this study, we used Support Vector Machines (SVMs) to predict blood-secretory human protein.On a dataset containing 305 known blood-secretory human proteins and be used as the positive dataset in our study.We have randomly selected two datasets from the Pfam protein families that do not contain positive dataset as the negative dataset.Each negative dataset contains 400 protein sequences.Results: By choosing amino acids composition as the only input vector, we are able to achieve 89.8% accuracy with 89.0% sensitivity for the jackknife test.Further, incorporting the compositions of amino acids dipeptide and the hydropathy distribution into the input vector, we show that the prediction results are improved to 93.0% accuracy with 92.4% sensitivity in the jackknife test.Conclusions: We hope that the promising results using novel descriptors will improve the performance of identification of blood-secretory human proteins.The high accuracy is helpful for further experimental study .
其他文献
  Nucleosome positioning in vivo is influenced by DNA sequence, chromatin remodelers and fixed barriers, such as DNA-binding proteins, but the relative contri
会议
  Background: Protein phosphorylation is one of the pervasive and most important protein posttranslational modifications, which regulates the dynamic behavior
  Background: Module (community) structure is a common and important property of many types of networks such as social networks and biological networks.Severa
  Motivation: Genetic and pharmacological perturbations are powerful systems biology tools to study cellular signal transduction pathways.Here, we report a fr
会议
会议
会议
  Background: The ultra intercellular heterogeneity in tumor is one major causes for the failure of cancer therapy, e.g.drug resistance and/or cancer relapse.
会议
会议
  Background: Small insertions and deletions (INDELs) compose of the second largest category of genetic variants (next to single nucleotide polymorphism) in t