论文部分内容阅读
Post-translational modifications (PTMs),such as phosphorylation,acetylation and ubiquitination,play essential roles in the regulation of almost all of biological processes and cellular signaling pathways.Recent progresses in phosphoproteomics have identified nearly 500,000 phosphorylation sites.However,how to efficiently retrieve useful information from flood of data is still a great challenge.To integrate both large-scale and small-scale experimental data,we developed a number of databases,such as dbPPT containing 82,175 phosphorylation sites (p-sites) in 31,012 proteins for 20 plant species (http://dbppt.biocuckoo.org/),dbPSP containing 7,391 p-sites in 3,750 prokaryotic proteins (http://dbpsp.biocuckoo.org/),and dbPAF containing 483,001 p-sites of 54,148 proteins for human,animals and fungi (http://dbpaf.biocuckoo.org/).To predict potentially site-specific kinase-substrate relations (ssKSRs) or kinase-specific p-sites,we further improved our algorithm of Group-based Prediction System (GPS),by using over 6,000 known kinase-specific p-sites as the training data set.To generate an accurate classification map of protein kinases,we also collected 1,855 known kinases and 347 known phosphatases from the literature,further identified 50,433 kinases and 11,296 phosphatases in 84 eukaryotic species,and constructed the database of EKPD (http://ekpd.biocuckoo.org/).Using the updated GPS algorithm and EKPD information,GPS 3.0 beta provides both online service and local packages at http://gps.biocuckoo.org/,and can predict ssKSRs for protein kinases of 84 eukaryotes,including 464 human kinases.In addition,to greatly reduce false positive predictions,we integrated protein-protein interactions between kinases and substrates and further developed the in vivo GPS (iGPS) tool for the prediction of ssKSRs.