论文部分内容阅读
选择性剪接是促进基因组复杂性和蛋白质组多样性的一种主要机制,但是对水稻NBS-LRR序列选择性剪接的全基因组分析却未见报道。通过隐马尔柯夫模型搜索,从TIGR数据库里得到了855条编码NBS-LRR基序的序列。利用这些序列在KOME、TIGR基因索引及UniProt三个数据库中进行同源搜索,获得同源的完整cDNA序列、假设一致性序列和蛋白质序列。再利用Spidey和SIM4程序把完整cDNA序列和假设一致性序列联配到相应的BAC序列上来预测选择性剪接。蛋白质序列和基因组序列之间的联配使用tBLASTn。在这875个NBS-LRR基因中,119个基因具有选择性剪接现象,其中包括71内含子保留,20个外显子跳跃,25个选择性起始,16个选择性终止,12个5′端的选择性剪接和16个3′端选择性剪接。大多数选择性剪接都为两个和多个转录本所支持。可以通过访问http://www.bioinfor.org查询这些数据。进而通过生物信息学分析剪接边界发现外显子跳跃和内含子保留的‘GT…AG’的规则不如组成型的保守。这暗示了它们是通过不同的调控机制来指导剪接变构体的形成。通过分析内含子保留对蛋白质的影响,发现选择性剪接的蛋白更倾向于改变其C端氨基酸序列。最后对选择性剪接的组织分布和蛋白质定位进行分析,结果表明选择性剪接的最大类的组织分布是根和愈伤组织。超过1/3剪接变构体的蛋白质定位是质膜和细胞质。这些选择性剪接蛋白可能在抗病信号转导中起到重要作用。
Alternative splicing is one of the major mechanisms that promote genomic complexity and proteomic diversity, but genome-wide analysis of alternative splicing of rice NBS-LRR sequences has not been reported. By Hidden Markov Model Search, 855 sequences encoding the NBS-LRR motif were obtained from the TIGR database. Using these sequences, homologous search was performed in KOME, TIGR gene index and UniProt database to obtain the homologous complete cDNA sequence, assuming the consensus sequence and the protein sequence. Spidey and SIM4 programs are then used to predict the alternative splicing by aligning the complete cDNA sequence and the putative consensus sequence to the corresponding BAC sequence. The association between the protein sequence and the genomic sequence uses tBLASTn. Of the 875 NBS-LRR genes, 119 have alternative splicing phenomena, including 71 intron retention, 20 exon skipping, 25 selective initiation, 16 selective termination, 12 5 ’End of the alternative splicing and 16 3’ alternative splicing. Most alternative splicing is supported by two and multiple transcripts. This data can be accessed by visiting http://www.bioinfor.org. Furthermore, the rules of exon skipping and intron retention of ’GT ... AG’ by using the bioinformatics analysis of the splice boundary are not as good as constitutively conserved. This implies that they direct the formation of splice variants through different regulatory mechanisms. By analyzing the effect of intron retention on the protein, it was found that alternatively spliced proteins are more likely to alter their C-terminal amino acid sequence. Finally, the tissue distribution and protein localization of alternative splicing were analyzed. The results showed that the tissue distribution of the largest group of alternative splicing was root and callus. The protein localization of more than 1/3 splice variants is the plasma membrane and the cytoplasm. These alternative splicing proteins may play an important role in resistance signal transduction.