论文部分内容阅读
本研究以莲雾转录组中筛选出50 360条Unigene为数据来源,应用编写的perl脚本、CUSP和SPSS软件对序列进行了密码子偏好性分析以及多元统计分析.结果表明,莲雾转录组数据中的17 723条高置信蛋白编码基因CDS序列的平均总GC含量为49.47%,同义密码子第三位出现G或C的频率为50.91%,比出现A或T的频率高,有效密码子数(ENC)取值介于23.7~61.0,平均为58.4;确定了16个莲雾最优密码子,其中15个密码子以A/T结尾,仅有1个以G结尾,说明莲雾最优密码子偏爱以A/T结尾.通过和17种植物的密码子使用频率进行比较,发现RSCU值相对于GC3s含量是评估植物进化关系一个更好的参数.同时我们还确定了20个氨基酸密码子双联密码子的序列情况.本研究结果对指导莲雾基因改造,遗传转化,新基因的发现、功能基因表达调控研究,蛋白质结构和功能预测、以及与其他物种的比较基因组学研究和分子标记育种等基因工程问题具有一定的参考价值.“,”In this study,the codon preference and multivariate statistics were analysed by taking the 50 360 Unigenes in transcriptome sequencing data of Syzygium samarangense as data sources and using the pattern utilization of codons by perl script,CUSP,and SPSS bioinformatics soffwares.The results showed the total average GC content of 17 723 high confidence protein coding gene CDS sequences in transcript data of Syzygium samarangense was 49.47%,the G or C content in the third positions was 50.91%,higher than the frequency of occurrence of A or T.The effective number of codon (ENC) of Syzygium samarangense Unigene ranged from 23.7~61.0,with the average of 58.4.A total of 16 optimal codons were found from Syzygium samarangense Unigenes,in which 15 codons in the end of A/T,1 codon in the end of G,which indicated that Syzygium samarangense Unigenes preferred the end of A/T.By compared with that from 17 plants,we found that RSCU and GC3s can reflect the evolution of the relationship between species to some extent.In addition,we also analyzed the sequence of codons with 20 amino acid pair codons.The results provided important guidance for genetic transformation,new gene discovery,functional gene expression regulation,protein structure and function prediction of wax apple genes,comparative genomics research with other species and molecular genetic breeding in Syzygium samarangense.