论文部分内容阅读
从NCBI的EST数据库中获得的木麻黄EST序列共有34 752条,进行拼接后得到全长7 278.578 kb的非冗余序列(Unigene)12 062条,并从中检索得到分布于353条Unigene的367个SSR位点,SSR检出率为2.93%,平均分布距离为19.83 kb,包括39种重复基序类型。其中,以二核苷酸和三核苷酸为主要类型,在总SSRs中所占比例分别为57.77%和34.60%;而二核苷酸重复基序中,AG/CT所占比例最高,为93.87%;在三核苷酸重复基序中AAG/CTT所占比例最高,为44.09%。对检索出的EST-SSR位点设计得到97对引物,其中32对为可有效扩增引物。Blastx分析发现77.3%的含SSR位点的EST序列与非冗余蛋白序列数据库中功能序列具有同源性,而功能已知的序列中葡萄来源的序列占有最大比例(10.4%)。GO功能分类发现,含有SSR位点的EST序列中有47.3%至少具有1个GO注释,归入细胞组分的序列最多,而其中细胞质和细胞核的功能项所占比例较大。
A total of 34 752 ESTs from the NCBI EST database were obtained. After splicing, 12 062 non-redundant (Unigene) sequences with a total length of 7 278.578 kb were obtained, and 367 of them were retrieved from 353 Unigene SSR loci, SSR detection rate of 2.93%, the average distribution distance of 19.83 kb, including 39 kinds of repeat motifs. Among them, dinucleotide and trinucleotide were the main types, accounting for 57.77% and 34.60% of the total SSRs, respectively. In the dinucleotide repeat motifs, AG / CT accounted for the highest proportion of 93.87%; AAG / CTT accounted for the highest percentage of trinucleotide repeat motifs (44.09%). Ninety-seven pairs of primers were designed for the EST-SSR loci retrieved, of which 32 were effective amplification primers. Blastx analysis showed that 77.3% of SSR-containing EST sequences shared homology with functional sequences in non-redundant protein sequence databases, while grape-derived sequences accounted for the largest proportion (10.4%) in known sequences. GO functional classification found that 47.3% of EST sequences containing SSR loci had at least one GO annotation, the most were classified into cellular components, and the proportion of functional items in cytoplasm and nucleus was larger.