论文部分内容阅读
[目的]通过筛查猕猴桃EST数据库中的SSR重复序列,为开发出新型的猕猴桃EST-SSR标记和分子生物学研究奠定理论基础。[方法]从NCBI公共数据库中最新公布的猕猴桃表达序列标签(Expressed Sequence Tag,EST)中随机抽取56 400条序列,应用SSRHunter软件查找微卫星(Microsatellite,SSR)重复序列。搜索的标准是:二核苷酸、三核苷酸、四核苷酸、五核苷酸和六核苷酸最少重复次数分别为7次、5次、4次、4次、3次,包括复合型微卫星。微卫星的重复次数越多,相应的检验出等位基因数目就越多。[结果]从猕猴桃EST序列中获得了7 939条SSR,其中包括二核苷酸重复5 131条(64.63%),三核苷酸重复1 237条(15.58%),四核苷酸重复284条(3.58%),五核苷酸重复397条(5.00%),六核苷酸重复890条(11.21%)。大约每2.48 kb长度的单一基因序列中即存在1个SSR,即平均7个单一基因中存在1个SSR。在二核苷酸重复序列中,AG/CT共分布4 654条(90.70%)。在三核苷酸中,较为丰富的3种重复基元是ACC/GGT,AAG/CTT和CGC/GCG,共分布817条,它们占三核苷酸重复中各重复基元的66.04%。在其他基元中,还包括AGG/CCT,AGC/GCT,AAC/GTT,ATC/GAT,AGT/ACT,CGA/TCG,AAT/ATT7种重复基元,共420条SSR序列,占33.96%。AGT与CGT重复基元未见分布。[结论]在称猴桃EST序列中,二核苷酸重复序列是最丰富的重复单元,其次为三核苷酸重复和六核苷酸重复。在所获得的SSR重复单元中,AG/CT为优势重复。
[Objective] The research aimed to establish a theoretical basis for the development of a new type of kiwifruit EST-SSR marker and molecular biology by screening SSR repeats in Kiwi EST database. [Method] 56 400 sequences were randomly selected from Expressed Sequence Tags (ESTs) published in the NCBI public database and SSRHunter software was used to search for microsatellite (SSR) repeats. The search criteria are: dinucleotide, trinucleotide, tetranucleotide, pentranucleotide and hexanucleotide at least 7 times, 5 times, 4 times, 4 times, 3 times, including Compound microsatellite. The more repetitions of microsatellites, the greater the number of alleles tested. [Result] A total of 7 939 SSRs were obtained from kiwifruit EST sequence, including 5 131 repeats (64.63%), 1 237 repeats (15.58%), tetranucleotide repeat 284 repeats (3.58%), pentanucleotide repeat 397 (5.00%) and hexanucleotide repeat 890 (11.21%). There is one SSR in a single gene sequence of about 2.48 kb in length, ie, one SSR in an average of seven single genes. In dinucleotide repeat sequences, AG / CT was distributed 4 654 (90.70%). Of the trinucleotides, the more abundant three kinds of repeat motifs were ACC / GGT, AAG / CTT and CGC / GCG, which distributed a total of 817, accounting for 66.04% of the repeat motifs in trinucleotide repeats. Among the other motifs, there are 7 kinds of repeat motifs including AGG / CCT, AGC / GCT, AAC / GTT, ATC / GAT, AGT / ACT, CGA / TCG and AAT / ATT. There was no distribution of AGT and CGT repeat motifs. [Conclusion] The dinucleotide repeats were the most abundant repeat units in the EST sequences weighed, and followed by trinucleotide repeats and hexanucleotide repeats. Among the SSR repeat units obtained, AG / CT was predominately duplicated.