论文部分内容阅读
目的 研究DNA序列分析中各种影响因素的作用,建立排除污染、进行序列质量控制的方法。方法 通过对 950份HIV 1样品DNA基因序列结果的分析,查找序列读取及序列分析中存在的各种影响因素,对各种可能导致污染的原因进行分析和解释。结果 在使用各种软件进行序列分析时,两样本之间的基因距离为 0;两样本所测区段的核苷酸或氨基酸序列完全一致或相差甚微;样本与实验室内所构建的克隆株之间的基因距离过近,同源性达到 99%以上;两个独立传播的群体之间个别样本的互混等指标均提示存在污染的可能。结论 构建基因进化树和将样本的核苷酸序列翻译成蛋白质的氨基酸序列后构建共享序列,是一种很好的发现序列质量问题、进行序列质量控制的方法。
Objective To study the role of various influencing factors in DNA sequence analysis and to establish a method to eliminate contamination and control sequence quality. Methods Based on the analysis of the DNA sequence of 950 HIV 1 samples, various factors affecting sequence reading and sequence analysis were searched, and various possible causes of contamination were analyzed and explained. Results When using various softwares for sequence analysis, the distance between two samples was 0. The nucleotide or amino acid sequences of the two samples were completely identical or slightly different. The samples were similar to the clones constructed in laboratory Strains were too close and their homology was above 99%. The intermixing of individual samples between two independently transmitted populations suggested the possibility of contamination. Conclusion The construction of phylogenetic tree and the translation of the nucleotide sequence of the sample into the amino acid sequence of the protein construct a shared sequence, which is a good way to discover sequence quality and control sequence quality.