论文部分内容阅读
Introduction: Human genomes are diploid,with the homologous chromosomes being derived from each parent,respectively.The process of resolving the diploid nature,which assigns each allele to different homologous chromosomes,is called haplotyping.For many biological and medical studies,it is very valuable to obtain chromosome-long haplotyping information.At present,several dilution-based methods have been reported to resolve haplotypes,including single-molecule dilution (1) and clone pooling (2-3).The ideal scenario requires enough dilution in which the target DNA will be diluted into too many wells.However,even under abundant dilution,clones covering the same sites from both homologous chromosomes could still simultaneously appear within a pool.Hence,current clone-based haplotyping methods remove either all the overlapping clones or just the overlapping parts (3).Considering that clones could be picked repeatedly,if clones are mixed combinatorially into different pools and sequenced,it would be possible to construct more accurate haplotypes containing the overlapping parts by correctly assigning variants in the overlapping parts to their original clones,based on the unique pooling pattern for each clone.Considering that haplotyping could be realized by determining all the alleles on all the clones and then assembling these clones into haplotype contigs,the critical step is to assign allelic variants called from the sequencing data to their original clones.Taking the clones containing a target allele as positive samples,overlapping pool sequencing can be used to assign every allele to clones by combinatorial pooling design and decoding.When all the alleles are correctly assigned,clones could be reconstructed by linking the alleles.Accordingly,clones belonging to the same chromosome could be assembled to form haplotype contigs by chaining together heterozygous variants.