Average mutual information of nucleosome and linker DNA regions

来源 :第五届全国生物信息学与系统生物学学术大会 | 被引量 : 0次 | 上传用户:tzl1986
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Background: Knowledge of the detailed organization of nucleosomes across genomes and the mechanisms of nucleosome positioning is critical for understanding of DNA transcription, replication, repair, recombination, and disease development, etc.Recently, many prediction algorithms of nucleosome positioning in eukaryotic model organism have been presented.However, most of them were constructed using sequence patterns including k-mer, G+C content, poly A, etc.Why can the algorithm combining sequence patterns obtain higher prediction precision in identifying nucleosome position? Here, we tried to elucidate this question through analyzing average mutual in formation of nucleosome and linker regions in S.cerevisiae genome.Methods: The whole S.cerevisiae genome sequences were downloaded from 2006 assembly of the Yeast Genome Database (http://www.yeastgenome.org/).The experimental maps of nucleosome locations of S.cerevisiae genome were obtained from Penn State Genome Cartography Project (http://atlas.bx.psu.edu/).A total of 54,750 fragments of 147-bp having at least three sequencing reads were selected as the nucleosome region dataset.The flanking 20 bp of nucleosome DNA is regarded as linker DNA and the 70,488 fragments of 20-bp were selected as the linker region dataset.Then, all nucleosome and linker DNAs were separately joined into one sequence, and again split into 2048-bp fragments.The average mutual information function(AMI) was used to measure the information contained in nucleosome and linker regions.AMI is defined as AMI(k) =Σi,jpi(k)j log2 Pi(k)j/PiPj, k =0,1,2,3...where pi denotes the probability of finding the nucleotide ni ∈ (A,G,C,T) and pi(k)j denotes the probability of finding the pair of nucleotides ni and nj separated by a gap of length k.Results: Our analysis showed the value ofAMI (k<=2) in nucleosome and linker DNA regions is obviously larger than random sequence.Furthermore, the value of AMI (k<=2) in nucleosome DNA is larger than linker DNA regions.Conclusions: The results indicated that (i) nucleosome and linker DNA regions contain some sequence information relating with gene regulation and expression (ii) sequence motifs directing nucleosome positioning are enriched in nucleosome DNA regions (iii) short range correlation (k<=2) is the most important characteristic of nucleosome and linker DNA regions.Thus, some sequence patterns including k-mer(k=1,2,3,4) and poly A were often used to construct algorithm of predicting nucleosome positioning .
其他文献
  The Enzyme Commission (EC) classification system is widely used in chemistry and biology and is maintained by the Nomenclature Committee of the Internationa
  Background: Based upon Genome-Wide Association Study involving a total of 575 broilers selected for abdominal fat content (unpublished data)in our lab, the
  Background: The motivation of the present study is to explore potential relationships between small molecule metabolites and animal behavior phenotype by in
  Hepatitis C virus (HCV) infection constitutes a global health problem, which affects more than 170 million individuals.HCV is an enveloped single strand RNA
  Background: Unlike Western medicine, Traditional Chinese Medicine (TCM), which is based on the doctrine and empirical practices of systems science, uses sim
  Rational: Cardiac conduction disease is multifactorial complex disease.Genetic factors play critical roles in formation of cardiac conductive tissue during
  A core task of drug discovery study is to identify the dependency between the genetic/ molecular makeups of the human body and disease phenotype.Here we pro
会议
  Recent studies of geothermally heated aquatic ecosystems have found widely divergent viruses with unusual morphotypes.Archaeal Viruses isolated from these h
  Background: The accumulation of knowledge on biological networks and high-throughput experimental data raises the need of robust, efficient, schematic and e
  Background: Non-structural protein 1, a highly conserved influenza virus protein, has been demonstrated previously to be a potential target for antiviral de