A lncRNA prediction tool based on deep learning algorithm

来源 :第七届全国生物信息学与系统生物学学术大会 | 被引量 : 0次 | 上传用户:cynthia0737
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
论文部分内容阅读
  Long non-coding RNAs (lncRNAs,longer than 200 bp) play important biological roles in dosage compensation,genomic imprinting,and cell differentiation,and have been implicated in human disease such as cancer.Facilitated by high throughput sequencing technology,RNA sequencing has revealed numerous novel noncoding RNAs,the majority of which are long noncoding RNAs.To comprehensively annotate newly discovered transcripts,the first step is to distinguish lncRNAs from protein-coding transcripts[1].Herein we present a novel tool named PLADL for lncRNA identification,which has several advantages over existing tools.First,PLADL incorporates the intrinsic features of transcript sequences,such as ORF length,ORF ratio,and entropy density profiles[2],to build the mathematical model of lncRNAs.Second,PLADL employs deep learning to construct a classification algorithm.Compared with existing tools,CPC,CPAT,CNCI,PLEK,lncRNA-MFDL and lncRScan-SVM,PLADL achieves totally the best accuracy on human training data with 10-fold cross validation (98.2%) as well as on the test set (Table 1).Moreover,PLADL outperforms other tools remarkably on cross-species when we predicted lncRNAs for mouse and zebrafish (Table 2).
其他文献
  Pituitary adenoma (PA),tumor occurring in the pituitary gland,is one of the most common intracranial adenomas.Although PA is usually benign and non-metastat
  The underlying relationship between genomic factors and the response of diverse cancer drugs still remains unclear.A number of studies have showed that the
会议
  Gastric cancer is one of the top leading causes of cancer death worldwide especially in China.In recent years,some lncRNAs are discovered to be dysregulated
  From a global perspective,the abundance of protein—final product of gene expression—is only partly controlled by transcription or mRNA abundance,and mRNA
  Aging is a physiological phenomenon caused by genes,environment and diets that changes during the body function declines.Many factors contribute to aging,su
  In recent years,increasing evidence has suggested that a novel class of non-coding RNA,long non-coding RNA (lncRNA),is commonly altered at various stages of
  With development of genomics and bioinformatics,especially the extensive applications of high-throughput sequencing technology,more transcriptional units wi
会议
  Calcium release-activated calcium(CRAC)channels in the plasma membrane are integral membrane proteins that is critical in cellular signaling by generating t
  Motivation: Backbone structures and solvent accessible surface area of proteins are benefited from continuous real value prediction because it removes the a
  With the completion of the silkworm genome sequencing,the silkworm genomic study has focused on functional genomics research.Proteomic analysis is an import