Identification of MicroRNA Precursors with Support Vector Machine and String Kernel

来源 :基因组蛋白质组与生物信息学报(英文版) | 被引量 : 0次 | 上传用户:xiaoyun1986
下载到本地 , 更方便阅读
声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架
MicroRNAs (miRNAs) are one family of short (21-23 nt) regulatory non-coding RNAs processed from long (70-110 nt) miRNA precursors (pre-miRNAs). Identifying true and false precursors plays an important role in computational identification of miRNAs. Some numerical features have been extracted from precursor sequences and their secondary structures to suit some classification methods; however, they may lose some usefully discriminative information hidden in sequences and structures. In this study, pre-miRNA sequences and their secondary structures are directly used to construct an exponential kernel based on weighted Levenshtein distance between two sequences. This string kernel is then combined with support vector machine (SVM) for detecting true and false pre-miRNAs. Based on 331 training samples of true and false human pre-miRNAs, 2 key parameters in SVM are selected by 5-fold cross validation and grid search, and 5 realizations with different 5-fold partitions are executed. Among 16 independent test sets from 3 human, 8 animal, 2 plant, 1 virus, and 2 artificially false human pre-miRNAs, our method statistically outperforms the previous SVM-based technique on 11 sets, including 3 human, 7 animal, and 1 false human pre-miRNAs. In particular, premiRNAs with multiple loops that were usually excluded in the previous work are correctly identified in this study with an accuracy of 92.66%.
本研究采用RT-PCR结合RACE技术,成功地克隆了一个新的巴西橡胶树(Hevea brasiliensis)K+通道蛋白基因并分析了其结构和表达特征.结果表明,该基因cDNA全长1482 bp,拥有1059 bp
由白粉病菌(Blumeria graminisf.sp.tritici)引起的小麦白粉病是严重影响小麦安全生产的主要病害之一。本研究将来自以色列的野生二粒小麦(Triticum dicoccoides)WE27的坏白
2006年12月到2007年4月,对内江市几个主要农贸市场两栖类贸易的种类、数量、价格和来源进行了调查.发现内江农贸市场上出售的两栖类主要是青蛙(Rana nigromaculata)和沼蛙(Ra
研究赤霉素(GA3),冷湿和温度 对五个种源的印度冷杉(Abies pindrow)和长叶云杉(Picea smithiana)种子萌发的影响.种子被浸泡在GA3 (10 mg(L-1)中24小时,然后在3(5(C温度的条
目的 采羊水细胞培养作染色体核型分析以了解唐氏(Downs)综合征出现的频率与产前诊断之间的关系.方法 46例妊娠16~23周的孕妇进行羊膜穿刺术并细胞培养进行核型分析.结果 羊水