A pipelined Pre-training algorithm for DBNs

来源 :第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会 | 被引量 : 0次 | 上传用户：feileizuhe

【摘要】

：

【作者】

：

Zhiqiang Ma Tuya Li Shuangtao Yang Li Zhang

【机构】

：

College of Information Engineering,Inner Mongolia University of Technology,Hohhot,CHINA

【出处】

：

第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会

【发表日期】

：

2017年7期

【关键词】

：

component deep networks pre-training greedy layer-wise RBM pipelined

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

　　Deep networks have been widely used in many domains in recentyears.However,the pre-training of deep networks is time consuming with greedy layer-wise algorithm,and the scalability of this algorithm is greatly re-stricted by its inherently sequential nature where only one hidden layer can be trained at one time.In order to speed up the training of deep networks,this pa-per mainly focuses on pre-training phase and proposes a pipelined pre-training algorithm because it uses distributed cluster,which can significantly reduce the pre-training time at no loss of recognition accuracy.Its more efficient than greedy layer-wise pre-training algorithm by using the computational cluster.The contrastive experiments between greedy layer-wise and pipelined layer-wise algorithm are conducted finally,so we have carried out a comparative ex-periment on the greedy layer-wise algorithm and pipelined pre-training algo-rithms on the TIMIT corpus,result shows that the pipelined pre-training algo-rithm is an efficient algorithm to utilize distributed GPU cluster.We achieve a 2.84 and 5.9 speed-up with no loss of recognition accuracy when we use 4 slaves and 8 slaves.Parallelization efficiency is close to 0.73.

其他文献

Harvest Uyghur-Chinese Aligned-Sentences Bitexts from Multilingual Sites Based on Word Embedding

Obtaining bilingual parallel data from the multilingual websites is along-standing research problem,which is very benefit for resource-scarce lan-guages.In this paper,we present an approach for obtain

会议

bilingual parallel dataword embeddingresource-scarce languages

Closed-Set Chinese Word Segmentation Based on Convolutional Neural Network Model

This paper proposes a neural model for closed-set Chinese word segmentation.The model follows the character-based approach which assigns a class label to each character,indicating its relative po-siti

会议

Chinese word segmentationDeep learningConvolutional neural networks

Improving Event Detection via Information Sharing among Related Event Types

Event detection suffers from data sparseness and label imbalance prob-lem due to the expensive cost of manual annotations of events.To address this problem,we propose a novel approach that allows for

会议

Hierarchical Gated Recurrent Neural Tensor Network for Answer Triggering

In this paper,we focus on the problem of answer triggering ad-dressed by Yang et al.(2015),which is a critical component for a real-world question answering system.We employ a hierarchical gated recur

会议

Answer TriggeringQuestion AnsweringHierarchical gated recur-rent neural tensor

Joint Extraction of Multiple Relations and Entities by using a Hybrid Neural Network

This paper proposes a novel end-to-end neural model to jointly extract entities and relations in a sentence.Unlike most exist-ing approaches,the proposed model uses a hybrid neural network to automati

会议

Information ExtractionNeural Networks

Language Model for Mongolian Polyphone Proofreading

Mongolian text proofreading is the particularly difficult task because of its unique polyphonic alphabet,morphological ambiguity and agglutinative feature,and coding errors are currently pervasive in

会议

MongolianPolyphoneAutomatic Proofreading SystemMorpho-logical Ambiguity

Collective Entity Linking on Relational Graph Model with Mentions

Given a source document with extracted mentions,entity linking callsfor map-ping the mention to an entity in reference knowledge base.Previous en-tity linking approaches mainly focus on generic statis

会议

Collective Entity LinkingEntity DisambiguationRelational Graph

Cost-aware Learning Rate for Neural Machine Translation

Neural Machine Translation(NMT)has drawn much attention due to its promising translation performance in recent years.The conventional optimiza-tion algorithm for NMT sets a unified learning rate for e

会议

Neural Machine TranslationCost-aware Learning Rate

Improving Word Embeddings for Low Frequency Words by Pseudo Contexts

This paper investigates relations between word semantic den-sity and word frequency.A distributed representations based word av-erage similarity is defined as the measure of word semantic density.We f

会议

Word EmbeddingLow Freuqcy Word

Question Answering with Character-Level LSTM Encoders and Model-Based Data Augmentation

This paper presents a character-level encoder-decoder mod-eling method for question answering(QA)from large-scale knowledge bases(KB).This method improves the existing approach [9] from three aspects.

会议

Question AnsweringKnowledge BaseLong Short-TermMemoryEncoder-Decoder

A pipelined Pre-training algorithm for DBNs

与本文相关的学术论文