Preface
Contents
1 Introduction to Deep Density Models with Latent Variables
1.1 Introduction
1.1.1 Density Model with Latent Variables
1.1.2 Deep Architectures via Greedy Layer-Wise Learning Algorithm
1.1.3 Unsupervised Learning
1.2 Shallow Architectures of Latent Variable Models
1.2.1 Notation
1.2.2 Mixtures of Factor Analyzers
1.2.2.1 Maximum Likelihood
1.2.2.2 Maximum A Posteriori
1.2.3 Mixtures of Factor Analyzers with Common Factor Loadings
1.2.3.1 Maximun Likelihood
1.2.3.2 Maximum A Posteriori
1.2.4 Unsupervised Learning
1.2.4.1 Empirical Results
1.2.4.2 Clustering
1.3 Deep Architectures with Latent Variables
1.3.1 Deep Mixtures of Factor Analyzers
1.3.1.1 Inference
1.3.1.2 Collapse Model
1.3.2 Deep Mixtures of Factor Analyzers with Common Factor Loadings
1.3.2.1 Inference
1.3.2.2 Collapse Model
1.3.3 Unsupervised Learning
1.4 Expectation-Maximization Algorithm
1.5 Conclusion
References
2 Deep RNN Architecture: Design and Evaluation
2.1 Introduction
2.2 Related Works
2.2.1 Segmentation-Free Handwriting Recognition
2.2.2 Variants of RNN Neuron
2.3 Datasets
2.4 Proposed Deep Neural Network
2.4.1 Architecture
2.4.2 Learning
2.4.3 Decoding
2.4.4 Experimental Setup
2.4.5 Results
2.4.6 Error Analysis
2.5 Proposed RNN Neuron
2.5.1 Architecture
2.5.2 Forward Propagation
2.5.3 Backward Propagation
2.5.4 Experimental Setup
2.5.5 Experimental Results
2.6 Conclusions
References
3 Deep Learning Based Handwritten Chinese Character and Text Recognition
3.1 Introduction
3.2 Handwritten Chinese Character Recognition (HCCR)
3.2.1 Direction Decomposed Feature Map
3.2.1.1 Offline DirectMap
3.2.1.2 Online DirectMap
3.2.1.3 Analysis
3.2.2 Convolutional Neural Network
3.2.2.1 Architecture
3.2.2.2 Regularization
3.2.2.3 Activation
3.2.2.4 Training
3.2.3 Adaptation of ConvNet
3.2.4 Experiments
3.2.4.1 Database
3.2.4.2 Offline HCCR Results
3.2.4.3 Online HCCR Results
3.2.4.4 Adaptation Results
3.3 Handwritten Chinese Text Recognition (HCTR)
3.3.1 System Overview
3.3.2 Neural Network Language Models
3.3.2.1 Feedforward Neural Network Language Models
3.3.2.2 Recurrent Neural Network Language Models
3.3.2.3 Hybrid Language Models
3.3.2.4 Acceleration
3.3.3 Convolutional Neural Network Shape Models
3.3.3.1 Character Classifier
3.3.3.2 Over-Segmentation
3.3.3.3 Geometric Context Models
3.3.4 Experiments
3.3.4.1 Settings
3.3.4.2 Effects of Language Models
3.3.4.3 Effects of CNN Shape Models
3.3.4.4 Results with LMs on Large Corpus
3.3.4.5 Performance Analysis
3.4 Conclusion
References
4 Deep Learning and Its Applications to Natural LanguageProcessing
4.1 Introduction
4.2 Learning Word Representations
4.3 Learning Models
4.3.1 Recurrent Neural Networks (RNNs)
4.3.2 Convolutional Neural Networks (CNNs)
4.4 Applications
4.4.1 Part-of-Speech (POS) Tagging
4.4.2 Named Entity Recognition (NER)
4.4.3 Neural Machine Translation
4.4.4 Automatic English Grammatical Error Correction
4.4.5 Image Description
4.5 Datasets for Natural Language Processing
4.5.1 Word Embedding
4.5.2 N-Gram
4.5.3 Text Classification
4.5.4 Part-Of-Speech (POS) Tagging
4.5.5 Machine Translation
4.5.6 Automatic Grammatical Error Correction
4.5.7 Image Description
4.6 Conclusions and Discussions
References
5 Deep Learning for Natural Language Processing
5.1 Deep Learning for Named Entity Recognition
5.1.1 Task Definition
5.1.2 NER Using Deep Learning
5.1.2.1 BLSTM
5.1.2.2 BLSTM-CRF Model
5.2 Deep Learning for Supertagging
5.2.1 Task Definition
5.2.2 Deep Neural Networks with Skip Connection for CCG Supertagging
5.2.2.1 Exploring Skip Connections
5.2.2.2 Neural Architecture for CCG Supertagging Tagging
5.2.2.3 Network Inputs
5.2.2.4 Network Outputs
5.3 Deep Learning for Machine Translation
5.3.1 Task Definition
5.3.2 Statistical Machine Translation
5.3.3 Neural Machine Translation
5.3.4 Recent Progress on Neural Machine Translation
5.4 Deep Learning for Text Summarization
5.4.1 Task Definition
5.4.2 Extractive Summarization Methods
5.4.3 Abstractive Summarization with Deep Learning
5.5 Discussion
References
6 Oceanic Data Analysis with Deep Learning Models
6.1 Introduction
6.2 Background
6.2.1 Representation Learning
6.2.1.1 Shallow Feature Learning
6.2.1.2 Deep Learning
6.2.2 Oceanic Data Analysis
6.3 Oceanic Data Analysis with Deep Learning Models
6.3.1 Ocean Front Recognition with Convolutional Neural Networks
6.3.1.1 Network Architecture
6.3.1.2 Experimental Results
6.3.2 Sea Surface Temperature Prediction with Long Short-Term Memory Networks
6.3.2.1 Network Architectures
6.3.2.2 Experimental Results
6.4 Conclusion
References
Index