logo资料库

Deep Learning_ Fundamentals, Theory and Applications.pdf

第1页 / 共168页
第2页 / 共168页
第3页 / 共168页
第4页 / 共168页
第5页 / 共168页
第6页 / 共168页
第7页 / 共168页
第8页 / 共168页
资料共168页,剩余部分请下载后查看
Preface
Contents
1 Introduction to Deep Density Models with Latent Variables
1.1 Introduction
1.1.1 Density Model with Latent Variables
1.1.2 Deep Architectures via Greedy Layer-Wise Learning Algorithm
1.1.3 Unsupervised Learning
1.2 Shallow Architectures of Latent Variable Models
1.2.1 Notation
1.2.2 Mixtures of Factor Analyzers
1.2.2.1 Maximum Likelihood
1.2.2.2 Maximum A Posteriori
1.2.3 Mixtures of Factor Analyzers with Common Factor Loadings
1.2.3.1 Maximun Likelihood
1.2.3.2 Maximum A Posteriori
1.2.4 Unsupervised Learning
1.2.4.1 Empirical Results
1.2.4.2 Clustering
1.3 Deep Architectures with Latent Variables
1.3.1 Deep Mixtures of Factor Analyzers
1.3.1.1 Inference
1.3.1.2 Collapse Model
1.3.2 Deep Mixtures of Factor Analyzers with Common Factor Loadings
1.3.2.1 Inference
1.3.2.2 Collapse Model
1.3.3 Unsupervised Learning
1.4 Expectation-Maximization Algorithm
1.5 Conclusion
References
2 Deep RNN Architecture: Design and Evaluation
2.1 Introduction
2.2 Related Works
2.2.1 Segmentation-Free Handwriting Recognition
2.2.2 Variants of RNN Neuron
2.3 Datasets
2.4 Proposed Deep Neural Network
2.4.1 Architecture
2.4.2 Learning
2.4.3 Decoding
2.4.4 Experimental Setup
2.4.5 Results
2.4.6 Error Analysis
2.5 Proposed RNN Neuron
2.5.1 Architecture
2.5.2 Forward Propagation
2.5.3 Backward Propagation
2.5.4 Experimental Setup
2.5.5 Experimental Results
2.6 Conclusions
References
3 Deep Learning Based Handwritten Chinese Character and Text Recognition
3.1 Introduction
3.2 Handwritten Chinese Character Recognition (HCCR)
3.2.1 Direction Decomposed Feature Map
3.2.1.1 Offline DirectMap
3.2.1.2 Online DirectMap
3.2.1.3 Analysis
3.2.2 Convolutional Neural Network
3.2.2.1 Architecture
3.2.2.2 Regularization
3.2.2.3 Activation
3.2.2.4 Training
3.2.3 Adaptation of ConvNet
3.2.4 Experiments
3.2.4.1 Database
3.2.4.2 Offline HCCR Results
3.2.4.3 Online HCCR Results
3.2.4.4 Adaptation Results
3.3 Handwritten Chinese Text Recognition (HCTR)
3.3.1 System Overview
3.3.2 Neural Network Language Models
3.3.2.1 Feedforward Neural Network Language Models
3.3.2.2 Recurrent Neural Network Language Models
3.3.2.3 Hybrid Language Models
3.3.2.4 Acceleration
3.3.3 Convolutional Neural Network Shape Models
3.3.3.1 Character Classifier
3.3.3.2 Over-Segmentation
3.3.3.3 Geometric Context Models
3.3.4 Experiments
3.3.4.1 Settings
3.3.4.2 Effects of Language Models
3.3.4.3 Effects of CNN Shape Models
3.3.4.4 Results with LMs on Large Corpus
3.3.4.5 Performance Analysis
3.4 Conclusion
References
4 Deep Learning and Its Applications to Natural LanguageProcessing
4.1 Introduction
4.2 Learning Word Representations
4.3 Learning Models
4.3.1 Recurrent Neural Networks (RNNs)
4.3.2 Convolutional Neural Networks (CNNs)
4.4 Applications
4.4.1 Part-of-Speech (POS) Tagging
4.4.2 Named Entity Recognition (NER)
4.4.3 Neural Machine Translation
4.4.4 Automatic English Grammatical Error Correction
4.4.5 Image Description
4.5 Datasets for Natural Language Processing
4.5.1 Word Embedding
4.5.2 N-Gram
4.5.3 Text Classification
4.5.4 Part-Of-Speech (POS) Tagging
4.5.5 Machine Translation
4.5.6 Automatic Grammatical Error Correction
4.5.7 Image Description
4.6 Conclusions and Discussions
References
5 Deep Learning for Natural Language Processing
5.1 Deep Learning for Named Entity Recognition
5.1.1 Task Definition
5.1.2 NER Using Deep Learning
5.1.2.1 BLSTM
5.1.2.2 BLSTM-CRF Model
5.2 Deep Learning for Supertagging
5.2.1 Task Definition
5.2.2 Deep Neural Networks with Skip Connection for CCG Supertagging
5.2.2.1 Exploring Skip Connections
5.2.2.2 Neural Architecture for CCG Supertagging Tagging
5.2.2.3 Network Inputs
5.2.2.4 Network Outputs
5.3 Deep Learning for Machine Translation
5.3.1 Task Definition
5.3.2 Statistical Machine Translation
5.3.3 Neural Machine Translation
5.3.4 Recent Progress on Neural Machine Translation
5.4 Deep Learning for Text Summarization
5.4.1 Task Definition
5.4.2 Extractive Summarization Methods
5.4.3 Abstractive Summarization with Deep Learning
5.5 Discussion
References
6 Oceanic Data Analysis with Deep Learning Models
6.1 Introduction
6.2 Background
6.2.1 Representation Learning
6.2.1.1 Shallow Feature Learning
6.2.1.2 Deep Learning
6.2.2 Oceanic Data Analysis
6.3 Oceanic Data Analysis with Deep Learning Models
6.3.1 Ocean Front Recognition with Convolutional Neural Networks
6.3.1.1 Network Architecture
6.3.1.2 Experimental Results
6.3.2 Sea Surface Temperature Prediction with Long Short-Term Memory Networks
6.3.2.1 Network Architectures
6.3.2.2 Experimental Results
6.4 Conclusion
References
Index
Cognitive Computation Trends 2 Series Editor: Amir Hussain Kaizhu Huang · Amir Hussain Qiu-Feng Wang Rui Zhang Editors Deep Learning: Fundamentals, Theory and Applications
Cognitive Computation Trends Volume 2 Series Editor Amir Hussain School of Computing Edinburgh Napier University Edinburgh, UK
Cognitive Computation Trends is an exciting new Book Series covering cutting- edge research, practical applications and future trends covering the whole spectrum of multi-disciplinary fields encompassed by the emerging discipline of Cognitive Computation. The Series aims to bridge the existing gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities. The broad scope of Cognitive Computation Trends covers basic and applied work involving bio-inspired computational, theoretical, experimental and integrative accounts of all aspects of natural and artificial cognitive systems, including: perception, action, attention, learning and memory, decision making, language processing, communication, reasoning, problem solving, and consciousness. More information about this series at http://www.springer.com/series/15648
Kaizhu Huang • Amir Hussain • Qiu-Feng Wang Rui Zhang Editors Deep Learning: Fundamentals, Theory and Applications 123
Editors Kaizhu Huang Xi’an Jiaotong-Liverpool University Suzhou, China Amir Hussain School of Computing Edinburgh Napier University Edinburgh, UK Qiu-Feng Wang Xi’an Jiaotong-Liverpool University Suzhou, China Rui Zhang Xi’an Jiaotong-Liverpool University Suzhou, China ISSN 2524-5341 Cognitive Computation Trends ISBN 978-3-030-06072-5 https://doi.org/10.1007/978-3-030-06073-2 ISSN 2524-535X (electronic) ISBN 978-3-030-06073-2 (eBook) Library of Congress Control Number: 2019930405 © Springer Nature Switzerland AG 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface Over the past 10 years, deep learning has attracted a lot of attention, and many exciting results have been achieved in various areas, such as speech recognition, computer vision, handwriting recognition, machine translation, and natural lan- guage understanding. Rather surprisingly, the performance of machines has even surpassed humans’ in some specific areas. The fast development of deep learning has already started impacting people’s lives; however, challenges still exist. In particular, the theory of successful deep learning has yet to be clearly explained, and realization of state-of-the-art performance with deep learning models requires tremendous amounts of labelled data. Further, optimization of deep learning models can require substantially long times for real-world applications. Hence, much effort is still needed to investigate deep learning theory and apply it in various challenging areas. This book looks at some of the problems involved and describes, in depth, the fundamental theories, some possible solutions, and latest techniques achieved by researchers in the areas of machine learning, computer vision, and natural language processing. The book comprises six chapters, each preceded by an introduction and followed by a comprehensive list of references for further reading and research. The chapters are summarized below: Density models provide a framework to estimate distributions of the data, which is a major task in machine learning. Chapter 1 introduces deep density models with latent variables, which are based on a greedy layer-wise unsupervised learning algorithm. Each layer of deep models employs a model that has only one layer of latent variables, such as the Mixtures of Factor Analyzers (MFAs) and the Mixtures of Factor Analyzers with Common Loadings (MCFAs). Recurrent Neural Networks (RNN)-based deep learning models have been widely investigated for the sequence pattern recognition, especially the Long Short- term Memory (LSTM). Chapter 2 introduces a deep LSTM architecture and a Connectionist Temporal Classification (CTC) beam search algorithm and evaluates this design on online handwriting recognition. Following on above deep learning-related theories, Chapters 3, 4, 5 and 6 introduce recent advances on applications of deep learning methods in several v
vi Preface areas. Chapter 3 overviews the state-of-the-art performance of deep learning-based Chinese handwriting recognition, including both isolated character recognition and text recognition. Chapters 4 and 5 describe application of deep learning methods in natural language processing (NLP), which is a key research area in artificial intelligence (AI). NLP aims at designing computer algorithms to understand and process natural language in the same way as humans do. Specifically, Chapter 4 focuses on NLP fundamentals, such as word embedding or representation methods via deep learning, and describes two powerful learning models in NLP: Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). Chapter 5 addresses deep learning technologies in a number of benchmark NLP tasks, including entity recognition, super-tagging, machine translation, and text summarization. Finally, Chapter 6 introduces Oceanic data analysis with deep learning models, focusing on how CNNs are used for ocean front recognition and LSTMs for sea surface temperature prediction, respectively. In summary, we believe this book will serve as a useful reference for senior (undergraduate or graduate) students in computer science, statistics, electrical engineering, as well as others interested in studying or exploring the potential of exploiting deep learning algorithms. It will also be of special interest to researchers in the area of AI, pattern recognition, machine learning, and related areas, alongside engineers interested in applying deep learning models in existing or new practical applications. In terms of prerequisites, readers are assumed to be familiar with basic machine learning concepts including multivariate calculus, probability and linear algebra, as well as computer programming skills. Suzhou, China Edinburgh, UK Suzhou, China Suzhou, China March 2018 Kaizhu Huang Amir Hussain Qiu-Feng Wang Rui Zhang
Contents 1 Introduction to Deep Density Models with Latent Variables ........... 1 Xi Yang, Kaizhu Huang, Rui Zhang, and Amir Hussain 2 Deep RNN Architecture: Design and Evaluation......................... 31 Tonghua Su, Li Sun, Qiu-Feng Wang, and Da-Han Wang 3 Deep Learning Based Handwritten Chinese Character and Text Recognition ................................................................... Xu-Yao Zhang, Yi-Chao Wu, Fei Yin, and Cheng-Lin Liu 4 Deep Learning and Its Applications to Natural Language Processing ..................................................................... Haiqin Yang, Linkai Luo, Lap Pong Chueng, David Ling, and Francis Chin 57 89 5 Deep Learning for Natural Language Processing ........................ 111 Jiajun Zhang and Chengqing Zong 6 Oceanic Data Analysis with Deep Learning Models ..................... 139 Guoqiang Zhong, Li-Na Wang, Qin Zhang, Estanislau Lima, Xin Sun, Junyu Dong, Hui Wang, and Biao Shen Index ............................................................................... 161 vii
分享到:
收藏