logo资料库

natrual language processing with tensorflow.pdf

第1页 / 共472页
第2页 / 共472页
第3页 / 共472页
第4页 / 共472页
第5页 / 共472页
第6页 / 共472页
第7页 / 共472页
第8页 / 共472页
资料共472页,剩余部分请下载后查看
Cover
Copyright
Packt Upsell
Contributors
Table of Contents
Preface
Chapter 1: Introduction to Natural Language Processing
What is Natural Language Processing?
Tasks of Natural Language Processing
The traditional approach to Natural Language Processing
Understanding the traditional approach
Example – generating football game summaries
Drawbacks of the traditional approach
The deep learning approach to Natural Language Processing
History of deep learning
The current state of deep learning and NLP
Understanding a simple deep model – a Fully Connected Neural Network
The roadmap – beyond this chapter
Introduction to the technical tools
Description of the tools
Installing Python and scikit-learn
Installing Jupyter Notebook
Installing TensorFlow
Summary
Chapter 2: Understanding TensorFlow
What is TensorFlow?
Getting started with TensorFlow
TensorFlow client in detail
TensorFlow architecture – what happens when you execute the client?
Cafe Le TensorFlow – understanding TensorFlow with an analogy
Inputs, variables, outputs, and operations
Defining inputs in TensorFlow
Feeding data with Python code
Preloading and storing data as tensors
Building an input pipeline
Defining variables in TensorFlow
Defining TensorFlow outputs
Defining TensorFlow operations
Comparison operations
Mathematical operations
Scatter and gather operations
Neural network-related operations
Reusing variables with scoping
Implementing our first neural network
Preparing the data
Defining the TensorFlow graph
Running the neural network
Summary
Chapter 3: Word2vec – Learning Word Embeddings
What is a word representation or meaning?
Classical approaches to learning word representation
WordNet – using an external lexical knowledge base for learning word representations
Tour of WordNet
Problems with WordNet
One-hot encoded representation
The TF-IDF method
Co-occurrence matrix
Word2vec – a neural network-based approach to learning word representation
Exercise: is queen = king – he + she?
Designing a loss function for learning word embeddings
The skip-gram algorithm
From raw text to structured data
Learning the word embeddings with a neural network
Formulating a practical loss function
Efficiently approximating the loss function
Implementing skip-gram with TensorFlow
The Continuous Bag-of-Words algorithm
Implementing CBOW in TensorFlow
Summary
Chapter 4: Advanced Word2vec
The original skip-gram algorithm
Implementing the original skip-gram algorithm
Comparing the original skip-gram with the improved skip-gram
Comparing skip-gram with CBOW
Performance comparison
Which is the winner, skip-gram or CBOW?
Extensions to the word embeddings algorithms
Using the unigram distribution for negative sampling
Implementing unigram-based negative sampling
Subsampling – probabilistically ignoring the common words
Implementing subsampling
Comparing the CBOW and its extensions
More recent algorithms extending skip-gram and CBOW
A limitation of the skip-gram algorithm
The structured skip-gram algorithm
The loss function
The continuous window model
GloVe – Global Vectors representation
Understanding GloVe
Implementing GloVe
Document classification with Word2vec
Dataset
Classifying documents with word embeddings
Implementation – learning word embeddings
Implementation – word embeddings to document embeddings
Document clustering and t-SNE visualization of embedded documents
Inspecting several outliers
Implementation – clustering/classification of documents with K-means
Summary
Chapter 5: Sentence Classification with Convolutional Neural Networks
Introducing Convolution Neural Networks
CNN fundamentals
The power of Convolution Neural Networks
Understanding Convolution Neural Networks
Convolution operation
Standard convolution operation
Convolving with stride
Convolving with padding
Transposed convolution
Pooling operation
Max pooling
Max pooling with stride
Average pooling
Fully connected layers
Putting everything together
Exercise – image classification on MNIST with CNN
About the data
Implementing the CNN
Analyzing the predictions produced with a CNN
Using CNNs for sentence classification
CNN structure
Data transformation
The convolution operation
Pooling over time
Implementation – sentence classification with CNNs
Summary
Chapter 6: Recurrent Neural Networks
Understanding Recurrent Neural Networks
The problem with feed-forward neural networks
Modeling with Recurrent Neural Networks
Technical description of a Recurrent Neural Network
Backpropagation Through Time
How backpropagation works
Why we cannot use BP directly for RNNs
Backpropagation Through Time – training RNNs
Truncated BPTT – training RNNs efficiently
Limitations of BPTT – vanishing and exploding gradients
Applications of RNNs
One-to-one RNNs
One-to-many RNNs
Many-to-one RNNs
Many-to-many RNNs
Generating text with RNNs
Defining hyperparameters
Unrolling the inputs over time for Truncated BPTT
Defining the validation dataset
Defining weights and biases
Defining state persisting variables
Calculating the hidden states and outputs with unrolled inputs
Calculating the loss
Resetting state at the beginning of a new segment of text
Calculating validation output
Calculating gradients and optimizing
Outputting a freshly generated chunk of text
Evaluating text results output from the RNN
Perplexity – measuring the quality of the text result
Recurrent Neural Networks with Context Features – RNNs with longer memory
Technical description of the RNN-CF
Implementing the RNN-CF
Defining the RNN-CF hyperparameters
Defining input and output placeholders
Defining weights of the RNN-CF
Variables and operations for maintaining hidden and context states
Calculating output
Calculating the loss
Calculating validation output
Computing test output
Computing the gradients and optimizing
Text generated with the RNN-CF
Summary
Chapter 7: Long Short-Term Memory Networks
Understanding Long Short-Term Memory Networks
What is an LSTM?
LSTMs in more detail
How LSTMs differ from standard RNNs
How LSTMs solve the vanishing gradient problem
Improving LSTMs
Greedy sampling
Beam search
Using word vectors
Bidirectional LSTMs (BiLSTM)
Other variants of LSTMs
Peephole connections
Gated Recurrent Units
Summary
Chapter 8: Applications of LSTM – Generating Text
Our data
About the dataset
Preprocessing data
Implementing an LSTM
Defining hyperparameters
Defining parameters
Defining an LSTM cell and its operations
Defining inputs and labels
Defining sequential calculations required to process sequential data
Defining the optimizer
Decaying learning rate over time
Making predictions
Calculating perplexity (loss)
Resetting states
Greedy sampling to break unimodality
Generating new text
Example generated text
Comparing LSTMs to LSTMs with peephole connections and GRUs
Standard LSTM
Review
Example generated text
Gated Recurrent Units (GRUs)
Review
The code
Example generated text
LSTMs with peepholes
Review
The code
Example generated text
Training and validation perplexities over time
Improving LSTMs – beam search
Implementing beam search
Examples generated with beam search
Improving LSTMs – generating text with words instead of n-grams
The curse of dimensionality
Word2vec to the rescue
Generating text with Word2vec
Examples generated with LSTM-Word2vec and beam search
Perplexity over time
Using the TensorFlow RNN API
Summary
Chapter 9: Applications of LSTM – Image Caption Generation
Getting to know the data
ILSVRC ImageNet dataset
The MS-COCO dataset
The machine learning pipeline for image caption generation
Extracting image features with CNNs
Implementation – loading weights and inferencing with VGG-16
Building and updating variables
Preprocessing inputs
Inferring VGG-16
Extracting vectorized representations of images
Predicting class probabilities with VGG-16
Learning word embeddings
Preparing captions for feeding into LSTMs
Generating data for LSTMs
Defining the LSTM
Evaluating the results quantitatively
BLEU
ROUGE
METEOR
CIDEr
BLEU-4 over time for our model
Captions generated for test images
Using TensorFlow RNN API with pretrained GloVe word vectors
Loading GloVe word vectors
Cleaning data
Using pretrained embeddings with TensorFlow RNN API
Defining the pretrained embedding layer and the adaptation layer
Defining the LSTM cell and softmax layer
Defining inputs and outputs
Processing images and text differently
Defining the LSTM output calculation
Defining the logits and predictions
Defining the sequence loss
Defining the optimizer
Summary
Chapter 10: Sequence-to-Sequence Learning – Neural Machine Translation
Machine translation
A brief historical tour of machine translation
Rule-based translation
Statistical Machine Translation (SMT)
Neural Machine Translation (NMT)
Understanding Neural Machine Translation
Intuition behind NMT
NMT architecture
The embedding layer
The encoder
The context vector
The decoder
Preparing data for the NMT system
At training time
Reversing the source sentence
At testing time
Training the NMT
Inference with NMT
The BLEU score – evaluating the machine translation systems
Modified precision
Brevity penalty
The final BLEU score
Implementing an NMT from scratch – a German to English translator
Introduction to data
Preprocessing data
Learning word embeddings
Defining the encoder and the decoder
Defining the end-to-end output calculation
Some translation results
Training an NMT jointly with word embeddings
Maximizing matchings between the dataset vocabulary and the pretrained embeddings
Defining the embeddings layer as a TensorFlow variable
Improving NMTs
Teacher forcing
Deep LSTMs
Attention
Breaking the context vector bottleneck
The attention mechanism in detail
Implementing the attention mechanism
Defining weights
Computing attention
Some translation results – NMT with attention
Visualizing attention for source and target sentences
Other applications of Seq2Seq models – chatbots
Training a chatbot
Evaluating chatbots – Turing test
Summary
Chapter 11: Current Trends and the Future of Natural Language Processing
Current trends in NLP
Word embeddings
Region embedding
Probabilistic word embedding
Ensemble embedding
Topic embedding
Neural Machine Translation (NMT)
Improving the attention mechanism
Hybrid MT models
Penetration into other research fields
Combining NLP with computer vision
Visual Question Answering (VQA)
Caption generation for images with attention
Reinforcement learning
Teaching agents to communicate using their own language
Dialogue agents with reinforcement learning
Generative Adversarial Networks for NLP
Towards Artificial General Intelligence
One Model to Learn Them All
A joint many-task model – growing a neural network for multiple NLP tasks
First level – word-based tasks
Second level – syntactic tasks
Third level – semantic-level tasks
NLP for social media
Detecting rumors in social media
Detecting emotions in social media
Analyzing political framing in tweets
New tasks emerging
Detecting sarcasm
Language grounding
Skimming text with LSTMs
Newer machine learning models
Phased LSTM
Dilated Recurrent Neural Networks (DRNNs)
Summary
References
Appendix: Mathematical Foundations and Advanced TensorFlow
Basic data structures
Scalar
Vectors
Matrices
Indexing of a matrix
Special types of matrices
Identity matrix
Diagonal matrix
Tensors
Tensor/matrix operations
Transpose
Multiplication
Element-wise multiplication
Inverse
Finding the matrix inverse – Singular Value Decomposition (SVD)
Norms
Determinant
Probability
Random variables
Discrete random variables
Continuous random variables
The probability mass/density function
Conditional probability
Joint probability
Marginal probability
Bayes' rule
Introduction to Keras
Introduction to the TensorFlow seq2seq library
Defining embeddings for the encoder and decoder
Defining the encoder
Defining the decoder
Visualizing word embeddings with TensorBoard
Starting TensorBoard
Saving word embeddings and visualizing via TensorBoard
Summary
Other Books You May Enjoy
Index
[ 1 ]
Natural Language Processing with TensorFlow Teach language to machines using Python's deep learning library Thushan Ganegedara BIRMINGHAM - MUMBAI
Natural Language Processing with TensorFlow Copyright © 2018 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. Acquisition Editor: Frank Pohlmann Project Editor: Radhika Atitkar Content Development Editor: Chris Nelson Technical Editor: Bhagyashree Rai Copy Editor: Tom Jacob Proofreader: Safis Editing Indexer: Rekha Nair Graphics: Tom Scaria Production Coordinator: Nilesh Mohite First published: May 2018 Production reference: 2310518 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78847-831-1 www.packtpub.com
mapt.io Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website. Why subscribe? • Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals • Learn better with Skill Plans built especially for you • Get a free eBook or video every month • Mapt is fully searchable • Copy and paste, print, and bookmark content PacktPub.com Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub. com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at service@packtpub.com for more details. At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Contributors About the author Thushan Ganegedara is currently a third year Ph.D. student at the University of Sydney, Australia. He is specializing in machine learning and has a liking for deep learning. He lives dangerously and runs algorithms on untested data. He also works as the chief data scientist for AssessThreat, an Australian start-up. He got his BSc. (Hons) from the University of Moratuwa, Sri Lanka. He frequently writes technical articles and tutorials about machine learning. Additionally, he also strives for a healthy lifestyle by including swimming in his daily schedule. I would like to thank my parents, my siblings, and my wife for the faith they had in me and the support they have given, also all my teachers and my Ph.D advisor for the guidance he provided me with.
About the reviewers Motaz Saad holds a Ph.D. in computer science from the University of Lorraine. He loves data and he likes to play with it. He has over 10 years, professional experience in NLP, computational linguistics, data science, and machine learning. He currently works as an assistant professor at the faculty of information technology, IUG. Dr Joseph O'Connor is a data scientist with a deep passion for deep learning. His company, Deep Learn Analytics, a UK-based data science consultancy, works with businesses to develop machine learning applications and infrastructure from concept to deployment. He was awarded a Ph.D. from University College London for his work analyzing data on the MINOS high-energy physics experiment. Since then, he has developed ML products for a number of companies in the private sector, specializing in NLP and time series forecasting. You can find him at http:// deeplearnanalytics.com/. Packt is searching for authors like you If you're interested in becoming an author for Packt, please visit authors.packtpub. com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Table of Contents Preface Chapter 1: Introduction to Natural Language Processing What is Natural Language Processing? Tasks of Natural Language Processing The traditional approach to Natural Language Processing Understanding the traditional approach Example – generating football game summaries Drawbacks of the traditional approach The deep learning approach to Natural Language Processing History of deep learning The current state of deep learning and NLP Understanding a simple deep model – a Fully-Connected Neural Network The roadmap – beyond this chapter Introduction to the technical tools Description of the tools Installing Python and scikit-learn Installing Jupyter Notebook Installing TensorFlow Summary Chapter 2: Understanding TensorFlow What is TensorFlow? Getting started with TensorFlow TensorFlow client in detail TensorFlow architecture – what happens when you execute the client? Cafe Le TensorFlow – understanding TensorFlow with an analogy Inputs, variables, outputs, and operations Defining inputs in TensorFlow [ i ] xi 1 1 2 5 5 6 10 10 11 13 14 16 21 21 22 22 23 24 27 28 28 31 32 35 36 37
分享到:
收藏