natrual language processing with tensorflow.pdf

发布时间：2022-05-31 发布人：admin 分类：说明书资料大小：9.22M 资料格式：pdf 举报版权申诉

akon_wang_hkbu-10953981-4744302542912440221.pdf-第1页.png

第1页 / 共472页

akon_wang_hkbu-10953981-4744302542912440221.pdf-第2页.png

第2页 / 共472页

akon_wang_hkbu-10953981-4744302542912440221.pdf-第3页.png

第3页 / 共472页

akon_wang_hkbu-10953981-4744302542912440221.pdf-第4页.png

第4页 / 共472页

akon_wang_hkbu-10953981-4744302542912440221.pdf-第5页.png

第5页 / 共472页

akon_wang_hkbu-10953981-4744302542912440221.pdf-第6页.png

第6页 / 共472页

akon_wang_hkbu-10953981-4744302542912440221.pdf-第7页.png

第7页 / 共472页

akon_wang_hkbu-10953981-4744302542912440221.pdf-第8页.png

第8页 / 共472页

Cover

Packt Upsell

Contributors

Table of Contents

Preface

Chapter 1: Introduction to Natural Language Processing

What is Natural Language Processing?

Tasks of Natural Language Processing

The traditional approach to Natural Language Processing

Understanding the traditional approach

Example – generating football game summaries

Drawbacks of the traditional approach

The deep learning approach to Natural Language Processing

History of deep learning

The current state of deep learning and NLP

Understanding a simple deep model – a Fully Connected Neural Network

The roadmap – beyond this chapter

Introduction to the technical tools

Description of the tools

Installing Python and scikit-learn

Installing Jupyter Notebook

Installing TensorFlow

Summary

Chapter 2: Understanding TensorFlow

What is TensorFlow?

Getting started with TensorFlow

TensorFlow client in detail

TensorFlow architecture – what happens when you execute the client?

Cafe Le TensorFlow – understanding TensorFlow with an analogy

Inputs, variables, outputs, and operations

Defining inputs in TensorFlow

Feeding data with Python code

Preloading and storing data as tensors

Building an input pipeline

Defining variables in TensorFlow

Defining TensorFlow outputs

Defining TensorFlow operations

Comparison operations

Mathematical operations

Scatter and gather operations

Neural network-related operations

Reusing variables with scoping

Implementing our first neural network

Preparing the data

Defining the TensorFlow graph

Running the neural network

Summary

Chapter 3: Word2vec – Learning Word Embeddings

What is a word representation or meaning?

Classical approaches to learning word representation

WordNet – using an external lexical knowledge base for learning word representations

Tour of WordNet

Problems with WordNet

One-hot encoded representation

The TF-IDF method

Co-occurrence matrix

Word2vec – a neural network-based approach to learning word representation

Exercise: is queen = king – he + she?

Designing a loss function for learning word embeddings

The skip-gram algorithm

From raw text to structured data

Learning the word embeddings with a neural network

Formulating a practical loss function

Efficiently approximating the loss function

Implementing skip-gram with TensorFlow

The Continuous Bag-of-Words algorithm

Implementing CBOW in TensorFlow

Summary

Chapter 4: Advanced Word2vec

The original skip-gram algorithm

Implementing the original skip-gram algorithm

Comparing the original skip-gram with the improved skip-gram

Comparing skip-gram with CBOW

Performance comparison

Which is the winner, skip-gram or CBOW?

Extensions to the word embeddings algorithms

Using the unigram distribution for negative sampling

Implementing unigram-based negative sampling

Subsampling – probabilistically ignoring the common words

Implementing subsampling

Comparing the CBOW and its extensions

More recent algorithms extending skip-gram and CBOW

A limitation of the skip-gram algorithm

The structured skip-gram algorithm

The loss function

The continuous window model

GloVe – Global Vectors representation

Understanding GloVe

Implementing GloVe

Document classification with Word2vec

Dataset

Classifying documents with word embeddings

Implementation – learning word embeddings

Implementation – word embeddings to document embeddings

Document clustering and t-SNE visualization of embedded documents

Inspecting several outliers

Implementation – clustering/classification of documents with K-means

Summary

Chapter 5: Sentence Classification with Convolutional Neural Networks

Introducing Convolution Neural Networks

CNN fundamentals

The power of Convolution Neural Networks

Understanding Convolution Neural Networks

Convolution operation

Standard convolution operation

Convolving with stride

Convolving with padding

Transposed convolution

Pooling operation

Max pooling

Max pooling with stride

Average pooling

Fully connected layers

Putting everything together

Exercise – image classification on MNIST with CNN

About the data

Implementing the CNN

Analyzing the predictions produced with a CNN

Using CNNs for sentence classification

CNN structure

Data transformation

The convolution operation

Pooling over time

Implementation – sentence classification with CNNs

Summary

Chapter 6: Recurrent Neural Networks

Understanding Recurrent Neural Networks

The problem with feed-forward neural networks

Modeling with Recurrent Neural Networks

Technical description of a Recurrent Neural Network

Backpropagation Through Time

How backpropagation works

Why we cannot use BP directly for RNNs

Backpropagation Through Time – training RNNs

Truncated BPTT – training RNNs efficiently

Limitations of BPTT – vanishing and exploding gradients

Applications of RNNs

One-to-one RNNs

One-to-many RNNs

Many-to-one RNNs

Many-to-many RNNs

Generating text with RNNs

Defining hyperparameters

Unrolling the inputs over time for Truncated BPTT

Defining the validation dataset

Defining weights and biases

Defining state persisting variables

Calculating the hidden states and outputs with unrolled inputs

Calculating the loss

Resetting state at the beginning of a new segment of text

Calculating validation output

Calculating gradients and optimizing

Outputting a freshly generated chunk of text

Evaluating text results output from the RNN

Perplexity – measuring the quality of the text result

Recurrent Neural Networks with Context Features – RNNs with longer memory

Technical description of the RNN-CF

Implementing the RNN-CF

Defining the RNN-CF hyperparameters

Defining input and output placeholders

Defining weights of the RNN-CF

Variables and operations for maintaining hidden and context states

Calculating output

Calculating the loss

Calculating validation output

Computing test output

Computing the gradients and optimizing

Text generated with the RNN-CF

Summary

Chapter 7: Long Short-Term Memory Networks

Understanding Long Short-Term Memory Networks

What is an LSTM?

LSTMs in more detail

How LSTMs differ from standard RNNs

How LSTMs solve the vanishing gradient problem

Improving LSTMs

Greedy sampling

Beam search

Using word vectors

Bidirectional LSTMs (BiLSTM)

Other variants of LSTMs

Peephole connections

Gated Recurrent Units

Summary

Chapter 8: Applications of LSTM – Generating Text

Our data

About the dataset

Preprocessing data

Implementing an LSTM

Defining hyperparameters

Defining parameters

Defining an LSTM cell and its operations

Defining inputs and labels

Defining sequential calculations required to process sequential data

Defining the optimizer

Decaying learning rate over time

Making predictions

Calculating perplexity (loss)

Resetting states

Greedy sampling to break unimodality

Generating new text

Example generated text

Comparing LSTMs to LSTMs with peephole connections and GRUs

Standard LSTM

Review

Example generated text

Gated Recurrent Units (GRUs)

Review

The code

Example generated text

LSTMs with peepholes

Review

The code

Example generated text

Training and validation perplexities over time

Improving LSTMs – beam search

Implementing beam search

Examples generated with beam search

Improving LSTMs – generating text with words instead of n-grams

The curse of dimensionality

Word2vec to the rescue

Generating text with Word2vec

Examples generated with LSTM-Word2vec and beam search

Perplexity over time

Using the TensorFlow RNN API

Summary

Chapter 9: Applications of LSTM – Image Caption Generation

Getting to know the data

ILSVRC ImageNet dataset

The MS-COCO dataset

The machine learning pipeline for image caption generation

Extracting image features with CNNs

Implementation – loading weights and inferencing with VGG-16

Building and updating variables

Preprocessing inputs

Inferring VGG-16

Extracting vectorized representations of images

Predicting class probabilities with VGG-16

Learning word embeddings

Preparing captions for feeding into LSTMs

Generating data for LSTMs

Defining the LSTM

Evaluating the results quantitatively

BLEU

ROUGE

METEOR

CIDEr

BLEU-4 over time for our model

Captions generated for test images

Using TensorFlow RNN API with pretrained GloVe word vectors

Loading GloVe word vectors

Cleaning data

Using pretrained embeddings with TensorFlow RNN API

Defining the pretrained embedding layer and the adaptation layer

Defining the LSTM cell and softmax layer

Defining inputs and outputs

Processing images and text differently

Defining the LSTM output calculation

Defining the logits and predictions

Defining the sequence loss

Defining the optimizer

Summary

Chapter 10: Sequence-to-Sequence Learning – Neural Machine Translation

Machine translation

A brief historical tour of machine translation

Rule-based translation

Statistical Machine Translation (SMT)

Neural Machine Translation (NMT)

Understanding Neural Machine Translation

Intuition behind NMT

NMT architecture

The embedding layer

The encoder

The context vector

The decoder

Preparing data for the NMT system

At training time

Reversing the source sentence

At testing time

Training the NMT

Inference with NMT

The BLEU score – evaluating the machine translation systems

Modified precision

Brevity penalty

The final BLEU score

Implementing an NMT from scratch – a German to English translator

Introduction to data

Preprocessing data

Learning word embeddings

Defining the encoder and the decoder

Defining the end-to-end output calculation

Some translation results

Training an NMT jointly with word embeddings

Maximizing matchings between the dataset vocabulary and the pretrained embeddings

Defining the embeddings layer as a TensorFlow variable

Improving NMTs

Teacher forcing

Deep LSTMs

Attention

Breaking the context vector bottleneck

The attention mechanism in detail

Implementing the attention mechanism

Defining weights

Computing attention

Some translation results – NMT with attention

Visualizing attention for source and target sentences

Other applications of Seq2Seq models – chatbots

Training a chatbot

Evaluating chatbots – Turing test

Summary

Chapter 11: Current Trends and the Future of Natural Language Processing

Current trends in NLP

Word embeddings

Region embedding

Probabilistic word embedding

Ensemble embedding

Topic embedding

Neural Machine Translation (NMT)

Improving the attention mechanism

Hybrid MT models

Penetration into other research fields

Combining NLP with computer vision

Visual Question Answering (VQA)

Caption generation for images with attention

Reinforcement learning

Teaching agents to communicate using their own language

Dialogue agents with reinforcement learning

Generative Adversarial Networks for NLP

Towards Artificial General Intelligence

One Model to Learn Them All

A joint many-task model – growing a neural network for multiple NLP tasks

First level – word-based tasks

Second level – syntactic tasks

Third level – semantic-level tasks

NLP for social media

Detecting rumors in social media

Detecting emotions in social media

Analyzing political framing in tweets

New tasks emerging

Detecting sarcasm

Language grounding

Skimming text with LSTMs

Newer machine learning models

Phased LSTM

Dilated Recurrent Neural Networks (DRNNs)

Summary

References

Appendix: Mathematical Foundations and Advanced TensorFlow

Basic data structures

Scalar

Vectors

Matrices

Indexing of a matrix

Special types of matrices

Identity matrix

Diagonal matrix

Tensors

Tensor/matrix operations

Transpose

Multiplication

Element-wise multiplication

Inverse

Finding the matrix inverse – Singular Value Decomposition (SVD)

Norms

Determinant

Probability

Random variables

Discrete random variables

Continuous random variables

The probability mass/density function

Conditional probability

Joint probability

Marginal probability

Bayes' rule

Introduction to Keras

Introduction to the TensorFlow seq2seq library

Defining embeddings for the encoder and decoder

Defining the encoder

Defining the decoder

Visualizing word embeddings with TensorBoard

Starting TensorBoard

Saving word embeddings and visualizing via TensorBoard

Summary

Other Books You May Enjoy

Index

[ 1 ]

Natural Language Processing with TensorFlow Teach language to machines using Python's deep learning library Thushan Ganegedara BIRMINGHAM - MUMBAI

Natural Language Processing with TensorFlow Copyright © 2018 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. Acquisition Editor: Frank Pohlmann Project Editor: Radhika Atitkar Content Development Editor: Chris Nelson Technical Editor: Bhagyashree Rai Copy Editor: Tom Jacob Proofreader: Safis Editing Indexer: Rekha Nair Graphics: Tom Scaria Production Coordinator: Nilesh Mohite First published: May 2018 Production reference: 2310518 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78847-831-1 www.packtpub.com

mapt.io Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career. For more information, please visit our website. Why subscribe? • Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals • Learn better with Skill Plans built especially for you • Get a free eBook or video every month • Mapt is fully searchable • Copy and paste, print, and bookmark content PacktPub.com Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub. com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at service@packtpub.com for more details. At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.

Contributors About the author Thushan Ganegedara is currently a third year Ph.D. student at the University of Sydney, Australia. He is specializing in machine learning and has a liking for deep learning. He lives dangerously and runs algorithms on untested data. He also works as the chief data scientist for AssessThreat, an Australian start-up. He got his BSc. (Hons) from the University of Moratuwa, Sri Lanka. He frequently writes technical articles and tutorials about machine learning. Additionally, he also strives for a healthy lifestyle by including swimming in his daily schedule. I would like to thank my parents, my siblings, and my wife for the faith they had in me and the support they have given, also all my teachers and my Ph.D advisor for the guidance he provided me with.

About the reviewers Motaz Saad holds a Ph.D. in computer science from the University of Lorraine. He loves data and he likes to play with it. He has over 10 years, professional experience in NLP, computational linguistics, data science, and machine learning. He currently works as an assistant professor at the faculty of information technology, IUG. Dr Joseph O'Connor is a data scientist with a deep passion for deep learning. His company, Deep Learn Analytics, a UK-based data science consultancy, works with businesses to develop machine learning applications and infrastructure from concept to deployment. He was awarded a Ph.D. from University College London for his work analyzing data on the MINOS high-energy physics experiment. Since then, he has developed ML products for a number of companies in the private sector, specializing in NLP and time series forecasting. You can find him at http:// deeplearnanalytics.com/. Packt is searching for authors like you If you're interested in becoming an author for Packt, please visit authors.packtpub. com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.

Table of Contents Preface Chapter 1: Introduction to Natural Language Processing What is Natural Language Processing? Tasks of Natural Language Processing The traditional approach to Natural Language Processing Understanding the traditional approach Example – generating football game summaries Drawbacks of the traditional approach The deep learning approach to Natural Language Processing History of deep learning The current state of deep learning and NLP Understanding a simple deep model – a Fully-Connected Neural Network The roadmap – beyond this chapter Introduction to the technical tools Description of the tools Installing Python and scikit-learn Installing Jupyter Notebook Installing TensorFlow Summary Chapter 2: Understanding TensorFlow What is TensorFlow? Getting started with TensorFlow TensorFlow client in detail TensorFlow architecture – what happens when you execute the client? Cafe Le TensorFlow – understanding TensorFlow with an analogy Inputs, variables, outputs, and operations Defining inputs in TensorFlow [ i ] xi 1 1 2 5 5 6 10 10 11 13 14 16 21 21 22 22 23 24 27 28 28 31 32 35 36 37

分享到：

赞收藏

资料库

natrual language processing with tensorflow.pdf

相关推荐

人工智能

热门标签

最新资料