084933375X
au3375_c000
Neural Networks for Applied Sciences and Engineering
Half Title
Series Title
Title
Dedication
Contents
Preface
Acknowledgments
About the Author
au3375_c001
Table of Contents
Chapter 1: From Data to Models: Complexity and Challenges in Understanding Biological, Ecological, and Natural Systems
1.1 Introduction
1.2 Layout of the Book
References
au3375_c002
Table of Contents
Chapter 2: Fundamentals of Neural Networks and Models for Linear Data Analysis
2.1 Introduction and Overview
2.2 Neural Networks and Their Capabilities
2.3 Inspirations from Biology
2.4 Modeling Information Processing in Neurons
2.5 Neuron Models and Learning Strategies
2.5.1 Threshold Neuron as a Simple Classifier
2.5.2 Learning Models for Neurons and Neural Assemblies
2.5.2.1 Hebbian Learning
2.5.2.2 Unsupervised or Competitive Learning
2.5.2.3 Supervised Learning
2.5.3 Perceptron with Supervised Learning as a Classifier
2.5.3.1 Perceptron Learning Algorithm
2.5.3.2 A Practical Example of Perceptron on a Larger Realistic Data Set: Identifying the Origin of Fish from the Growth-Ring Diameter of Scales
2.5.3.3 Comparison of Perceptron with Linear Discriminant Function Analysis in Statistics
2.5.3.4 Multi-Output Perceptron for Multicategory Classification
2.5.3.5 Higher-Dimensional Classification Using Perceptron
2.5.3.6 Perceptron Summary
2.5.4 Linear Neuron for Linear Classification and Prediction
2.5.4.1 Learning with the Delta Rule
2.5.4.2 Linear Neuron as a Classifier
2.5.4.3 Classification Properties of a Linear Neuron as a Subset of Predictive Capabilities
2.5.4.4 Example: Linear Neuron as a Predictor
2.5.4.5 A Practical Example of Linear Prediction: Predicting the Heat Influx in a Home
2.5.4.6 Comparison of Linear Neuron Model with Linear Regression
2.5.4.7 Example: Multiple Input Linear Neuron Model—Improving the Prediction Accuracy of Heat Influx in a Home
2.5.4.8 Comparison of a Multiple-Input Linear Neuron with Multiple Linear Regression
2.5.4.9 Multiple Linear Neuron Models
2.5.4.10 Comparison of a Multiple Linear Neuron Network with Canonical Correlation Analysis
2.5.4.11 Linear Neuron and Linear Network Summary
2.6 Summary
Problems
References
au3375_c003
Table of Contents
Chapter 3: Neural Networks for Nonlinear Pattern Recognition
3.1 Overview and Introduction
3.1.1 Multilayer Perceptron
3.2 Nonlinear Neurons
3.2.1 Neuron Activation Functions
3.2.1.1 Sigmoid Functions
3.2.1.2 Gaussian Functions
3.2.2 Example: Population Growth Modeling Using a Nonlinear Neuron
3.2.3 Comparison of Nonlinear Neuron with Nonlinear Regression Analysis
3.3 One-Input Multilayer Nonlinear Networks
3.3.1 Processing with a Single Nonlinear Hidden Neuron
3.3.2 Examples: Modeling Cyclical Phenomena with Multiple Nonlinear Neurons
3.3.2.1 Example 1: Approximating a Square Wave
3.3.2.2 Example 2: Modeling Seasonal Species Migration
3.4 Two-Input Multilayer Perceptron Network
3.4.1 Processing of Two-Dimensional Inputs by Nonlinear Neurons
3.4.2 Network Output
3.4.3 Examples: Two-Dimensional Prediction and Classification
3.4.3.1 Example 1: Two-Dimensional Nonlinear Function Approximation
3.4.3.2 Example 2: Two-Dimensional Nonlinear Classification Model
3.5 Multidimensional Data Modeling with Nonlinear Multilayer Perceptron Networks
3.6 Summary
Problems
References
au3375_c004
Table of Contents
Chapter 4: Learning of Nonlinear Patterns by Neural Networks
4.1 Introduction and Overview
4.2 Supervised Training of Networks for Nonlinear Pattern Recognition
4.3 Gradient Descent and Error Minimization
4.4 Backpropagation Learning
4.4.1 Example: Backpropagation Training—A Hand Computation
4.4.1.1 Error Gradient with Respect to Output Neuron Weights
4.4.1.2 The Error Gradient with Respect to the Hidden-Neuron Weights
4.4.1.3 Application of Gradient Descent in Backpropagation Learning
4.4.1.4 Batch Learning
4.4.1.5 Learning Rate and Weight Update
4.4.1.6 Example-by-Example (Online) Learning
4.4.1.7 Momentum
4.4.2 Example: Backpropagation Learning Computer Experiment
4.4.3 Single-Input Single-Output Network with Multiple Hidden Neurons
4.4.4 Multiple-Input, Multiple-Hidden Neuron, and Single-Output Network
4.4.5 Multiple-Input, Multiple-Hidden Neuron, Multiple-Output Network
4.4.6 Example: Backpropagation Learning Case Study—Solving a Complex Classification Problem
4.5 Delta-Bar-Delta Learning (Adaptive Learning Rate) Method
4.5.1 Example: Network Training with Delta-Bar-Delta— A Hand Computation
4.5.2 Example: Delta-Bar-Delta with Momentum— A Hand Computation
4.5.3 Network Training with Delta-Bar-Delta— A Computer Experiment
4.5.4 Comparison of Delta-Bar-Delta Method with Backpropagation
4.5.5 Example: Network Training with Delta-Bar-Delta— A Case Study
4.6 Steepest Descent Method
4.6.1 Example: Network Training with Steepest Descent—Hand Computation
4.6.2 Example: Network Training with Steepest Descent—A Computer Experiment
4.7 Second-Order Methods of Error Minimization and Weight Optimization
4.7.1 QuickProp
4.7.1.1 Example: Network Training with QuickProp— A Hand Computation
4.7.1.2 Example: Network Training with QuickProp— A Computer Experiment
4.7.1.3 Comparison of QuickProp with Steepest Descent, Delta-Bar-Delta, and Backpropagation
4.7.2 General Concept of Second-Order Methods of Error Minimization
4.7.3 Gauss–Newton Method
4.7.3.1 Network Training with the Gauss–Newton Method—A Hand Computation
4.7.3.2 Example: Network Training with Gauss–Newton Method—A Computer Experiment
4.7.4 The Levenberg–Marquardt Method
4.7.4.1 Example: Network Training with LM Method—A Hand Computation
4.7.4.2 Network Training with the LM Method—A Computer Experiment
4.7.5 Comparison of the Efficiency of the First-Order and Second-Order Methods in Minimizing Error
4.7.6 Comparison of the Convergence Characteristics of First-Order and Second-Order Learning Methods
4.7.6.1 Backpropagation
4.7.6.2 Steepest Descent Method
4.7.6.3 Gauss–Newton Method
4.7.6.4 Levenberg–Marquardt Method
4.8 Summary
Problems
References
au3375_c005
Table of Contents
Chapter 5: Implementation of Neural Network Models for Extracting Reliable Patterns from Data
5.1 Introduction and Overview
5.2 Bias–Variance Tradeoff
5.3 Improving Generalization of Neural Networks
5.3.1 Illustration of Early Stopping
5.3.1.1 Effect of Initial Random Weights
5.3.1.2 Weight Structure of the Trained Networks
5.3.1.3 Effect of Random Sampling
5.3.1.4 Effect of Model Complexity: Number of Hidden Neurons
5.3.1.5 Summary on Early Stopping
5.3.2 Regularization
5.4 Reducing Structural Complexity of Networks by Pruning
5.4.1 Optimal Brain Damage
5.4.1.1 Example of Network Pruning with Optimal Brain Damage
5.4.2 Network Pruning Based on Variance of Network Sensitivity
5.4.2.1 Illustration of Application of Variance Nullity in Pruning Weights
5.4.2.2 Pruning Hidden Neurons Based on Variance Nullity of Sensitivity
5.5 Robustness of a Network to Perturbation of Weights
5.5.1 Confidence Intervals for Weights
5.6 Summary
Problems
References
au3375_c006
Table of Content
Chapter 6: Data Exploration, Dimensionality Reduction, and Feature Extraction
6.1 Introduction and Overview
6.1.1 Example: Thermal Conductivity of Wood in Relation to Correlated Input Data
6.2 Data Visualization
6.2.1 Correlation Scatter Plots and Histograms
6.2.2 Parallel Visualization
6.2.3 Projecting Multidimensional Data onto Two-Dimensional Plane
6.3 Correlation and Covariance between Variables
6.4 Normalization of Data
6.4.1 Standardization
6.4.2 Simple Range Scaling
6.4.3 Whitening—Normalization of Correlated Multivariate Data
6.5 Selecting Relevant Inputs
6.5.1 Statistical Tools for Variable Selection
6.5.1.1 Partial Correlation
6.5.1.2 Multiple Regression and Best-Subsets Regression
6.6 Dimensionality Reduction and Feature Extraction
6.6.1 Multicollinearity
6.6.2 Principal Component Analysis (PCA)
6.6.3 Partial Least-Squares Regression
6.7 Outlier Detection
6.8 Noise
6.9 Case Study: Illustrating Input Selection and Dimensionality Reduction for a Practical Problem
6.9.1 Data Preprocessing and Preliminary Modeling
6.9.2 PCA-Based Neural Network Modeling
6.9.3 Effect of Hidden Neurons for Non-PCA- and PCA-Based Approaches
6.9.4 Case Study Summary
6.10 Summary
Problems
References
au3375_c007
Table of Contents
Chapter 7: Assessment of Uncertainty of Neural Network Models Using Bayesian Statistics
7.1 Introduction and Overview
7.2 Estimating Weight Uncertainty Using Bayesian Statistics
7.2.1 Quality Criterion
7.2.2 Incorporating Bayesian Statistics to Estimate Weight Uncertainty
7.2.2.1 Square Error
7.2.3 Intrinsic Uncertainty of Targets for Multivariate Output
7.2.4 Probability Density Function of Weights
7.2.5 Example Illustrating Generation of Probability Distribution of Weights
7.2.5.1 Estimation of Geophysical Parameters from Remote Sensing: A Case Study
7.3 Assessing Uncertainty of Neural Network Outputs Using Bayesian Statistics
7.3.1 Example Illustrating Uncertainty Assessment of Output Errors
7.3.1.1 Total Network Output Errors
7.3.1.2 Error Correlation and Covariance Matrices
7.3.1.3 Statistical Analysis of Error Covariance
7.3.1.4 Decomposition of Total Output Error into Model Error and Intrinsic Noise
7.4 Assessing the Sensitivity of Network Outputs to Inputs
7.4.1 Approaches to Determine the Influence of Inputs on Outputs in Feedforward Networks
7.4.1.1 Methods Based on Magnitude of Weights
7.4.1.2 Sensitivity Analysis
7.4.2 Example: Comparison of Methods to Assess the Influence of Inputs on Outputs
7.4.3 Uncertainty of Sensitivities
7.4.4 Example Illustrating Uncertainty Assessment of Network Sensitivity to Inputs
7.4.4.1 PCA Decomposition of Inputs and Outputs
7.4.4.2 PCA-Based Neural Network Regression
7.4.4.3 Neural Network Sensitivities
7.4.4.4 Uncertainty of Input Sensitivity
7.4.4.5 PCA-Regularized Jacobians
7.4.4.6 Case Study Summary
7.5 Summary
Problems
References
au3375_c008
Table of Contents
Chapter 8: Discovering Unknown Clusters in Data with Self-Organizing Maps
8.1 Introduction and Overview
8.2 Structure of Unsupervised Networks
8.3 Learning in Unsupervised Networks
8.4 Implementation of Competitive Learning
8.4.1 Winner Selection Based on Neuron Activation
8.4.2 Winner Selection Based on Distance to Input Vector
8.4.2.1 Other Distance Measures
8.4.3 Competitive Learning Example
8.4.3.1 Recursive Versus Batch Learning
8.4.3.2 Illustration of the Calculations Involved inWinner Selection
8.4.3.3 Network Training
8.5 Self-Organizing Feature Maps
8.5.1 Learning in Self-Organizing Map Networks
8.5.1.1 Selection of Neighborhood Geometry
8.5.1.2 Training of Self-Organizing Maps
8.5.1.3 Neighbor Strength
8.5.1.4 Example: Training Self-Organizing Networks with a Neighbor Feature
8.5.1.5 Neighbor Matrix and Distance to Neighbors from the Winner
8.5.1.6 Shrinking Neighborhood Size with Iterations
8.5.1.7 Learning Rate Decay
8.5.1.8 Weight Update Incorporating Learning Rate and Neighborhood Decay
8.5.1.9 Recursive and Batch Training and Relation to K-Means Clustering
8.5.1.10 Two Phases of Self-Organizing Map Training
8.5.1.11 Example: Illustrating Self-Organizing Map Learning with a Hand Calculation
8.5.1.12 SOM Case Study: Determination of Mastitis Health Status of Dairy Herd from Combined Milk Traits
8.5.2 Example of Two-Dimensional Self-Organizing Maps: Clustering Canadian and Alaskan Salmon Based on the Diameter of Growth Rings of the Scales
8.5.2.1 Map Structure and Initialization
8.5.2.2 Map Training
8.5.2.3 U-Matrix
8.5.3 Map Initialization
8.5.4 Example: Training Two-Dimensional Maps on Multidimensional Data
8.5.4.1 Data Visualization
8.5.4.2 Map Structure and Training
8.5.4.3 U-Matrix
8.5.4.4 Point Estimates of Probability Density of Inputs Captured by the Map
8.5.4.5 Quantization Error
8.5.4.6 Accuracy of Retrieval of Input Data from the Map
8.5.5 Forming Clusters on the Map
8.5.5.1 Approaches to Clustering
8.5.5.2 Example Illustrating Clustering on a Trained Map
8.5.5.3 Finding Optimum Clusters on the Map with the Ward Method
8.5.5.4 Finding Optimum Clusters by K-Means Clustering
8.5.6 Validation of a Trained Map
8.5.6.1 n-Fold Cross Validation
8.6 Evolving Self-Organizing Maps
8.6.1 Growing Cell Structure of Map
8.6.1.1 Centroid Method for Mapping Input Data onto Positions between Neurons on the Map
8.6.2 Dynamic Self-Organizing Maps with Controlled Growth (GSOM)
8.6.2.1 Example: Application of Dynamic Self-Organizing Maps
8.6.3 Evolving Tree
8.7 Summary
Problems
References
au3375_c009
Table of Contents
Chapter 9: Neural Networks for Time-Series Forecasting
9.1 Introduction and Overview
9.2 Linear Forecasting of Time-Series with Statistical and Neural Network Models
9.2.1 Example Case Study: Regulating Temperature of a Furnace
9.2.1.1 Multistep-Ahead Linear Forecasting
9.3 Neural Networks for Nonlinear Time-Series Forecasting
9.3.1 Focused Time-Lagged and Dynamically Driven Recurrent Networks
9.3.1.1 Focused Time-Lagged Feedforward Networks
9.3.1.2 Spatio-Temporal Time-Lagged Networks
9.3.2 Example: Spatio-Temporal Time-Lagged Network—Regulating Temperature in a Furnace
9.3.2.1 Single-Step Forecasting with Neural NARx Model
9.3.2.2 Multistep Forecasting with Neural NARx Model
9.3.3 Case Study: River Flow Forecasting
9.3.3.1 Linear Model for River Flow Forecasting
9.3.3.2 Nonlinear Neural (NARx) Model for River Flow Forecasting
9.3.3.3 Input Sensitivity
9.4 Hybrid Linear (ARIMA) and Nonlinear Neural Network Models
9.4.1 Case Study: Forecasting the Annual Number of Sunspots
9.5 Automatic Generation of Network StructureUsing Simplest Structure Concept
9.5.1 Case Study: Forecasting Air Pollution with Automatic Neural Network Model Generation
9.6 Generalized Neuron Network
9.6.1 Case Study: Short-Term Load Forecasting with a Generalized Neuron Network
9.7 Dynamically Driven Recurrent Networks
9.7.1 Recurrent Networks with Hidden Neuron Feedback
9.7.1.1 Encapsulating Long-Term Memory
9.7.1.2 Structure and Operation of the Elman Network
9.7.1.3 Training Recurrent Networks
9.7.1.4 Network Training Example: Hand Calculation
9.7.1.5 Recurrent Learning Network Application Case Study: Rainfall Runoff Modeling
9.7.1.6 Two-Step-Ahead Forecasting with Recurrent Networks
9.7.1.7 Real-Time Recurrent Learning Case Study: Two-Step-Ahead Stream Flow Forecasting
9.7.2 Recurrent Networks with Output Feedback
9.7.2.1 Encapsulating Long-Term Memory in Recurrent Networks with Output Feedback
9.7.2.2 Application of a Recurrent Net with Output and Error Feedback and Exogenous Inputs: (NARIMAx) Case Study: Short-Term Temperature Forecasting
9.7.2.3 Training of Recurrent Nets with Output Feedback
9.7.3 Fully Recurrent Network
9.7.3.1 Fully Recurrent Network Practical Application Case Study: Short-Term Electricity Load Forecasting
9.8 Bias and Variance in Time-Series Forecasting
9.8.1 Decomposition of Total Error into Bias and Variance Components
9.8.2 Example Illustrating Bias–Variance Decomposition
9.9 Long-Term Forecasting
9.9.1 Case Study: Long-Term Forecasting with Multiple Neural Networks (MNNs)
9.10 Input Selection for Time-Series Forecasting
9.10.1 Input Selection from Nonlinearly Dependent Variables
9.10.1.1 Partial Mutual Information Method
9.10.1.2 Generalized Regression Neural Network
9.10.1.3 Self-Organizing Maps for Input Selection
9.10.1.4 Genetic Algorithms for Input Selection
9.10.2 Practical Application of Input Selection Methods for Time-Series Forecasting
9.10.3 Input Selection Case Study: Selecting Inputs for Forecasting River Salinity
9.11 Summary
Problems
References
AU3375_A001
Table of Contents
Appendix
A.1 Linear Algebra, Vectors, and Matrices
A.1.1 Addition of Vectors
A.1.2 Multiplication of a Vector by a Scalar
A.1.3 The Norm of a Vector
A.1.4 Vector Multiplication: Dot Products
A.2 Matrices
A.2.1 Matrix Addition
A.2.2 Matrix Multiplication
A.2.3 Multiplication of a Matrix by a Vector
A.2.4 Matrix Transpose
References