logo资料库

Matrix Algebra Theory, Computations and Applications in Statisti....pdf

第1页 / 共664页
第2页 / 共664页
第3页 / 共664页
第4页 / 共664页
第5页 / 共664页
第6页 / 共664页
第7页 / 共664页
第8页 / 共664页
资料共664页,剩余部分请下载后查看
Preface to the Second Edition
Preface to the First Edition
Acknowledgments
Contents
Author Biography
Part I Linear Algebra
1 Basic Vector/Matrix Structure and Notation
1.1 Vectors
1.2 Arrays
1.3 Matrices
1.3.1 Subvectors and Submatrices
1.4 Representation of Data
2 Vectors and Vector Spaces
2.1 Operations on Vectors
2.1.1 Linear Combinations and Linear Independence
2.1.2 Vector Spaces and Spaces of Vectors
2.1.2.1 Generating Sets
2.1.2.2 The Order and the Dimension of a Vector Space
2.1.2.3 Vector Spaces with an Infinite Number of Dimensions
2.1.2.4 Essentially Disjoint Vector Spaces
2.1.2.5 Some Special Vectors: Notation
2.1.2.6 Ordinal Relations Among Vectors
2.1.2.7 Set Operations on Vector Spaces
2.1.2.8 Subpaces
2.1.2.9 Intersections of Vector Spaces
2.1.2.10 Unions and Direct Sums of Vector Spaces
2.1.2.11 Direct Sum Decomposition of a Vector Space
2.1.2.12 Direct Products of Vector Spaces and Dimension Reduction
2.1.3 Basis Sets for Vector Spaces
2.1.3.1 Properties of Basis Sets of Vector Subspaces
2.1.4 Inner Products
2.1.5 Norms
2.1.5.1 Convexity
2.1.5.2 Norms Induced by Inner Products
2.1.5.3 Lp Norms
2.1.5.4 Basis Norms
2.1.5.5 Equivalence of Norms
2.1.6 Normalized Vectors
2.1.6.1 ``Inverse'' of a Vector
2.1.7 Metrics and Distances
2.1.7.1 Metrics Induced by Norms
2.1.7.2 Convergence of Sequences of Vectors
2.1.8 Orthogonal Vectors and Orthogonal Vector Spaces
2.1.9 The ``One Vector''
2.1.9.1 The Mean and the Mean Vector
2.2 Cartesian Coordinates and Geometrical Properties of Vectors
2.2.1 Cartesian Geometry
2.2.2 Projections
2.2.3 Angles Between Vectors
2.2.4 Orthogonalization Transformations: Gram-Schmidt
2.2.5 Orthonormal Basis Sets
2.2.6 Approximation of Vectors
2.2.6.1 Optimality of the Fourier Coefficients
2.2.6.2 Choice of the Best Basis Subset
2.2.7 Flats, Affine Spaces, and Hyperplanes
2.2.8 Cones
2.2.8.1 Convex Cones
2.2.8.2 Dual Cones
2.2.8.3 Polar Cones
2.2.8.4 Additional Properties
2.2.9 Cross Products in IR3
2.3 Centered Vectors and Variances and Covariances of Vectors
2.3.1 The Mean and Centered Vectors
2.3.2 The Standard Deviation, the Variance, and Scaled Vectors
2.3.3 Covariances and Correlations Between Vectors
Exercises
3 Basic Properties of Matrices
3.1 Basic Definitions and Notation
3.1.1 Multiplication of a Matrix by a Scalar
3.1.2 Diagonal Elements: diag(·) and vecdiag(·)
3.1.3 Diagonal, Hollow, and Diagonally Dominant Matrices
3.1.4 Matrices with Special Patterns of Zeroes
3.1.5 Matrix Shaping Operators
3.1.5.1 Transpose
3.1.5.2 Diagonal Matrices and Diagonal Vectors: diag(·) (Again)
3.1.5.3 Forming a Vector from the Elements of a Matrix: vec(·) and vech(·)
3.1.6 Partitioned Matrices
3.1.6.1 Transposes of Partitioned Matrices
3.1.7 Matrix Addition
3.1.7.1 The Transpose of the Sum of Matrices
3.1.7.2 Rank Ordering Matrices
3.1.7.3 Vector Spaces of Matrices
3.1.8 Scalar-Valued Operators on Square Matrices: The Trace
3.1.8.1 The Trace: tr(·)
3.1.8.2 The Trace of the Transpose of Square Matrices
3.1.8.3 The Trace of Scalar Products of Square Matrices
3.1.8.4 The Trace of Partitioned Square Matrices
3.1.8.5 The Trace of the Sum of Square Matrices
3.1.9 Scalar-Valued Operators on Square Matrices:The Determinant
3.1.9.1 The Determinant: det(·)
3.1.9.2 Notation and Simple Properties of the Determinant
3.1.9.3 Minors, Cofactors, and Adjugate Matrices
3.1.9.4 The Determinant of the Transpose of Square Matrices
3.1.9.5 The Determinant of Scalar Products of Square Matrices
3.1.9.6 The Determinant of an Upper (or Lower) Triangular Matrix
3.1.9.7 The Determinant of Certain Partitioned Square Matrices
3.1.9.8 The Determinant of the Sum of Square Matrices
3.1.9.9 A Diagonal Expansion of the Determinant
3.1.9.10 Computing the Determinant
3.1.9.11 A Geometrical Perspective of the Determinant
3.2 Multiplication of Matrices and Multiplication of Vectors and Matrices
3.2.1 Matrix Multiplication (Cayley)
3.2.1.1 Powers of Square Matrices
3.2.1.2 Nilpotent Matrices
3.2.1.3 Matrix Polynomials
3.2.2 Multiplication of Matrices with Special Patterns
3.2.2.1 Multiplication of Partitioned Matrices
3.2.3 Elementary Operations on Matrices
3.2.3.1 Interchange of Rows or Columns: Permutation Matrices
3.2.3.2 The Vec-Permutation Matrix
3.2.3.3 Scalar Row or Column Multiplication
3.2.3.4 Axpy Row or Column Transformations
3.2.3.5 Elementary Operator Matrices: Summary of Notation and Properties
3.2.3.6 Determinants of Elementary Operator Matrices
3.2.4 The Trace of a Cayley Product That Is Square
3.2.5 The Determinant of a Cayley Product of Square Matrices
3.2.6 Multiplication of Matrices and Vectors
3.2.6.1 The Matrix/Vector Product as a Linear Combination
3.2.6.2 The Matrix as a Mapping on Vector Spaces
3.2.7 Outer Products
3.2.8 Bilinear and Quadratic Forms: Definiteness
3.2.8.1 Nonnegative Definite and Positive Definite Matrices
3.2.8.2 Ordinal Relations among Symmetric Matrices
3.2.8.3 The Trace of Inner and Outer Products
3.2.9 Anisometric Spaces
3.2.9.1 Conjugacy
3.2.10 Other Kinds of Matrix Multiplication
3.2.10.1 The Hadamard Product
3.2.10.2 The Kronecker Product
3.2.10.3 The Inner Product of Matrices
3.2.10.4 Orthonormal Matrices
3.2.10.5 Orthonormal Basis: Fourier Expansion
3.3 Matrix Rank and the Inverse of a Matrix
3.3.1 Row Rank and Column Rank
3.3.2 Full Rank Matrices
3.3.3 Rank of Elementary Operator Matrices and Matrix Products Involving Them
3.3.4 The Rank of Partitioned Matrices, Products of Matrices, and Sums of Matrices
3.3.4.1 Rank of Partitioned Matrices and Submatrices
3.3.4.2 An Upper Bound on the Rank of Products of Matrices
3.3.4.3 An Upper and a Lower Bound on the Rank of Sums of Matrices
3.3.5 Full Rank Partitioning
3.3.6 Full Rank Matrices and Matrix Inverses
3.3.6.1 Solutions of Linear Equations
3.3.6.2 Consistent Systems
3.3.6.3 Matrix Inverses
3.3.6.4 Nonsquare Full Rank Matrices: Right and Left Inverses
3.3.7 Full Rank Factorization
3.3.8 Equivalent Matrices
3.3.8.1 Equivalent Canonical Forms
3.3.8.2 Products with a Nonsingular Matrix
3.3.8.3 A Factorization Based on an Equivalent Canonical Form
3.3.8.4 Equivalent Forms of Symmetric Matrices
3.3.9 Multiplication by Full Rank Matrices
3.3.9.1 Products with a General Full Rank Matrix
3.3.9.2 Preservation of Positive Definiteness
3.3.9.3 The General Linear Group
3.3.10 Gramian Matrices: Products of the Form ATA
3.3.10.1 General Properties of Gramian Matrices
3.3.10.2 Rank of ATA
3.3.10.3 Zero Matrices and Equations Involving Gramians
3.3.11 A Lower Bound on the Rank of a Matrix Product
3.3.12 Determinants of Inverses
3.3.13 Inverses of Products and Sums of NonsingularMatrices
3.3.13.1 Inverses of Cayley Products of Matrices
3.3.13.2 Inverses of Kronecker Products of Matrices
3.3.13.3 Inverses of Sums of Matrices and Their Inverses
3.3.13.4 An Expansion of a Matrix Inverse
3.3.14 Inverses of Matrices with Special Forms
3.3.15 Determining the Rank of a Matrix
3.4 More on Partitioned Square Matrices: The Schur Complement
3.4.1 Inverses of Partitioned Matrices
3.4.2 Determinants of Partitioned Matrices
3.5 Linear Systems of Equations
3.5.1 Solutions of Linear Systems
3.5.1.1 Underdetermined Systems
3.5.1.2 Overdetermined Systems
3.5.1.3 Generalized Inverses
3.5.1.4 Multiple Solutions in Consistent Systems
3.5.2 Null Space: The Orthogonal Complement
3.6 Generalized Inverses
3.6.1 Immediate Properties of Generalized Inverses
3.6.2 Special Generalized Inverses: The Moore-Penrose Inverse
3.6.2.1 Definitions and Terminology
3.6.2.2 Existence
3.6.2.3 Uniqueness
3.6.2.4 Other Properties
3.6.2.5 Drazin Inverses
3.6.3 Generalized Inverses of Products and Sums of Matrices
3.6.4 Generalized Inverses of Partitioned Matrices
3.7 Orthogonality
3.7.1 Orthogonal Matrices: Definition and Simple Properties
3.7.2 Orthogonal and Orthonormal Columns
3.7.3 The Orthogonal Group
3.7.4 Conjugacy
3.8 Eigenanalysis: Canonical Factorizations
3.8.1 Eigenvalues and Eigenvectors Are Remarkable
3.8.2 Left Eigenvectors
3.8.3 Basic Properties of Eigenvalues and Eigenvectors
3.8.3.1 Eigenvalues of Elementary Operator Matrices
3.8.4 The Characteristic Polynomial
3.8.4.1 Additional Properties of Eigenvalues and Eigenvectors
3.8.4.2 Eigenvalues and the Trace and Determinant
3.8.5 The Spectrum
3.8.5.1 Notation
3.8.5.2 The Spectral Radius
3.8.5.3 Linear Independence of Eigenvectors Associated with Distinct Eigenvalues
3.8.5.4 The Eigenspace and Geometric Multiplicity
3.8.5.5 Algebraic Multiplicity
3.8.5.6 Gershgorin Disks
3.8.6 Similarity Transformations
3.8.6.1 Orthogonally and Unitarily Similar Transformations
3.8.6.2 Uses of Similarity Transformations
3.8.7 Schur Factorization
3.8.8 Similar Canonical Factorization: Diagonalizable Matrices
3.8.8.1 Symmetric Matrices
3.8.8.2 A Defective Matrix
3.8.8.3 The Jordan Decomposition
3.8.9 Properties of Diagonalizable Matrices
3.8.9.1 Matrix Functions
3.8.10 Eigenanalysis of Symmetric Matrices
3.8.10.1 Orthogonality of Eigenvectors: Orthogonal Diagonalization
3.8.10.2 Spectral Decomposition
3.8.10.3 Kronecker Products of Symmetric Matrices: Orthogonal Diagonalization
3.8.10.4 Quadratic Forms and the Rayleigh Quotient
3.8.10.5 The Fourier Expansion
3.8.10.6 Powers of a Symmetric Matrix
3.8.10.7 The Trace and Sums of Eigenvalues
3.8.11 Positive Definite and Nonnegative Definite Matrices
3.8.11.1 Eigenvalues of Positive and Nonnegative Definite Matrices
3.8.11.2 Inverse of Positive Definite Matrices
3.8.11.3 Diagonalization of Positive Definite Matrices
3.8.11.4 Square Roots of Positive and Nonnegative Definite Matrices
3.8.12 Generalized Eigenvalues and Eigenvectors
3.8.12.1 Matrix Pencils
3.8.13 Singular Values and the Singular Value Decomposition (SVD)
3.8.13.1 The Fourier Expansion in Terms of the Singular Value Decomposition
3.9 Matrix Norms
3.9.1 Matrix Norms Induced from Vector Norms
3.9.1.1 Lp Matrix Norms
3.9.1.2 L1, L2, and L∞ Norms of Symmetric Matrices
3.9.2 The Frobenius Norm—The ``Usual'' Norm
3.9.2.1 The Frobenius Norm and the Singular Values
3.9.3 Other Matrix Norms
3.9.4 Matrix Norm Inequalities
3.9.5 The Spectral Radius
3.9.6 Convergence of a Matrix Power Series
3.9.6.1 Conditions for Convergence of a Sequence of Powers to 0
3.9.6.2 Another Perspective on the Spectral Radius: Relation to Norms
3.9.6.3 Convergence of a Power Series: Inverse of I-A
3.9.6.4 Nilpotent Matrices
3.10 Approximation of Matrices
3.10.1 Measures of the Difference Between Two Matrices
3.10.2 Best Approximation with a Matrix of Given Rank
Exercises
4 Vector/Matrix Derivatives and Integrals
4.1 Functions of Vectors and Matrices
4.2 Basics of Differentiation
4.2.1 Continuity
4.2.2 Notation and Properties
4.2.3 Differentials
4.3 Types of Differentiation
4.3.1 Differentiation with Respect to a Scalar
4.3.1.1 Derivatives of Vectors with Respect to Scalars
4.3.1.2 Derivatives of Matrices with Respect to Scalars
4.3.1.3 Derivatives of Functions with Respect to Scalars
4.3.1.4 Higher-Order Derivatives with Respect to Scalars
4.3.2 Differentiation with Respect to a Vector
4.3.2.1 Derivatives of Scalars with Respect to Vectors: The Gradient
4.3.2.2 Derivatives of Vectors with Respect to Vectors: The Jacobian
4.3.2.3 Derivatives of Vectors with Respect to Vectors in IR3: The Divergence and the Curl
4.3.2.4 Derivatives of Matrices with Respect to Vectors
4.3.2.5 Higher-Order Derivatives with Respect to Vectors: The Hessian
4.3.2.6 Summary of Derivatives with Respect to Vectors
4.3.3 Differentiation with Respect to a Matrix
4.4 Optimization of Scalar-Valued Functions
4.4.1 Stationary Points of Functions
4.4.2 Newton's Method
4.4.2.1 Quasi-Newton Methods
4.4.3 Least Squares
4.4.3.1 Linear Least Squares
4.4.3.2 Nonlinear Least Squares: The Gauss-Newton Method
4.4.3.3 Levenberg-Marquardt Method
4.4.4 Maximum Likelihood
4.4.5 Optimization of Functions with Constraints
4.4.5.1 Equality-Constrained Linear Least Squares Problems
4.4.5.2 The Reduced Gradient and Reduced Hessian
4.4.5.3 Lagrange Multipliers
4.4.5.4 The Lagrangian
4.4.5.5 Another Example: The Rayleigh Quotient
4.4.5.6 Optimization of Functions with Inequality Constraints
4.4.5.7 Inequality-Constrained Linear Least Squares Problems
4.4.5.8 Nonlinear Least Squares as an Inequality-Constrained Problem
4.4.6 Optimization Without Differentiation
4.5 Integration and Expectation: Applications to Probability Distributions
4.5.1 Multidimensional Integrals and Integrals Involving Vectors and Matrices
4.5.1.1 Change of Variables: Jacobians
4.5.2 Integration Combined with Other Operations
4.5.3 Random Variables and Probability Distributions
4.5.3.1 Vector Random Variables
4.5.3.2 The Multivariate Normal Distribution
4.5.3.3 Matrix Random Variables
Exercises
5 Matrix Transformations and Factorizations
5.1 Factorizations
5.2 Computational Methods: Direct and Iterative
5.3 Linear Geometric Transformations
5.3.1 Invariance Properties of Linear Transformations
5.3.2 Transformations by Orthogonal Matrices
5.3.3 Rotations
5.3.4 Reflections
5.3.5 Translations: Homogeneous Coordinates
5.4 Householder Transformations (Reflections)
5.4.1 Zeroing All Elements But One in a Vector
5.4.2 Computational Considerations
5.5 Givens Transformations (Rotations)
5.5.1 Zeroing One Element in a Vector
5.5.2 Givens Rotations That Preserve Symmetry
5.5.3 Givens Rotations to Transform to Other Values
5.5.4 Fast Givens Rotations
5.6 Factorization of Matrices
5.7 LU and LDU Factorizations
5.7.1 Properties: Existence
5.7.2 Pivoting
5.7.3 Use of Inner Products
5.7.4 Properties: Uniqueness
5.7.5 Properties of the LDU Factorization of a Square Matrix
5.8 QR Factorization
5.8.1 Related Matrix Factorizations
5.8.2 Matrices of Full Column Rank
5.8.3 Relation to the Moore-Penrose Inverse for Matrices of Full Column Rank
5.8.4 Nonfull Rank Matrices
5.8.5 Relation to the Moore-Penrose Inverse
5.8.6 Determining the Rank of a Matrix
5.8.7 Formation of the QR Factorization
5.8.8 Householder Reflections to Form theQR Factorization
5.8.9 Givens Rotations to Form the QR Factorization
5.8.10 Gram-Schmidt Transformations to Form the QR Factorization
5.9 Factorizations of Nonnegative Definite Matrices
5.9.1 Square Roots
5.9.2 Cholesky Factorization
5.9.2.1 Cholesky Decomposition of Singular Nonnegative Definite Matrices
5.9.2.2 Relations to Other Factorizations
5.9.3 Factorizations of a Gramian Matrix
5.10 Approximate Matrix Factorization
5.10.1 Nonnegative Matrix Factorization
5.10.2 Incomplete Factorizations
Exercises
6 Solution of Linear Systems
6.1 Condition of Matrices
6.1.1 Condition Number
6.1.2 Improving the Condition Number
6.1.2.1 Ridge Regression and the Condition Number
6.1.3 Numerical Accuracy
6.2 Direct Methods for Consistent Systems
6.2.1 Gaussian Elimination and Matrix Factorizations
6.2.1.1 Pivoting
6.2.1.2 Nonfull Rank and Nonsquare Systems
6.2.2 Choice of Direct Method
6.3 Iterative Methods for Consistent Systems
6.3.1 The Gauss-Seidel Method with Successive Overrelaxation
6.3.1.1 Convergence of the Gauss-Seidel Method
6.3.1.2 Successive Overrelaxation
6.3.2 Conjugate Gradient Methods for Symmetric Positive Definite Systems
6.3.2.1 The Conjugate Gradient Method
6.3.2.2 Krylov Methods
6.3.2.3 GMRES Methods
6.3.2.4 Preconditioning
6.3.3 Multigrid Methods
6.4 Iterative Refinement
6.5 Updating a Solution to a Consistent System
6.6 Overdetermined Systems: Least Squares
6.6.0.1 Accounting for an Intercept
6.6.1 Least Squares Solution of an OverdeterminedSystem
6.6.1.1 Orthogonality of Least Squares Residuals to span(X)
6.6.1.2 Numerical Accuracy in Overdetermined Systems
6.6.2 Least Squares with a Full Rank Coefficient Matrix
6.6.3 Least Squares with a Coefficient Matrix Not of Full Rank
6.6.3.1 An Optimal Property of the Solution Using the Moore-Penrose Inverse
6.6.4 Weighted Least Squares
6.6.5 Updating a Least Squares Solution of an Overdetermined System
6.7 Other Solutions of Overdetermined Systems
6.7.1 Solutions that Minimize Other Normsof the Residuals
6.7.1.1 Minimum L1 Norm Fitting: Least Absolute Values
6.7.1.2 Minimum L∞ Norm Fitting: Minimax
6.7.1.3 Lp Norms and Iteratively Reweighted Least Squares
6.7.2 Regularized Solutions
6.7.3 Minimizing Orthogonal Distances
Exercises
7 Evaluation of Eigenvalues and Eigenvectors
7.1 General Computational Methods
7.1.1 Numerical Condition of an Eigenvalue Problem
7.1.2 Eigenvalues from Eigenvectors and Vice Versa
7.1.3 Deflation
7.1.3.1 Deflation of Symmetric Matrices
7.1.4 Preconditioning
7.1.5 Shifting
7.2 Power Method
7.2.1 Inverse Power Method
7.3 Jacobi Method
7.4 QR Method
7.5 Krylov Methods
7.6 Generalized Eigenvalues
7.7 Singular Value Decomposition
Exercises
Part II Applications in Data Analysis
8 Special Matrices and Operations Useful in Modeling and Data Analysis
8.1 Data Matrices and Association Matrices
8.1.1 Flat Files
8.1.2 Graphs and Other Data Structures
8.1.2.1 Adjacency Matrix: Connectivity Matrix
8.1.2.2 Digraphs
8.1.2.3 Connectivity of Digraphs
8.1.2.4 Irreducible Matrices
8.1.2.5 Strong Connectivity of Digraphs and Irreducibility of Matrices
8.1.3 Term-by-Document Matrices
8.1.4 Probability Distribution Models
8.1.5 Derived Association Matrices
8.2 Symmetric Matrices and Other Unitarily Diagonalizable Matrices
8.2.1 Some Important Properties of Symmetric Matrices
8.2.2 Approximation of Symmetric Matrices and an Important Inequality
8.2.3 Normal Matrices
8.3 Nonnegative Definite Matrices: Cholesky Factorization
8.3.1 Eigenvalues of Nonnegative Definite Matrices
8.3.2 The Square Root and the Cholesky Factorization
8.3.3 The Convex Cone of Nonnegative Definite Matrices
8.4 Positive Definite Matrices
8.4.1 Leading Principal Submatrices of Positive Definite Matrices
8.4.2 The Convex Cone of Positive Definite Matrices
8.4.3 Inequalities Involving Positive Definite Matrices
8.5 Idempotent and Projection Matrices
8.5.1 Idempotent Matrices
8.5.1.1 Symmetric Idempotent Matrices
8.5.1.2 Cochran's Theorem
8.5.2 Projection Matrices: Symmetric IdempotentMatrices
8.5.2.1 Projections onto Linear Combinations of Vectors
8.6 Special Matrices Occurring in Data Analysis
8.6.1 Gramian Matrices
8.6.1.1 Sums of Squares and Cross Products
8.6.1.2 Some Immediate Properties of Gramian Matrices
8.6.1.3 Generalized Inverses of Gramian Matrices
8.6.1.4 Eigenvalues of Gramian Matrices
8.6.2 Projection and Smoothing Matrices
8.6.2.1 A Projection Matrix Formed from a Gramian Matrix
8.6.2.2 Smoothing Matrices
8.6.2.3 Effective Degrees of Freedom
8.6.2.4 Residuals from Smoothed Data
8.6.3 Centered Matrices and Variance-CovarianceMatrices
8.6.3.1 Centering and Scaling of Data Matrices
8.6.3.2 Gramian Matrices Formed from Centered Matrices: Covariance Matrices
8.6.3.3 Gramian Matrices Formed from Scaled Centered Matrices: Correlation Matrices
8.6.4 The Generalized Variance
8.6.4.1 Comparing Variance-Covariance Matrices
8.6.5 Similarity Matrices
8.6.6 Dissimilarity Matrices
8.7 Nonnegative and Positive Matrices
8.7.1 The Convex Cones of Nonnegative and PositiveMatrices
8.7.2 Properties of Square Positive Matrices
8.7.3 Irreducible Square Nonnegative Matrices
8.7.3.1 Properties of Square Irreducible Nonnegative Matrices: The Perron-Frobenius Theorem
8.7.3.2 Primitive Matrices
8.7.3.3 Limiting Behavior of Primitive Matrices
8.7.4 Stochastic Matrices
8.7.5 Leslie Matrices
8.8 Other Matrices with Special Structures
8.8.1 Helmert Matrices
8.8.2 Vandermonde Matrices
8.8.3 Hadamard Matrices and Orthogonal Arrays
8.8.4 Toeplitz Matrices
8.8.4.1 Inverses of Certain Toeplitz Matrices and Other Banded Matrices
8.8.5 Circulant Matrices
8.8.6 Fourier Matrices and the Discrete FourierTransform
8.8.6.1 Fourier Matrices and Elementary Circulant Matrices
8.8.6.2 The Discrete Fourier Transform
8.8.7 Hankel Matrices
8.8.8 Cauchy Matrices
8.8.9 Matrices Useful in Graph Theory
8.8.9.1 Adjacency Matrix: Connectivity Matrix
8.8.9.2 Digraphs
8.8.9.3 Use of the Connectivity Matrix
8.8.9.4 The Laplacian Matrix of a Graph
8.8.10 Z-Matrices and M-Matrices
Exercises
9 Selected Applications in Statistics
9.1 Structure in Data and Statistical Data Analysis
9.2 Multivariate Probability Distributions
9.2.1 Basic Definitions and Properties
9.2.2 The Multivariate Normal Distribution
9.2.3 Derived Distributions and Cochran's Theorem
9.3 Linear Models
9.3.1 Fitting the Model
9.3.1.1 Statistical Estimation
9.3.1.2 Ordinary Least Squares
9.3.1.3 Weighted Least Squares
9.3.1.4 Variations on the Criteria for Fitting
9.3.1.5 Regularized Fits
9.3.1.6 Orthogonal Distances
9.3.1.7 Collinearity
9.3.2 Linear Models and Least Squares
9.3.2.1 Degrees of Freedom
9.3.2.2 The Hat Matrix and Leverage
9.3.3 Statistical Inference
9.3.3.1 Estimability
9.3.3.2 Testability
9.3.3.3 The Gauss-Markov Theorem
9.3.4 The Normal Equations and the Sweep Operator
9.3.5 Linear Least Squares Subject to Linear Equality Constraints
9.3.6 Weighted Least Squares
9.3.7 Updating Linear Regression Statistics
9.3.7.1 Adding More Variables
9.3.7.2 Adding More Observations
9.3.7.3 Adding More Observations Using Weights
9.3.8 Linear Smoothing
9.3.9 Multivariate Linear Models
9.3.9.1 Fitting the Model
9.3.9.2 Partitioning the Sum of Squares
9.3.9.3 Statistical Inference
9.4 Principal Components
9.4.1 Principal Components of a Random Vector
9.4.2 Principal Components of Data
9.4.2.1 Principal Components Directly from the Data Matrix
9.4.2.2 Dimension Reduction
9.5 Condition of Models and Data
9.5.1 Ill-Conditioning in Statistical Applications
9.5.2 Variable Selection
9.5.3 Principal Components Regression
9.5.4 Shrinkage Estimation
9.5.4.1 Ridge Regression
9.5.4.2 Lasso Regression
9.5.5 Statistical Inference about the Rank of a Matrix
9.5.5.1 Numerical Approximation and Statistical Inference
9.5.5.2 Statistical Tests of the Rank of a Class of Matrices
9.5.5.3 Statistical Tests of the Rank Based on an LDU Factorization
9.5.6 Incomplete Data
9.6 Optimal Design
9.6.1 D-Optimal Designs
9.7 Multivariate Random Number Generation
9.7.1 The Multivariate Normal Distribution
9.7.2 Random Correlation Matrices
9.8 Stochastic Processes
9.8.1 Markov Chains
9.8.1.1 Properties of Markov Chains
9.8.1.2 Limiting Behavior of Markov Chains
9.8.2 Markovian Population Models
9.8.3 Autoregressive Processes
9.8.3.1 Relation of the Autocorrelations to the Autoregressive Coefficients
Exercises
Part III Numerical Methods and Software
10 Numerical Methods
10.1 Digital Representation of Numeric Data
10.1.1 The Fixed-Point Number System
10.1.1.1 Software Representation and Big Integers
10.1.2 The Floating-Point Model for Real Numbers
10.1.2.1 The Parameters of the Floating-Point Representation
10.1.2.2 Standardization of Floating-Point Representation
10.1.2.3 Special Floating-Point Numbers
10.1.3 Language Constructs for Representing NumericData
10.1.3.1 C
10.1.3.2 Fortran
10.1.3.3 Python
10.1.3.4 Determining the Numerical Characteristics of a Particular Computer
10.1.4 Other Variations in the Representation of Data; Portability of Data
10.2 Computer Operations on Numeric Data
10.2.1 Fixed-Point Operations
10.2.2 Floating-Point Operations
10.2.2.1 Errors
10.2.2.2 Guard Digits and Chained Operations
10.2.2.3 Addition of Several Numbers
10.2.2.4 Compensated Summation
10.2.2.5 Catastrophic Cancellation
10.2.2.6 Standards for Floating-Point Operations
10.2.2.7 Operations Involving Special Floating-Point Numbers
10.2.2.8 Comparison of Real Numbers and Floating-Point Numbers
10.2.3 Language Constructs for Operations on Numeric Data
10.2.4 Software Methods for Extending the Precision
10.2.4.1 Multiple Precision
10.2.4.2 Rational Fractions
10.2.4.3 Interval Arithmetic
10.2.5 Exact Computations
10.2.5.1 Exact Dot Product (EDP)
10.3 Numerical Algorithms and Analysis
10.3.1 Algorithms and Programs
10.3.2 Error in Numerical Computations
10.3.2.1 Measures of Error and Bounds for Errors
10.3.2.2 Error of Approximation
10.3.2.3 Algorithms and Data
10.3.2.4 Condition of Data
10.3.2.5 Robustness of Algorithms
10.3.2.6 Stability of Algorithms
10.3.2.7 Reducing the Error in Numerical Computations
10.3.3 Efficiency
10.3.3.1 Measuring Efficiency
10.3.3.2 Improving Efficiency
10.3.3.3 Scalability
10.3.3.4 Bottlenecks and Limits
10.3.3.5 High-Performance Computing
10.3.3.6 Computations in Parallel
10.3.4 Iterations and Convergence
10.3.4.1 Extrapolation
10.3.5 Other Computational Techniques
10.3.5.1 Recursion
10.3.5.2 Computations Without Storing Data
10.3.5.3 MapReduce
Exercises
11 Numerical Linear Algebra
11.1 Computer Storage of Vectors and Matrices
11.1.1 Storage Modes
11.1.2 Strides
11.1.3 Sparsity
11.2 General Computational Considerations for Vectors and Matrices
11.2.1 Relative Magnitudes of Operands
11.2.1.1 Condition
11.2.1.2 Pivoting
11.2.1.3 ``Modified'' and ``Classical'' Gram-Schmidt Transformations
11.2.2 Iterative Methods
11.2.2.1 Preconditioning
11.2.2.2 Restarting and Rescaling
11.2.2.3 Preservation of Sparsity
11.2.2.4 Iterative Refinement
11.2.3 Assessing Computational Errors
11.2.3.1 Errors in Vectors and Matrices
11.2.3.2 Assessing Errors in Given Computations
11.3 Multiplication of Vectors and Matrices
11.3.1 Strassen's Algorithm
11.3.2 Matrix Multiplication Using MapReduce
11.4 Other Matrix Computations
11.4.1 Rank Determination
11.4.2 Computing the Determinant
11.4.3 Computing the Condition Number
Exercises
12 Software for Numerical Linear Algebra
12.1 General Considerations
12.1.1 Software Development and Open Source Software
12.1.2 Collaborative Research and Version Control
12.1.3 Finding Software
12.1.4 Software Design
12.1.4.1 Interoperability
12.1.4.2 Efficiency
12.1.4.3 Writing Mathematics and Writing Programs
12.1.4.4 Numerical Mathematical Objects and Computer Objects
12.1.4.5 Other Mathematical Objects and Computer Objects
12.1.4.6 Software for Statistical Applications
12.1.4.7 Robustness
12.1.4.8 Computing Paradigms
12.1.4.9 Array Structures and Indexes
12.1.4.10 Matrix Storage Modes
12.1.5 Software Development, Maintenance, and Testing
12.1.5.1 Test Data
12.1.5.2 Assessing the Accuracy of a Computed Result
12.1.5.3 Software Reviews
12.1.6 Reproducible Research
12.2 Software Libraries
12.2.1 BLAS
12.2.2 Level 2 and Level 3 BLAS, LAPACK, and Related Libraries
12.2.3 Libraries for High Performance Computing
12.2.3.1 Libraries for Parallel Processing
12.2.3.2 Clusters of Computers
12.2.3.3 Graphical Processing Units
12.2.4 The IMSL Libraries
12.2.4.1 Examples of Use of the IMSL Libraries
12.3 General Purpose Languages
12.3.1 Programming Considerations
12.3.1.1 Reverse Communication in Iterative Algorithms
12.3.1.2 Computational Efficiency
12.3.2 Modern Fortran
12.3.3 C and C++
12.3.4 Python
12.4 Interactive Systems for Array Manipulation
12.4.1 R
12.4.1.1 General Properties
12.4.1.2 Documentation
12.4.1.3 Basic Operations with Vectors and Matrices and for Subarrays
12.4.1.4 R Functions for Numerical Linear Algebra
12.4.1.5 Other Mathematical Objects and Operations
12.4.1.6 Extensibility
12.4.1.7 Packages
12.4.1.8 Sharable Libraries and Rcpp
12.4.1.9 Other Versions of R and Other Distributions
12.4.2 MATLAB and Octave
Exercises
Appendices
A Notation and Definitions
A.1 General Notation
A.2 Computer Number Systems
A.3 General Mathematical Functions and Operators
A.3.1 Special Functions
A.4 Linear Spaces and Matrices
A.4.1 Norms and Inner Products
A.4.2 Matrix Shaping Notation
A.4.3 Notation for Rows or Columns of Matrices
A.4.4 Notation Relating to Matrix Determinants
A.4.5 Matrix-Vector Differentiation
A.4.6 Special Vectors and Matrices
A.4.7 Elementary Operator Matrices
A.5 Models and Data
B Solutions and Hints for Selected Exercises
Bibliography
Index
Springer Texts in Statistics James E. Gentle Matrix Algebra Theory, Computations and Applications in Statistics Second Edition
Springer Texts in Statistics Series Editors Richard DeVeaux Stephen E. Fienberg Ingram Olkin
More information about this series at http://www.springer.com/series/417
James E. Gentle Matrix Algebra Theory, Computations and Applications in Statistics Second Edition 123
James E. Gentle Fairfax, VA, USA ISSN 1431-875X Springer Texts in Statistics ISBN 978-3-319-64866-8 DOI 10.1007/978-3-319-64867-5 ISSN 2197-4136 (electronic) ISBN 978-3-319-64867-5 (eBook) Library of Congress Control Number: 2017952371 1st edition: © Springer Science+Business Media, LLC 2007 2nd edition: © Springer International Publishing AG 2017 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and informa- tion in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To Mar´ıa
Preface to the Second Edition In this second edition, I have corrected all known typos and other errors; I have (it is hoped) clarified certain passages; I have added some additional material; and I have enhanced the Index. I have added a few more comments about vectors and matrices with com- plex elements, although, as before, unless stated otherwise, all vectors and matrices in this book are assumed to have real elements. I have begun to use “det(A)” rather than “|A|” to represent the determinant of A, except in a few cases. I have also expressed some derivatives as the transposes of the expressions I used formerly. I have put more conscious emphasis on “user-friendliness” in this edition. In a book, user-friendliness is primarily a function of references, both internal and external, and of the index. As an old software designer, I’ve always thought that user-friendliness is very important. To the extent that internal references were present in the first edition, the positive feedback I received from users of that edition about the friendliness of those internal references (“I liked the fact that you said ‘equation (x.xx) on page yy,’ instead of just ‘equation (x.xx)’ ”) encouraged me to try to make the internal references even more useful. It’s only when you’re “eating your own dog food,” that you become aware of where details matter, and in using the first edition, I realized that the choice of entries in the Index was suboptimal. I have spent significant time in organizing it, and I hope that the user will find the Index to this edition to be very useful. I think that it has been vastly improved over the Index in the first edition. The overall organization of chapters has been preserved, but some sec- tions have been changed. The two chapters that have been changed most are Chaps. 3 and 12. Chapter 3, on the basics of matrices, got about 30 pages longer. It is by far the longest chapter in the book, but I just didn’t see any reasonable way to break it up. In Chap. 12 of the first edition, “Software for Numerical Linear Algebra,” I discussed four software systems or languages, C/C++, Fortran, Matlab, and R, and did not express any preference for one vii
viii Preface to the Second Edition over another. In this edition, although I occasionally mention various lan- guages and systems, I now limit most of my discussion to Fortran and R. There are many reasons for my preference for these two systems. R is ori- ented toward statistical applications. It is open source and freely distributed. As for Fortran versus C/C++, Python, or other programming languages, I agree with the statement by Hanson and Hopkins (2013, page ix), “... For- tran is currently the best computer language for numerical software.” Many people, however, still think of Fortran as the language their elders (or they themselves) used in the 1970s. (On a personal note, Richard Hanson, who passed away recently, was a member of my team that designed the IMSL C Libraries in the mid 1980s. Not only was C much cooler than Fortran at the time, but the ANSI committee working on updating the Fortran language was so fractured by competing interests that approval of the revision was repeat- edly delayed. Many numerical analysts who were not concerned with coolness turned to C because it provided dynamic storage allocation and it allowed flexible argument lists, and the Fortran constructs could not be agreed upon.) Language preferences are personal, of course, and there is a strong “cool- ness factor” in choice of a language. Python is currently one of the coolest languages, but I personally don’t like the language for most of the stuff I do. Although this book has separate parts on applications in statistics and computational issues as before, statistical applications have informed the choices I made throughout the book, and computational considerations have given direction to most discussions. I thank the readers of the first edition who informed me of errors. Two people in particular made several meaningful comments and suggestions. Clark Fitzgerald not only identified several typos, he made several broad suggestions about organization and coverage that resulted in an improved text (I think). Andreas Eckner found, in addition to typos, some gaps in my logic and also suggested better lines of reasoning at some places. (Although I don’t follow an itemized “theorem-proof” format, I try to give reasons for any nonobvious statements I make.) I thank Clark and Andreas especially for their comments. Any remaining typos, omissions, gaps in logic, and so on are entirely my responsibility. Again, I thank my wife, Mar´ıa, to whom this book is dedicated, for everything. I used TEX via LATEX 2ε to write the book. I did all of the typing, program- ming, etc., myself, so all misteaks (mistakes!) are mine. I would appreciate receiving suggestions for improvement and notification of errors. Notes on this book, including errata, are available at http://mason.gmu.edu/~jgentle/books/matbk/ Fairfax County, VA, USA July 14, 2017 James E. Gentle
分享到:
收藏