Matrix Algebra From a Statistician’s Perspective
David A. Harville
Matrix Algebra
From a Statistician’s
Perspective
1 3
David A. Harville
IBM T. J. Watson Research Center
Mathematical Sciences Department
Yorktown Heights, NY 10598-0218
USA
harville@us.ibm.com
ISBN 978-0-387-78356-7
e-ISBN 978-0-387-22677-4
Library of Congress Control Number: 2008927514
c 2008 Springer Science+Business Media, LLC
All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York,
NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use
in connection with any form of information storage and retrieval, electronic adaptation, computer
software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they
are not identified as such, is not to be taken as an expression of opinion as to whether or not they are
subject to proprietary rights.
Printed on acid-free paper.
9 8 7 6 5 4 3 2 1
springer.com
Preface
Matrix algebra plays a very important role in statistics and in many other disci-
plines. In many areas of statistics, it has become routine to use matrix algebra in
the presentation and the derivation or verification of results. One such area is linear
statistical models; another is multivariate analysis. In these areas, a knowledge of
matrix algebra is needed in applying important concepts, as well as in studying the
underlying theory, and is even needed to use various software packages (if they
are to be used with confidence and competence).
On many occasions, I have taught graduate-level courses in linear statistical
models. Typically, the prerequisites for such courses include an introductory (un-
dergraduate) course in matrix (or linear) algebra. Also typically, the preparation
provided by this prerequisite course is not fully adequate. There are several rea-
sons for this. The level of abstraction or generality in the matrix (or linear) algebra
course may have been so high that it did not lead to a “working knowledge” of the
subject, or, at the other extreme, the course may have emphasized computations at
the expense of fundamental concepts. Further, the content of introductory courses
on matrix (or linear) algebra varies widely from institution to institution and from
instructor to instructor. Topics such as quadratic forms, partitioned matrices, and
generalized inverses that play an important role in the study of linear statistical
models may be covered inadequately if at all. An additional difficulty is that sev-
eral years may have elapsed between the completion of the prerequisite course
on matrix (or linear) algebra and the beginning of the course on linear statistical
models.
This book is about matrix algebra. A distinguishing feature is that the content,
the ordering of topics, and the level of generality are ones that I consider appro-
priate for someone with an interest in linear statistical models and perhaps also
The content of the paperback version is essentially the same as that of the earlier, hard-
cover version. The paperback version differs from the earlier version in that a number
of (mostly minor) corrections and alterations have been incorporated. In addition, the
typography has been improved—as a side effect, the content and the numbering of the
individual pages differ somewhat from those in the earlier version.
vi
Preface
for someone with an interest in another area of statistics or in a related discipline.
I have tried to keep the presentation at a level that is suitable for anyone who
has had an introductory course in matrix (or linear) algebra. In fact, the book is
essentially self-contained, and it is hoped that much, if not all, of the material may
be comprehensible to a determined reader with relatively little previous exposure
to matrix algebra. To make the material readable for as broad an audience as pos-
sible, I have avoided the use of abbreviations and acronyms and have sometimes
adopted terminology and notation that may seem more meaningful and familiar to
the non-mathematician than those favored by mathematicians. Proofs are provided
for essentially all of the results in the book. The book includes a number of results
and proofs that are not readily available from standard sources and many others
that can be found only in relatively high-level books or in journal articles.
The book can be used as a companion to the textbook in a course on linear
statistical models or on a related topic—it can be used to supplement whatever
results on matrices may be included in the textbook and as a source of proofs.
And, it can be used as a primary or supplementary text in a second course on
matrices, including a course designed to enhance the preparation of the students
for a course or courses on linear statistical models and/or related topics. Above all,
it can serve as a convenient reference book for statisticians and for various other
professionals.
While the motivation for the writing of the book came from the statistical ap-
plications of matrix algebra, the book itself does not include any appreciable dis-
cussion of statistical applications. It is assumed that the book is being read because
the reader is aware of the applications (or at least of the potential for applications)
or because the material is of intrinsic interest—this assumption is consistent with
the uses discussed in the previous paragraph. (In any case, I have found that the
discussions of applications that are sometimes interjected into treatises on matrix
algebra tend to be meaningful only to those who are already knowledgeable about
the applications and can be more of a distraction than a help.)
The book has a number of features that combine to set it apart from the more tra-
ditional books on matrix algebra—it also differs in significant respects from those
matrix-algebra books that share its (statistical) orientation, such as the books of
Searle (1982), Graybill (1983), and Basilevsky (1983). The coverage is restricted to
real matrices (i.e., matrices whose elements are real numbers)—complex matrices
(i.e., matrices whose elements are complex numbers) are typically not encountered
in statistical applications, and their exclusion leads to simplifications in terminol-
ogy, notation, and results. The coverage includes linear spaces, but only linear
spaces whose members are (real) matrices—the inclusion of linear spaces facili-
tates a deeper understanding of various matrix concepts (e.g., rank) that are very
relevant in applications to linear statistical models, while the restriction to linear
spaces whose members are matrices makes the presentation more appropriate for
the intended audience.
The book features an extensive discussion of generalized inverses and makes
heavy use of generalized inverses in the discussion of such standard topics as the
solution of linear systems and the rank of a matrix. The discussion of eigenvalues
Preface
vii
and eigenvectors is deferred until the next-to-last chapter of the book—I have found
it unnecessary to use results on eigenvalues and eigenvectors in teaching a first
course on linear statistical models and, in any case, find it aesthetically displeasing
to use results on eigenvalues and eigenvectors to prove more elementary matrix
results. And the discussion of linear transformations is deferred until the very last
chapter—in more advanced presentations, matrices are regarded as subservient to
linear transformations.
The book provides rather extensive coverage of some nonstandard topics that
have important applications in statistics and in many other disciplines. These in-
clude matrix differentiation (Chapter 15), the vec and vech operators (Chapter 16),
the minimization of a second-degree polynomial (in n variables) subject to linear
constraints (Chapter 19), and the ranks, determinants, and ordinary and general-
ized inverses of partitioned matrices and of sums of matrices (Chapter 18 and parts
of Chapters 8, 9, 13 16, 17, and 19). An attempt has been made to write the book
in such a way that the presentation is coherent and non-redundant but, at the same
time, is conducive to using the various parts of the book selectively.
With the obvious exception of certain of their parts, Chapters 12 through 22
(which comprise approximately three-quarters of the book’s pages) can be read in
arbitrary order. The ordering of Chapters 1 through 11 (both relative to each other
and relative to Chapters 12 through 22) is much more critical. Nevertheless, even
Chapters 1 through 11 include sections or subsections that are prerequisites for
only a small part of the subsequent material. More often than not, the less essential
sections or subsections are deferred until the end of the chapter or section.
The book does not address the computational aspects of matrix algebra in
any systematic way, however it does include descriptions and discussion of certain
computational strategies and covers a number of results that can be useful in dealing
with computational issues. Matrix norms are discussed, but only to a limited extent.
In particular, the coverage of matrix norms is restricted to those norms that are
defined in terms of inner products.
In writing the book, I was influenced to a considerable extent by Halmos’s
(1958) book on finite-dimensional vector spaces, by Marsaglia and Styan’s (1974)
paper on ranks, by Henderson and Searle’s (1979, 1981b) papers on the vec and
vech operators, by Magnus and Neudecker’s (1988) book on matrix differential
calculus, and by Rao and Mitra’s (1971) book on generalized inverses. And I ben-
efited from conversations with Oscar Kempthorne and from reading some notes
(on linear systems, determinants, matrices, and quadratic forms) that he had pre-
pared for a course (on linear statistical models) at Iowa State University. I also
benefited from reading the first two chapters (pertaining to linear algebra) of notes
prepared by Justus F. Seely for a course (on linear statistical models) at Oregon
State University.
The book contains many numbered exercises. The exercises are located at (or
near) the ends of the chapters and are grouped by section—some exercises may
require the use of results covered in previous sections, chapters, or exercises. Many
of the exercises consist of verifying results supplementary to those included in the
body of the chapter. By breaking some of the more difficult exercises into parts