Aspects of Multivariate
Statistical Theory
Aspects of Multivariate
Statistical Theory
ROBB J. MUIRHEAD
Senior Statistical Scientist
PJizer Global Research and Development
New London, Conneclicut
@E-+!&CIENCE
A JOHN WILEY & SONS, INC., PUBLICATION
Copyright 0 1982,2005 hy John Wiley & Sotis, Inc. All rights rcscrvcd
Published by John Wiley & Sons, Inc., Hohokcn, Ncw Icrscy
Published simultaneously in Canada.
No part of this publication may he reproduccd, storcd in a rctricval systeni or transmittcd in any lorm
or by any mcans, electronic, niechanical. photncopying. rm)rding. scuniiing. or othcrwisc, cxccpt as
pcrmitted under Section 107 o r 108 of the 1976 United Statcs (‘opyright Act. withnut cithcr the prior
written permission of the Puhlishcr. or autlioiiiation through paynicnt of thc appropriate p c r a p y fec
to the Copyright Clcarance Center, Inc., 222 Iisclaimer of Warranty: While the puhlishcr and author havc uscd thcir hcst
efforts in preparing this bcwk, they makc no rcprescntations or warrantics with respwt to the
accuracy or completeness of‘the contents ofthis hwk and spcwifically disclaim any iiiiplicd
warranties of merchantability or fitness for a particular purpose No warranty may hc crcatcd or
extended hy sales represcntatives or written snlcs iiiatcrials. The advicc illid stratcgicr contained
herein may not be suitahle I‘or your situation. You should conslilt with’a profisioital where
appropriate. Neither the puhlishcr nor author shall bc liuhlc Tor ally loss ol’prolit or any otlicr
commercial damages, including hut not limited tc;rpccial.
incidcntal. coitxequciitial. or otlicr
damages.
For general information on our other products and services or fnr technical support, plcasc contact
our Customer Care Department within thc 1J.S. at (800) 762-2974, outsidc the U.S. at (3 17) 572-
3993 or fax (3 17) 572-4002.
Wiley also publishes its hooks in a variety nl’electroitic hrtnats. Some cottictit Iliac appcars i t i piint
may not be available in electronic format. For inlimnation ahout Wiley products, visit our web site at
www.wiley.com.
Library of Congress Catalogin~-in-PubNtion ih uvuiluhle.
ISBN- I 3 978-0-47 1-76985-9
ISBN-I0 0-471-76985-1
Printed in the United States of Amcrica
1 0 9 8 7 6 5 4 3 2 I
To
Nan and Bob
arid
Maria and Mick
Preface
This book has grown out of lectures given in first- and second-year graduate
courses at Yale University and the University of Michigan. It is designed as
a text for graduate level courses in multivariate statistical analysis, and I
hope that it may also prove to be useful as a reference book for research
workers interested in this area.
Any person writing a book in multivariate analysis owes a great debt to
T. W. Anderson for his 1958 text, An Introduction 10 Multivariate Statistical
Analysis, which has become a classic in the field. This book synthesized
various subareas for the first time in a broad overview of the subject and has
influenced the direction of recent and current research in theoretical multi-
variate analysis. It is also largely responsible for the popularity of many of
the multivariate techniques and procedures in common use today.
The current work builds on the foundation laid by Anderson in 1958 and
in large part is intended to describe some of the developments that have
taken place since then. One of the major developments has been the
introduction of zonal polynomials and hypergeometric functions of matrix
argument by A. T. James and A. G. Constantine. To a very large extent
these have made possible a unified study of the noncentral distributions that
arise in multivariate analysis under the standard assumptions of normal
sampling. This work is intended to provide an introduction to some of this
theory.
Most books of this nature reflect the author’s tastes and interests, and
this is no exception. The main focus of this work is on distribution theory,
both exact and asymptotic. Multivariate techniques depend heavily on
latent roots of random matrices; all of the important latent root distribu-
tions are introduced and approximations to them are discussed. In testing
problems the primary emphasis here is on likelihood ratio tests and the
distributions of likelihood ratio test statistics, The noncentral distributions
vii
Prejulte
viii
are needed to evaluate power functions. Of course, in the absence of “best”
tests simply computing power functions is of little interest; what is needed is
a comparison of powers of competing tests over a wide range of alternatives.
Wherever possible the results of such power studies in the literature are
discussed. I I should be mentioned, however, that although the emphasis is
on likelihood ratio statistics, many of the techniques introduced here for
studying and approximating their distributions can be applied to other test
statistics as well.
A few words should be said about the material covered in the text.
Matrix theory is used extensively, and matrix factorizations are extremely
important. Most of the relevant material is reviewed in the Appcndix, but
some results also appear in the text and as exercises. Chapter I introduces
the multivariate normal distribution and studies its properties, and also
provides an introduction to spherical and elliptical distributions. These form
an important class of non-normal distributions which have found increasing
use in robustness studies where the aim is to determine how sensitive
existing multivariate techniques are to multivariate normality assumptions.
In Chapter 2 many of the Jacobians of transformations used in the text are
derived, aiid a brief introduction to invariant measures via exterior differen-
tial forms is given. A review of rnatrix Kronecker or direct products is also
included here, The reason this is given at this point rather than in the
Appendix is that very few of the students that I have had in multivariate
analysis courses have been familiar with this product, which is widely used
in later work. Chapter 3 deals with the Wishart and multivariate beta
distributions and their properties. Chapter 4, on decision-theoretic estima-
tion of the parameters of a multivariate normal distribution, is rather an
anomaly. I would have preferred to incorporate this topic in one of the
other chapters, but there seemed to be no natural place for it. The niaterial
here is intended only as an introduction and certainly not as a review of the
current state of the art. Indeed, only admissibility (or rather, inadmissibility)
results are presented, and no mention is even made of Bayes procedures.
Chapter 5 deals with ordinary, multiple, and partial correlation coefficients.
An introduction to invariance theory and invariant tests is given in Chapter
6. It may be wondered why this topic is included here in view of the
coverage of the relevant basic material in the books by E. L.. L.ehmann,
Testing Statistical Hypotheses, and T. S . Ferguson, Mathenintical Statistics:
A Decision Theoretic Approach. The answer is that most of the students that
have taken my multivariate analysis courses have been unfamiliar with
invariance arguments, although they usually meet them in subsequent
courses. For this reason I have long felt that an introduction to invariant
tests in a multivariate text would certainly not be out of place.
Preluce
ix
Chapter 7 is where this book departs most significantly from others on
multivariate statistical theory. Here the groundwork is laid for studying the
noncentral distribution theory needed in subsequent chapters, where the
emphasis is on testing problems in standard multivariate procedures. Zonal
polynomials and hypergeometric functions of matrix argument are intro-
duced, and many of their properties needed in later work are derived.
Chapter 8 examines properties, and central and noncentral distributions, of
likelihood ratio statistics used for testing standard hypotheses about covari-
ance matrices and mean vectors. An attempt is also made here to explain
what happens if these tests are used and the underlying distribution is
non-normal. Chapter 9 deals with the procedure known as principal compo-
nents, where much attention is focused on the latent roots of the sample
covariance matrix. Asymptotic distributions of these roots are obtained and
are used in various inference problems. Chapter 10 studies the multivariate
general linear model and the distribution of latent roots and functions of
them used for testing the general linear hypothesis. An introduction to
discriminant analysis is also included here, although the coverage is rather
brief. Finally, Chapter I I deals with the problem of testing independence
between a number of sets of variables and also with canonical correlation
analysis.
The choice of the material covered is, of course, extremely subjective and
limited by space requirements. There are areas that have not been men-
tioned and not everyone will agree with my choices; I do believe, however,
that the topics included form the core of a reasonable course in classical
multivariate analysis. Areas which are not covered in the text include factor
analysis, multiple time series, multidimensional scaling, clustering, and
discrete multivariate analysis. These topics have grown so large that there
are now separate books devoted to each. The coverage of classification and
discriminant analysis also is not very extensive, and no mention is made of
Bayesian approaches; these topics have been treated in depth by Anderson
and by Kshirsagar, Multivariate Analysis, and Srivastava and Khatri, An
Introduction to Multioariate Statistics, and a person using the current work
as a text may wish to supplement it with material from these references.
This book has been planned as a text for a two-semester course in
multivariate statistical analysis. By an appropriate choice of topics it can
also be used in a one-semester course. One possibility is to cover Chapters 1,
2, 3, 5, and possibly 6, and those sections of Chapters 8, 9, 10 and 1 I which
do not involve noncentral distributions and consequently do not utilize the
theory developed in Chapter 7. The book is designed so that for the most
part these sections can be easily identified and omitted if desired. Exercises
are provided at the end of each chapter. Many of these deal with points