PARAMETER ESTIMATION
AND INVERSE PROBLEMS
Second Edition
RICHARD C. ASTER
BRIAN BORCHERS
CLIFFORD H. THURBER
AMSTERDAM • BOSTON • HEIDELBERG • LONDON
NEW YORK • OXFORD • PARIS • SAN DIEGO
SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
Academic Press is an imprint of Elsevier
Academic Press is an imprint of Elsevier
225 Wyman Street, Waltham, MA 02451, USA
The Boulevard, Langford Lane, Kidlington, Oxford, OX5 1GB, UK
c 2013 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical,
including photocopying, recording, or any information storage and retrieval system, without permission in writing from
the Publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our
arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be
found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as
may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and experience broaden our
understanding, changes in research methods, professional practices, or medical treatment may become
necessary.
Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any
information, methods, compounds, or experiments described herein. In using such information or methods they should
be mindful of their own safety and the safety of others, including parties for whom they have a professional
responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for
any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any
use or operation of any methods, products, instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
Aster, Richard C.
Parameter estimation and inverse problems. – 2nd ed. / Richard C. Aster, Clifford H. Thurber.
p. cm.
Includes bibliographical references and index.
ISBN 978-0-12-385048-5 (hardback)
1. Parameter estimation.
4. Mathematical models.
QA276.8.A88 2012
515’.357–dc23
2. Inverse problems (Differential equations)
I. Thurber, Clifford H.
II. Title.
3. Inversion (Geophysics)
2011032004
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
For information on all Academic Press publications
visit our website at www.elsevierdirect.com
Typeset by: diacriTech, India
Printed in the United States of America
12 13 14 15
10 9 8 7 6 5 4 3 2 1
PREFACE
This textbook evolved from a course in geophysical inverse methods taught during the
past two decades at New Mexico Tech, first by Rick Aster and, subsequently, jointly
between Rick Aster and Brian Borchers. The audience for the course has included a
broad range of first- or second-year graduate students (and occasionally advanced under-
graduates) from geophysics, hydrology, mathematics, astrophysics, and other disciplines.
Cliff Thurber joined this collaboration during the production of the first edition and
has taught a similar course at the University of Wisconsin-Madison.
Our principal goal for this text is to promote fundamental understanding of param-
eter estimation and inverse problem philosophy and methodology, specifically regarding
such key issues as uncertainty, ill-posedness, regularization, bias, and resolution. We
emphasize theoretical points with illustrative examples, and MATLAB codes that imple-
ment these examples are provided on a companion website. Throughout the examples
and exercises, a web icon indicates that there is additional material on the website.
Exercises include a mix of applied and theoretical problems.
This book has necessarily had to distill a tremendous body of mathematics and
science going back to (at least) Newton and Gauss. We hope that it will continue to
find a broad audience of students and professionals interested in the general problem of
estimating physical models from data. Because this is an introductory text surveying a
very broad field, we have not been able to go into great depth. However, each chapter
has a “notes and further reading” section to help guide the reader to further explo-
ration of specific topics. Where appropriate, we have also directly referenced research
contributions to the field.
Some advanced topics have been deliberately left out of this book because of space
limitations and/or because we expect that many readers would not be sufficiently famil-
iar with the required mathematics. For example, readers with a strong mathematical
background may be surprised that we primarily consider inverse problems with discrete
data and discretized models. By doing this we avoid much of the technical complexity of
functional analysis. Some advanced applications and topics that we have omitted include
inverse scattering problems, seismic diffraction tomography, wavelets, data assimilation,
simulated annealing, and expectation maximization methods.
We expect that readers of this book will have prior familiarity with calculus, dif-
ferential equations, linear algebra, probability, and statistics at the undergraduate level.
In our experience, many students can benefit from at least a review of these topics, and
we commonly spend the first two to three weeks of the course reviewing material from
Appendices A, B, and C.
ix
x
Preface
Chapters 1 through 4 form the heart of the book, and should be covered in sequence.
Chapters 5 through 8 are mostly independent of each other, but draw heavily on the
material in Chapters 1 through 4. As such, they may be covered in any order. Chapters 9
and 10 are independent of Chapters 5 through 8, but are most appropriately covered in
sequence. Chapter 11 is independent of the material in Chapters 5 through 10, and
provides an introduction to the Bayesian perspective on inverse problems and Bayesian
solution methods.
If significant time is allotted for review of linear algebra, vector calculus, probability,
and statistics in the appendices, there will probably not be time to cover the entire book
in one semester. However, it should be possible for teachers to cover substantial material
following Chapter 4.
We especially wish to acknowledge our own professors and mentors in this field,
including Kei Aki, Robert Parker, and Peter Shearer. We thank our many colleagues,
including many students in our courses, who provided sustained encouragement and
feedback during the initial drafting and subsequent revision of the book, particularly
Kent Anderson, James Beck, Aaron Masters, Elena Resmerita, Charlotte Rowe, Tyson
Strand, and Suzan van der Lee. Stuart Anderson, Greg Beroza, Ken Creager, Don
Clewett, Ken Dueker, Eliza Michalopoulou, Paul Segall, Anne Sheehan, and Kristy
Tiampo deserve special mention for their classroom testing of early and subsequent ver-
sions of this text and their helpful suggestions, and Jason Mattax deserves special mention
for his thorough proofreading of the second edition text. Robert Nowack, Gary Pavlis,
Randall Richardson, and Steve Roecker provided thorough and very helpful reviews
during the initial scoping. We offer special thanks to Per Christian Hansen of the Tech-
nical University of Denmark for his Regularization Tools, which we highly recommend
as an adjunct to this text, and which were an inspiration in writing the first edition. Valu-
able feedback that improved the second edition included that provided by Ken Dueker,
Anne Sheehan, Pamela Moyer, John Townend, Frederik Tilmann, and Kurt Feigl. Oleg
Makhnin cotaught this course with Rick Aster at New Mexico Tech in 2010 and pro-
vided significant contributions, particularly regarding material in Chapter 11, that have
been incorporated into this second edition. We also thank the editorial staff at Elsevier
over the years, especially Frank Cynar, Kyle Sarofeen, Jennifer Hel´e, and John Fedor
for essential advice and direction. Suzanne Borchers and Susan Delap provided valuable
proofreading and graphics expertise. Brian Borchers was a visiting fellow at the Institute
for Pure and Applied Mathematics (IPAM) at University of California-Los Angeles, and
Rick Aster was partially supported by the New Mexico Tech Geophysical Research
Center during preparation of this book. Finally, we express thanks for the boundless
support of our families during the many years that it has taken to complete this effort.
Rick Aster, Brian Borchers, and Cliff Thurber
June 2011
CHAPTER ONE
Introduction
Synopsis
General issues associated with parameter estimation and inverse problems are introduced
through the concepts of the forward problem and its inverse solution. Scaling and super-
position properties that characterize linear systems are given, and common situations
leading to linear and nonlinear mathematical models are discussed. Examples of discrete
and continuous linear and nonlinear parameter estimation problems to be revisited in
later chapters are shown. Mathematical demonstrations highlighting the key issues of
solution existence, uniqueness, and instability are presented and discussed.
1.1. CLASSIFICATION OF PARAMETER ESTIMATION AND INVERSE
PROBLEMS
Scientists and engineers frequently wish to relate physical parameters characterizing a
model, m, to collected observations making up some set of data, d. We will commonly
assume that the fundamental physics are adequately understood, so a function, G, may
be specified relating m and d such that
In practice, d may be a function of time and/or space, or may be a collection of dis-
crete observations. An important issue is that actual observations always contain some
amount of noise. Two common ways that noise may arise are unmodeled influences
on instrument readings and numerical round-off. We can thus envision data as gener-
ally consisting of noiseless observations from a “perfect” experiment, dtrue, plus a noise
component η,
G(m) = d.
d = G(mtrue)+ η
= dtrue + η,
(1.1)
(1.2)
(1.3)
where dtrue exactly satisfies (1.1) for m equal to the true model, mtrue, and we assume
that the forward modeling is exact. We will see that it is commonly mathematically
possible, although practically undesirable, to also fit all or part of η by (1.1). It may seem
remarkable, but it is often the case that a solution for m that is influenced by even a small
Parameter Estimation and Inverse Problems, Second Edition. DOI: 10.1016/B978-0-12-385048-5.00001-X
c 2013 Elsevier Inc. All rights reserved.
1
2
Chapter 1 Introduction
noise amplitude η can have little or no correspondence to mtrue. Another key issue that
may seem astounding at first is that commonly there are an infinite number of models
aside from mtrue which fit the perfect data, dtrue.
When m and d are functions, we typically refer to G as an operator. G will be called
a function when m and d are vectors. The operator G can take on many forms. In some
cases G is an ordinary differential equation (ODE) or partial differential equation (PDE).
In other cases, G is a linear or nonlinear system of algebraic equations.
Note that there is some inconsistency between mathematicians and other scientists in
modeling terminology. Applied mathematicians usually refer to G(m) = d as the “math-
ematical model” and to m as the “parameters.” On the other hand, scientists often refer
to G as the “forward operator” and to m as the “model.” We will adopt the scientific
parlance and refer to m as the “the model” while referring to the equation G(m) = d as
the “mathematical model.”
The forward problem is to find d given m. Computing G(m) might involve solving
an ODE or PDE, evaluating an integral, or applying an algorithm for which there is no
explicit analytical formula for G(m). Our focus in this text is on the inverse problem
of finding m given d. A third problem, not addressed here, is the model identification
problem of determining G given examples of m and d.
In many cases, we will want to determine a finite number of parameters, n, to define
a model. The parameters may define a physical entity directly (e.g., density, voltage,
seismic velocity), or may be coefficients or other constants in a functional relationship
that describes a physical process. In this case, we can express the model parameters as an
n element vector m. Similarly, if there are a finite number of data points then we can
express the data as an m element vector d. (Note that the use of the integer m here for
the number of data points is easily distinguishable from the model m by its context.) Such
problems are called discrete inverse problems or parameter estimation problems.
A general parameter estimation problem can be written as a system of equations
G(m) = d.
(1.4)
In other cases, where the model and data are functions of continuous variables, such
as time or space, the associated task of estimating m from d is called a continuous
inverse problem. A central theme of this book is that continuous inverse problems can
often be well-approximated by discrete inverse problems.
We will generally refer to problems with small numbers of parameters as “parameter
estimation problems.” Problems with a larger number of parameters, and which will
often require the application of stabilizing constraints, will be referred to as “inverse
problems.” A key aspect of many inverse problems is that they are ill-conditioned in a
sense that will be discussed later in this chapter. In both parameter estimation and inverse
problems we solve for a set of parameters that characterize a model, and a key point of
this text is that the treatment of all such problems can be sufficiently generalized so
1.1. Classification of Parameter Estimation and Inverse Problems
3
that the distinction is largely irrelevant. In practice, what is important is the distinction
between ill-conditioned and well-conditioned parameter estimation problems.
A type of mathematical model for which many useful results exist is the class of
linear systems. Linear systems obey superposition
G(m1 + m2) = G(m1)+ G(m2)
and scaling
G(αm) = αG(m).
(1.5)
(1.6)
In the case of a discrete linear inverse problem, (1.4) can always be written in the form
of a linear system of algebraic equations (see Exercise 1.1).
G(m) = Gm = d.
(1.7)
In a continuous linear inverse problem, G can often be expressed as a linear integral
operator, where (1.1) has the form
g(x, ξ )m(ξ )dξ = d(x)
(1.8)
b
b
and the function g(x, ξ ) is called the kernel. The linearity of (1.8) is easily seen because
g(x, ξ )(m1(ξ )+ m2(ξ ))dξ =
g(x, ξ )m1(ξ )dξ +
g(x, ξ )m2(x)dξ
(1.9)
a
and
a
a
b
a
g(x, ξ )αm(ξ )dξ = α
b
a
g(x, ξ )m(ξ )dξ.
(1.10)
Equations in the form of (1.8), where m(x) is the unknown, are called Fredholm inte-
gral equations of the first kind (IFK). IFKs arise in a surprisingly large number of
inverse problems. A key characteristic of these equations is that they have mathematical
properties which make it difficult to obtain useful solutions by straightforward methods.
In many cases the kernel in (1.8) can be written to depend explicitly on x− ξ,
producing a convolution equation,
g(x− ξ )m(ξ )dξ = d(x).
(1.11)
b
a
b
∞
−∞