Chris Bishop’s PRML
Ch. 3: Linear Models of Regression
Mathieu Guillaumin & Radu Horaud
October 25, 2007
Mathieu Guillaumin & Radu Horaud
Chris Bishop’s PRML Ch. 3: Linear Models of Regression
Chapter content
I An example – polynomial curve fitting – was considered in
Ch. 1
I A linear combination – regression – of a fixed set of nonlinear
functions – basis functions
I Supervised learning: N observations {xn} with corresponding
target values {tn} are provided. The goal is to predict t of
a new value x.
I Construct a function such that y(x) is a prediction of t.
I Probabilistic perspective: model the predictive distribution
p(t|x).
Mathieu Guillaumin & Radu Horaud
Chris Bishop’s PRML Ch. 3: Linear Models of Regression
Figure 1.16, page 29
Mathieu Guillaumin & Radu Horaud
Chris Bishop’s PRML Ch. 3: Linear Models of Regression
txx02σy(x0,w)y(x,w)p(t|x0,w,β)
The chapter section by section
3.1 Linear basis function models
I Maximum likelihood and least squares
I Geometry of least squares
I Sequential learning
I Regularized least squares
3.2 The bias-variance decomposition
3.3 Bayesian linear regression
I Parameter distribution
I Predictive distribution
I Equivalent kernel
3.4 Bayesian model comparison
3.5 The evidence approximation
3.6 Limitations of fixed basis functions
Mathieu Guillaumin & Radu Horaud
Chris Bishop’s PRML Ch. 3: Linear Models of Regression
Linear Basis Function Models
M−1X
j=0
y(x, w) =
wjφj(x) = w>φ(x)
where:
I w = (w0, . . . , wM−1)> and φ = (φ0, . . . , φM−1)> with
φ0(x) = 1 and w0 = bias parameter.
I In general x ∈ RD but it will be convenient to treat the case
x ∈ R
I We observe the set X = {x1, . . . , xn, . . . , xN} with
corresponding target variables t = {tn}.
Mathieu Guillaumin & Radu Horaud
Chris Bishop’s PRML Ch. 3: Linear Models of Regression
Basis function choices
I Polynomial
I Gaussian
I Sigmoidal
φj(x) = exp
x − µj
s
φj(x) = σ
φj(x) = xj
− (x − µj)2
2s2
with σ(a) =
1
1 + e−a
I splines, Fourier, wavelets, etc.
Mathieu Guillaumin & Radu Horaud
Chris Bishop’s PRML Ch. 3: Linear Models of Regression
Examples of basis functions
Mathieu Guillaumin & Radu Horaud
Chris Bishop’s PRML Ch. 3: Linear Models of Regression
−101−1−0.500.51−1010 0.250.5 0.751 −10100.250.50.751
Maximum likelihood and least squares
t = y(x, w)
+
deterministic
Gaussian noise
|{z}
| {z }
NY
n=1
For a i.i.d. data set we have the likelihood function:
p(t|X, w, β) =
N (tn| w>φ(xn)
|
{z
mean
}
, β−1|{z}
var
)
We can use the machinery of MLE to estimate the parameters w
and the precision β:
wM L = (Φ>Φ)−1Φ>t with ΦM×N = [φmn(xn)]
and:
tn − w>
M Lφ(xn)2
NX
n=1
β−1
M L =
1
N
Mathieu Guillaumin & Radu Horaud
Chris Bishop’s PRML Ch. 3: Linear Models of Regression