Linear least squares

Linear least squares#

Linear least squares refers to the problem of minimizing χ2 with a model that can be written as a linear combination of basis functions fj(x) as

m(x)=j=1Majfj(x)

where aj are the model parameters. For example if we wanted to fit a polynomial we would choose fj(x)=xj1 and minimize χ2 to find the polynomial coefficients {aj}.

We’ll follow the analysis of Numerical Recipes with some small changes in notation. Write out χ2 and minimize it:

χ2=i=1N[yij=1Majfj(xi)σi]2
χ2ak=0i=1N1σi2[yij=1Majfj(xi)]fk(xi)=0
i=1Nyiσifk(xi)σii=1Nj=1Makfj(xi)σifk(xi)σi=0

At this point it is useful to define the design matrix

Aij=fj(xi)σi

and also the vector d which is the data weighted by the errors at each point:

di=yiσi.

Then we have

i=1NAikbii=1Nj=1MakAijAik=0
i=1N(AT)kibij=1Mi=1N(AT)jiAikak=0

So in matrix form, we have

ATd=ATAa.

These are known as the normal equations for the linear least squares problem.

We’ve assumed here that the noise is uncorrelated between measurements (the noise on one measurement does not depend on the noise on a previous measurement). The normal equations can be generalized to include correlated noise, but we’ll leave this for later and for now focus on how we can solve these equations.

Exercise:

Show that χ2 can be written in our matrix notation as

χ2=(dAa)T(dAa).