Polynomial Regression


Introduction
This article describes polynomial curve fitting by the least squares method.

Two approaches are presented method 1. provides the best fitting polynomial , degree 0..7, for a set of points.
The graphics program Graphics-Explorer uses this method.

method 2. is limited to straight lines, so a polynomial of degree one or linear regression.

Method 1: Polynomial Regression
Look at the picture right:

you see the points (x1,y1)....(x5,y5).
Requested is the best fitting polynomial of degree 2 through these points.

"Best fitting" means :
the sum of the squares of the deviations must be minimal.
Dotted lines show the deviation for each point.

Given are the points (x1,y1) , (x2,y2)...(xn , yn)

Requested:
a polynomial, degree m, y = c0 + c1x + c2x2 + ... + cmxm
through these points with minimal deviations.

If the polynomial exactly crosses all points, if m+1 = n, than:

written as matrix:

If the polynomial does not exactly cross the points, there will be a difference vector:

The norm of this difference vector is the sum of all squared deviations.

So we look for the values of c , making || y - M . c || minimal.

This will be the case if the difference vector is perpendicular to the column space of M.
The inner product equals zero in this case.

Remarks
1.
Mt means the matrix M, reflected in the main diagonal.

If
than 2.
rule: ( A B)t = BtAt

3.
The inner product of two vectors a and b can be written as at.b

Example
Find the least-squares line through
points (0,1) (1,3) (2,4) en (3,4)

The line therefore is y = 1.5 + x

Method 2: Linear Regression
Look at the picture right:
Given are points (xi, yi)...where i = 1,2,...,n
Requested is the line y = b + ax with minimal deviation to these points.

Again, the result is measured as the sum of the squares
of the difference di of the point i and the line.
in case of n points.

For a point i we observe Before continuing we give some definitions and calculus rules

Definition Rules example: The formulas for the regression line y = b + ax
Let f(a,b) be the function of the sum of the squared deviations of points 1..n: First, the derivative of f(a,b) to a is calculated, where b is held constant
second, the derivative of f(a,b) to b is calculated where a is held constant: For the minimum of f(a,b) both derivatives must be 0.
This yields the following set of equations: Continuing with ....2) and using this result in ........1) By now, the formulas for a and b are found.
By substituting the a value into ........3) the value of b can be calculated.
However, the formula for a may be written more elegantly.
Nominator and denominator are worked on separately.

1. nominator 2. denominator summary:
Note
The above formulas for a and b may be derived as well by using linear algebra instead of differential calculus.
In the case of linear regression so m = 1 and c =[b,a] .......{line has equation y = b + ax..}
we depart with: remembering The next set of equations has to be solved: or or which matches .......1) and ..........2) before.