python22.sci2u.dk Active sci2u
Loading...
Login
Prefer latest booklet Prefer latest booklet Login
Link to booklet is copied to clipboard!
Saved to booklet!
Removed from booklet!
Started capturing to booklet.
Stopped capturing to booklet.
Preferring latest booklet.
No longer prefers latest booklet.
Entered edit mode.
Exited edit mode.
Edit mode

Click on the booklet title or a chapter title to edit (max. 30 characters)


How To's

How to get python How to plot points and lines How to use loops How to draw general shapes How to draw curves How to use functions How to manipulate Polygons How to create animations How to do mathematical analysis How to fit
10.1 Creating noisy data
10.2 Least Square Fit I
10.3 Least Square Fit II
How to go from SymPy to NumPy How to solve a single differential equation How to do coupled and 2nd order ODEs How to combine vectors and ODEs How to do ODE events Add Chapter..

How to fit

10.1 Creating noisy data

In this section, we will create some noisy data that follow a linear curve.
We start by creating some random -values distributed uniformly in the interval: :
import numpy as np
np.set_printoptions(precision=3)

N = 20
xs = -3 + 15 * np.array(sorted(np.random.rand(N)))
xs
array([-2.918, -0.36 , -0.343, -0.236,  1.362,  1.81 ,  2.952,  3.587,
        4.134,  4.609,  4.905,  5.319,  5.706,  7.547,  7.691,  7.975,
        8.127,  8.221,  9.267, 10.004])
Next, we create some corresponding -values so that we end up with noisy data points that follow a linear curve with slope and intersection :
sigma0 = 0.8
noise = sigma0 * np.random.randn(N)

a = -0.7
b = 4.2
ys = a * xs + b + noise 

ys
array([ 5.455,  2.908,  3.089,  4.039,  3.59 ,  3.779,  2.367,  2.381,
        2.506,  2.037, -0.058,  0.745, -0.852, -0.831, -0.397, -0.539,
       -2.101, -2.099, -3.948, -1.512])
When plotted
import matplotlib.pyplot as plt
plt.rc('font', size=16)

fig, ax = plt.subplots(1,1)
ax.grid()
ax.plot(xs,ys,'.',markersize=15,markeredgecolor='k',color='c')
it looks like this:

Figure 1



10.2 Least Square Fit I

We will now be making a fit of a function to the data points. The obvious choice here is a linear function,
where and must be chosen so that the function approximates the data points "as good as possible". To quantify "as good as possible" we introduce the sum of squared errors that the function makes to the data points when utilizing their -values and comparing to their -values. That is, we will consider the sum:
and choose and so that the sum is minimized.
In Python it may look like this:
from scipy.optimize import fmin
f = lambda x, a, b: a * x + b

q_opt = fmin(lambda q: np.sum((f(xs,q[0],q[1]) - ys)**2),[0,0])
q_opt
where the lambda-function inside fmin is introduced since fmin expects a function of one variable. We need to have two variables determined, and , but luckily enough, fmin accepts a vector as its argument and hence we provide the vector q whose first element is interpreted as and its second element is interpreted as . We get the following output:
Optimization terminated successfully.
         Current function value: 16.144634
         Iterations: 67
         Function evaluations: 132
array([-0.636,  3.871])
and see that the optimum values of and determined become and which are close to the and values that were chosen when the data were created. The second argument, [0,0], given to fmin are the initial guesses for and .
Finally, we may plot the fitted function:
a_opt = q_opt[0]
b_opt = q_opt[1]
ax.plot(xs,f(xs,a_opt,b_opt),'k--',linewidth=3)
fig

Figure 2



10.3 Least Square Fit II

Here is an example, where a least square fit is made with a Gaussian type function to data that look like a Normal distribution.
First we create some noisy data:
import numpy as np
import matplotlib.pyplot as plt
plt.rc('font', size=16)

N = 1000
bins = np.linspace(-5,10,31)    # bin edges
bins_c = (bins[1:]+bins[:-1])/2 # center values of the bins
x0 = 3.2
sigma0 = 1.2
data = sigma0 * np.random.randn(N) + x0

obs, _ = np.histogram(data,bins)
bins_c,obs
(array([-4.75, -4.25, -3.75, -3.25, -2.75, -2.25, -1.75, -1.25, -0.75,
        -0.25,  0.25,  0.75,  1.25,  1.75,  2.25,  2.75,  3.25,  3.75,
         4.25,  4.75,  5.25,  5.75,  6.25,  6.75,  7.25,  7.75,  8.25,
         8.75,  9.25,  9.75]),
 array([  0,   0,   0,   0,   0,   0,   0,   0,   1,   2,   8,  27,  37,
         86, 121, 162, 154, 159, 115,  62,  29,  26,   8,   3,   0,   0,
          0,   0,   0,   0]))
The np.histogram function just distributes the data in bins and counts how many data fall in each bin. We can confirm that we have a total of N=1000 pieces of data:
np.sum(obs)
1000
The data may be plotted like this:
fig, ax = plt.subplots(1,1)
ax.grid()
ax.bar(bins_c,obs,width=0.4,edgecolor='k',color='c')

Figure 3

Now, a least squares fit of
is made by introducing a vector q with , , and as its three components and having fmin find the optimum q as the one that minimizes the sum of square deviations between the -values of the data points and the function values at the corresponding -values:
The Python code may look like this:
from scipy.optimize import fmin
f = lambda x, A, x0, sigma: \
    A/sigma/np.sqrt(2*np.pi)*np.exp(-1/2*((x-x0)/sigma)**2)

q_opt = fmin(lambda q: np.sum((f(bins_c,q[0],q[1],q[2]) - obs)**2),
             [100,2.5,1])
q_opt
We get this output:
Optimization terminated successfully.
         Current function value: 659.506315
         Iterations: 116
         Function evaluations: 206
array([497.03785365,   3.17629749,   1.18678323])
and may plot the fit:
A_opt = q_opt[0]
x0_opt = q_opt[1]
sigma_opt = q_opt[2]
xs = np.linspace(-5,10,100)
ax.plot(xs,f(xs,A_opt,x0_opt,sigma_opt),'k--',linewidth=3)
fig

Figure 4



Sci2u Assignment: 820
Delete "How To's"?
Once deleted this booklet is gone forever!
Block is found in multiple booklets!

Choose which booklet to go to:

© 2019-2022 Uniblender ApS