The data for this problem is available at www.eg.bucknell.edu/~phys310/hw/fitting_4/fit_4.dat Each line in the data file contains a value of $x_i$, $y_i$, and $u_i$, where $u_i$ is the uncertainty in $y$.
fit_4.dat
import numpy as np
from scipy import optimize
import urllib # for importing from a URL
import matplotlib as mpl
import matplotlib.pyplot as plt
# Following is an Ipython magic command that puts figures in notebook.
# make pdf-file with matplotlib inline and for jupyter notebook use matplotlib notebook
%matplotlib notebook
# M.L. modification of matplotlib defaults
# Changes can also be put in matplotlibrc file,
# or effected using mpl.rcParams[]
plt.style.use('classic')
plt.rc('figure', figsize = (6, 4.5)) # Reduces overall size of figures
plt.rc('axes', labelsize=16, titlesize=14)
plt.rc('figure', autolayout = True) # Adjusts supblot params for new size
NOTE: This function is not linear in the fit parameters $a$, $b$, $c$, and $d$. Therefore
initial estimates for these parameters must be provided to the curvefit
function.
def s(x, a, b, c, d):
return a*c**2/((x-b)**2 + c**2) + d
def chi2(x, y, u, a, b, c, d):
yr = (y - s(x,a,b,c,d))/u # normalized residuals
return np.sum(yr**2)
g = urllib.request.urlopen('https://www.eg.bucknell.edu/~phys310/hw/fitting_4/fit_4.dat')
data = np.loadtxt(g)
x, y, u = data.T
plt.figure()
plt.errorbar(x, y, u, fmt='o')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.title("Result for part (a)");
a = 6 # Height of peak above background
b = 45 # Position of peak
c = 0.5 # Half width half maximum
d = 1.8 # Background
p0 = a, b, c, d
plt.figure()
xc = np.linspace(40,50,201)
yc = s(xc, *p0)
plt.plot(xc, yc)
plt.errorbar(x,y,u,fmt='o')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.title("Preliminary graph with estimated parameters");
popt, pcov = optimize.curve_fit(s, x, y, p0, sigma = u, absolute_sigma=True)
a, b, c, d = popt
plt.figure()
xc = np.linspace(40,50,201)
yc = s(xc, *popt)
plt.plot(xc, yc)
plt.errorbar(x,y,u,fmt='o')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.title("graph with best-fit parameters");
print('a =', a,'+/-', np.sqrt(pcov[0,0]))
print('b =', b,'+/-', np.sqrt(pcov[1,1]))
print('c =', c,'+/-', np.sqrt(pcov[2,2]))
print('d =', d,'+/-', np.sqrt(pcov[3,3]))
The fit looks good, but we should really look at the residuals and calculate $\chi^2_\text{min}$.
plt.figure()
yr = (y - s(x, *popt))/u
#plt.scatter(x, yr)
plt.errorbar(x,yr,1,fmt='o')
plt.axhline(0)
plt.axhline(1, ls='--')
plt.axhline(-1, ls='--')
plt.xlabel('$x$')
plt.ylabel('residuals')
plt.title("normalized residuals");
print(len(x), chi2(x, y, u, *popt))
For $a$ and $b$, I expect a correlation coefficient near zero. The determination of the horizontal position of the peak has no bearing on the height of the peak.
AA = np.linspace(5.6, 6.8, 201)
bb = np.linspace(44.93, 45.07, 201)
Z = np.zeros((len(AA),len(bb)))
for i in range(len(AA)):
for j in range(len(bb)):
Z[j,i] = chi2(x, y, u, AA[i], bb[j], c, d) - chi2(x, y, u, *popt)
plt.figure()
AAgrid, bbgrid = np.meshgrid(AA, bb)
CS = plt.contour(AAgrid, bbgrid, Z, levels=[1,2,5])
plt.xlabel('$a$')
plt.ylabel('$b$')
plt.title('$\chi^2 - \chi^2_{min}$ as function of peak height($a$) and x-position($b$)')
plt.grid()
plt.axhline(b)
plt.axvline(a)
plt.clabel(CS, inline=1, fontsize=10);
And the fact that the contours are not tilted with respect to the (x,y) axes confirms that there is no correlation between these parameters. Using the $\chi^2-\chi^2_{min} = 1$ contour, I see that the range in acceptable $a$ values runs from 5.9 to 6.5, so an uncertainty of 0.3 seems reasonable. Likewise, the range in acceptable $b$ values runs from 44.98 to 45.04, so an uncertainty of 0.03 in this parameter seems reasonable.
For $c$ and $d$ I expect there to be a nonzero correlation. A high estimation of the background $d$) will clearly be negatively correlated with a low estimation of the signal amplitude ($a$), but it will also be negatively correlated with a low estimation of the width at half maximum (determined by $c$), because the maximum is low.
cc = np.linspace(0.37, 0.55, 201)
dd = np.linspace(1.7, 2.0, 201)
Z = np.zeros((len(cc),len(dd)))
for i in range(len(cc)):
for j in range(len(dd)):
Z[j,i] = chi2(x, y, u, a, b, cc[i], dd[j]) - chi2(x, y, u, *popt)
plt.figure()
CCgrid, DDgrid = np.meshgrid(cc, dd)
CS = plt.contour(CCgrid, DDgrid, Z, levels=[1,2,5])
plt.xlabel('$c$')
plt.ylabel('$d$')
plt.title('$\chi^2 - \chi^2_{min}$ as function of peak width($c$) and background($d$)')
plt.grid()
plt.axhline(d)
plt.axvline(c)
plt.clabel(CS, inline=1, fontsize=10)
And the fact that the contours are tilted with respect to the (x,y) axes confirms that there is a correlation between these parameters. Using the $\chi^2-\chi^2_{min} = 1$ contour, I see that the range in acceptable $c$ values runs from 0.43 to 0.52, so an uncertainty of 0.05 seems reasonable. Likewise, the range in acceptable $d$ values runs from 1.72 to 1.94, so an uncertainty of 0.09 in this parameter seems reasonable.
version_information
is from J.R. Johansson (jrjohansson at gmail.com); see Introduction to scientific computing with Python for more information and instructions for package installation.
version_information
is installed on the linux network at Bucknell
%load_ext version_information
version_information numpy, scipy, matplotlib