In Part A of the Prelude to Chapter 8, we looked at the expected distribution of values of $\chi^2$ for the fit of a straight line to 5 data points. In Part B we repeat exactly the same process, but the model is an exponential decay, and we fit the model to a larger number of data points.
Marty Ligare, August 2020
import numpy as np
from scipy import stats
from scipy import optimize
import matplotlib as mpl
import matplotlib.pyplot as plt
# Following is an Ipython magic command that puts figures in the notebook.
%matplotlib notebook
# M.L. modifications of matplotlib defaults
# Changes can also be put in matplotlibrc file,
# or effected using mpl.rcParams[]
mpl.style.use('classic')
plt.rc('figure', figsize = (6, 4.5)) # Reduces overall size of figures
plt.rc('axes', labelsize=16, titlesize=14)
plt.rc('figure', autolayout = True) # Adjusts supblot parameters for new size
Remember: $$ \chi^2 = \sum_i\frac{\left(y(x_i) - y_i\right)^2}{\alpha_i^2} $$
def f(x,q,r,s):
'''Simple exponential decay function (with offset); nonlinear
in the fit parameteres q, r, and s.'''
def chi2(x,y,u,q,r,s):
'''Chisquare as a function of data (x, y, and yerr=u), and model
parameters q, r, and s.'''
data = np.array([[0. , 7.46129812, 0.4 ],
[0.2 , 5.85452157, 0.4 ],
[0.4 , 6.07717576, 0.4 ],
[0.6 , 5.4736062 , 0.4 ],
[0.8 , 4.5904504 , 0.4 ],
[1. , 4.79926158, 0.4 ],
[1.2 , 4.55971739, 0.4 ],
[1.4 , 3.60873722, 0.4 ],
[1.6 , 2.86732949, 0.4 ],
[1.8 , 3.54260122, 0.4 ],
[2. , 2.13946228, 0.4 ],
[2.2 , 2.22060587, 0.4 ],
[2.4 , 1.63133617, 0.4 ],
[2.6 , 2.46693153, 0.4 ],
[2.8 , 2.32689348, 0.4 ],
[3. , 1.5128004 , 0.4 ],
[3.2 , 1.20809552, 0.4 ],
[3.4 , 1.43849374, 0.4 ],
[3.6 , 1.36242689, 0.4 ],
[3.8 , 1.73797785, 0.4 ],
[4. , 1.72819865, 0.4 ]])
Normalized residuals look ok according to qualitative assessments mentioned, so precede to more quantitative measures.
Again, the histogram gives a probability distribution for values $\chi^2$ that minimize data sets consistent with the model and uncertainties. With more data points, the probability distribution is more symmetric than that in Part A.
Again, the value of $\chi^2$ obtained by fitting the original data is in the range of values we expect from the model and uncertainties. To be more quantitative, there is a 25% chance of getting a value of $\chi^2$ that is 21.5 or bigger, so it's very reasonable to expect a value of about 21.5.
There no reason to reject the exponential decay model based on the value of $\chi^2 = 21.5$.
On the other hand, what if our fit had returned a value of $\chi^2 = 40$? (This is more than 3 standard deviations about the mean of the data in the historgram.) Zooming in on the histogram shows that there only 17 out of the 10,000 simulated data sets (0.0017%), that have a value of $\chi^2$ greater than 40. It's pretty unlikely that we would get a value of $\chi^2=40$ from our model and random flucuations.
What if our fit had returned a value of $\chi^2 = 4$? Inspection of the histrogram data shows that only 44 of the 10,000 simulated data sets returned a value less than 4, or equivalently 99.56% of the trials returned a value of $\chi^2$ greater than 4. This would be a good indication that our assumed error bars were too big.
version_information
is from J.R. Johansson (jrjohansson at gmail.com).
See Introduction to scientific computing with Python
for more information and instructions for package installation.
version_information
has been installed system-wide on Bucknell linux networked computers.
%load_ext version_information
version_information numpy, scipy, matplotlib