The data for this problem is available in the file fit_1.dat
which can
be downloaded from
http://www.eg.bucknell.edu/~phys310/hw/hw4.html. One option is to download the data file to
a local directory, and then import it using np.loadtxt()
. Another option is to download it
directly into the Jupyter notebook. I will use the second option here.
import numpy as np
import urllib # for importing data from a URL
import matplotlib as mpl
import matplotlib.pyplot as plt
# Following is an Ipython magic command that puts figures in notebook. (html-version compatible)
%matplotlib notebook
# to get figures to show up in pdf use next line:
#matplotlib inline
# M.L. modification of matplotlib defaults
# Changes can also be put in matplotlibrc file,
# or effected using mpl.rcParams[]
plt.style.use('classic')
plt.rc('figure', figsize = (6, 4.5)) # Reduces overall size of figures
plt.rc('axes', labelsize=16, titlesize=14)
plt.rc('figure', autolayout = True) # Adjusts supblot params for new size
np.loadtxt
function imports the content of the data file into a numpy
array.
The unpack = 'True'
option transposes the array so that each column ofo the data
file is in a separate array.In this notebook I use the second method.
f = urllib.request.urlopen('https://www.eg.bucknell.edu/~phys310/hw/assignments/fitting_1/fit_1.dat')
data = np.loadtxt(f)
print(data)
These data come from twenty (20) experiments
nominally measuring the same physical quantity. The data for each measurement (point)
is on a single line in the file. The first number on each line is the value of the
measured quantity, and the second is the uncertainty in the measurement. When using np.loadtxt the unpack=True
option takes the the transpose of the file, putting all the measured values in
the first element of the array data
, and the uncertainties in the second
element. Here the data were taken directly from the URL and stored in the array data
. In the next cell the transpose is taken of the data
array.
Best value is the weighted mean of the data. \begin{equation} x_{\rm CE}=\frac{\sum \limits_{i=1}^{N} w_i x_i}{\sum \limits_{i=1}^{N} w_i} \qquad \mbox{where} \qquad w_i=\frac{1}{\alpha_i^2} \qquad \mbox{and uncertainty} \qquad \alpha_{\rm CE}=1.0/\sqrt{\sum \limits_i w_i} \end{equation}
y, u = data.T
w = 1/u**2 # Calculate weights
mean = np.sum(y*w)/np.sum(w) # Calculate weighted mean
wunc = 1./np.sqrt(np.sum(w)) # Caclulate uncertainy in the weighted mean
#We can round the numbers to one significant figure in uncertainty and to same decimal point using the round() function.
print(mean, wunc)
print(round(mean,2),round(wunc,2))
plt.figure()
x = np.linspace(1,len(y),len(y))
plt.errorbar(x,y,u, fmt = 'ko')
plt.axhline(mean, c='r', label='weighted mean')
#plt.axhline(mean + wunc)
#plt.axhline(mean - wunc)
plt.xlabel('experiment number')
plt.legend(loc='upper left')
plt.xlim(0,21)
plt.ylabel('result');
Visual inspection of the graph, along with use of the zooming feature of the interactive graphics, reveal that 15 of the fall with one standard deviation of the weighted mean.
It's probably easier to have the computer do the counting. In order to compute the number within one standard deviation of the weighted mean, we can calculate the normalized residuals (r_norm
)
and count the number with magnitude < 1.
# METHOD 1
r_norm = (y - mean)/u # Calculate normalized residuals
cnt = 0
for i in range(len(r_norm)):
if np.abs(r_norm[i]) < 1:
cnt += 1
print(cnt, 'of the', len(r_norm), 'values are within 1 alphi_i of the weighted mean')
# METHOD 2
cnt = np.sum(np.abs(r_norm) < 1)
print(cnt, 'of the', len(r_norm), 'values are within 1 alpha_i of the weighted mean')
# to see how Method 2 works uncomment the following line
#print(np.abs(r_norm) < 1)
# in the sum command every TRUE is interpreted as 1 and every FALSE as zero
The goodness of fit $\chi^2$ is simply the sum of the squares of the normalized residuals.
chisq = sum(r_norm**2)
print(chisq, len(r_norm))
The goodness of fit parameter is $\chi^2 = 14.7$, which is close to the number of data points, implying consistency with the idea that the measurements are all of the same quantity, and it makes sense to average them.
version_information
is from J.R. Johansson (jrjohansson at gmail.com); see Introduction to scientific computing with Python for more information and instructions for package installation.
version_information
is installed on the linux network at Bucknell
%load_ext version_information
version_information numpy, matplotlib