# PHYS 310 Class 4 (Data Analysis class number 3)
Skeleton notebook for in-class exercises

Tom Solomon, Feb 2023

In [None]:
# First, start with the standard stuff at the top
import numpy as np
from scipy import optimize
import urllib

import matplotlib as mpl
import matplotlib.pyplot as plt

In [None]:
# Following is an Ipython magic command that puts figures in notebook.
%matplotlib notebook 
        
# M.L. modifications of matplotlib defaults
# Changes can also be put in matplotlibrc file, 
# or effected using mpl.rcParams[]
mpl.style.use('classic') 
plt.rc('figure', figsize = (6, 4.5)) # Reduces overall size of figures
plt.rc('axes', labelsize=16, titlesize=14)
plt.rc('figure', autolayout = True) # Adjusts supblot params for new size

# Recall from last week:  
### Average $\bar{x} = \frac{1}{N}\Sigma_ix_i$
### Weighted average $\bar{x}_{weighted} = \frac{\Sigma_iw_ix_i}{\Sigma_iw_i}$
### where weights $w_i = \frac{1}{\alpha_i^2}$

<hr>
Question:  when is it appropriate to average different results (with or without using weights)?

Let's say, e.g., two different groups measure Plank's constant _h_ from the cut-off wavelength of x-rays emanating from x-ray anodes. Let's say that one group uses a Molybdenum anode and finds results $6.8 \pm 0.2$, $6.9 \pm 0.2$, $6.7 \pm 0.3$ and $7.0 \pm 0.2$  (all $\times 10^{-34}$ Js), and another group uses a copper anode and finds results $4.8 \pm 0.3$, $5.1 \pm 0.2$, $4.7 \pm 0.4$ and $4.9 \pm 0.2$ (also all $\times 10^{-34}$ Js).

In [None]:
xvalues = np.array([1,2,3,4,5,6,7,8])
hvalues = np.array([6.8,6.9,6.7,7.0,4.8,5.1,4.7,4.9])
hmean = np.mean(hvalues)

In [None]:
plt.figure()
plt.scatter(xvalues,hvalues)
plt.axhline(hmean, c='r', label='mean')

### Question:  Does taking the "mean" of the data above produce a better estimate of the true value of _h_?

## What if one group measured _h_, all with the same electrode and got the following values?
$6.82 \pm 0.16$, $6.9 \pm 0.2$, $4.8 \pm 0.2$, $5.13 \pm 0.16$, $6.72 \pm 0.13$, $4.7 \pm 0.2$, $4.9 \pm 0.2$ and $7.0 \pm 0.4$  (all $\times 10^{-34}$ Js)?

In [None]:
xvalues = np.array([1,2,3,4,5,6,7,8])
hvalues = np.array([6.8,6.9,4.8,5.1,6.72,4.7,4.9,7.0])
hmean = np.mean(hvalues)

In [None]:
plt.figure()
plt.scatter(xvalues,hvalues)
plt.axhline(hmean, c='r', label='mean')

### Question:  Does taking the "mean" of the data above produce a better estimate of the true value of _h_?

#### What if we add the error bars to the plot?

In [None]:
xvalues = np.array([1,2,3,4,5,6,7,8])
hvalues = np.array([6.82,6.9,4.8,5.13,6.72,4.7,4.9,7.0])
alphavalues = np.array([0.16, 0.2, 0.2, 0.16, 0.13, 0.2, 0.2, 0.4])
# In class, add a calculation of weighted mean, call it wmean

In [None]:
plt.figure()
plt.errorbar(xvalues,hvalues,alphavalues, fmt = 'ko')
#plt.axhline(wmean, c='r', label='weighted mean')  #Uncomment this when you have determined weighted mean
plt.xlabel('experiment number')
plt.xlim(0,9)
plt.ylabel('h ');

Go back two cells, add a line or two to calculate the weighted mean. (Hint:  a dot product makes the numerator easy.  you can take the dot product of two matrices a and b either with numpy.dot -- i.e., np.dot(a,b) -- or by using the @ sign -- i.e., a @ b.)  Then plot the weighted mean on the plot above.

Next, plot the residuals $x_i - \bar{x}$

And then plot the normalized residuals ($(x_i - \bar{x})/\alpha_i$)

Plot residuals, then normalized residuals.  Then square the residuals and add them up -- chi-squared.