{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Prelude to Hughes & Hase Chapter 8, Part B\n", "\n", "In Part A of the Prelude to Chapter 8, we looked at the expected distribution \n", "of values of $\\chi^2$ for the fit of a straight line to 5 data points. In Part B we\n", "repeat exactly the same process, but the model is an exponential decay, and \n", "we fit the model to a larger number of data points.\n", "\n", "Marty Ligare, August 2020" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from scipy import stats\n", "from scipy import optimize\n", "\n", "import matplotlib as mpl \n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Following is an Ipython magic command that puts figures in the notebook.\n", "%matplotlib notebook\n", " \n", "# M.L. modifications of matplotlib defaults\n", "# Changes can also be put in matplotlibrc file, \n", "# or effected using mpl.rcParams[]\n", "mpl.style.use('classic') \n", "plt.rc('figure', figsize = (6, 4.5)) # Reduces overall size of figures\n", "plt.rc('axes', labelsize=16, titlesize=14)\n", "plt.rc('figure', autolayout = True) # Adjusts supblot parameters for new size" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Functions\n", "Remember:\n", "$$\n", "\\chi^2 = \\sum_i\\frac{\\left(y(x_i) - y_i\\right)^2}{\\alpha_i^2}\n", "$$" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def f(x,q,r,s):\n", " '''Simple exponential decay function (with offset); nonlinear\n", " in the fit parameteres q, r, and s.'''\n", " \n", "def chi2(x,y,u,q,r,s):\n", " '''Chisquare as a function of data (x, y, and yerr=u), and model \n", " parameters q, r, and s.'''\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Set II" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "data = np.array([[0. , 7.46129812, 0.4 ],\n", " [0.2 , 5.85452157, 0.4 ],\n", " [0.4 , 6.07717576, 0.4 ],\n", " [0.6 , 5.4736062 , 0.4 ],\n", " [0.8 , 4.5904504 , 0.4 ],\n", " [1. , 4.79926158, 0.4 ],\n", " [1.2 , 4.55971739, 0.4 ],\n", " [1.4 , 3.60873722, 0.4 ],\n", " [1.6 , 2.86732949, 0.4 ],\n", " [1.8 , 3.54260122, 0.4 ],\n", " [2. , 2.13946228, 0.4 ],\n", " [2.2 , 2.22060587, 0.4 ],\n", " [2.4 , 1.63133617, 0.4 ],\n", " [2.6 , 2.46693153, 0.4 ],\n", " [2.8 , 2.32689348, 0.4 ],\n", " [3. , 1.5128004 , 0.4 ],\n", " [3.2 , 1.20809552, 0.4 ],\n", " [3.4 , 1.43849374, 0.4 ],\n", " [3.6 , 1.36242689, 0.4 ],\n", " [3.8 , 1.73797785, 0.4 ],\n", " [4. , 1.72819865, 0.4 ]])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make a graph of the data (with errorbars)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make initial estimates for fit parameters and fit function to data (\"old business\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Graph best-fit function along with data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculate and plot normalized residuals" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What confidence can we have in the model? (\"new business\")\n", "\n", "Normalized residuals look ok according to qualitative assessments \n", "mentioned, so precede to more quantitative measures.\n", "\n", "1. Assume the model is correct, and that any deviations \n", "from the model are due to the fact that data is the result of\n", "random sampling from normal probability distributions characterized by the \n", "known standard errors of the data points. This assumption is called\n", "the null hypothesis.\n", "2. Generate many simulated data sets based on the model, best-fit parameters, and uncertainties. (Use same values for $x_i$).\n", "3. Fit each of the simulated data sets and store the values of \n", "$\\chi^2$ in an array.\n", "4. Examine the histogram of simulated values of $\\chi^2$.\n", "Where does the value of $\\chi^2_{\\text{min}}$ from the fit to the \n", "orignal data fall within the histogram? What's the probability of \n", "getting values of $\\chi^2$ from the simulated data sets \n", "that are greater than the value given by the data that was actually measured. \n", "In other words, is our value of $\\chi^2$ a reasonable result from the model\n", "and the fluctuations we expect in our data?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Perform simulated experiments and print results" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make histogram of results from simulation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Discussion\n", "\n", "Again, the histogram gives \n", "a probability distribution for values $\\chi^2$ that \n", "minimize data sets consistent with the model and uncertainties.\n", "With more data points, the probability distribution is \n", "more symmetric than that in Part A.\n", "\n", "Again, the value of $\\chi^2$ obtained\n", "by fitting the original data is in the range of values we expect from \n", "the model and uncertainties. To be more quantitative, \n", "there is a 25% chance of getting a value of $\\chi^2$ that \n", "is 21.5 or bigger, so it's very reasonable to expect a value \n", "of about 21.5.\n", "\n", "There no reason to reject the exponential decay model based on the value of \n", " $\\chi^2 = 21.5$.\n", " \n", "On the other hand, what if our fit had returned a value of $\\chi^2 = 40$? (This \n", "is more than 3 standard deviations about the mean of the data in the historgram.) Zooming\n", "in on the histogram shows that there only 17 out of the 10,000 simulated data \n", "sets (0.0017%), that have a value of $\\chi^2$ greater than 40. It's pretty unlikely that we would get a value of $\\chi^2=40$ from our model and random flucuations.\n", "\n", "What if our fit had returned a value of $\\chi^2 = 4$? Inspection of the histrogram \n", "data shows that only 44 of the 10,000 simulated data sets returned a value less than \n", "4, or equivalently 99.56% of the trials returned a value of $\\chi^2$ greater than 4\\. This would be a good indication that our assumed error bars were too big." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Version Information \n", "`version_information` is from J.R. Johansson (jrjohansson at gmail.com).\n", "See Introduction to scientific computing with Python\n", "for more information and instructions for package installation.
\n", "\n", "`version_information` has been installed system-wide on Bucknell linux networked computers. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "%load_ext version_information" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "application/json": { "Software versions": [ { "module": "Python", "version": "3.7.8 64bit [GCC 7.5.0]" }, { "module": "IPython", "version": "7.17.0" }, { "module": "OS", "version": "Linux 3.10.0 1062.9.1.el7.x86_64 x86_64 with centos 7.8.2003 Core" }, { "module": "numpy", "version": "1.19.1" }, { "module": "scipy", "version": "1.5.0" }, { "module": "matplotlib", "version": "3.3.0" } ] }, "text/html": [ "
SoftwareVersion
Python3.7.8 64bit [GCC 7.5.0]
IPython7.17.0
OSLinux 3.10.0 1062.9.1.el7.x86_64 x86_64 with centos 7.8.2003 Core
numpy1.19.1
scipy1.5.0
matplotlib3.3.0
Thu Aug 20 15:47:28 2020 EDT
" ], "text/latex": [ "\\begin{tabular}{|l|l|}\\hline\n", "{\\bf Software} & {\\bf Version} \\\\ \\hline\\hline\n", "Python & 3.7.8 64bit [GCC 7.5.0] \\\\ \\hline\n", "IPython & 7.17.0 \\\\ \\hline\n", "OS & Linux 3.10.0 1062.9.1.el7.x86\\_64 x86\\_64 with centos 7.8.2003 Core \\\\ \\hline\n", "numpy & 1.19.1 \\\\ \\hline\n", "scipy & 1.5.0 \\\\ \\hline\n", "matplotlib & 3.3.0 \\\\ \\hline\n", "\\hline \\multicolumn{2}{|l|}{Thu Aug 20 15:47:28 2020 EDT} \\\\ \\hline\n", "\\end{tabular}\n" ], "text/plain": [ "Software versions\n", "Python 3.7.8 64bit [GCC 7.5.0]\n", "IPython 7.17.0\n", "OS Linux 3.10.0 1062.9.1.el7.x86_64 x86_64 with centos 7.8.2003 Core\n", "numpy 1.19.1\n", "scipy 1.5.0\n", "matplotlib 3.3.0\n", "Thu Aug 20 15:47:28 2020 EDT" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "version_information numpy, scipy, matplotlib" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.8" } }, "nbformat": 4, "nbformat_minor": 1 }