NOTE: In this notebook I use the stats
sub-module of scipy
for all statistics functions, including generation of random numbers. There are other modules with some overlapping functionality, e.g., the regular python random module, and the scipy.random
module, but I do not use them here. The stats
sub-module includes tools for a large number of distributions, it includes a large and growing set of statistical functions, and there is a unified class structure. (And namespace issues are minimized.) See https://docs.scipy.org/doc/scipy/reference/stats.html.
import numpy as np
from scipy import stats
import matplotlib as mpl
import matplotlib.pyplot as plt
# Following is an Ipython magic command that puts figures in notebook.
%matplotlib notebook
# M.L. modification of matplotlib defaults
# Changes can also be put in matplotlibrc file,
# or effected using mpl.rcParams[]
plt.style.use('classic')
plt.rc('figure', figsize = (6, 4.5)) # Reduces overall size of figures
plt.rc('axes', labelsize=16, titlesize=14)
plt.rc('figure', autolayout = True) # Adjusts supblot params for new size
To get started, sample one bag of M&Ms, and count the numberof brown M&Ms.
Do this by generating 60 random integers from the set 0, 1, 2, 3, 4, 5, and let's say that "brown" = 0.
bag = stats.randint.rvs(0,6,size = 60) # or sp.random.randint(0,6,60)
print(bag)
np.bincount(bag)
. The first element in the array is the number of occurences of 0 in "bag," the second element is the number of occurences of 1, etc.np.bincount(bag)
bincount
, or sp.bincount(bag)[0]
.np.bincount(bag)[0]
# Long version of sampling many bags
nb = 24 # number of bags
data_section = np.zeros(nb) # array in for data for a lab section
for i in range(nb):
bag = stats.randint.rvs(0,6,size=60)
data_section[i] = np.bincount(bag)[0]
data_section
# Concise version of sampling many bags
nb = 24 # number of bags
data_section = np.array([np.bincount(stats.randint.rvs(0,6,size=60))[0] for i in range(nb)])
data_section
np.mean(data_section), np.std(data_section), np.std(data_section)/np.sqrt(len(data_section)-1)
$\overline N = 9.8 \pm 0.6$
plt.figure()
nbins = 20
low = 0
high = 20
plt.hist(data_section,nbins,[low,high])
plt.xlim(0,20)
plt.title("Histogram of brown M&Ms per bag - Single Section",fontsize=14)
plt.xlabel("Number of brown M&Ms")
plt.ylabel("Occurences");
version_information
is from J.R. Johansson (jrjohansson at gmail.com); see Introduction to scientific computing with Python for more information and instructions for package installation.
version_information
is installed on the linux network at Bucknell
%load_ext version_information
version_information numpy, scipy, matplotlib