## Import relevant packages
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
Often, we will fine that we need to generate a regularly spaced array of numbers, such as when creating an x-axis, or a set of independent variables
The np.linspace
is really valuable in this regard. What does it do? Look up its syntax by running the following command
?np.linspace
Run the following commands
np.linspace(0,30)
and note what each varible does
From the help file, identify required and optional arguments. Note that the number of samples returned is an optional argument. What is its default value?
Below, we will demonstrate a few methods for generating random numbers. Importantly, the random numbers are drawn from a given statistical distribution. We consider the Uniform, Normal (Gaussian), Binomial, and Poisson distributions.
# Sample n random numbers in interval [0.0,1.0]:
n = 10
stats.uniform.rvs(size=n) # "uniform" in the command indicates all values are equally likely
array([0.96428521, 0.22810387, 0.46150645, 0.02136174, 0.60182643, 0.80169691, 0.70136798, 0.04686054, 0.81178822, 0.08349542])
#?np.random.random_sample
This is given by
$p(x) = \frac{1}{\sigma \sqrt{2\pi}}exp\left[-\frac{(x-\bar{x})^2}{2\sigma^2}\right]$
What do the symbols $\sigma$, and $\bar{x}$ represent. The snippet of code below generates 10 samples from a normal distribution
# Sampling from normal distribution
n = 10
mean = 10.
sigma = 2.
stats.norm.rvs(mean, sigma, size=n)
# Sampling from normal distribution
n = 1000
mean = 10.
sigma = 2.
ynorm = stats.norm.rvs(mean, sigma, size=n)
x = np.linspace(0,20,200) # make an array of 200 evenly spaced values from 0 to 20
y = stats.norm.pdf(x, mean, sigma) # determine the value of the pdf at each of the points in 'x'
plt.title("pdf of normal distribution")
plt.xlabel("$x$")
plt.ylabel("$p(x)$")
plt.grid()
plt.plot(x, y);
Consider the following snippet of code
n = 1000
mean = 10.
sigma = 2.
ynorm = stats.norm.rvs(mean, sigma, size=n)
Before running it, describe what it will do.
Now generate a histogram of using:
plt.hist(ynorm)
plt.xlabel("value")
plt.ylabel("occurences")
plt.title("Histogram; equal sized bins")
Now the plt.hist function has a number of optional variables. For example, you can normalize it using density = True, and you can change the relative width of the cells by using rwidth = ...
Try this snippet of code, and think about what each line an each optional variable does
plt.hist(ynorm, density = True,rwidth=0.8,label = 'samples')
plt.plot(x, y, label = 'pdf');
plt.legend()
Label the x - axis "value" and the y axis "occurrences"
The stats module implements dozens of other distributions. For example, to generate random values from a binomial distribution, you may use
stats.binom.rvs(n,p,size = 100)
where n
and p
are the two required variables for the binomial distribution.
stats.poisson.rvs(mean, size=100)
generates a sample of 100 random variables form the poisson distribution. Note that the poisson distribution is a function of a single variable. Namely the mean number of counts.
Experiment with generating random numbers and histograms of these distributions.
$\mu = 1$
$\mu = 4$
$\mu = 12$