Statistical Inference¶
Statistics looks at some data, and asks the following questions:
Hypothesis Testing: how well does the data match some assumed (null) distribution?
Point Estimation: if not well, what instance of some distributional class does it match well?
Uncertainty Estimation: are there reasonable competitors to that best guess distribution?
Sensitivity Analysis: do our results rely heavily on our distributional assumptions?
These questions are addressed through various statistical/computational methodologies, e.g.,
- Numerical Optimization
Maximum Likelihood
Least Squares
Expectation Maximization (EM)
- Simulation of Null Distributions
Bootstrapping
Permutation Testing
Monte Carlo Methods
- Estimation of Posterior Distributions
Markov Chain Monte Carlo (MCMC)
Variational Methods
- Nonparametric Estimation
Bayesian Nonparametrics
Is my coin fair?¶
Hypothesis testing¶
Pose your question (“Is this coin fair?”)
Find the relevant population (“‘Flip results’ from this coin”)
Specify a null hypothesis \(H_0\) (“The chance of heads is 50%”)
Choose test statistic informing \(H_0\) (“The number of heads observed”)
Collect data (“Flip the coin \(n\) times”)
Calculate the test statistics (“Count the total heads flipped”)
Reject the null hypothesis if the test statistic looks “strange” compared to its sampling distribution under the null hypothesis; otherwise, fail to reject the null hypothesis
Simulating Data
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as st
n = 100
pcoin = 0.62 # actual value of p for coin
results = st.bernoulli(pcoin).rvs(n)
h = sum(results)
print("We observed %s heads out of %s"%(h,n))
We observed 67 heads out of 100
Null Hypothesis
p = 0.5
rv = st.binom(n,p)
mu = rv.mean()
sd = rv.std()
print("The expected distribution for a fair coin is mu=%s, sd=%s"%(mu,sd))
The expected distribution for a fair coin is mu=50.0, sd=5.0
Binomial Test
print("binomial test p-value: %s"%st.binom_test(h, n, p))
binomial test p-value: 0.000873719836912
Z-Test (with continuity correction)
z = (h-0.5-mu)/sd
print("normal approximation p-value: - %s"%(2*(1 - st.norm.cdf(z))))
normal approximation p-value: 0.000966848284768
Permutation (Simulation) Test
nsamples = 100000
xs = np.random.binomial(n, p, nsamples)
print("simulation p-value - %s"%(2*np.sum(xs >= h)/(xs.size + 0.0)))
simulation p-value - 0.00062
Note
EXERCISE
Does anyone know what a p-value is?
Hint: it is not the probability that the null hypothesis is false. (Why?)
Hint: it is not the probability that the test wrongly rejected the null hypothesis. (Why?)
Maximum Likelihood Estimation (MLE)¶
bs_samples = np.random.choice(results, (nsamples, len(results)), replace=True)
bs_ps = np.mean(bs_samples, axis=1)
bs_ps.sort()
print("Maximum Likelihood Estimate: %s"%(np.sum(results)/float(len(results))))
print("Bootstrap CI: (%.4f, %.4f)" % (bs_ps[int(0.025*nsamples)], bs_ps[int(0.975*nsamples)]))
Maximum likelihood 0.67
Bootstrap CI: (0.5800, 0.7600)
Bayesian Estimation¶
The Bayesian approach estimates the posterior distribution (i.e., the updated belief about the parameters given the prior belief and the observed data) and uses it to make point and interval estimates about the parameters. The calculations we demonstrate here have analytic solutions <https://en.wikipedia.org/wiki/Closed-form_expression>`_. For most real life problems the necessary statistical models are more complex and estimation makes use of advanced numerical simulation methods.
fig = plt.figure()
ax = fig.add_subplot(111)
a, b = 10, 10
prior = st.beta(a, b)
post = st.beta(h+a, n-h+b)
ci = post.interval(0.95)
map_ =(h+a-1.0)/(n+a+b-2.0)
xs = np.linspace(0, 1, 100)
ax.plot(prior.pdf(xs), label='Prior')
ax.plot(post.pdf(xs), label='Posterior')
ax.axvline(mu, c='red', linestyle='dashed', alpha=0.4)
ax.set_xlim([0, 100])
ax.axhline(0.3, ci[0], ci[1], c='black', linewidth=2, label='95% CI');
ax.axvline(n*map_, c='blue', linestyle='dashed', alpha=0.4)
ax.legend()
plt.savefig("coin-toss.png")
Note
EXERCISE
Use the Python code above to play around with the prior specification a little bit. Does it seem to influence the results of the analysis (i.e., the resulting posterior distribution)? If so, how? How do you feel about that?
Note
CLASS DISCUSSION
Describe the role of Bernoulli and binomial distributions in our above example.
Describe the underlying hypothesis and the character of the tests assessing it.
Describe how the p-values are used to assess \(H_0\). (Hint: assume a significance level of \(\alpha=0.05\)).
Compare and contrast the Bayesian and Frequentist estimation paradigms.
Does anyone have other examples (not coin-flipping) where these tools might be applied?