.. stats-shortcourse Statistical Inference =============================== Statistics looks at some data, and asks the following questions: * **Hypothesis Testing**: how well does the data match some assumed (null) distribution? * **Point Estimation**: if not well, what instance of some distributional class does it match well? * **Uncertainty Estimation**: are there reasonable competitors to that best guess distribution? * **Sensitivity Analysis**: do our results rely heavily on our distributional assumptions? These questions are addressed through various statistical/computational methodologies, e.g., * Numerical Optimization * Maximum Likelihood * Least Squares * Expectation Maximization (EM) * Simulation of Null Distributions * Bootstrapping * Permutation Testing * Monte Carlo Methods * Estimation of Posterior Distributions * Markov Chain Monte Carlo (MCMC) * Variational Methods * Nonparametric Estimation * Bayesian Nonparametrics Is my coin fair? ---------------- Hypothesis testing ^^^^^^^^^^^^^^^^^^ 0. Pose your question ("Is this coin fair?") 1. Find the relevant *population* ("'Flip results' from this coin") 2. Specify a *null hypothesis* :math:`H_0` ("The chance of heads is 50%") 3. Choose *test statistic* informing :math:`H_0` ("The number of heads observed") 4. Collect data ("Flip the coin :math:`n` times") 5. Calculate the test statistics ("Count the total heads flipped") 6. *Reject the null hypothesis* if the test statistic looks "strange" compared to its *sampling distribution* under the null hypothesis; otherwise, *fail to reject the null hypothesis* **Simulating Data** .. code-block:: python import numpy as np import matplotlib.pyplot as plt import scipy.stats as st n = 100 pcoin = 0.62 # actual value of p for coin results = st.bernoulli(pcoin).rvs(n) h = sum(results) print("We observed %s heads out of %s"%(h,n)) We observed 67 heads out of 100 **Null Hypothesis** .. code-block:: python p = 0.5 rv = st.binom(n,p) mu = rv.mean() sd = rv.std() print("The expected distribution for a fair coin is mu=%s, sd=%s"%(mu,sd)) The expected distribution for a fair coin is mu=50.0, sd=5.0 **Binomial Test** .. code-block:: python print("binomial test p-value: %s"%st.binom_test(h, n, p)) binomial test p-value: 0.000873719836912 **Z-Test** (with continuity correction) .. code-block:: python z = (h-0.5-mu)/sd print("normal approximation p-value: - %s"%(2*(1 - st.norm.cdf(z)))) normal approximation p-value: 0.000966848284768 **Permutation (Simulation) Test** .. code-block:: python nsamples = 100000 xs = np.random.binomial(n, p, nsamples) print("simulation p-value - %s"%(2*np.sum(xs >= h)/(xs.size + 0.0))) simulation p-value - 0.00062 .. note:: **EXERCISE** Does anyone know what a **p-value** is? * Hint: it is *not* the probability that the null hypothesis is false. (Why?) * Hint: it is *not* the probability that the test wrongly rejected the null hypothesis. (Why?) Maximum Likelihood Estimation (MLE) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. code-block:: python bs_samples = np.random.choice(results, (nsamples, len(results)), replace=True) bs_ps = np.mean(bs_samples, axis=1) bs_ps.sort() print("Maximum Likelihood Estimate: %s"%(np.sum(results)/float(len(results)))) print("Bootstrap CI: (%.4f, %.4f)" % (bs_ps[int(0.025*nsamples)], bs_ps[int(0.975*nsamples)])) Maximum likelihood 0.67 Bootstrap CI: (0.5800, 0.7600) Bayesian Estimation ^^^^^^^^^^^^^^^^^^^ The Bayesian approach estimates the posterior distribution (i.e., the updated belief about the parameters given the prior belief and the observed data) and uses it to make point and interval estimates about the parameters. The calculations we demonstrate here have analytic solutions `_. For most real life problems the necessary statistical models are more complex and estimation makes use of advanced numerical simulation methods. .. code-block:: python fig = plt.figure() ax = fig.add_subplot(111) a, b = 10, 10 prior = st.beta(a, b) post = st.beta(h+a, n-h+b) ci = post.interval(0.95) map_ =(h+a-1.0)/(n+a+b-2.0) xs = np.linspace(0, 1, 100) ax.plot(prior.pdf(xs), label='Prior') ax.plot(post.pdf(xs), label='Posterior') ax.axvline(mu, c='red', linestyle='dashed', alpha=0.4) ax.set_xlim([0, 100]) ax.axhline(0.3, ci[0], ci[1], c='black', linewidth=2, label='95% CI'); ax.axvline(n*map_, c='blue', linestyle='dashed', alpha=0.4) ax.legend() plt.savefig("coin-toss.png") .. figure:: ./images/coin-toss.png :scale: 75% :align: center :alt: coin-toss :figclass: align-center .. note:: **EXERCISE** Use the Python code above to play around with the *prior specification* a little bit. Does it seem to influence the results of the analysis (i.e., the resulting posterior distribution)? If so, how? How do you feel about that? .. note:: **CLASS DISCUSSION** 1. Describe the role of Bernoulli and binomial distributions in our above example. 2. Describe the underlying hypothesis and the character of the tests assessing it. 3. Describe how the p-values are used to assess :math:`H_0`. (Hint: assume a **significance level** of :math:`\alpha=0.05`). 4. Compare and contrast the Bayesian and Frequentist estimation paradigms. 5. Does anyone have other examples (not coin-flipping) where these tools might be applied?