Misconceptions and useful counterexamples in statistics

This page contains an assortment of useful or informative examples/counterexamples in probability and statistics. I plan to continue updating this page as I come across new examples.

A Bayes factor which contradicts the posterior

The following is an example of what can happen when Bayes factors are used to test point nulls (their only use in psychology).

Courtesy of Stone (1997), consider a binomial experiment involving n = 527,135 trials with k = 106,298 successes, where we want to test the null-hypothesis that p = 0.2. Assuming a uniform prior, the posterior distribution is \mathrm{Beta}(106299, 420838), yielding a Bayes factor of 8.11, which Jeffreys calls “substantial” evidence for the null hypothesis. But what does the posterior actually look like?
This would seem to definitively exclude 0.2 (in fact, p > 0.2 with 99.9\% certainty). A classical hypothesis test likewise rejects 0.2 with a p-value of 0.0028.

A p-value which doesn’t approximate the probability of a type-1 error

The p-value is the probability of obtaining an effect greater than or equal to the size of the observed effect if the null hypothesis were true. Unfortunately, this is not very well understood in psychology, and p-values are widely misinterpreted as the probability that the null-hypothesis is true (some research suggests that close to 80% of psychology students and researchers hold some variant of this misconception).

Consider screening for breast cancer in a population of 1,000,000 people, where only 100 members of the population actually have cancer. Suppose further that our test is 99% accurate in the sense that, if a person has breast cancer, then we will detect it 99% of the time, and if a person does not, then we will give them a clean bill of health 99% of the time (so the sensitivity and specificity of the test are both 99%). Define the null hypothesis H_0 to be that a person is cancer-free. Suppose that we pick a person at random and obtain a positive test result — what is the p-value associated with this outcome? If H_0 were true, such a result would only be obtained 1% of the time, so our p-value is .01, which is highly significant. What is the probability that H_0 is actually true? Bayes theorem gives us

    \begin{align*}           p(\text{cancer}|\text{test positive}) &= \frac{ p(\text{test positive}|\text{cancer}) \cdot p(\text{cancer}) }                                                         {p(\text{test positive})} \\                                                 &= \frac{0.99 \cdot 0.0001}{0.99 \cdot 0.0001 + 0.1 \cdot 0.9999} \\                                                 &\approx 0.01       \end{align*}

So we obtain a significant result with p = .01, and yet the probability that we have made a type-1 error is 0.99.

A confidence interval which can’t contain the true value

Confidence intervals, much like p-values, are widely misinterpreted in a Bayesian way. In particular, it is common to interpret a 95% confidence intervals for a parameter as having a 95% chance of containing the true value of that parameter, but this is not true for all kinds of reasons (mostly because parameters are not generally considered to be random variables in frequentist statistics).

A confidence interval for a parameter \mu is, simply, a random interval that will contain the true value of \mu some specified percentage of the time. Consider the standard 95% confidence interval for a normal mean — if we we continue drawing random samples and computing confidence intervals, then 95% of the confidence intervals will contain the true value of the mean. This is not the same as saying that some specific interval has a 95% chance of containing the true value (the probably is technically 0 or 1 by frequentist reasoning — it either contains the true value or it doesn’t). The difference is hard to appreciate when it comes to the standard confidence interval for a mean, so consider the following example:

Construct a confidence interval \delta for the mean of a normally distributed random variable as follows: Take a biased coin (with p = .95 of returning heads) and flip it. If the coin returns heads, define \delta = (-\infty,\infty), otherwise, define \delta = \emptyset (the empty set). If we draw random samples and flip our coin over and over again, then 95% of our confidence intervals will contain the true value (since they contain everything), so this is a 95% confidence interval. What happens if the coin lands on tails? It’s still a 95% confidence interval, but it cannot possibly contain the true value, since it contains nothing!

Uncorrelated is not independent

Independence implies uncorrelatedness, but not vice versa. For example, let X be standard normal, and let Y = X^2. Then X and Y are dependent, but uncorrelated, since

    \begin{align*}           \text{cov}(X,Y) &= \mathrm{E}(XY) - \mathrm{E}(X)\mathrm{E}(Y) \\                           &= \mathrm{E}(X^3) - 0 \\                           &= 0      \end{align*}

One of the few situations in which the reverse holds is when a set of random variables {X_1,X_2,\dots,X_n} are jointly normally distributed, in which case they are independent if they are uncorrelated.

Correlation is not transitive

If X is correlated with Y, and Y is correlated with Z, it does not follow that X is correlated with Z. For example, let X and Z be independent (and thus uncorrelated) normal random variables, and define Y = X + Z. Then both X and Z are positively correlated with Y, but the correlation between X and Z is zero.