Hypothesis testing for population proportion using one sample.

Here we are concerned with making an inference about a population proportion consisting of two categories(like Success or Failures, Pro-Villar or Anti-Villar) from data obtained in a sample of size n.

The null hypothesis is that the population proportion p_0 = \pi.

Mean and variance of sample proportion

From the binomial distribution, we know that the mean would be \pi and the variance is \pi (1 -\pi) /n

In terms of counts X, the mean is n\pi and the variance is n\pi (1-\pi).

An estimate obtained from a sample of size $n$ would be denoted by p (computed as X/n where X is the number of successes).

Standard error of estimated p

The standard error or standard deviation of the estimate of proportion is given by \sqrt{\pi (1-\pi) /n}

The normalized test statistic.

Assuming n is large enough, the normal approximation can be used and the test statistic would be given by

Z_{test} = \frac{p- \pi}{\sqrt{\frac{\pi (1-\pi)}      {n}}}
.

In terms of counts, it is Z_{test} = \frac{X - n\pi}{\sqrt{n\pi (1 - \pi)}}

Confidence Intervals for population proportions.

Let z_{crit} be the critical value(s) according to the level of significance \alpha. For example the lower and critical critical values for a two sided test at the 5 percent level of significance is -1.96, and 1.96 but is 1.64 for a right single sided test.

Then for a two-sided test, we accept H_0 depending on whether Z_{test} \in [-z_{crit}, z_{crit}] or not respectively.

For convenience, here are the acceptance regions or intervals for various common values of \alpha.

\alpha left sided test : H_1:p_0  \le  \pi double sided test: H_1:p_0 \ne \pi right sided test: H_1:p_0 \ge \pi
0.01 [-2.36, \infty) [-2.58, 2.58] (-\infty, 2.36]
0.05 [-1.64,\infty) [-1.96, 1.96] (-\infty, 1.64]
0.10 [-1.28, \infty) [-1.64, 1.64] (-\infty, 1.28]

The p-value of the test.

Suppose that we obtained the sample proportion p. Convert this to a test statistic Z_{test}. Then the p-value would be given as

Condition Formula
Z_{test} \ge 0 1-pnorm(Z_{test})
Z_{test} \le 0 pnorm(Z_{test})

Here pnorm(Z_{test}) is the area from (-\infty, Z_test) using the standard normal distribution. The double-sided test would have a p-value twice the p-values obtained for the single-sided tests.

The margin of error

The confidence interval corresponding to 1- \alpha) confidence level coefficient is centered on p, and has upper and lower confidence limits of p \pm Z_{crit} (se). where se is the standard error of the sample proportion. The margin of error is Z_{crit} (se)..

Examples here!! TBD.

Had a tough time sorting out the latex problems! It basically involves the ‘<‘ sign inside latex formulas. changed it to \le so the html display would be all right.

  • Share/Bookmark

Leave a Reply

Digital explorations is Digg proof thanks to caching by WP Super Cache