 Top

Comparing Two Means

Whenever we do any research will gather information about two populations in order to compare them to get more accurate results. In statistical inference we have two popular statistical tools useful to differentiate between two population parameters are tests of significance and confidence intervals. In statistical terms, we want to perform a hypothesis test when comparing two groups. Usually, we work with two hypotheses in order to build a test of the hypothesis. In the case of comparing two independent groups, the null hypothesis is that the two means for the independent groups are the same, and the alternate hypothesis is that the two means are different.

 Related Calculators Compare Fractions Calculator Comparing and Ordering Fractions Calculator Comparing Decimals Calculator Comparing Proportions Calculator

Definition

A mean is the ratio of the sum of all observations to the total number of observations in general. Researchers compare two means or two population means to gather various kinds of comparative information for a deeper study.

Comparing Two Population Means

When we compare two means or two population means there are an important thing that is required to be checked. That thing is if the data samples that are taken whose means are to be compared are dependent samples or independent samples. Only the basis of the type of dependency of the samples we can make choice of the test that can be used to compare two means.

Statistically Significant Difference Between Two Means

When there are two samples given of sizes $n_1$ and $n_2$ say, which have means that are not known say $mu_1$ and $\mu_2$, and have standard deviations that are known say $\sigma_1$ and $\sigma_2$, then the statistic test that is used to compare the means is commonly termed as two sample z statistic given by the formula

$z$ = $\frac{(\bar{x_1}\ –\ \bar{x_2})\ –\ (\mu_1\ –\ \mu_2)} {\sqrt {\frac{(\sigma_1)^2}{(n_1)}\ +\ \frac{(\sigma_2)^2}{(n_2)}}}$

It has standard normal deviation that is N (0, 1).

In this case, the null hypothesis is taken to always assume that the means are equal. And the alternate hypothesis may either be one sided or two sided.

Student’s T-test for the Comparison of Two Means

This is the most commonly used tests which are applied to small sets of data samples. An example of a case where we can apply this test is to compare the results obtained in analysis with same method on two samples X and Y for confirming the percentage of analyte to be same in both X and Y or not.

In this test, there are two possible outcomes for the null hypothesis that is either acceptance or rejection. Generally, the null hypothesis states any kind of discrepancies, differences etc due to any not systematic errors or at random. The alternative
hypothesis is exactly the opposite of null hypothesis.

The results that are provided by all the significant tests lies within a confidence level percentage that is defined already. The most
commonly used levels of confidence are 90%, 95% and 99%. Among all three 95% is even more common to be used. By 95% confidence level, one means that there is 95% or more certainty on doing the right thing.

The t-test makes two assumptions.

1. There is a normal distribution for the random errors population.

2. In between the standard deviations of both of the given samples, there is no significant difference.

We use the given equations for calculating the two means along with the corresponding standard deviations.

$\bar{x_A}$ = $\sum_{(j\ =\ 1)}^{(n_A)}\ [\frac{x_j}{n_A}]$

$\bar{x_B}$ = $\sum_{(j\ =\ 1)}^{(n_B)}\ [\frac{x_j}{n_B}]$

$s_A$ = $\sqrt{(\sum_{(j\ =\ 1)}^{(n_A)}\ \frac{(\bar{x_A}\ –\ x_j)^2}{(n_A\ –\ 1)}}$

$s_B$ = $\sqrt{(\sum_{(j\ =\ 1)}^{(n_B)}\ \frac{(\bar{x_B}\ –\ x_j)^2}{(n_B\ –\ 1)}}$

Now we calculate the pooled estimate $f\ S_{AB}$, the standard deviation using the formula below.

$S_{AB}$ = $\sqrt {\frac{(n_A\ –\ 1) (s_A)^2\ +\ (n_B\ –\ 1)\ (s_B)^2}{(n_A\ +\ n_B\ –\ 2)}}$

In the end, we calculate $t_{\exp}$, that is the experimental t value using the formula below.

$t_{\exp}$ = $\frac{|\bar{x_A}\ –\ \bar{x_B}|}{s_{AB} \sqrt{\frac{1}{(n_A)}\ +\ \frac{1}{(n_B)}}}$

This $t$ value is then compared with the theoretical or the critical $t_{th}$ value which corresponds to $N$, that is the given degree of freedom along with the chosen level of confidence. If the experimental value of t is greater than the theoretical value of t then the null hypothesis is rejected else it is retained.

Confidence Interval for the Difference Between Two Means

There are two possible ways for calculation of confidence interval for the difference between two means. The difference in the formulas used is based on the assumption that the standard deviations are unequal or equal.

CASE 1:

When the standard deviations are assumed to be equal

This implies that $\sigma_{1}$ = $\sigma_{2}$ = $\sigma$ (say). If this is unknown then the most appropriate confidence interval that will also be two sided for $\mu_1\ -\ \mu_2$ will be

$\bar{X_1}\ –\ \bar{X_2}\ \pm$ $t_{(1\ –\ \frac{a}{2}, n_1\ +\ n_2\ –\ 2)} s_p\ \times$ $\sqrt{\frac{1}{n_1}\ +\ \frac{1}{n_2}}$

$s_p$ represents $\sqrt{\frac{((n_1\ –\ 1)\ (s_1)^2\ +\ (n_2\ –\ 1)\ (s_2)^2)}{(n_1\ +\ n_2\ –\ 2)}}$

One can obtain the upper and lower confidence intervals that are one sided by simply replacing $\frac{a}{2}$ by $2$.

CASE 2:

When the standard deviations are assumed to be unequal

That is $\sigma_1\ \neq\ \sigma_2$ are both unknowns. In this case the confidence interval that is two sided and appropriate for $\mu_1\ -\ \mu_2$ will be

$\bar{X_1}\ –\ \bar{X_2}\ \pm$ $t_{(1\ –\ \frac{a}{2},\ v)}\ \times \sqrt{\frac{(s_1)^2}{n_1}\ +\ \frac{(s_2)^2}{n_2}}$

Here

$v$ = $\frac{(\frac{(s_1)^2}{(n_1)}\ +\ \frac{(s_2)^2}{(n_2)})^2}{\frac{(s_1)^4}{(n_1)^2 (n_1\ –\ 1)}\ +\ \frac{(s_2)^4}{(n_2)^2\ (n_2\ –\ 1)}}$
 Related Topics Math Help Online Online Math Tutor
*AP and SAT are registered trademarks of the College Board.