Empirical rule is of great importance in statistics. It is a statistical rule which states that:In case of normal distribution, mostly all the data falls within 3 standard deviations of the mean. More specifically, it states that the 68% of the data falls within first standard deviation, the 95% falls within first two standard deviations and
99.7% of the data falls within first three standard deviations of the mean.
Therefore, this rule also known as 68-95-99.7 rule
or three sigma rule
.In other words, according to empirical rule:
1) 68.27 % of the values lie within one standard deviation of the mean; i.e. within +1 and -1 standard deviation on both sides of the mean.
2) 95.45 % of the values will lie within 2 standard deviations of the mean; i.e. within +2 and -2 standard deviation on both sides of the mean.
3) Almost all (99.7%) of the values will lie within 3 standard deviations of the mean; i.e. within +3 and -3 standard deviation on the both sides of the mean.
For better understanding, have a look at the following diagram explaining empirical rule:
This rule provides a quick rough estimate about the spread of the given data for a
normal distribution, when mean standard deviation are given. Empirical rule is used to indicate the virtue of normality for a distribution. If the data falls outside the described When a number of
data points fall outside the three standard deviation range, it may be a non normal distribution.
The empirical rule can be mathematically put in the form of following formula:pr($\mu - \sigma \leq x \leq \mu + \sigma$) = 68.27 %
pr($\mu - 2 \sigma \leq x \leq \mu + 2 \sigma$) = 95.45 %
pr($\mu - 3 \sigma \leq x \leq \mu + 3\sigma$) = 99.73 %
Where, $\mu$ and $\sigma$ represent mean and standard deviation.We may also write it in the following way:
|Distances from mean
|$\mu \pm \sigma$
|$\mu \pm 2\sigma$
|$\mu \pm 3\sigma$
The Empirical Rule indicates what percentage of data falls within a particular range of the mean. Although, these are approximated results and are applicable only in case of normal distribution; yet this rule plays a vital role in statistics. Concentrate at the following graph shown here:
This diagram illustrates the three components of empirical Rule.
The reason behind most of the (68%) data values fall in 1 standard
deviation of the mean is its bell shape. So, the majority of the data are clustered in the middle. Above figure illustrates that about 34.1 % of the values lie on either side of mean within one standard deviation.
Another standard deviation on the either side of mean
increases its percentage from 68 to 95. It is a big jump which gives a
better idea about the location of most of the values. Most of the researchers rely on this 95% range rather than on 99.7% for reporting the results.
Empirical rule may also be used as a normality test for the distribution. In order to do so, one should compute the size
of the deviations in the form of standard deviations and should compare this to
the expected frequency. If for a data set, the points fall outside the 3 standard deviations from the mean, then these points are said to be outliers
. In case there a lot of point that lie outside the 3 standard deviations, the distribution is likely to be a non-normal distribution.
Few examples that use empirical rules are given below:Example 1:
A research was performed on the IQ scores of the employees of a private firm. The scores are noted to be in normal distribution. The mean of the distribution be 100 and standard deviation be 15. Estimate the percentage of the scores that fall between 70 and 130. Solution:
We know that according to empirical rule, we are supposed to calculated either $\mu \pm \sigma$ or $\mu \pm 2\sigma$ or $\mu \pm 3\sigma$.
Here, $\mu$ = 100
and $\sigma$ = 15
130 = 100 + 30 = 100 + 2(15)
70 = 100 - 30 = 100 - 2(15)
Thus, 130 and 70 are 2 standard deviations to the right and to the left of the mean. Therefore, from the definition of empirical rule, about 95% of the IQ scores will fall between 70 and 130.Example 2:
In a recent report, during a research in a school, it was found that the heights of the students of class 6 were found to be in normal distribution. If the mean height be 1.5 and the standard deviation be 0.08; then classify the data in accordance with empirical rule.Solution:
Empirical rule states that
Approximately 68% of the heights would fall within 1 standard deviation around the mean.
$\mu \pm \sigma$ = 1.4 $\pm$ 0.08
= (1.4 + 0.08, 1.4 - 0.08)
= (1.48, 1.32)
Approximately 95% of the heights would fall within 2 standard deviations about the mean.
$\mu \pm 2\sigma$ = 1.4 $\pm$ 0.16
= (1.4 + 0.16, 1.4 - 0.16)
= (1.56, 1.24)
And 99.7% of the heights would fall within 3 standard deviations around the mean.
$\mu \pm 3\sigma$ = 1.4 $\pm$ 0.24
= (1.4 + 0.24, 1.4 - 0.24)
= (1.64, 1.16)
Example 3: The scores of an entrance test for the high school pass-outs in a particular year were bell shaped. If the mean and standard deviation were 490 and 100. Then:
a) What percentage students scored between 590 and 390 on
b) The score of a student was 795. What can you say about his performance as compared to rest of the scores?
a) Since 590 = 490 + 100 = $\mu$ + $\sigma$
and 390 = 490 - 100 = $\mu$ - $\sigma$
Hence, we can say that approximately 68% of the students scored between 590 and 390 on this test.
b) Since 490 + 3 x 100 = 790 = $\mu$ + 3$\sigma$
490 - 3 x 100 = 190 = $\mu$ - 3$\sigma$
We can say that 99.7% of the test scores lie between 190 and 790. Hence a
score of 795 is one of the highest scores.