Top

Contingency Table

Statistics is an essential branch of mathematics. It acquires a broad form in higher studies and becomes a separate and vast subject of mathematics. Statistics is the study of collection, presentation, organization, calculation and interpretation of the numerical data. The statistical data may be broadly classified into two types - Qualitative and Quantitative. Usually the qualitative data is a categorical data which is not expressed in the form of numbers, rather in the form of language description.

Although we may have categories, the categories may have a structure to them. In other words, the data which expresses the categories is called categorical data. The examples of these categories might be gender, sport, religion, cast etc. Categorical variables that represent order or size (small, medium, large, etc. or first, second, third etc) are known as ordinal variables.

When we deal with large categorical data, we require to represent it is the form of a table. This tabular representation is termed as contingency table. This may be a record or display of the given data and the analysis interpreted from the categorical data. In this article, we shall go ahead and learn about the concept of contingency table, its formation and its applications.

 Related Calculators Chi Square Contingency Table Calculator Anova Table Calculator Function Table Calculator Logic Truth Table Calculator

Definition

The contingency tables were first introduced by a famous statistician named Karl Pearson in the year 1904. The contingency table is defined as the tabular representation of the categorical data. Generally, this table displays various frequencies for certain combinations of values for two discrete random variables X and Y. Every column of the table shows a disjoint or mutually exclusive combination of X-Y values. Contingency table summarizes the results when the comparison of two or more groups is held.
For Example: pass versus fail, open artery versus obstructed artery, symptoms versus disease.

A contingency table may display data for the following five types of studies:
1) Cross sectional study
2) Prospective study
3) Retrospective study
4) Study of variables
5) Accuracy of result
A contingency table is a format of a display format that is used in order to analyze, record and interpret the relationship among two or more categorical variables. It may be defined as a table for analysis of relationship between variables. Usually, contingency tables do have as many rows as the categories are there in the data.

For Example: a simple contingency table is given below :
In a survey in an office, the following information was noted.
 Categories American African Caucasian Asian Total Male 30 15 27 54 126 Female 43 16 22 35 116 Total 73 31 49 89 242

How to Make Contingency Table?

In order to construct a contingency table, one needs to follow the following instructions.
1) Decide the number of rows and number of columns on the basis of categories.
2) Plug in each category (usually) in the rows.
3) Write down the corresponding numbers given in the data in front of each category.
4) Draw an additional row and column for the total.
5) Find the total in each category and write it in the end of each row and each column.
6) Finally write the grand total below in the right most column.

For example - During a survey in a hospital it was found that there are 5 males and 8 females with cancer and 9 males and 10 females with lung disease. The contingency table for this data is as follows :
 Cancer Lung Disease Total Males 5 9 14 Females 8 10 18 Total 13 19 32

Chi Square Test

Pearson's chi square test was introduced by Karl Pearson. This test is utilized to judge two types of comparisons:
The goodness of fit
Which establishes if the observed frequency distribution is different from the theoritical frequency distribution.
The test of independence

Which evaluates if the observations of two given variables are independent of each other.

The procedure of Chi square test involves the steps explained below :
1) Find the value of chi square test static which is represented by $\chi^{2}$ with the use of formula.
2) Evaluate degrees of freedom for that static, denoted by df.
3) Compare Chi square test static $\chi^{2}$ with the critical value from chi square distribution along with degrees of freedom df. This usually provides good approximation for the distribution.

The chi square test statistic is calculated using the following formula :
$\chi^{2}=\sum_{i=1}^{r} \sum _{j=1}^{c}\frac{(O_{ij}-E_{ij})^{2}}{E_{ij}}$
Where,
$O_{i}$ = Observed frequencies
$E_{i}$ = Expected or theoretical frequencies
$\chi^{2}$ = Chi square test static
r= Total number of rows
c = Total number of columns.

Degrees of freedom is estimated using the following relation :
df = (r - 1) . (c - 1)
r = number of rows
c = number of columns.

2x2 Contingency Table

A 2 x 2 contingency table is one which has 2 rows and 2 columns. The examples of 2 x 2 contingency tables are illustrated below :
Example 1 : Calculate the chi square test static for the following contingency table.
 A+ grader A+ grader Total 10$^{th}$ standard 7 20 27 12$^{th}$ standard 18 22 40 Total 25 42 67

Solution: From the table given above, theoretical frequencies are calculated in the following way
$E_{11}$ for observed frequency 7 = $\frac{27 \times 25}{67}$ = 10.1

$E_{12}$ for observed frequency 20 = $\frac{27 \times 42}{67}$ = 16.93

$E_{21}$ for observed frequency 18 = $\frac{40 \times 25}{67}$ = 14.93

$E_{22}$ for observed frequency 7 = $\frac{40 \times 42}{67}$ = 25.1

Thus, we get another table with observed frequencies.

 A+ grader A+ grader Total 10$^{th}$ standard 7(10.1) 20(16.93) 27 12$^{th}$ standard 18(14.93) 22(25.1) 40 Total 25 42 67

$\chi^{2}=\sum_{i=1}^{r} \sum _{j=1}^{c}$$\frac{(O_{ij}-E_{ij})^{2}}{E_{ij}}$

= $\frac{(7-10.1)^{2}}{10.1}$+$\frac{(20-16.93)^{2}}{16.93}$+$\frac{(18-14.93)^{2}}{14.93}$+$\frac{(22-25.1)^{2}}{25.1}$

= 0.951 + 0.557 + 0.631 + 0.383
= 2.522

Example 2: During elections, in a particular area in America, the record of voters was made which is noted in the following contingency table. Evaluate the chi square test static for it.
 Vote yes Vote no Total Men 35 9 44 Women 60 41 101 Total 95 50 145

Solution: The theoretical frequency for each observed frequency is calculated in the following manner.
$E_{11}$ for observed frequency 35 = $\frac{44 \times 95}{145}$ = 28.83

$E_{12}$ for observed frequency 9 = $\frac{44 \times 50}{145}$ = 15.17

$E_{21}$ for observed frequency 60 = $\frac{101 \times 95}{145}$ = 66.17

$E_{22}$ for observed frequency 41 = $\frac{101 \times 50}{145}$ = 34.83

Thus, we get another table with observed frequencies.

 Vote yes Vote no Total Men 35(28.83) 9(15.17) 44 Women 60(66.17) 41(34.83) 101 Total 95 50 145

The formula for chi square test static is given below:
$\chi^{2}=\sum_{i=1}^{r} \sum _{j=1}^{c}$ $\frac{(O_{ij}-E_{ij})^{2}}{E_{ij}}$

= $\frac{(35-28.83)^{2}}{28.83}$+$\frac{(9-15.17)^{2}}{15.17}$+$\frac{(60-66.17)^{2}}{66.17}$+$\frac{(41-34.83)^{2}}{34.83}$

= 5.5
 Related Topics Math Help Online Online Math Tutor
*AP and SAT are registered trademarks of the College Board.