The Chi-square distribution has many uses in the field of testing of hypotheses. It helps to test whether a population has given variance. It also helps to test ‘goodness of fit’ of a theoretical distribution to an observed distribution and in testing independence of attributes in a contingency table.

Related Calculators | |

Chi Square Test Calculator | |

Any statistical test that uses the chi square distribution can be called chi square test. Chi-square test is conducted a statistical test to investigate difference, and it is denoted by $\chi^2$. The chi-square test measures the difference between a statistically generated expected result and an actual result to see if there is a statistically significant difference between them. It measure the goodness of fit between an expected and an actual result.

Where,

$\chi ^2$ - Chi Square

O - Observed sample in each category.

E - Expected frequency in corresponding category.

The degree of freedom** **for the chi square difference test is equal to the difference between degree of freedom associated with the models. Each type of two way table has its own chi-square distribution, depending on the number of rows and columns, and each chi-square distribution is identified by its degree of freedom. A two way table with r rows and c column uses a chi-square distribution with (r - 1)*(c - 1) degree of freedom.

### Chi Square Test of Independence Example

For a given population, we consider two attributes and we may find the dependence between them. We have a set of workers in a factory and we try to classify them as smokers and non-smokers. The same workers are classified again as 'men' and 'women'. Here, we may find that the number of smokers are more in men than in women. So, we say that the attributes 'smoking' and 'sex' is dependent (associated). This test is applicable when the observations are independent (random) and the total frequency should be large. This test is used to test association of variables in two-way tables where the assumed model of independence is evaluated against the observed data. The chi-square goodness of fit test is that it can be applied to any univariate distribution for which you can calculate the cumulative distribution function. The chi-square goodness-of-fit test can be applied to discrete distributions such as the binomial and the Poisson.

Chi-square test statistic is of the form

$\chi^2$ = $\frac{\sum (\text{Observed value - Expected value})^2}{\text{Expected value}}$

**Degree of Freedom for the Chi-Square Test for Goodness of Fit**

The number of degree of freedom that we calculate for the Chi-square test for goodness of fit reflects the number of categories that we are comparing minus one.

Degree of freedom (df) = c - 1

The chi square difference test is very useful both for making simpler models more complex and for making complex models simpler. A more accurate test can be obtained by performing a chi square difference test.

** degree of freedom **for the chi square difference test is equal to the difference between degree of freedom associated with the models. When the chi square difference is statisticant, the model with the smaller chi-square is considered to fit the data better than the model with the higher chi-square.

The chi square test of homogeneity is used to test the differences between two popuations that are homogeneous with respect to some characteristics. In this test categories are assumed mutually and exhaustively exclusive. The test statistics for chi square test of homogeneity is the same as that for chi square of association.

$\chi^2$ = $\sum_{i=1}^{m}\sum_{j=1}^n \frac{(O_{ij} - E_{ij})^2}{E_{ij}}$

Where, df = (m - 1)(n - 1).

Chi-square test of association is equivalent to the Chi-square test of independence and the Chi-square test of homogeneity. The Chi-square test of association is used to determine whether there is an association between two or more categorical variables. In the Chi-square test of association the expected proportions are known a priori, for the Chi-square test of association the expected proportions are not known a priori but must be estimated from the sample data.

The chi-square test for trend tests is a linear trend between rows and the columns of the table. It only makes sense when the rows are arranged in a natural order (such as by age or time), and are equally spaced. A large chi-square statistic indicates in the table, the observed frequencies differ markedly from the expected frequencies. When a chi-square is high, examine the table to determine which cells are responsible. In the chi-squared test for trend, we not only use the order of the categories, but attach a numerical value. The chi-squared for trend statistic is always less than the chi-squared for association statistic. The difference between the two chi-squared statistics follows a Chi-squared distribution if the null hypothesis is true, with degrees of freedom equal to the difference between the two degrees of freedom.

The one-sample Chi-square test compares the distribution of cases across the categories of a variable with a hypothesized distribution. The Chi-square test used with one sample is described as a "goodness of fit" test. It can help you decide whether a distribution of frequencies for a variable in a sample is representative of, or "fits", a specified population distribution. The one sample Chi-square test is used to test a hypothesis such as 'suicide rate varies significant from month to month'. If the hypothesis is false, the suicide rate will be the same for one of the twelve months. The one sample Chi-square test can be used to compare observed suicide rates per month with what would be expected if the rate were equal for the all months.

The chi-square test measures the difference between a statistically generated expected result and an actual result to see if there is a statistically significant difference between them. After finding the Chi-square value and the degree of freedom are known, a standard table of Chi-square values can be consulted to determine the corresponding p-value. The p value indicates the probability that a Chi-square value that large would have resulted from the chance.

The chi-square test have some important assumption.

Table for Chi square test is given below:

### Solved Examples

**Question 1: **

** Solution: **

For blue, Observed frequency - Expected frequency = 5-10 = -5

**Question 2: **

** Solution: **

**Question 3: **

** Solution: **

The formula for Chi Square is defined as follows:

Where,

$\chi ^2$ - Chi Square

O - Observed sample in each category.

E - Expected frequency in corresponding category.

The degree of freedom

- For one degree of freedom, the distribution looks like a hyperbola.
- For than one degree of freedom, it loos like a mound that has a long right tail.

For a given population, we consider two attributes and we may find the dependence between them. We have a set of workers in a factory and we try to classify them as smokers and non-smokers. The same workers are classified again as 'men' and 'women'. Here, we may find that the number of smokers are more in men than in women. So, we say that the attributes 'smoking' and 'sex' is dependent (associated). This test is applicable when the observations are independent (random) and the total frequency should be large. This test is used to test association of variables in two-way tables where the assumed model of independence is evaluated against the observed data. The chi-square goodness of fit test is that it can be applied to any univariate distribution for which you can calculate the cumulative distribution function. The chi-square goodness-of-fit test can be applied to discrete distributions such as the binomial and the Poisson.

Chi-square test statistic is of the form

$\chi^2$ = $\frac{\sum (\text{Observed value - Expected value})^2}{\text{Expected value}}$

The number of degree of freedom that we calculate for the Chi-square test for goodness of fit reflects the number of categories that we are comparing minus one.

Degree of freedom (df) = c - 1

The chi square difference test is very useful both for making simpler models more complex and for making complex models simpler. A more accurate test can be obtained by performing a chi square difference test.

- Estimating the original model.
- Estimating the revised model in which new path has been added.
- Calculating the difference between the two resulting chi square values.

The chi square test of homogeneity is used to test the differences between two popuations that are homogeneous with respect to some characteristics. In this test categories are assumed mutually and exhaustively exclusive. The test statistics for chi square test of homogeneity is the same as that for chi square of association.

$\chi^2$ = $\sum_{i=1}^{m}\sum_{j=1}^n \frac{(O_{ij} - E_{ij})^2}{E_{ij}}$

Where, df = (m - 1)(n - 1).

Chi-square test of association is equivalent to the Chi-square test of independence and the Chi-square test of homogeneity. The Chi-square test of association is used to determine whether there is an association between two or more categorical variables. In the Chi-square test of association the expected proportions are known a priori, for the Chi-square test of association the expected proportions are not known a priori but must be estimated from the sample data.

The chi-square test for trend tests is a linear trend between rows and the columns of the table. It only makes sense when the rows are arranged in a natural order (such as by age or time), and are equally spaced. A large chi-square statistic indicates in the table, the observed frequencies differ markedly from the expected frequencies. When a chi-square is high, examine the table to determine which cells are responsible. In the chi-squared test for trend, we not only use the order of the categories, but attach a numerical value. The chi-squared for trend statistic is always less than the chi-squared for association statistic. The difference between the two chi-squared statistics follows a Chi-squared distribution if the null hypothesis is true, with degrees of freedom equal to the difference between the two degrees of freedom.

The one-sample Chi-square test compares the distribution of cases across the categories of a variable with a hypothesized distribution. The Chi-square test used with one sample is described as a "goodness of fit" test. It can help you decide whether a distribution of frequencies for a variable in a sample is representative of, or "fits", a specified population distribution. The one sample Chi-square test is used to test a hypothesis such as 'suicide rate varies significant from month to month'. If the hypothesis is false, the suicide rate will be the same for one of the twelve months. The one sample Chi-square test can be used to compare observed suicide rates per month with what would be expected if the rate were equal for the all months.

The chi-square test measures the difference between a statistically generated expected result and an actual result to see if there is a statistically significant difference between them. After finding the Chi-square value and the degree of freedom are known, a standard table of Chi-square values can be consulted to determine the corresponding p-value. The p value indicates the probability that a Chi-square value that large would have resulted from the chance.

The chi-square test have some important assumption.

- For the chi-square test to be meaningful it is imperative that each person, item or entity contributes to only one cell of the contingency table.
- Both independent and dependent variables are categorical with two or more levels.
- The data consist of frequencies, not scores.
- Each randomly selected observation can be classified into only one category for the independent variable and only one category for the dependent variable.

Table for Chi square test is given below:

Given below are some of the examples on chi square test.

Find the chi square for the following given datas

Color | Blue | Black | Brown | Yellow |

Observed frequency | 5 | 15 | 10 | 20 |

Expected frequency | 10 | 20 | 5 | 30 |

For blue, Observed frequency - Expected frequency = 5-10 = -5

For black, Observed frequency - Expected frequency = 15-20 = -5

For brown, Observed frequency - Expected frequency = 10-5 = 5

For yellow, Observed frequency - Expected frequency = 20-30 = -10

For blue,` (O-E) ^2/E` = `(-5) ^2/10 `

= `25/10` =2.5

For black, `(O-E) ^2/E` = `(-5) ^2/20`

= `25/20` = 1.25

For brown, `(O-E) ^2/E` = `(5) ^2/5 `

= `25/5` =5

For yellow, `(O-E) ^2/E` =` (-10)^2/30`

= `100/30` = 3.3333

`chi ^2 = sum( O-E) ^2/E`

= (2.5 + 1.25 + 5 + 3.3333)

=9.58333

The above calculation is also done by tabulation method which is explained below,

Color | Observed frequency | Expected frequency | Observed frequency -Expected Frequency | `(O-E)^2` | `(O-E)^2/E` |

Blue | 5 | 10 | -5 | 25 | 2.5 |

Black | 15 | 20 | -5 | 25 | 1.25 |

Brown | 10 | 5 | 5 | 25 | 5 |

Yellow | 20 | 30 | -10 | 100 | 0.83333 |

The formula for define Chi Square test is given by,

` chi ^2 = sum( O-E) ^2/E`

= (2.5 + 1.25 + 5 + 3.3333)

=9.58333

Find the chi square for the following given datas

Color | Blue | Black | Brown | Yellow |

Observed frequency | 10 | 5 | 25 | 35 |

Expected frequency | 15 | 30 | 30 | 25 |

Color | Observed frequency | Expected frequency | Observed frequency -Expected frequency | `(O-E)^2` | `(O-E)^2/E` |

Blue | 10 | 15 | -5 | 25 | 1.6666 |

Black | 5 | 30 | -25 | 625 | 20.8333 |

Brown | 25 | 30 | -5 | 25 | 0.8333 |

Yellow | 35 | 25 | 10 | 100 | 4 |

The formula for Chi Square is given by,

`chi ^ 2 = sum (O-E) ^2/E`

= 1.6666 + 20.8333 + 0.8333 + 4.0000

=27.3332

Find the chi square for the following given datas

Color | Blue | Black | Brown | Yellow |

Observed frequency | 23 | 24 | 32 | 23 |

Expected frequency | 12 | 32 | 25 | 21 |

Color | Observed frequency | Expected frequency | Observed frequency -Expected frequency | `(O-E)^2` | `(O-E)^2/E` |

Blue | 23 | 12 | 11 | 121 | 10.0833 |

Black | 24 | 32 | -8 | 64 | 2.0000 |

Brown | 32 | 25 | 7 | 49 | 1.9600 |

Yellow | 23 | 21 | 2 | 4 | 0.1905 |

The formula for Chi Square is given by,

`chi ^ 2 = sum (O-E) ^2/E`

=10.0833 + 2.0000 + 1.9600 + 0.1905

=14.2338

Related Topics | |

Math Help Online | Online Math Tutor |