A regression is a statistical analysis assessing the association between two variables. It is used to find the relationship between two variables. Linear Regression Definition states that it can be measured by using lines of regression. Regression measures the amount of average relationship or mathematical relationship between two variables in terms of original units of data. Whereas, correlation measures the nature of relationship between two variables. i.e.., positive or negative or uncorrelated.

Related Calculators | |

Calculating Linear Regression | linear regression correlation coefficient calculator |

Regression is used for estimating the value of one variable if we know the value of other variable, one of the variable is independent variable and other variable is dependent variable.

Let ( XWe know that, coefficient of correlation

$r$ = $\frac{Cov(X, Y)}{\sigma_X \sigma_Y}$

where Cov (X , Y) = $\frac{1}{n}$ $\sum X_i Y_i - \bar X \bar Y$

and $X^2 = $\frac{1}{n}$ $\sum (X_i - \bar X)^2

Now, we want to obtain regression equation of Y-on-X by taking the line and the corresponding normal equation are

Y = a + bX ----------------- ( 1 )

$\sum Y_i = na + b \sum X_i$ ..................... ( 2 )

$\sum X_i Y_i = a \sum X_i + b \sum X^2_i$ .................................... ( 3 )

Divide equation (2) and (3) by n

From (2), $\frac{1}{n} \sum Y_i = a + b (\frac{1}{n}) \sum X_i$

$\bar Y = a + b \bar X$ ...................................... (4)

cov (X , Y) + X Y = $a \bar X + b (\sigma X^2 + \bar X^2)$ ............................. ( 5 )

$\sigma X^2 = (\frac{1}{n}) \sum X^2_i - \bar X^2$

Multiplying equation (4) with X and sub from (5)

b = cov ( X , Y ) / `sigma` X2

Substitute the value of b in ( 4 )

Therefore, $Y - \bar Y = bYX (X - \bar X)$

Similarly, we can prove that regression equation of X-on-Y is

$X - \bar X = bXY (Y - \bar Y)$

When the regression line is linear the regression coefficient is the constant (a) that represents the rate of change of one variable (Y) as a function of changes in the other (X); it is the slope of the regression line. If the two variables are mutually related to each other, then every time there is an increase of the given size in value of X variable.

The relation between variables when the regression equation is linear.

Y' = aX + b

where Y' is the predicted value of the variable.

a = regression coefficient

b = intercept of the line.

Linear regression is a statistical procedure for predicting the value of a dependent variable from an independent variable when the relationship between the variables can be described with a linear model. A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable, 'b' is the slope of the line, and 'a' is the intercept. The dependent and independent variables may be scalars or vectors. If the independent variable is a vector, one speaks of multiple linear regression. In simple regression analysis, there is no partialling out of other variables because no other variables are included in the regression.

The equation of the probabilistic simple regression is

y = $\beta_0 +\beta_1 x_1 + \varepsilon $

where, y is the value of the dependent variable

$\beta_0$ is the population y intercept

$\beta_0$ is the population slope

$\varepsilon $ the error of prediction. Simple linear regression is a way of analyzing the relationship between variable x and variable y. Linear regression is an approach to modelling the relationship between a scalar dependent variable y and one or more explanatory variables denoted x. A linear regression model attempts to explain the relationship between two or more variables using a straight line. For a simple linear regression, assign one variable to the dependent variable and one to the explanatory variables. Both the dependent and explanatory variables must be numerical.

Simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. Regression analysis is a statistical technique that attempts to explore and model the relationship between two or more variables using a straight line. Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. The generalization of linear regression, called multi-linear regression, aims to find the mapping that is as close as possible to a nonlinear mapping. The population regression line for p explanatory variables $x_1, x_2,........, x_p$ is defined to be

$\mu_i$ = $\beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + .......+ \beta_p x_{pi}$

The model for multiple linear regression for n observations, is

y = $\beta_0 +\beta_1 x_1+\beta_2 x_2 + ..............+ \beta_i x_i + \varepsilon $. Stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure. When a large number of variables are involved in a model then stepwise regression performs the analyzing a large number of variables and selecting those that fit well. Thus stepwise regression is convenient procedure for selecting variables, especially when a large number of variables are to be considered. Let us see with the help of an example, how to calculate linear regression.

The line of regression of Y on X

Y = a + bX

The norm equations are

`sum` Y = a + b`sum` X

`sum` XY =a`sum` X + b`sum` X^{2}

10a + 130b = 220 ...........(i)

130a + 2288b = 3467 ..................(ii)

Solving the equations (i) and (ii), we get a = 8.8 and b =1.01

=> Y = 8.8 + (1.01)X

More topics in Linear Regression | |

Bayesian Linear Regression | Least Square Method |

Related Topics | |

Math Help Online | Online Math Tutor |