Top

# Linear Regression

A regression is a statistical analysis assessing the association between two variables. It is used to find the relationship between two variables. Linear Regression Definition states that it can be measured by using lines of regression. Regression measures the amount of average relationship or mathematical relationship between two variables in terms of original units of data. Whereas, correlation measures the nature of relationship between two variables. i.e.., positive or negative or uncorrelated.

 Related Calculators Calculating Linear Regression linear regression correlation coefficient calculator

## Linear Regression Definition

Regression is used for estimating the value of one variable if we know the value of other variable, one of the variable is independent variable and other variable is dependent variable.

Let ( Xi , Yi ) ; i = 1, 2, 3, ...................n the n pairs of observations are given now plot all these points in XY-plane which reserves a scatter diagram. In scatter diagram if the maximum number of points are going through a straight lines then we call it as linear regression if not that means they are passing through a curve then we call it as curve linear regression. Linear Regression can be measured by using lines of regression i.e.., Y-on-X & X-on-Y and also curve linear regression can be measured by using correlation ratio.

## Linear Regression Equation

A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable, 'b' is the slope of the line, and 'a' is the intercept. The linear regression formula is derived as follows. Let ( Xi , Yi ) ; i = 1, 2, 3,................, n be n-pairs of observations are given and there are representing a linear regression.

We know that, coefficient of correlation

$r$ = $\frac{Cov(X, Y)}{\sigma_X \sigma_Y}$

where Cov (X , Y) = $\frac{1}{n}$ $\sum X_i Y_i - \bar X \bar Y$

and $X^2 =$\frac{1}{n}\sum (X_i - \bar X)^2

Now, we want to obtain regression equation of Y-on-X by taking the line and the corresponding normal equation are

Y = a + bX ----------------- ( 1 )

$\sum Y_i = na + b \sum X_i$ ..................... ( 2 )

$\sum X_i Y_i = a \sum X_i + b \sum X^2_i$ .................................... ( 3 )

Divide equation (2) and (3) by n

From (2), $\frac{1}{n} \sum Y_i = a + b (\frac{1}{n}) \sum X_i$

$\bar Y = a + b \bar X$ ...................................... (4)

cov (X , Y) + X Y = $a \bar X + b (\sigma X^2 + \bar X^2)$ ............................. ( 5 )

$\sigma X^2 = (\frac{1}{n}) \sum X^2_i - \bar X^2$

Multiplying equation (4) with X and sub from (5)

b = cov ( X , Y ) / sigma X2

Substitute the value of b in ( 4 )

Therefore, $Y - \bar Y = bYX (X - \bar X)$

Similarly, we can prove that regression equation of X-on-Y is

$X - \bar X = bXY (Y - \bar Y)$

## Linear Regression Coefficient

When the regression line is linear the regression coefficient is the constant (a) that represents the rate of change of one variable (Y) as a function of changes in the other (X); it is the slope of the regression line. If the two variables are mutually related to each other, then every time there is an increase of the given size in value of X variable.

The relation between variables when the regression equation is linear.

Y' = aX + b

where Y' is the predicted value of the variable.
a = regression coefficient
b = intercept of the line.

## Linear Regression Model

Linear regression is a statistical procedure for predicting the value of a dependent variable from an independent variable when the relationship between the variables can be described with a linear model. A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable, 'b' is the slope of the line, and 'a' is the intercept. The dependent and independent variables may be scalars or vectors. If the independent variable is a vector, one speaks of multiple linear regression. In simple regression analysis, there is no partialling out of other variables because no other variables are included in the regression.

The equation of the probabilistic simple regression is

y = $\beta_0 +\beta_1 x_1 + \varepsilon$

where, y is the value of the dependent variable

$\beta_0$ is the population y intercept

$\beta_0$ is the population slope

$\varepsilon$ the error of prediction.

## Simple Linear Regression

Simple linear regression is a way of analyzing the relationship between variable x and variable y. Linear regression is an approach to modelling the relationship between a scalar dependent variable y and one or more explanatory variables denoted x. A linear regression model attempts to explain the relationship between two or more variables using a straight line. For a simple linear regression, assign one variable to the dependent variable and one to the explanatory variables. Both the dependent and explanatory variables must be numerical.

### Simple Linear Regression Analysis

Simple linear regression is the least squares estimator of a linear regression model with a single explanatory variable. Regression analysis is a statistical technique that attempts to explore and model the relationship between two or more variables using a straight line.

## Multiple Linear Regression

Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. The generalization of linear regression, called multi-linear regression, aims to find the mapping that is as close as possible to a nonlinear mapping. The population regression line for p explanatory variables $x_1, x_2,........, x_p$ is defined to be

$\mu_i$ = $\beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i} + .......+ \beta_p x_{pi}$

The model for multiple linear regression for n observations, is

y = $\beta_0 +\beta_1 x_1+\beta_2 x_2 + ..............+ \beta_i x_i + \varepsilon$.

## Stepwise Linear Regression

Stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure. When a large number of variables are involved in a model then stepwise regression performs the analyzing a large number of variables and selecting those that fit well. Thus stepwise regression is convenient procedure for selecting variables, especially when a large number of variables are to be considered.

## Linear Regression Example

Let us see with the help of an example, how to calculate linear regression.

### Solved Example

Question: 10 observations on price X and supply "Y" the following data was obtained sum X = 130, sum Y = 220, sum X2 = 2288, sum2 = 5506, sum XY = 3467 Find the line of regression of Y on X.
Solution:

The line of regression of Y on X

Y = a + bX

The norm equations are

sum Y = a + bsum X

sum XY =asum X + bsum X2

10a + 130b = 220              ...........(i)

130a + 2288b = 3467          ..................(ii)

Solving the equations (i) and (ii), we get a = 8.8 and b =1.01

=> Y = 8.8 + (1.01)X

 More topics in Linear Regression Bayesian Linear Regression Least Square Method
 NCERT Solutions NCERT Solutions NCERT Solutions CLASS 6 NCERT Solutions CLASS 7 NCERT Solutions CLASS 8 NCERT Solutions CLASS 9 NCERT Solutions CLASS 10 NCERT Solutions CLASS 11 NCERT Solutions CLASS 12
 Related Topics Math Help Online Online Math Tutor
*AP and SAT are registered trademarks of the College Board.