A scatter plot or scatter plot graph
is a graph which is drawn in Cartesian coordinate to visually represent the values for two variables for a set of data. It is a graphical representation that shows how one variable is affected by the other. The data is presented in the form of collection of points, each of which has one value of a variable positioned on the horizontal or x-axis, also called explanatory variable and the value of the other variable positioned on the vertical or y-axis, also called response variable. So it can be defined as a way of showing the relationship between any two variable by using data point on a two dimensional graph.
The importance of scatter plot is that it helps to see how two comparable data sets will agree with each other. In such condition we use the identity line as a reference to plotting the rest of the points. The more the two data come in arrangement, the more the scatters tend to be near the vicinity of the identity line, if the two data sets are numerically same, the scatters and the identity line becomes exactly.
Each data point denoted by a small circle in the graph. If the values on x-axis to increase as of y axis increase. This causes the formation of a scatter plot in which the points cluster together around an imaginary line which is moving from the lower left side of the graph to the upper right side. Hence a positive relationship can be seen between these two variables.
Scatter Plot Data Sets
When there is a large number of datas, scatter plot comes into use. They give emphases on the following factors:
- Strength between the variable
- Shape formed
One of the main high lights of scatter plot is its ability to show nonlinear relationships between different variables. When ever a scatter plot provides an association between the two variables, it not necessary that there will be a cause and effect relationship, as these variables can be connected to a third variable that explains their effect.
Scatter Plot Graph
Lets consider the following scatter plots as see how the correlation works:
Positive Scatter Plot
The slope of the line is positive so there is a positive co-relation between two variables. When there is perfect positive correlation, this is how the graph will look like:
When there is pretty strong positive correlation, the graph would look like this
For low positive correlation, the data points will not be very close to each other but it will be increasing from left to right.
For no correlation, there will be no pattern in which the datas will lie as it will have no particular direction.
For low negative correlation, the data will be more spread out but will follow a downward pattern.
In higher negative correlation, the data points are much closer which shows a higher correlation.
Negative Scatter Plot
The slope of the line is negative so there is a negative co-relation between two variables. When there is perfect negative correlation, this is how the graph will look like.
So, scatter plot is very helpful in finding the relationship between variable without actually calculating it.
Scatter plots may be "smoothed" by connecting a line to the data. This line attempts to show the non-random component of the association among the variables.
Smoothing may be accomplished using:
- The straight line
- The quadratic or polynomial line
- Smoothing splines allow greater elasticity in nonlinear associations.
The curve is fitted in a way that provides the best fit, often defined as the fit that consequence in the smallest sum of the squared errors (least squares criterion).
The use of smoothing to split the non-random from the random variations allows one to make predictions of the answer based on the value of the explanatory variable.
The scatter plot can be very well “smooth
” by using the concept of fitting a line to the data provided. This will show the association between non-random variables. Smoothing can be obtained by using a straight line, a polynomial line or smoothing splines. The curve is fitted such that it always will provide the best fit, that result in the least square criterion. This process of smoothing helps to predict the response which is based on the value of the explanatory variable. The way of showing graphically the
degree of linear relationship between any two variables. By linear relationship we mean that variables tend to cluster around a straight line.
The scatter plot provides a graphical show of the relationship between two variables. It is useful in the early stages of study when exploring data before actually scheming a correlation coefficient or fitting a regression curve. For example, a scatter plot can help out one to establish whether a linear regression model is suitable.
Types of Correlation In Scatter Plot
Correlation is used for measuring the strength of linear association between two variables that are used in scatter plots.
Types of Correlation for Scatter Plots Association:
- Positive correlation.
- Negative correlation.
- No correlation.
Negative Correlation In Scatter Plot Association :
If the pattern of plotted points runs from upper left to lower right in the 2D plot then the association between the two variables is negative correlation.
Negative Correlated scatter plot looks like the following plot.
Positive Correlation in Scatter Plot Association:-
If the pattern of plotted points runs from Lower left to Upper right in the 2D plot then the association between the two variables is Positive correlation.
Positive Correlated scatter plot looks like the following plot.
No Correlation in Scatter Plot Association:-
If the plotted points are scattered on the 2D plot without any certain form then the association between the variables is known as No Correlation.
The scatter plot with no correlation looks like the following plot.
There are four more category of correlation in scatter plot association. They are the following
- High Positive Correlation in scatter plot association.
- High Negative Correlation in scatter plot association.
- Low Positive Correlation in scatter plot association.
- Low Negative Correlation in scatter plot association.
In a scatter plot graph, the relationship between the variables is called correlation or
better scatter plot correlation
. This correlation will be well defined by a cluster of points along a line. A scatter plot can show different types of correlation between the variables in a particular manner. The correlation can be positive means it rises.
the pattern in the graph slopes from lower left to upper right, that is
upward sloping line, it means there is a positive correlation between
the variables. In simple sense , if the data makes a straight line going
through the origin to the higher values of x and y, then these
variables will be having positive correlation.The correlation can be negative means its falling or null means uncorrelation.
If the pattern in the graph slopes from upper left to lower right, that is downward sloping line, it means there is a negative correlation between them. In simple sense, if the data makes a straight line going through the
higher values of y down to the higher value of x, then these variables
will be having negative correlation.
There can also be or null means uncorrelation
relation as we wouldn’t be able to find any straight line that passes through most of the datas.
The correlations are given values according to the slopes that are formed.
- Perfect positive correlation is given 1 as value.
- Perfect negative correlation is given -1 as value.
- If no correlation , then it has 0 values
- As the number gets closer to 1 , there is stronger positive correlation
- As the number gets closer to -1 , there is stronger negative correlation
- As the number gets closer to 0 , there is weaker correlation.
Given below are some of the Scatter Plot Word Problems.
Draw the scatter plot for the Following set of points and identify the type of correlation the scatter plot association
|| Rain Fall in Centimeter
To plot the scatter plot, Months is taken as horizontal x- axis and the rainfall is taken along the vertical y-axis.
By plotting the points in the plot, the scatter plot look like the following plot.
By looking at the plot the plotted points are from upper left to
lower right thus kind of correlation is known as negative correlation.
Therefore the association between the two variables is known as negative
Question 2: Lets consider the table that gives
the data of the number of hours studied to the score obtained in test
for a particular subject:
| Time(in hours)
| Using scatter plot data we can plot the graph : scatter plot graph