Correlation and Regression

Page 1

Correlation and Regression AGB 111


Correlation • The statistical tool with the help of which the relationship between the two variables studied is called correlation.

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


• Uni-variate analysis: Analysis of data when only one variable is involved. • Eg. Dispersion, Central tendency, Skewness, Kurtosis • Bi-variate analysis: It involves two variables which have got relationship exist between them.  In biological experiment - to know the strength of relationship or one may like to predict one variable from another related variable.  Help in measuring the independence or relationship between bi-variate data Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Correlation and causation • High degree of correlation exists due to any one or a combination of the following reasons. 1. By Chance: Due to small number of variablessometimes there may exist a correlation in a sample but the same does not exist in the population. 2. Influence of some external factors on two variables- A high degree of variables may be due to same causes affecting the each variable. Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


3. Influence of two variables on each other or mutual influence. 4. Influence of one variable upon the other one of the variable is truly independent and therefore acts free from any external forces and influence the other variable which is truly dependent since it reacts in response to the independent variable.

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Mutual relationship could depend on • Mutual dependence- supply and demand • Both are influenced by same external factors – Effect of weather on rice and potato yield. • Pure chance- size of shoe and degree of intelligence- known as spurious or non sense correlation.

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Types of correlation 1. Positive or negative correlation. 2. Simple partial or multiple correlations

3. Linear or non linear correlations.

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Positive or negative correlation. • It depends on the direction in which the variables are moving. • When both the variables move in the same direction it is positive correlation • If they move in the opposite direction it is negative correlation.

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Negative correlation Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Simple , Partial & Multiple correlation • Simple- Only two variables are involved. • Partial or multiple- Relationship of more than two variables. • Multiple correlation- The relationship between one independent variable and two or more independent variables are studied. Eg. Feed intake _ Body weight, Milk yield.  Partial correlation: The study of two variables excluding some other variables is also called partial correlation. Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Linear correlation X

30

60

90

120 150 180

Y

10

20

30

40

50

70 60 50 40 30

20 10

0 0

50

100

150

200

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore

60


Non Linear correlation X

30

60

90 120 150 180

Y

10

50

60

20

50

60

70 60 50 40 30

20 10 0 0

50

100

150

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore

200


Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Methods of studying correlation Scatter diagram method Graphical method • Both these are about visualizing relationship. Coefficient of correlation - Measuring the relationship. • Scatter diagram: By plotting the two variables on the graph sheet the relationship can be understood Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Scatter Plots

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


• If the points are too much scattered it indicates less or no relationship. • If it is condensed then it indicates some relationship between the two variables.

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Correlation Coefficient - Karl Pearson

r 

Cov  X , Y 

  X

Y

 X  X Y  Y  n 1 r  Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore

  X

Y


Distinction between linear and nonlinear correlation It is based on the ratio of change between the variables under study. X 1 2 3 4 5 Y 5 7 9 11 13 ďƒ˜ For a unit change in X there is a constant change of 2 in Y. Y = 2X + 3 • The two variables X and Y are linearly related, if there exist a relationship Y = a + bx Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Non linear or curvilinear. If there is no constant change in ‘Y’ for every unit change in ‘a’ then it is termed as non linear or curvilinear. Y = a + bx X Y

1 5

2 7

X Y

1 5

2 7

3 9

4 11

5 13

3 4 5 12 15 13 Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore

Linear Non Linear


Depending upon the distribution in the scatter plot •High degree of positive correlation •High degree of negative correlation

•Low degree of negative correlation •Low degree of positive correlation Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Standard error It tests its reliability of the observed ‘r’

1 r SE  n

2

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Probable error • Probable error (P.E)= 0.6745 X S.E (r) 0.6745- in a normal distribution, 50% of the observations lie in the range of µ±0.6745  r ± P.E indicate the limit within which the population correlation coefficient may be expected to lie. If r< P.E then r is not significant If r > P.E then r is definitely significant Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


• Sometimes PE may give a wrong conclusion especially when ‘r’ is small. In such case the significance can be tested by student ‘t’ test.

t 

r

n2 1  r 

2

• Tested for n-2 degrees of freedom Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Rank correlation • When a statistical series is not quantitative ‘r’ cannot be calculated by Karl Pearson’s method. Eg., Qualitative trait - Honesty, Beauty, Intelligence, Morality. Edward Spearman has given Rank correlation. Certain ranking is given based on individual character and correlation constant is calculated. Rho ‘ρ’ Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Rank correlation

6d   1 2 n n 1 2

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


X

Y

d

d2

1 2 3 4 5

3 4 5 2 1

-2 -2 -2 2 4 ∑d=0

4 4 4 4 16 ∑d2= 32

6d 2   1 n n2  1

6  32  1 524  Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore

- 0.6


Coefficient of determination • It gives the percentage variation in the dependent variable that is accounted by the independent variable. • It is the ratio of explained variance to the total variance. r2=

If r = 0.8 then r2=0.64 which indicates that 64% of variation in the dependent variable is due to change in the independent variable. Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Regression • Regression coefficient gives the amount of change in dependent variable for every unit change in independent variable. • It ranges from -∞ to +∞ • It has the same unit as that of the dependent variable.

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Regression equation bxy

bxy 

 xy 

 x y

 2y

n

X is dependent and Y is independent

byx

b yx 

 xy 

 x y

 2x

Y is dependent and X is independent Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore

n


Regression equation

a  Y  bX if

a=2.5 and b= 0.46

Y = 2.5 + 0.46X Then the regression equation may be used to estimate the value of Y where a value of X is known Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Correlation

Regression

Correlation means relationship between two variables

Regression coefficient gives the amount of change in dependent variable for every unit change in independent variable.

Measures- direction and degree of relationship

Establishes functional relationship and by using this to predict dependent variable for any given value of independent variable

Need not imply cause and effect

Imply cause and effect relationship.

Correlation coefficient is a relative measure and ranges between -1 and +1

Regression coefficient are absolute measures- ranges from -∞ to +∞

Can be non sense correlation

No non sense regression

Limited application

Wider application

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


r  byx S x SY

r  bxy S y S x  b yx  r b xy  r

S

S

x

y

Sx  SY

Dr. R. Jayashree, Asst. Prof. (AGB), Veterinary College, Bangalore


Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.