Contents • • • • • • • •

Tests of significance Stages in performing test of significance Types of error Test of significance for large samples Test of significance for small samples Chi square test ANOVA Bibliography www.indiandentalacademy.com

Tests of significance

Tests of significance â&#x20AC;˘ Whatever be the sampling procedure or the care taken while selecting sample, the sample statistics will differ from the population parameters â&#x20AC;˘ Also variations between 2 samples drawn from the same population may also occur â&#x20AC;˘ i.e. differences in the results between two research workers for the same investigation may be observed www.indiandentalacademy.com

Tests of significance • Thus it becomes important to find out the significance of this observed variation • ie whether it is due to – chance or biological variation (statistically not significant) OR – due to influence of some external factors ( statistically significant)

Tests of significance â&#x20AC;˘

To test whether the variation observed is of significance, the various tests of significance are done. The test of significance can be broadly classified as 1. Parameteric tests 2. Non parametric tests

Parameteric tests • Parametric tests are those tests in which certain assumptions are made about the population – Population from which sample is drawn has normal distribution – The variances of sample do not differ significantly – The observations found are truly numerical thus arithmetic procedure such as addition, division, and multiplication can be used www.indiandentalacademy.com

Parameteric tests • Since these test make assumptions about the population parameters hence they are called parameteric tests . • These are usually used to test the difference • They are: – Student t test( paired or unpaired) – ANOVA – Test of significance between two means www.indiandentalacademy.com

Non parametric tests â&#x20AC;˘ In many biological investigation the research worker may not know the nature of distribution or other required values of the population. â&#x20AC;˘ Also some biological measurements may not be true numerical values hence arithmetic procedures are not possible in such cases. www.indiandentalacademy.com

Non parametric tests • In such cases distribution free or non parametric tests are used in which no assumption are made about the population parameters e.g. – – – – – –

Mann Whitney test Chi square test Phi coefficient test Fischer’s Exact test Sign Test Freidmans Test www.indiandentalacademy.com

One tailed and two tailed test â&#x20AC;˘ Test of significance can also be divided into one tailed or 2 tailed test

One tailed and two tailed test • • • •

Two tailed test This test determines if there is a difference between the two groups without specifying whether difference is higher or lower It includes both ends or tails of the normal distribution Such test is called Two tailed test Eg when one wants to know if mean IQ in malnourished children is different from well nourished children but does not specify if it is more or less www.indiandentalacademy.com

One tailed and two tailed test • • • • •

One tailed test In the test of significance when one wants to specifically know if the difference between the two groups is higher or lower ie the direction plus or minus side is specified. Then one end or tail of the distribution is excluded eg if one wants to know if mal nourished children have less mean IQ than well nourished then higher side of the distribution will be excluded Such test of significance is called one tailed test

Stages in performing test of significance

Stages in performing test of significance • • • •

State the null hypothesis State the alternative hypothesis Accept or reject the null hypothesis Finally determine the p value

Stages in performing test of significance State the null hypothesis • Null hypothesis • It is a hypothesis of no difference between statistics of a sample and parameter of the population or between statistics of two samples • It nullifies the claim that the experimental result is different from or better than the one observed already www.indiandentalacademy.com

Stages in performing test of significance State the alternative hypothesis â&#x20AC;˘ Alternative hypothesis â&#x20AC;˘ It is hypothesis stating that the sample result is different ie larger or smaller than the value of population or statistics of one sample is different from the other

Stages in performing test of significance Accept or reject the null hypothesis • Null Hypothesis is accepted or rejected depending on whether the result falls in zone of acceptance or zone of rejection • If the result of a sample falls in the area of mean ± 2SE the null hypothesis is accepted. • This area of normal curve is called zone of acceptance for null hypothesis www.indiandentalacademy.com

Stages in performing test of significance • If the result of sample falls beyond the area of mean ± 2 SE • null hypothesis of no difference is rejected and alternate hypothesis accepted • This area of normal curve is called zone of rejection for null hypothesis

Stages in performing test of significance Finally determine the p value â&#x20AC;˘ P value is determined using any of the previously mentioned methods â&#x20AC;˘ If p> 0.05 the difference is due to chance and not statistically different but if â&#x20AC;˘ p < 0.05 the difference is due to some external factor and statistically significant www.indiandentalacademy.com

Types of error

Types of error • While drawing conclusions in a study we are likely to commit two types of error. – Type I error – Type II error

Types of error • Type I error • This type of error occurs • When we conclude that the difference is significant when in fact there is no real difference in the population ie we reject the null hypothesis when it is true • Denoted by α www.indiandentalacademy.com

Types of error • Type II error • This type of error occurs • When we say that the difference is not significant when in fact there is a real difference between the populations i.e. the null hypothesis is not rejected when it is actually false • It is denoted by β www.indiandentalacademy.com

Tests of significance for large samples

Test of significance for large samples • These tests are used for sample size greater than 30 • The test used is Z test • Z is standard normal derivate and has been discussed under normal distribution Z = observation – mean SD www.indiandentalacademy.com

Test of significance for large samples • However in Z test standard deviation is replaced by standard error In Z test, Z = observed difference standard error • We know that standard deviation measure the variation within a sample • Standard error is the measure of difference in values occuring – between a sample and population – between two samples of the same population www.indiandentalacademy.com

Test of significance for large samples • Standard error used in Z test can be – Standard error of mean – Standard error of proportion – Standard error of difference between 2 means – Standard error of difference between 2 proportions

Test of significance for large samples â&#x20AC;˘ If in the Z test the Z>2 i.e. if the observed difference between the 2 means or proportion is greater than 2 times the standard error of difference â&#x20AC;˘ p < 0.05 according to the given table

Test of significance for large samples Z

1.6

2.0

2.3

2.6

p

0.1

0.05

0.02

0.01

Thus the difference is not due to chance and may be due to influence of some external factor i.e. the difference is statistically significant www.indiandentalacademy.com

Standard error of mean • Used for quantitative data • Standard error of mean is the difference between sample mean and population mean given by SE x = SD of Sample n • also population mean will be sample mean ± 2 standard error of mean www.indiandentalacademy.com

Standard error of mean â&#x20AC;˘ This will enable us to know whether the sample mean is within the limits of population mean Here Z=sample mean â&#x20AC;&#x201C; population mean SE of mean

Standard error of mean • In a random sample of 100 the mean blood sugar is 80 mg % with SD 6 mg% . Within what limits the population mean will be ? What can be said about another sample whose mean is 82% SE = 6 = 6 = 0.6 100 10 • Thus the population mean will be 80± 2 X 0.6 = 78.8 to 81.2 • A sample with 82% mean is not within limits of population mean thus it does not seem to be drawn from the same population www.indiandentalacademy.com

Standard error of difference between 2 means • Used for quantitative data • It is the difference between means of two samples drawn from the same population • It helps to know what is the significance of difference obtained by 2 research workers for the same investigation SE (X1 – X2) = SD12 + SD22 n1 www.indiandentalacademy.com

n2

Standard error of difference between 2 means â&#x20AC;˘ Eg.Find the significance of difference in mean heights of 50 girls and 50 boys with following values Mean

SD

Girls

147.4

6.6

Boys

151.6

6.3

Standard error of difference between 2 means SE

=

(6.6)2 + 50 = 1.29

(6.3)2 50

Z=observed difference SE Z = 151.6 â&#x20AC;&#x201C; 147.4 1.29 = 3.26 www.indiandentalacademy.com

Standard error of difference between 2 means â&#x20AC;˘ Since Z value is more than 2 ,p will be less than .05 â&#x20AC;˘ Thus difference is statistically significant and it can be concluded that boys are taller than girls

Standard error of proportion • In case of qualitative data where character remains same but its frequency varies we express it in proportion instead of mean • Proportion of individual having special character p • q is number of individual not having the character • P+q =1 or 100 if expressed in %age www.indiandentalacademy.com

Standard error of proportion • Standard error of proportion is the unit which measures variation in proportion of a character from sample to population SE of proportion = p X q n p=proportion of positive character q=proportion of negative character n=sample size • Also proportion of population = proportion of sample ± 2 SEP • Thus one can determine whether the proportion of sample is within limits of population proportion www.indiandentalacademy.com

Standard error of proportion â&#x20AC;˘ Proportion of blood group B among Indians is 30%. If in a sample of 100 individuals it is 25% what is your conclusion about the group SEP = p X q = 25 X 75 = 4.33 n 100 Z = observed diff = 30 - 25 = 1.15 SE 4.33 â&#x20AC;˘ Since z is < 2 ,p will be more .05 thus the difference is not significant. www.indiandentalacademy.com

Standard error of difference between 2 proportion â&#x20AC;˘ Measures the difference in proportion of a character from sample to sample SE (p1-p2) = p1 q1 + p2 q2 n1

n2

Standard error of difference between 2 proportion • If typhoid mortality in a sample of 100 is 20 % and other sample of 100 is 30% then is this difference in mortality rate significant ? • p1 = 20 : q1 = 80 : n1 = 100 • p2 = 30 : q2 = 70 : n2 = 100 • SE(p1-p2) = 6.08 • Z = 30 – 20 = 1.64 6.08 • Z< 2 , p<.05 thus difference observed is not significant www.indiandentalacademy.com

Test of significance for small samples

Test of significance for small samples • In case of samples less than 30 the Z value will not follow the normal distribution • Hence Z test will not give the correct level of significance . • In such cases students t test is used • It was given by WS Gossett whose pen name was student www.indiandentalacademy.com

â&#x20AC;˘

There are two types of student t Test 1. Unpaired t test 2. Paired t test

Unpaired t test • Applied to unpaired data of observation made on individuals of 2 separate groups to find the significance of difference between 2 means • Sample size is less than 30 • e.g. difference in accuracy in an impression using two different impression materials www.indiandentalacademy.com

Unpaired t test • • •

• •

Steps in unpaired t Test are Calculate the mean of two samples Calculate combined standard deviation Calculate the standard error of mean which is given by SEM = SD 1 + 1 n1 n2 Calculate observed difference between means X1 – X2 Calculate t value = observed difference Standard error of mean www.indiandentalacademy.com

Unpaired t test • Determine the degree of freedom which is one less than no of observation in a sample (n -1) • Here combined degree of freedom will be = (n1 – 1) + (n2 – 1) • Refer to table and find the probability of the t value corresponding to degree of freedom • P< 0.05 states difference is significant • P> 0.05 states difference is not significant www.indiandentalacademy.com

Unpaired t test • In a nutritional study 13 children in group A are given usual diet along with vitamin A and vitamin D while 12 children in group B take the usual diet. • The gain in weight in pounds for both groups after 12 months is shown in the table • Is vitamin A and D responsible for gain in weight? www.indiandentalacademy.com

Group A

Group B

5

1

3

3

4

2

3

4

2

2

6

1

3

3

2

4

3

3

6

2

7

2

5

3

3

Unpaired t test • • • • •

Mean of group A = 4 Mean of group B = 2.5 Total SD = 1.37 Total SE = 0.548 t= Observed difference SE • t = 4 – 2.5 = 2.74 0.548 www.indiandentalacademy.com

Unpaired t test • Combined degree of freedom = n1 + n2 – 2 • 12 +13 - 2 • p Value is checked corresponding to the t value at 23 d.f. from the t table • It is < 0.02 • Thus difference is statistically significant • And accounted to role of vitamins A&D www.indiandentalacademy.com

Paired t test • It is applied to paired data of observation from one sample only . • Used in sample less than 30 • The individual gives a pair of observation i.e. observation before and after taking a drug • The steps involved are www.indiandentalacademy.com

Paired t test • Calculate the difference in paired observation i.e. before and after = x1 – x2 = y • Calculate the mean of this difference = y • Calculate SD • Calculate SE = SD n • Determine t = y SE www.indiandentalacademy.com

Paired t test • Determine the degree of freedom • Since there is one sample df = n-1 • Refer to table and find the probability of the t value corresponding to degree of freedom • P< 0.05 states difference is significant • P> 0.05 states difference is not significant www.indiandentalacademy.com

Paired t test â&#x20AC;˘ Eg.Systolic BP of a normal individual before and after injection of hypotensive drug is given in the table. â&#x20AC;˘ Does the drug lower the BP?

BP before giving drug X1

BP after giving drug X2

Difference X1X2 = y

122 121

120 118

2 3

120

115

5

115

110

5

126

122

4

130

130

0

120

116

4

125

124

1

128

125

3

Paired t test • Mean of difference y= €y/ n = 27 / 9 = 3 • SD = €( y - y )2 = 1.73 n-1 • SE = SD = 1.73 = 0.58 n 9 • t = y / SE = 3 / 0.58 = 5.17 • Degree of freedom to n – 1 = 9 – 1 = 8 www.indiandentalacademy.com

Paired t test • p value corresponding to t = 5.17 and d.f. 8 is < 0.001 • Thus highly significant • Thus decrease in BP is due to the Drug

Chi square test

Chi square test • Chi square test unlike z and t test is a non parametric test • The test involves calculation of a quantity called chi square . • Chi square is denoted by X2 • It was developed by Karl Pearson

Chi square test • The most important application of chi square test in medical statistics are – Test of proportion – Test of association – Test of goodness of fit

Chi square test • Test of proportion – Used as an alternate test to find the significance of difference in 2 or more than 2 proportions

• Test of association – To measure the probability of association between 2 discreet attributes e.g smoking and cancer

• Test of goodness of fit – Tests whether the observed values of a character differ from the expected value by chance or due to play of some external factor www.indiandentalacademy.com

Chi square test X2 = € ( O – E ) 2 E • X2 denotes Chi square • O = Observed Value • E = Expected Value www.indiandentalacademy.com

Chi square test • • • •

Steps in Chi Square Test State the null hypothesis Determine the Chi square value Find the degree of freedom Refer the Chi square table to find the probability value corresponding to the degree of freedom www.indiandentalacademy.com

Chi square test • Let us consider the following example • We are making a field trial of 2 vaccines • The results of field trial are Not Vaccine Attacked Attacked

Total

Attack Rate

A

22

68

90

24.4%

B

14

72

86

16.2%

Total

36

140

176

Chi square test • Vaccine B seems to be superior to Vaccine A • We perform Chi Square test to verify if the vaccine B is superior to vaccine A or is it merely due to chance • State the null hypothesis • It states that the vaccines have equal efficacy www.indiandentalacademy.com

Chi square test Determining the Chi Square Value • Find total attack and non attack rates • Total Attack rate = 36 = 0.204 176 • Total Non Attack Rate =

140 = 176

0.795

Chi square test Vaccine

Attacked

Not Attacked

A (n=90)

O = 22 E = 0.204 X 90 =18.36 O - E = + 3.64

O = 68 E = 0.795 X 90 = 71.55 O - E = - 3.55

O = 14 E = 0.204 X 86 = 17.54 O - E = -3.54

O = 72 E = 0.795 X 86 = 68.37 O - E = + 3.63

B (n=86)

Chi square test X2 = € ( O – E ) 2 E = (3.64)2 + (3.55)2 + (3.54)2 + (3.63)2 18.36 71.55 17.54 68.37 = 0.72 + 0.17 + 0.71 + 0.19 = 1.79 www.indiandentalacademy.com

Chi square test • Find the Degree of Freedom = (c-1) (r-1) • c = number of Columns • r = number of Rows • d.f. = (2-1)(2-1) = 1

Chi square test • Find the p value • On referring to Chi square table with one degree of freedom the p value was more than 0.05. • Hence the difference is not statistically significant and the null hypothesis of no difference between vaccines is accepted.

ANOVA

ANOVA Analysis of variance • Investigations may not always be confined to comparison of 2 samples only • e.g. we might like to compare the difference in vertical dimension obtained using 3 or more methods like phonetics, swallowing, niswonger’s method • In such cases where more than 2 samples are used ANOVA can be used www.indiandentalacademy.com

ANOVA • Also when measurements are influenced by several factors playing there role e.g. factors affecting retention of a denture, ANOVA can be used. • ANOVA helps to decide which factors are more important • Requirements – Data for each group are assumed independent and normally distributed – Sampling should be at random www.indiandentalacademy.com

to

be

ANOVA • One way ANOVA – Where only one factor will effect the result between 2 groups

• Two way ANOVA – Where we have 2 factors that affect the result or outcome

• Multi way ANOVA – Three or more factors affect the result or outcomes between groups www.indiandentalacademy.com

ANOVA F test F = Mean Square between Samples Mean Square within Samples â&#x20AC;˘ F = variance ratio â&#x20AC;˘ The values of Mean square are seen from the analysis of variance table if we have the values of sum of squares and degree of freedom ( which are calculated ) www.indiandentalacademy.com

ANOVA • Mean Square between Samples – It denotes the difference between the sample mean of all groups involved in the study (A, B, C etc) with the mean of the population

• Mean Square within Samples – it denotes the difference between the means in between different samples

• The greater both these value more is the difference between the samples www.indiandentalacademy.com

ANOVA • The F value observed from the study is compared to the theoretical F value obtained from the Tables at 1% and 5% confidence limits. • The results are then interpreted. • If the observed value is more than theoretical value at 1% , the relation is highly significant. • If the observed value is less than the theoretical value at 5% it is not significant. • If the observed value is between 1 and 5% of theoretical value it is statistically significant. www.indiandentalacademy.com

Bibliography • Mahajan BK: Methods in Biostatistics.6th edition • Park. Textbook of social and preventive medicine. 18th edition • Smith F Gao, Smith J.E.; Clinical Research • Rao sunder, Richard: An introduction to biostatistics 3rd edition • Baride , Kularni , muzumdar :Manual of Biostatistics : 1st edition • Soben Peter ; Textbook of preventive and community dentistry : 2nd edition www.indiandentalacademy.com

Thank You â&#x20AC;Ś.

ANOVA • • • • • • • • • • •

Steps in ANOVA State the null hypothesis Calculate sum of squares for individuals Calculate Correction term Calculate total sum of squares Calculate sum of squares between groups Calculate the sum of squares within groups Calculate the degree of freedom Check the mean square from ANOVA table Calculate F value Comparison with theoretical value www.indiandentalacademy.com

ANOVA • In a study conducted children were divided into 3 groups and fed with different diets • The hemoglobin concentration was measured after a month and recorded • By applying ANOVA we can find whether the difference between groups differ significantly.

Group 1

Group 2

Group 3

11.6

11.2

9.8

10.3

8.9

9.7

10.0

9.2

11.5

11.5

8.8

11.6

11.8

8.4

10.8

11.8

9.1

9.1

12.1

6.3

10.5

10.8

9.3

10.0

11.9

7.8

12.4

10.7

8.8

10.7

11.5

10.0 9.7

n = 11

n = 12

n = 10

Total = 124.0

Total = 107.5 Total = 106.1 www.indiandentalacademy.com

Mean = 11.27

Mean = 8.96

Mean = 10.61

ANOVA • State the null hypothesis – The three groups are from the same population and do not show any difference

• Calculate sum of squares for individuals = – € x2 = 11.62 + 10.32 + ………. 10.72 = 3516.32

• Calculate Correction term = –(€x)2 n

=

(337.6) 2 33

= 3453.75

ANOVA • Calculate total sum of squares = Sum of squares for individuals - Correction term = 3516.32 – 3453.75 = 62.57

• Calculate sum of squares between groups = (€x)2 – Correction term n = t12 + t22 + t32 – Correction term n1

n2

ANOVA = 124.02 + 107.52 + 106.12 – 3452.75 = 32.81 11 12 10

• Calculate the sum of squares within groups =Total sum of squares – sum of square between groups = 62.57 – 32.81 = 29.57

• Calculate the degree of freedom – It is one less than the number of items www.indiandentalacademy.com

ANOVA – Degree of freedom for total sum of samples = 33 – 1 = 32 – Degree of freedom for sum of squares between groups = 3 – 1 = 2 – Degree of freedom for sum of squares within groups = 32 – 2 = 30

• Check the mean square from ANOVA table for given sum of square and degree of freedom www.indiandentalacademy.com

ANOVA From the table we get – Mean square between groups = 16.405 – Mean square within groups = 0.992

• Calculate F value ie variance ratio F = Mean Square between Samples Mean Square within Samples = 16.405 = 16.54 0.992 www.indiandentalacademy.com

ANOVA

• Comparison with theoretical value – theoretical value of F at 1% and 5% at a given degree of freedom is obtained from F table and compared with F value obtained – In the example discussed degree of freedom are n1 = 2 and n2 = 30 – The F value at 5% is 3.32 and at 1% is 5.39 – The observed value 16.54 is greater than the 1% value. Therefore we can say that the difference between groups is significant at 1% level – Thus the probability of observing a variance is less than 0.01. www.indiandentalacademy.com