Statistical Procedures Because of the variety of situations in which we wish to investigate differences, many statistical procedures have been developed. One of these is the z-test or critical ratio. It is based on the z-score and normal curve and can be used when you have a relatively large sample. A similar test, the t-test, is used when the sample is smaller and probably differs slightly from a normal distribution. In addition to these, there are numerous other tests in common usage. Although many of these statistical procedures appear frequently in physical education journals and research literature, a detailed explanation of the computational procedures and theory behind each is beyond the scope of this text because some of these procedures can bring about severe intellectual bwian damage…mental dwarfism even. You don’t want to try and run like a Porche, if God gave you a VW for a body…do you? On that same line of thinking, you don’t want to try and function like a central processing unit if God gave you a hand calculator for a brain. Now don’t get me wrong, you may be one of those central processing units; in general though, most of us are just hand calculators. This text is for us hand calculator people. If you are a central process person, you might want to check out John Cotton and Benton Underwood’s Advanced Statistics book. Since the t-test is one of the most commonly used of these techniques we will use it as an illustrative example. Each of the other techniques is appropriate for specific situations, but the basic theory and interpretation is similar in all cases. It should be noted here that the following explanation of the t-test has been somewhat simplified for easier understanding. You know for us hand calculator people. Computationally, it is correct, but some of the finer points of the theory have been glossed over.

Statistical Decisions Statistical decisions, based upon evidence observed in samples, always involve the possibility of error. Statisticians do not deal with decisions based upon certainty. They merely estimate the probability or improbability of occurrence of events. Rejection of a null hypothesis when it is really true is known as a Type I error. The level of significance selected determines the probability of a Type I error. For example, when the researcher rejects a null hypothesis at the .05 level he is running a 5 percent risk of rejecting what should be a sampling error explanation when it is probably true. Accepting a null hypothesis when it is really false is known as a Type II error. This decision errs in accepting a sampling error explanation when it is probably false. Setting the level of significance as high as .01 level minimizes the risk of a Type I error, but this high level of significance is so cautious that it increases the risk of a Type II error. Students have complained that the statement of a null hypothesis sounds like double talk. They are understandably puzzled about the reasons for the negative statement that the researcher attempts to reject. The explanation is somewhat involved, but the logic is sound. Stick with me here. Verification of one consequence of a positive hypothesis does not prove it to be true. As just mentioned, you could find a false positive 5% of the time if you test at the level of .05 significance. Observed

consequences that may be consistent with a positive hypothesis may also be compatible with equally plausible but competing hypotheses. Verifying a positive hypothesis provides a rather inconclusive test. Rejecting a null or negative hypothesis provides a stronger test of logic. Evidence that is inconsistent with a particular hypothesis provides a strong basis for its rejection. Before a court of law a defendant is assumed to be not-guilty until the not-guilty assumption is discredited or rejected. In a way, the not-guilty assumption is comparable to the null hypothesis…there is no significant difference until proven otherwise. Do you see the similarity?

Computing a t-test for the Difference between Two Means for Independent Samples This is probably the most extensively used statistical test of all time, and certainly the most widely known. It is simple, straightforward, easy to use, and adaptable to a broad range of situations. No statistical toolbox should ever be without it. Its utility is occasioned by the fact that scientific research very often examines the phenomena of natural world two variables at a time, with an eye toward answering the basic question: Are these two variables related? If we alter the level of one, will we thereby alter the level of the other? Or alternatively: If we examine two different levels of one variable, will we find them to be associated with different levels of the other. The formula for t-test for the difference between two means for independent samples takes on various forms. The formula we are going to utilize is commonly used and is shown below in all its glory. M1  M 2 t   X 1 2   X 2 2  N 1  N 2      N 1  N 2  2  N 1  N 2   

Look at that bad boy. If the formula for the Pearson Product-Moment Correlation Coefficient frightened you this baby probably terrifies you. Well, don’t let it because a lot of things that look scary aren’t scary. This is the absolute case here. This thing may look terrifying but “It ant Jack” as they say down South. Let me break this thing down for you step by step and you can see for yourself it is more fluff than fury. Again think of it like a football play. Just like a football play is made up of nothing more than a series of Xs and Os that have been mixed together and then worked to accomplish a process (blocking, hitting a designated hole and scoring a touchdown), so it is with this formula. Yea! I love the football analogy, because if you can decipher and follow offensive and defensive football schemes you sure as heck can figure out basic statistical analyzes. Look for the elements involved in this fundamental statistical problem. They are listed below: Σ X1 This simply tells you to add up all the X scores in the X1 column. Σ X2 This simply tells you to add up all the X scores in the X2 column. (X1-M1) This tells you to subtract the mean score from each raw score in X1 column. (X2-M2) This tells you to subtract the mean score from each raw score in X2 column.

X12 This tells you to square each value in the X1 column. Table 4-2 Extra base hits by samples of infielders and outfielders

X1 Infielders 7 8 6 5 10 6 7 Σ X1 = 49 N1 = 7 M1 = 7

X2 Outfielders 11 13 8 10 13 11 Σ X2 = 66 N2 = 6 M2 = 11

N1 This refers to the number of subjects you have in the N1 column. N2 This refers to the number of subjects you have in the N2 column. These are the essentials you need in order to score another touchdown. The rest is simply a matter of adding them, subtracting them, dividing them, multiplying them, and finally taking a square root. All of this is walk in the park with your calculator. Let’s work through an example. Okay, let’s do it this way. Often we wish to investigate differences between two groups that are independent. "Independent" means that the scores made by one group do not affect those made by the other group. When the two groups are independent a t-test for the difference between two means of independent samples may be appropriate. The assumptions necessary for this test will be discussed after some specific examples. Suppose two baseball fans were arguing whether outfielders are better power hitters than infielders. One fan thinks outfielders are better power hitters, while the other thinks infielders hit as well as outfielders. Infielders as a group are independent from outfielders. For example, knowing the second baseman got a triple gives you no basis for predicting what the right fielder will do. Each athlete’s performance is independent from the others. In other words, one athlete’s performance is not dependant on the other athlete’s performance. Actually, in baseball there could be a relationship, but just make believe there is no relationship here, okay. Anywho, the two fans decided to settle the argument by using a t-test for the difference between the two means for independent samples to compare the number of extra-base hits by a representative group of infielders versus a representative group of outfielders. Here we are again, two “freaken” baseball fans trying to be researchers. This is just what I was talking about a few seconds ago…Maryland Fried Chicken went out of business a few months ago now these guys are researchers. Well, we will give them the benefit of the doubt and call them researchers from here on out. I am such a nice guy. Anyhow, note that the researchers (the fans) will draw conclusions about the

populations of infielders and outfielders from the actual experience of these representative groups or samples. They assume that, if infielders and outfielders hit for power about equally well, the mean number of extra base hits for all infielders (M1) should be about equal to the mean number of extra base hits for all outfielders (M2). Thus, their null hypothesis (HO) can be stated: HO: M1=M2. In scientific terms the null hypothesis would be stated something like this: There will be no significant difference between the slugging percentage of infielders and outfielders. In laymen’s terms, infielders will hit just as well as outfielders and vice versa. Since one fan is convinced that outfielders are better power hitters the two fans will choose as their alternate hypothesis (HA) that the mean number of extra base hits by the infielder population (M1) is less than the mean for the outfielder population (M2) This is denoted by HA: M1< M2. The symbol “<” is read as “is less than.” Scientifically the alternative hypothesis is stated something like this: There will be a significant difference between the slugging of infielders and outfielders. Figure 4-3 gives the number of hits and the means of the two samples. The t-test will enable us to determine the probability that, if the null hypothesis stating that infielders and outfielders hit equally well for power is true, then the difference of 4 found between the two means (M1 =7 M2=11…the difference = 4) reported below will not be significantly different. The researchers agree to use the .05 level of significance. By using the procedure given in Figure 4-4, they compute the value of t for the difference between these sample means and find it to be -4.08. The t-statistic forms a distribution similar to that of the normal curve. We can determine areas under the curve corresponding to specific values of t in a similar manner as we did for z-scores on the normal curve. One additional bit of information is needed to use the t table. Each t value is associated with something called degree of freedom (df). For the t-test for the difference between means for independent samples, the number of degrees of freedom is the sum of the number of people in each of the two samples minus 2 (N1+ N2-2). For the data in Table 4-2, the df = 7+6-2=11. Since the fans choose the .05 level of significance, we are interested in the portion of the curve which corresponds to 5% of the area. Since the computed t-value in this example is negative, we must find the t-value with 5% of the area of the curve below it. This portion of the curve is called the Region of Rejection. A t-value will fall in this area less than 5% of the time. Now refer to the table in Appendix A. You will notice that all the t-values given are positive. Luckily, the t-curve is symmetrical. With 11df, the t value with 5% of the area above it is 1.796. As a result the t-value with 5% of the area below it is -1.796. This is called the critical value of t. Figure 4-1 gives a sketch of the t curve for 11df, showing the critical value of t, the Region of Rejection (shaded area), and the computed value of t for this example. As Figure 4-1 shows, the computed value of t falls in the Region of Rejection. Hence, the difference between the two sample means would occur due to chance less than 5% of the time if the null hypothesis were true.

Since the computed t-value is significant at the .05 level, the two researchers must reject the null hypothesis that infielders and outfielders are equally good power hitters. Instead, they have to accept the alternate hypothesis and conclude that outfielders are superior power hitters. Table 4-3 and the following steps illustrate the procedure for computing a t-value for the data presented in Table 4-2. Note that stating the null and alternate hypotheses and interpreting the conclusion in terms of the situation are included as part of the procedure. 1. State the null hypothesis – HO: M1 = M2 2. State the alternate hypothesis – HA: M1 < M2 3. Form the X1 and X2 columns. List the scores for each group of subjects Figure 4-1 t-distribution for 11 df showing critical and computed values of t

The computed t value falls in the Region of Rejection, thus we reject the HO and accept the HA: M1 < M2. Since the M of the outfielders is significantly higher, we conclude that they hit with more power.

4. Compute Σ X1, N1, and M1 5. Compute Σ X2, N2, and M2 6. Form the X1 column. Find the difference X1 – M1 for each value in the X1 column 7. Form the X2 column. Find the difference X2 – M2 for each value in the X2 column 8. As a check, compute Σ X1 and Σ X2 – both should be zero unless you rounded your means 9. Form the X12 column. Square each value in the X1 column 10. Compute Σ X12. Sum the X12 column 11. Form the X22 column. Square each value in the X2 column 12. Compute Σ X22. Sum the X22 column 13. Use the formula: to compute t. In this case, t = –4.08.

t

M1  M 2   X 12   X 2 2   N1  N 2  2 

 N 1  N 2   N 1  N 2 

  

14. Determine the degrees of freedom. In this case, N1 + N2 – 2 = 7 + 6 – 2 = 11df 15. Make a rough sketch of the t curve 16. Use the table in Appendix A to determine the critical value of t for the .05 level of significance. Since the computed t value is negative, we want the value of t with 5% of the area below it. This means that the probability of getting a t value less than the critical value due to chance factors is less than 5%. In this case, for .05 level with 11df, the critical value is –1.796 17. Shade the Region of Rejection. Since the critical value of t is negative, shade the area to its left. 18. See if the computed t value falls in the Region of Rejection. In this case, t = –4.08 falls within the Region of Rejection as we reject HO and accept HA 19. Write the conclusion, including the statistical information and its interpretation for this situation The above formula and procedure can be used regardless of the number of people in each sample. The data you collected and the procedure you used to analyze it is illustrated in Figure 4-6. I already did the work for you. Don’t tell anyone. It will be our little secret. Now don't let intimidation get you. Only instant karma can do that. Just think about your first day on the kindergarten bus. All those big first-graders making fun of you because you didn't know where to sit. Taking your hat and gloves and passing them around the bus and giving you wedgies when the bus driver wasn’t looking. Fun stuff! I won't even mention the Donald Duck lunch box. You remember! That was stressful, even a little scary, but you managed to get through it, didn't you? You are going to get through this too. I promise, by the second time you go over this stuff, you'll know where everything is, how everything breaks down, and you'll even start to do some of this freaken work on your own just like the big kids. It'll all work out fine…just leave the lunch box home. Now look at what the heck you supposedly did.

Table 4-3 t-test for the difference between 2 means for independent samples

HO: HA:

t

t

t

M1 = M2 M1 < M2 Infielders X1 7 8 6 5 10 6 7 Σ X1 = 49 N1 = 7 M1 = 7 M2 = 11

Outfielders X2 11 13 8 10 13 11 Σ X2 = 66 N2 = 6

M1  M 2   X 12   X 2 2   N1  N 2  2 

 N 1  N 2   N 1  N 2 

7  11  16  18  7  6      7  6  2  7  6  4

  

(X1-M1) X1 0 1 -1 -2 3 -1 0 Σ0

(X2-M2) X2 0 2 -3 -1 2 0 Σ0

X12 0 1 1 4 9 1 0 Σ X12 16

X22 0 4 9 1 4 0 Σ X22 18

df=N1 + N2 - 2

df=7 +6 – 2 = 11

 34  13      11  42  4 4 t  3.1.31 .96

t

4  4.08 .98

The computed t value falls in the Region of Rejection, thus we reject the HO and accept the HA: M1 < M2. Since the M of the outfielders is significantly higher, we conclude that they hit with more power.

As mentioned the above formula and procedure can be used regardless of the number of people in each sample. If, however, N1 and N2 are equal, you might prefer to use the slightly simpler formula. Let’s say you want to know if practicing with a weighted baseball would increase the maximum throwing distance of baseball players. The first thing you might want to do is randomly divided the team into two groups of 11 players and have Group A practice with the weighted ball and Group B practice with regular baseballs. After 8 weeks you would then test both groups to determine if they differed at the .05 level of significance. The data you collected and the procedure you used to analyze it is illustrated in Figure 4-5. I already did the work for you…AGAIN! I know I am the GREATEST! The steps for computing a t test for the difference between two means for independent samples when N1 = N2 is as follows: 1. State the null hypothesis – HO: M1 = M2 2. State the alternate hypothesis – HA: M1 > M2 3. Form the X1 and X2 columns. List the scores for each group of subjects 4. Compute Σ X1, N1, and M1 5. Compute Σ X2, N2, and M2 6. Form the X1 column. Find the difference X1 – M1 for each value in the X1 column 7. Form the X2 column. Find the difference X2 – M2 for each value in the X2 column 8. As a check, compute Σ X1 and Σ X2 – both should be zero unless you rounded your means 9. Form the X12 column. Square each value in the X1 column 10. Compute Σ X12. Sum the X12 column 11. Form the X22 column. Square each value in the X2 column 12. Compute Σ X22. Sum the X22 column 13. Use the formula:

t

M1  M 2

 X1   X 2 2

2

df=N1 + N2 – 2

N ( N  1)

14. Determine the degrees of freedom. In this case, N1 + N2 – 2 = 11 + 11 – 2 = 20df 15. Make a rough sketch of the t curve 16. Use the table in Appendix A to determine the critical value of t for the .05 level of significance. Since the computed t value is positive, we want the value of t with 5% of the area above it. In this case, for .05 level with 20df, the critical value is 1.725 17. Shade the Region of Rejection. Since the critical value of t is positive, shade the area to its right. 18. See if the computed t value falls in the Region of Rejection. In this case, t = 0.46 does not fall within the Region of Rejection as we accept HO 19. Write the conclusion, including the statistical information and its interpretation for this situation

Table 4-4 t-test for the difference between 2 means for independent samples when N1 = N2

HO: HA:

t

M1 = M2 M1 > M2 (X2-M2) Weighted Balls Regular Balls (X1-M1) X2 X1 X2 X1 200 200 0 10 210 100 10 -90 280 150 80 -40 120 180 -80 -10 180 240 -20 50 200 250 0 60 220 300 20 110 190 100 -10 -90 200 220 0 30 190 150 -10 -40 210 200 10 10 Σ X2 = 2090 Σ X1 = 0 Σ X2 = 0 Σ X1 = 2200 N2 = 11 N1 = 11 M1 = 200 M2 = 190

M1  M 2

 X1   X 2 2

N ( N  1) 200  190 t 14000  38800 110 10 t 480 10 t  0.46 21.9

2

X12 0 100 6400 6400 400 0 400 100 0 100 100 2 Σ X1 =14000

X22 100 8100 1600 100 2500 3600 12100 8100 900 1600 100 2 Σ X2 =38800

df=N1 + N2 - 2

df=11 + 11 – 2 = 20

Note that your computed t value does not fall within the region of rejection; thus, you have to accept the null hypothesis that M1=M2 and conclude that throwing with a weighted ball is no more effective than throwing with a regular ball.

TM-4-2

Statistical Procedures Statistical Decisions     21     21 21 Computing a t-test for the Difference between Two Means for Independen...