

SOLUTIONS MANUAL
Methods in Behavioral Research 15th Edition by
Paul Cozby and Scott Bates
TABLE OF CONTENTS
:
Chapter 1: Scientific Understanding of Behavior
Chapter 2: Where to Start
Chapter 3: Ethics in Behavioral Research
Chapter 4: Fundamental Research Issues
Chapter 5: Measurement Concepts
Chapter 6: Observational Methods
Chapter 7: Asking People About Themselves: Survey Research
Chapter 8: Experimental Design
Chapter 9: Conducting Experiments
Chapter 10: Complex Experimental Designs
Chapter 11: Single-Case, Quasi-Experimental, and Developmental Research
Chapter 12: Understanding Research Results: Description and Correlation
Chapter 13: Understanding Research Results: Statistical Inference
Chapter 14: Generalization
Solution Manual For
Methods in Behavioral Research, 15th Edition Paul Cozy and Scott Bates
Chapter 1-14 With Handouts
Chapter 1: Scientific Understanding of Behavior
Learning Objectives
Describe why it is essential to understand research methods.
Explain the scientific approach to learning about behavior and be able to compare and contrast it with other ways of knowing.
Identify and explain key features of the scientific approach to understanding behavior, and be able to compare and contrast it with a pseudoscientific approach.
Describe and give examples of the four goals of scientific research: description, prediction, determination of cause, and explanation of behavior.
Summarize the three elements for inferring causation: temporal order, covariation of cause and effect, and elimination of alternative explanations. Be able to generate an example.
Determine whether a study is basic or applied research.
Brief Chapter Outline
I. Consuming Research
A. Why Learn About Research Methods?
II. Ways of Knowing
A. Intuition
B. Authority
C. Empiricism
D. The Scientific Approach
E. Skepticism
III. Being a Skilled Consumer of Research
IV. Goals of Behavioral Science
A. Description
B. Prediction
C. Determining Causes
D. Explaining Behavior
V. Basic and Applied Research
A. Basic Research
B. Applied Research
C. Comparing Basic and Applied Research
Extended Chapter Outline
Please note that much of this information is quoted from the text.
I. Consuming Research
We are continuously bombarded with research results. Articles, books, websites, and social media posts make claims about the beneficial or harmful effects of particular diets or vitamins on one’s sex life, personality, or health. There are frequent reports of survey results that draw conclusions about our views on a variety of topics ranging from politics to the economy, health, education, and the environment.
A. Why Learn About Research Methods?
Learning about research methods is essential for many reasons. First, many occupations require the use of research findings. It is also important to recognize that scientific research has become increasingly prominent in public policy decisions. Research is also important when developing and assessing the effectiveness of programs designed to achieve certain goals for example, improving high school graduation rates in a community, influencing people to be vaccinated, teaching employees how to reduce the effects of stress, or making a workplace welcoming and productive for everybody. Finally, research methods can be the way to satisfy our native curiosity about ourselves, our world, and those around us.
II. Ways of Knowing
People have always observed the world around them and sought explanations for what they see and experience. So, the question must be asked: How does the scientific approach differ from other ways of learning about behavior?
A. Intuition
When people rely on intuition, they accept unquestioningly what their own personal judgment or a single story (anecdote) about one person’s experience tells them. The intuitive approach takes many forms. Often it involves finding an explanation for our own behaviors or the behaviors of others. For example, you might develop an explanation for why you keep having conflicts with your roommate, such as ―My roommate hates me‖ or ―Having to share a bathroom creates conflict.‖ Other times, intuition is used to explain events that you observe, as in the case of concluding that adoption increases the chances of conception for people who are having difficulty conceiving a child. A problem with intuition is that numerous cognitive and motivational biases affect one’s perceptions, and so one may draw erroneous conclusions about cause and effect.
B. Authority
Humans are often persuaded by those in authority. Many people are all too ready to accept anything they learn from the Internet, news media, books, government officials, celebrities, religious figures, or even a professor because they believe that statements made by such authorities must be true. The problem is that the statements may not be true. The scientific approach rejects the notion that one can accept on faith the statements of any authority; more evidence is needed before people can draw scientific conclusions.
C. Empiricism
The fundamental characteristic of the scientific approach is empiricism the idea that knowledge comes from observations. Data are collected and analyzed, and the data form the basis of conclusions about the nature of the world.
D. The Scientific Approach Data Play a Central Role
For scientists, knowledge is primarily based on observations. Scientists enthusiastically search for observations that will verify or reject their ideas about the world. They develop theories, argue that existing data support their theories, and conduct research that can increase their confidence that the theories are correct.
Scientists Are Not Alone
Scientists make observations that are accurately reported to other scientists and the public. Many other scientists will follow up on the findings by conducting research that replicates and extends these observations
Science Is Adversarial
Science is a way of thinking in which ideas do battle with other ideas in order to move ever closer to truth. Research can be conducted to test any idea; supporters of the idea and those who disagree with the idea can report their research findings, and these can be evaluated by others. Some ideas, even some very good ideas, may prove to be wrong if research fails to provide support for them. Good scientific ideas can be supported or they can be falsified by data the latter concept is called falsifiability
Scientific Evidence Is Peer Reviewed
Before a study is published in a top-quality scientific journal, other scientists who have the expertise to carefully evaluate the research review it. This process is called peer review
E. Skepticism
There is nothing wrong with having opinions or beliefs as long as they are presented simply as opinions or beliefs. However, people should always ask whether the opinion can be tested scientifically or whether scientific evidence exists that relates to the opinion in essence, to be skeptical of these claims. People should also be skeptical of pseudoscientific research. Pseudoscience is the use of seemingly scientific terms, and demonstrations are used to substantiate claims that have no basis in scientific research.
III. Being a Skilled Consumer of Research
Sometimes study authors overreach, coming to conclusions that are not justified. Here are eight key questions you can ask of any research study that will reveal a lot about how much the study should be trusted:
1. ―What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?‖
2. ―What did these researchers do? What was the method?‖
3. ―What was measured?‖
4. ―To what or whom can we generalize the results?‖
Generalization involves making broad or general inferences based on the procedures and findings in a specific study
5. ―What did they find? What were the results?‖
6. ―Have other researchers found similar results?‖
7. ―What are the limitations of this study?‖
8. ―What are the ethical issues present in this study?‖
IV. Goals of Behavioral Science
Scientific research on behavior has four general goals:
description
prediction
determining causes
understanding or explanation
A. Description
Description is the first goal of science. Psychologists and other behavioral scientists can describe behavior, which can often be directly observed (such as the amount of food consumed, running speed, eye gaze, or loudness of laughter), or mental states (such as happiness, sadness, or boredom), which are often less observable. Researchers are often interested in describing the ways in which events are systematically related to one another.
B. Prediction
The second goal of behavioral science is to make accurate predictions. Once events have been shown to be related to one another for example, that early-childhood poverty is related to lower school achievement predictions can be made.
C. Determining Causes
There are three types of evidence used to identify the cause of In the simplest case, where a change in one variable causes a change in another, three things must hold:
There is a temporal order of events in which the cause precedes the effect. This is called temporal precedence.
When the cause is present, the effect occurs; when the cause is not present, the effect does not occur. This is called covariation of cause and effect.
Nothing other than a causal variable could be responsible for the observed effect. This is called elimination of alternative explanations.
D. Explaining Behavior
The final goal of science is to explain to understand why a behavior.
V. Basic and Applied Research
A. Basic Research
Basic research tries to answer fundamental questions about the nature of behavior. Studies are often designed to address theoretical issues concerning phenomena such as cognition, emotion, motivation, learning, personality, development, and social behavior.
B. Applied Research
Applied research is conducted to address issues in which there are practical problems and potential solutions. A major area of applied research is called program evaluation, which assesses the social reforms and innovations that occur in government, education, the criminal justice system, industry, health care, and mental health institutions.
C. Comparing Basic and Applied Research
Both basic and applied research are important, and neither can be considered superior to the
other. In fact, progress in science is dependent on an interconnection between basic and applied research. Much applied research is guided by the theories and findings of basic research investigations.
Engaging With Research: Introduction
After reading the article, answer the following questions (which will be familiar to you from earlier in this chapter!). NOTE: Student answers will vary.
1. What is the primary goal of this study? Description, Prediction, Determining Cause, or Explanation? Do the authors achieve their goals?
a. Did these researchers discuss cause and effect? Did they make claims about how the things they measured cause changes in one another? Did they claim that being Latinx caused differences in gratitude?
b. Would you describe this study as applied research or basic research? Why?
2. What did these researchers do? What was the method?
3. What was measured?
4. To what or whom can we generalize the results?
a. Corona et al. (2021) collected data from a group of Latino, East Asian, and European men in their first study, and Latino and East Asian men in their second. In both cases, the research participants were from American universities. Do you think the results would be different if the researchers collected data from other populations? What about men from universities in other countries? Or women? Or people who were not university students? Do you think their results would be different?
5. What did they find? What were the results?
6. Have other researchers found similar results?
7. What are the limitations of this study?
a. What did the authors say about the limitations of the study in the discussion section?
8. What are the ethical issues present in this study?
a. Review the methods section: What ethical issues or procedures were addressed in the study?
Sample Answers for Review Questions
1. Why is it important for anyone in our society to have knowledge of research methods?
A background in research methods will help people read research reports critically, evaluate the methods employed, and decide whether the conclusions are reasonable. Learning about research methods will help people think critically. Many occupations require the use of research findings. It is also important to recognize that scientific research has become increasingly prominent in public policy decisions. Research is also important when developing and assessing the effectiveness of programs designed to achieve certain goals. Research methods can be the way for people to satisfy their native curiosity about ourselves, our world, and those around us.
2. How does the scientific approach differ from other ways of gaining knowledge about behavior?
In the scientific approach, data are collected and shared with peers. Adversarial conclusions are drawn from the data, and those conclusions are also shared with and reviewed by peers.
3. Why is scientific skepticism useful in furthering our knowledge of behavior?
Scientific skepticism means that ideas must be evaluated on the basis of careful logic and results from scientific investigations. The fundamental characteristic of the scientific method is empiricism the idea that knowledge comes from observations. Data are collected that form the basis of conclusions about the nature of the world.
4. Provide (a) definitions and (b) examples of description, prediction, determining cause, and explaining behavior as goals of scientific research.
Description of behavior is based on careful observation and can be something directly observable, such as running speed, or something less observable, such as self-perception. Researchers often try to describe the ways in which events are systematically related to one another. Prediction of behavior involves anticipating events based on observations and descriptions, such as predicting that a physically attractive defendant in a criminal trial will receive a more lenient sentence than an unattractive defendant guilty of the same offense. Determination of cause involves correctly identifying the underlying reason for a behavior, such as determining whether the correlation between the level of a child’s violent behavior and the amount of violent television programming the child has been exposed to is actually caused by exposure to violent programming or is caused by some other element. Explanation is very closely related to determining cause, and it seeks to explain reasons for observed behaviors. The previous example about violent television programming would also be applicable to explanation; however, the explanation may require modification if another cause or causes of the behavior are identified.
5. Describe the three types of evidence necessary in order to infer causation (Cook and Campbell, 1979).
The three types of evidence for inferring causation include temporal precedence, which is an
order of events in which the cause precedes the effect; covariation of cause and effect, in which an effect occurs if the cause is present and does not occur if the cause is absent; and elimination of alternative explanations, in which nothing other than a causal variable can be responsible for an observed effect.
6. Describe the characteristics of scientific inquiry, according to Goodstein (2000).
Goodstein’s (2000) characteristics for scientific inquiry are that data play a central role, scientists are not alone, science is adversarial, and scientific evidence is peer reviewed.
7. How does basic research differ from applied research?
Basic research differs from applied research because basic research tries to answer fundamental questions about the nature of behavior, and applied research tries to address issues in which there are practical problems and potential solutions.
Sample Answers for Being a Skilled Consumer of Research
1. Imagine a debate on the following statement: ―Knowledge of research methods is unnecessary for students who intend to pursue careers in clinical and counseling psychology.‖ Develop ―pro‖ and ―con‖ arguments arguments that support or oppose the assertion.
Students’ answers will vary
2. Read several editorials in your college's newspaper or in The New York Times, the Wall Street Journal, USA Today, the Washington Post, or another major metropolitan news publication, and identify the sources used to support the assertions and conclusions. Did the writer use intuition, appeals to authority, scientific evidence, or a combination of these? Give specific examples.
Students’ answers will vary based on the examples they choose.
3. Imagine a debate on the following statement: ―Behavioral scientists should only conduct research that has immediate practical applications.‖ Develop ―pro‖ and ―con‖ arguments arguments that support or oppose the assertion.
Students’ answers will vary.
4. You read a social media post that says, ―Childhood poverty has a stronger impact on academic achievement among Black children as compared to White children.‖ The study itself collected data from a sample of White children in the Northwest, and Black children in Texas. How strongly should researchers make a case for effect? Why?
Students’ answers will vary.
Laboratory Demonstration: The False Consensus Effect
People often believe that others are more like them than they really are. Thus, one’s predictions about others’ beliefs or behaviors, based on casual observation, are very likely to err in the direction of one’s own beliefs or behavior. For example, college students who preferred brown bread estimated that over 50% of all other college students preferred brown bread, while whitebread eaters estimated more accurately that 37% showed brown bread preference (Ross et al., 1977). This is known as the false consensus effect (Mullen et al., 1985; Ross et al., 1977). The false consensus effect provides the basis for the following demonstration, which emphasizes the need for systematic rather than casual observation.
Before describing the false consensus effect, have students answer the questions listed below. Next, have students predict the class mean for each question. Collect the data sheets. According to the false consensus effect, students’ predictions about the class mean should be influenced by their own positions. Consequently, a student whose position is below the class mean is likely to make a prediction that will be below the class mean as well.
To demonstrate the effect statistically, compute the class mean for each question using the students’ personal data. To involve the students in this process, divide the class into six groups and assign one question to each. Have them tabulate the answers for that question and calculate the mean. (Be sure each group has access to all the data sheets rotating six batches of data sheets from one group to another until all groups have recorded data from all batches.) Put the means on the board. Next, have students compute a score for each participant in the following way: For each question, score a + 1 if the participant’s personal answer and predicted class mean are either both below or both above the actual class mean; score a 1 if the participant’s personal score and predicted class mean are on opposite sides of the actual class mean. Sum all six questions so that each participant now has a single score that ranges between 6 and +6. If people err randomly, the average score for all students should be zero. In contrast, if people err in the direction of their own beliefs, the average should be greater than zero. A simple, one-group t test can be calculated using m = 0 for the null hypothesis.
Behavior Questions
Personal prediction answer for class
1. How many loads of laundry do you wash per week?
2. How many times per year do you attend services at a place of worship?
3. How many times per week do you eat a meal from a fast-food restaurant?
4. How many times per year do you wash your car?
5. How many times per year do you see a movie at a theater?
6. How many times per week do you consume alcohol?
Mullen, B., Atkins, J. L., Champion, D. S., Edwards, C., Hardy, D., Story, J. E., & Vanderlok, M. (1985). The false consensus effect: A meta-analysis of 115 hypothesis tests. Journal of Experimental Social Psychology, 21, 262–283.
Ross, L., Greene, D., & House, P. (1977). The false consensus phenomenon: An attributional bias in self-perception and social perception processes. Journal of Experimental Social Psychology, 13, 279–301.
Laboratory Demonstration: Single Versus Multiple Observations
The systematic observation employed by scientists generally relies on many independent instances, while casual observation is often based on only a few instances. The following demonstration is designed to show how misleading a small sample of observations may be. Divide the class into groups of three or four students each. Fill a bowl or basket with a ―population‖ of poker chips or simple slips of paper. On each chip or piece of paper there should be written a single score. (An approximate normal distribution of 200 numbers is provided below.) Have each group draw five samples from the population and compute the mean for each sample. Each group, however, should draw samples of a different size from the other groups. For instance, group one draws five samples of Size 1, group two draws five samples of Size 3, group three draws five samples of Size 5, and so on. The rate of progression from small to large samples depends on the number of groups. It is a good idea to have the last group draw fairly large size samples (e.g., N = 20 or 25). Once the means for each sample are computed, have each group plot the means on a graph on the board. It should be obvious that with small samples we can easily get a distorted picture of the population mean. Note how the variability from sample mean to sample mean decreases dramatically as we increase the sample size. Discuss how many of our casual observations are based on relatively few observations.
The following population of scores yields a population mean of 17 and a standard deviation of 4.66.
Note: This population of scores can be used for demonstrations suggested in Chapters 8 and 12.
Activity: Observing Behavior
It is often useful to have students immediately begin making observations of behaviors. In class, students might generate a list of possible behaviors to observe on campus. For example, observe the age, ethnicity, and dress of students in various campus locations, such as different eating/gathering places, the library, and the computer center. How many students are alone, in groups of two, or groups of three or more; are these same- or mixed-gender groups? Check door cards on faculty offices to see whether the occupant is an assistant, an associate, or a full professor, and note whether the office has a window. Categorize restroom graffiti; how much is aggressive, sexual, humorous, or political? A discussion based on these observations in class can introduce students to many topics and procedures of research methods.
Activity: Setting up a Research News Group
Research-related stories often appear on a variety of web-related sources. A news group may be set up containing research-related stories from the American Psychological Association and American Psychological Society Press releases, and regional psychological association press releases. Students could sign up for the newsgroup and receive emails with stories relevant to topics dealing with research methods.
Additional Discussion Topics
Discussion: The Gambler’s Fallacy
Another way to illustrate the limitations of intuition is to discuss the gambler’s fallacy. Ask students the following: If they were in Vegas and they pulled a slot machine arm 25 times with no payout, would there be a greater probability that the next pull would pay out? Or if one flips a coin 20 times and gets heads each time, is one more likely to get tails on the next trial? Even though students may understand probability intellectually, a part of their brain says, ―Yes, it is more likely!‖ That would imply that each trial is not independent, but rather it is dependent on prior trials.
Discussion: Operational Definitions
Most students have not discussed operational definitions since Introduction to Psychology. Explain that research hinges on an operational definition that specifically includes AND excludes things from the definition. For example, ask students to define aggression. One can expect the usual examples of hitting, pushing, punching, kicking, and so on; now ask about indirect forms, such as spreading rumors, keying someone’s car, and so on. Now what about sports? Are hockey players aggressive? What about football? What about consensual sex between adults that involves harm to one of the participants? Remind students that the role of definitions is to both include things, such as hitting and spreading rumors, while excluding other things, such as sports and other consensual adult behaviors.
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Brems, C. (1994). Taking the fear out of research: A gentle approach to teaching an appreciation for research. Teaching of Psychology 21, 241–243.
Johnson, D. E. (1996). A ―handy‖ way to introduce research methods. Teaching of Psychology, 23, 168–170.
Also recommended:
Lakin, J. L., Giesler, R. B., Morris, K. A., & Vosmik, J. R. (2007). HOMER as an acronym for the scientific method. Teaching of Psychology, 34, 94–96.
Marek, P., Christopher, A. N , & Walker, B. J. (2004). Learning by doing: Research methods with a theme. Teaching of Psychology, 31, 128–131.
Chapter 2: Where to Start
Learning Objectives
Explain how research questions, hypotheses, and predictions are related.
Describe the different sources of ideas for research that are outlined in this chapter, including common sense, practical problems, observation of the world around us, theories, and past research.
Explain the two functions of a theory.
Compare and contrast the three kinds of journal articles: literature reviews, theory articles, and empirical research.
Summarize what is commonly included in the major sections of an empirical research article: the abstract, introduction, method, results, and discussion sections.
Demonstrate an ability to conduct searches of past research using APA PsycINFO, Web of Science, Scopus, and Google Scholar.
Brief Chapter Outline
I. Research Questions, Hypotheses, and Predictions
II. Sources of Ideas
A.Common Sense
B. Practical Problems
C. Observations of the World Around Us
D.Theories
E. Past Research
III. Types of Journal Articles
A.Literature Reviews
B. TheoryArticles
C. Empirical Research Articles
1. Abstract
2. Introduction
3. Method
4. Results
5. Discussion
IV. Exploring Past Research
A.Journals
B. Online Scholarly Research Databases
1. APA PsycINFO
C. Conducting an APA PsycINFO Search
D.Web of Science & Scopus
E. Other Electronic Search Resources
1. Internet Searches
2. Google Scholar
Extended Chapter Outline
Please note that much of this information is quoted from the text.
D. Research Questions, Hypotheses, and Predictions
A research question is the first and most general step in designing and conducting a research investigation. A good research question must be specific so that it can be answered with a research project. A hypothesis is a tentative answer to a research question. Once the hypothesis is proposed, data must be gathered and evaluated in terms of whether the evidence is consistent or inconsistent with the hypothesis.
Once the study is designed, the researcher can make a specific prediction about the outcome of the study. A prediction follows directly from a hypothesis, is directly testable, and includes specific variables and methodologies. If a prediction is confirmed by the results of the study, the hypothesis is supported. If the prediction is not confirmed, the researcher will either reject the hypothesis or conduct further research using different methods to study the hypothesis. It is important to note that when the results of a study confirm a prediction, the hypothesis is only supported, not proven.
E. Sources of Ideas
A. Common Sense
One source of ideas that can be tested is the body of knowledge called common sense the things we all believe to be true. Do ―opposites attract‖ or do ―birds of a feather flock together?‖ Asking questions such as these can lead to research programs studying attraction, the effects of punishment, and the role of visual images in learning and memory.
B. Practical Problems
Research is also stimulated by practical problems that can have immediate applications.
C. Observations of the World Around Us
Observations can provide many ideas for research. The curiosity sparked by your observations and experiences can lead you to ask questions about all sorts of phenomena.
Serendipity can also play a role sometimes the most interesting discoveries are the result of accident or sheer luck.
D. Theories
A theory consists of a systematic body of ideas about a particular topic or phenomenon. Theories organize and explain a variety of specific facts or descriptions of behavior. Theories generate new knowledge by focusing people’s thinking so that they notice new aspects of behavior theories guide people’s observations of the world. Theories are usually modified as new research defines the scope of the theory.
E. Past Research
Becoming familiar with a body of research on a topic is perhaps the best way to generate ideas for new research. Because the results of research are published, researchers can use the body of past literature on a topic to continually refine and expand people’s knowledge As you become familiar with the research literature on a topic, you may also see inconsistencies in research results that need to be investigated
F. Types of Journal Articles
Most journal articles fall into three categories: literature reviews that summarize research, theory articles that describe theories, and empirical articles that describe specific research projects.
A. Literature Reviews
Literature reviews summarize previous research on a particular topic Because literature reviews summarize research across many studies, they are an important part of the research landscape. A more quantitative type of review method for comparing a large number of studies in a specific research area is called meta-analysis. In meta-analysis, researchers analyze the results of a number of studies using statistical procedures.
B. Theory Articles
Some published research reports are the culmination of work that describes a theory. While literature reviews summarize, theory articles generally summarize and integrate research to provide a new framework for understanding a phenomenon.
C. Empirical Research Articles
The empirical research article is a report of a study in which data were gathered to help answer a research question. Empirical research articles usually have five sections:
D. Abstract
The abstract is a summaryof the research report and typically runs between 150 and 250 words in length. It includes information about the hypothesis, the procedure, and the broad pattern of results.
E.Introduction
In the Introduction section, the researcher outlines the problem that has been investigated. Past research and theories relevant to the problem are described in detail. The specific expectations of the researcher are noted, often as formal hypotheses.
F.Method
The method section is divided into subsections, with the number of subsections determined bythe author and dependent on the complexity of the research design.
G. Results
In the results section, the researcher presents the findings, usuallyin three ways. First, there is a description in narrative form. Second, the results are described in statistical language. Third, the material is often depicted in tables and graphs.
H. Discussion
In the discussion section, the researcher reviews the research described in the Method section and the Results section from various perspectives.
G. Exploring Past Research
Before conducting any research project, an investigator must have a thorough knowledge of previous research findings. Even if the researcher formulates the basic idea, a review of past studies will help the researcher clarify the idea and design the study.
A. Journals
There is an enormous number of scholarly journals in which researchers publish the results of their investigations. After a research project has been completed, the study is written as a report, which then may be submitted to the editor of an appropriate journal. The editor solicits reviews from other scientists in the same field and then decides whether the report is to be accepted for publication. This is the process of peer review, in which experts in the field assess the quality of the research. Most journals specialize in one or two topic areas.
B. Online Scholarly Research Databases
APA PsycINFO
The American Psychological Association began the monthly publication of Psychological Abstracts, or PsycABSTRACTS, in 1927. The abstracts are brief summaries of articles in psychology and related disciplines indexed by topic area. Today, the abstracts are maintained in a digital database called APA PsycINFO, which is accessed via the Internet and is updated weekly.
C. Conducting an APA PsycINFO Search
The exact look and feel of the system users will use to search APA PsycINFO will depend on the library website. Users’ most important task is to specify the search terms that they want the database to use.
Most APA PsycINFO systems have advanced search screens that enable you to use the Boolean operators AND, OR, and NOT. Another helpful search tool is the ―wildcard‖ asterisk (*).
D. Web of Science & Scopus
Web of Science and Scopus are similar databases that allow searches of authors, titles, and psychological terms; in addition, they let users perform searches for citations in a specific article. The most important feature of both resources is the ability to use the ―key article‖ method.
E. Other Electronic Search Resources
The American Psychological Association maintains several databases in addition to APA PsycINFO. These include APA PsycArticles, consisting of full-text scholarly articles, and APA PsycBooks, a database of full-text books and book chapters published by APA. Other major databases include Sociological Abstracts, PubMed, and ERIC (Educational Resources Information Center).
1. Internet Searches
The most widely available information resource is the wealth of material that is available on the Internet and located using search services such as Google, Bing, and DuckDuckGo.
2. Google Scholar
Google Scholar is a specialized scholarly search engine that can be accessed via any web browser at http://scholar.google.com. When users do a search using Google Scholar, they find articles, theses, books, abstracts, and court opinions from a wide range of sources, including academic publishers, professional societies, online repositories, universities, and other websites.
Engaging With Research: Laptops in Class
After you have a full copy of the article, evaluate the study by answering these questions NOTE: Student answers will vary.
1. What is the primarygoal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?
a. Summarize the authors’ purpose for conducting the study; note that they will have supported their purpose by citing or discussing other studies.
b. Identify the research question or questions that are being explored by this study. In some studies, research questions are clearly labeled as such.
c. Identify the hypotheses that are being tested by this study. As with research questions, in some studies hypotheses are clearly labeled as such.
2. What did these researchers do? What was the method?
3. What was measured?
a. How did they record the students’ Internet use?
4. To what or whom can we generalize the results?
a. To whom can the results not be easily generalized? How many students did they invite to participate? How many participants did they collect data from? What was the breakdown of participant demographics?
5. What did theyfind? What were the results?
a. How much time, on average, did their participants spend on non-class-related purposes during class?
b. How did the authors measure class-related use versus non-class-related use of the Internet?
6. Have other researchers found similar results?
a. Are their findings similar to or different from the findings of others?
7. What are the limitations of this study?
8. What are the ethical issues present in this study?
Sample Answers for Review Questions
1. What is a research question? What is a hypothesis? What is a prediction? What are the distinctions between a research question, a hypothesis, and a prediction?
A research question is a specific question to be addressed by a research project. A hypothesis is a tentative answer to a research question. A hypothesis differs from a prediction because a
prediction follows directly from a hypothesis, is directly testable, and includes specific variables and methodology. When a study confirms a prediction, it supports the hypothesis, but does not necessarily prove it.
2. What are the two primary functions of a theory?
The two primary functions of a theory are to organize and explain a variety of specific facts or descriptions of behavior and to generate new knowledge by focusing people’s thinking so that they notice new aspects of behavior.
3. Distinguish between literature reviews, meta-analyses, theoryarticles, and empirical research articles.
Literature reviews are articles that summarize research in a particular area. Meta-analyses utilize statistical methods to summarize research in a specific research area to draw statistical conclusions. Theory articles summarize and integrate research to provide a new framework for understanding a phenomenon. Empirical research articles are reports of studies in which data were gathered to help answer research questions.
4. What are the components of a research article? What information does the researcher communicate in each of the sections of a research article?
A research article has five main sections: (1) the Abstract; (2) the Introduction or literature review; (3) the Method; (4) the Results; and (5) the Discussion. The Abstract is a summary of the research report. The Introduction describes the problem being examined, reviews previous research on the topic, and lays out the specific hypotheses being tested. The Method may have subsections to describe who participated in the research, what activities were conducted, and any other information needed to replicate the research project. The Results present what the research found in narrative, statistical, and graphical formats. The Discussion summarizes the research project, concludes whether the hypotheses were supported, and may offer ideas for future research.
5. How do you conduct an effective APA PsycINFO or Google Scholar search?
The exact look and feel of the system you will use to search APA PsycINFO will depend on your library website. Your most important task is to specify the search terms that you want the database to use. These are typed into a search box. How do you know what words to type in the search box? Most commonly, you will want to use standard psychological terms. The Thesaurus of Psychological Index Terms lists all the standard terms that are used to index the
abstracts, and it can be accessed directly with most PsycINFO systems.
Google Scholar operates like any other Google search. Access Google Scholar at https://scholar.google.com and type in a keyword as you would in a basic PsycINFO search. The key difference is that whereas the universe of content for PsycINFO comes from the published works in psychology and related sciences, the universe for Google Scholar includes the entire internet. This can be both a strength and a weakness. If your topic is broad for example, if you are interested in doing a search for depression or ADHD or color perception Google Scholar will generate many more hits than PsycINFO would, and many of those hits will not be from the scientific literature. On the other hand, if you have a narrow search (e.g., adult ADHD treatment; color perception and reading speed), then Google Scholar will generate a set of results more closely aligned with your intentions.
6. Describe the differences in the ways past research is found when you use PsycINFO versus the ―keyarticle‖ method of the Web of Science or Scopus.
In a PsycINFO search, a user specifies search terms and searches the database for abstracts related to the search terms. In a Web of Science or Scopus ―key article‖ search, the user identifies a key article and the search generates a list of relevant articles that have cited the key article identified by the user. Web of Science and Scopus are similar databases that allow searches of authors, titles, and psychological terms; in addition, they let users perform searches for citations in a specific article. The most important feature of both resources is the ability to use the ―key article‖ method. Here you need to first identify a key article on your topic that is particularly relevant to your interests. Choose an article that was published sufficiently long ago to allow time for subsequent related research to be conducted and reported. You can then search for the subsequent articles that cited the key article. This search will give you a bibliography of articles relevant to your topic.
Sample Answers for Being a Skilled Consumer of Research
Below are four examples of recent articles related to behavioral science in the popular press. For each article: (1) find the original published article sometimes the articles are linked directly in the story, sometimes they are not (in those cases, use APA PsycINFO or Google Scholar), (2) read the article, and (3) compare and contrast what the published article says and the popular media article says. What does the popular press get right about the original article? What does it miss?
NOTE: Student answers will vary.
1. Your dog mayknow if you've done something on purpose, or just screwed up: https://www.npr.org/sections/health-shots/2021/09/01/1032841893/dog-human-mistakestudy
2. Studyunlocks the secrets to developing a regular workout habit: https://www.nbcnews.com/health/health-news/study-unlocks-secrets-developing-workouthabit-rcna8075
3. Social media is making you spend money: https://www.washingtonpost.com/uspolicy/2019/02/19/your-friends-social-media-posts-are-making-you-spend-more-moneyresearchers-say/?noredirect=on
4. Study shows teaching teens about social, personality changes helps cope with stress: https://thedailytexan.com/2016/07/03/study-shows-teaching-teens-about-social-personalitychanges-helps-cope-with-stress/
Activity: Psychological Abstracts
Have students choose a topic and then search for past literature using Psychological Abstracts or APA PsycINFO. They should write down or print information on the author, title, date of publication, and so forth on each article. Finally, they should try to track down one of the articles. This is a good time to point out how important it is to follow your library’s procedures for photocopying articles and returning journals to the shelves; otherwise, it can be frustrating to search for articles that are missing. A possible handout for this exercise is included as Handout 1 in Part II of the instructor’s manual.
Activity: Library Databases: Getting Information
For this activity, modify the following handout to fit a specific university library system. Instructors may also want to modify the introductions to fit with detailed instructions that fit with that system. Instructors can also modify the questions to pertain to your area as well.
Library Activity
To complete this assignment, you must use the ____ (this could be ERIC, APA PsycINFO, LUIS, APA PsycArticles, etc.) through _____ university system.
VI. How many database entries are there with the keyword (key concept) attachment?
VII. How many peer-reviewed journal articles have been published by someone with exactly the same last name as yours?
VIII. How many journal articles by Philip Zimbardo appear in the database?
IX. How many journal articles by David Buss appear in the database?
X. How many database entries are there that have the word persuasion in the title?
XI. How many database entries are there with the subject schizophrenia that were published in 2004?
XII. Since March 2009, how many journal articles are there with the keyword depression?
XIII. How many database entries have both schizophrenia AND depression as keywords?
XIV. How many database entries have depression as a keyword but NOT schizophrenia as a keyword?
XV. How many database entries have discrimination as a keyword AND the word social in the journal title?
Adapted from Janowsky, A. (2009). University of Central Florida.
Additional Discussion Topics
Discussion: RefWorks
Check with the university’s library system and find out about its subscriptions to a system like RefWorks (or similar programs) that allow students to place the reference to an article they are selecting in a file that will self-generate APA format references for them. This is a handy tool that will allow students to keep track of articles they find relevant to their search and also allow them to generate a reference page that follows the APA format.
Discussion: Search Topics
If the classroom has a computer, then open the university library catalogue and have students generate search terms. Use common psychological terms like ―attachment‖ or ―cheating‖ to generate thousands of hits; instructors can then help students narrow results by refining the search terms.
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Ault, R. L. (1991). What goes where? An activity to teach the organization of journal articles. Teaching of Psychology, 18, 45–46.
Cameron, L., & Hart, J. (1992). Assessment of PsycLIT competence, attitudes, and instructional methods. Teaching of Psychology, 19, 239–242.
Joswick, K. E. (1994). Getting the most from PsycLIT: Recommendations for searching. Teaching of Psychology, 21, 49–53.
Marmie, W. R. (1994). Using an everyday memory task to introduce the method and results sections of a scientific paper. Teaching of Psychology, 21, 164–166
Merriam, J., LaBaugh, R. T., & Butterfield, N. E. (1992). Library instruction for psychology majors: Minimum training guidelines. Teaching of Psychology, 19, 34–36.
Poe, R. E. (1990). A strategy for improving literature reviews in psychology courses. Teaching of Psychology, 17, 54–55.
Also recommended:
Connor-Greene, P. A., & Greene, D. J. (2002). Science or snake oil? Teaching critical evaluation of ―research‖ reports on the internet. Teaching of Psychology, 29, 321–324.
Chapter 3: Ethics in Behavioral Research
Learning Objectives
Summarize the ethical principles in the APA Ethics Code concerning research with human research participants.
Provide examples of what is analyzed in a risk-benefit analysis.
Describe the concept of informed consent and how to create a document to establish
informed consent.
Describe the function of an Institutional Review Board and understand the distinctions among exempt, expedited, limited, and full review.
Analyze a study in terms of its risk and classify it as minimal risk or greater than minimal risk.
Summarize the ethical issues concerning research with nonhuman animals.
Define and understand the concept of research fraud and its connection to ethics and the ethical code.
Define and understand plagiarism (including word-for-word and paraphrasing) and describe how to avoid it.
Brief Chapter Outline
V. Milgram’s Obedience Experiments
VI. Historical Context of Current Ethical Standards
F. The Nuremberg Code, the Declaration of Helsinki, and the Belmont Report
VII. APA Ethics Code
APA Ethics Code: Five Principles
9. Principle A: Beneficence and Nonmaleficence
10.
rinciple B: Fidelity and Responsibility
11.
rinciple C: Integrity
12.
rinciple D: Justice
13.
rinciple E: Respect for People’s Rights and Dignity
VIII. Assessment of Risks and Benefits
Risks in Behavioral Research Physical Harm
14.
tress and Distress
15.
onfidentiality and Privacy
IX. Informed Consent
Informed Consent Form
G.Autonomy Issues
H.Withholding Information and Deception
X. The Importance of Debriefing
XI. Institutional Review Boards
Determining Type of IRB Review
Exempt Review of Minimal Risk Research
16.
xpedited Review of Minimal Risk Research
17.
imited Review of Minimal Risk Research
18.
ull Review of Greater Than Minimal Risk Research
XII. Research With Nonhuman Animal Subjects
XIII. Being an Ethical Researcher: The Issue of Misrepresentation Fraud
I. Plagiarism
Word-for-Word Plagiarism
19.
araphrasing Plagiarism
XIV.Conclusion: Risks and Benefits Revisited
Extended Chapter Outline
Please note that much of this information is quoted from the text.
I. Milgram’s Obedience Experiments
Stanley Milgram conducted a series of studies (1963, 1964, 1965) to studyobedience to authority The studypurportedly was to be an experiment on memoryand learning, but Milgram really was interested in learning whether participants would continue to obeythe experimenter byadministering ever higher levels of shock to the learner. Milgram’s studyis important, and the results have implications for understanding obedience in real-life situations, such as the Holocaust in Nazi Germany and the Jonestown mass suicide. But it is also an important example for discussing the problem of ethics in behavioral research.
II. Historical Context of Current Ethical Standards
Modern codes of ethics in behavioral and medical research have their origins in three important documents.
A. The Nuremberg Code, the Declaration of Helsinki, and the Belmont Report
Following World War II, the Nuremberg Trials were held to hear evidence against the Nazi doctors and scientists who had committed atrocities while forcing concentration camp inmates to be research subjects. The legal document that resulted from the trials contained what became
known as the Nuremberg Code a set of 10 rules of research conduct that would help prevent future research atrocities. The Nuremberg Code was a set of principles without any enforcement structure or endorsement byprofessional organizations.
Consequently, the World Medical Association developed a code that is known as the Declaration of Helsinki. This 1964 document is a broader application of the Nuremberg that was produced bythe medical community and included a requirement that journal editors ensure that published research conform to the principles of the Declaration.
By the early 1970s, news about numerous ethically questionable studies forced the scientific community to search for a better approach to protect human research subjects. As a result of new public demand for action, a committee was formed that eventually produced the Belmont Report. This report defined the principles and applications that have guided more detailed regulations developed by the American Psychological Association and other professional societies and U.S federal regulations that apply to both medical and behavioral research investigations. The three basic ethical principles of the Belmont Report are beneficence, respect for persons (autonomy), and justice
III. APA Ethics Code
The American Psychological Association (APA) has provided leadership in formulating ethical principles and standards. The Ethical Principles of Psychologists and Code of Conduct known as the APA Ethics Code is amended periodically.
The preamble to the APA Ethics code states: ―Psychologists are committed to increasing scientific and professional knowledge of behavior and people’s understanding of themselves and others and to the use of such knowledge to improve the condition of individuals, organizations and society.‖
A. APA Ethics Code: Five Principles
The APA Ethics Code includes five general ethical principles beneficence and nonmaleficence, fidelity and responsibility, integrity, justice, and respect for rights and responsibilities.
H.Principle A: Beneficence and Nonmaleficence
The principle of beneficence refers to the need for research to maximize benefits and minimize anypossible harmful effects of participation.
I. Principle B: Fidelity and Responsibility
The principle of fidelity and responsibility states that ―psychologists establish relationships of trust with those with whom theywork. Theyare aware of their professional and scientific responsibilities to society and to the specific communities in which they work.‖
J. Principle C: Integrity
The principle of integrity states that ―Psychologists seek to promote accuracy, honesty, and truthfulness in the science, teaching, and practice of psychology. In these activities, psychologists do not steal, cheat, or engage in fraud, subterfuge, or intentional misrepresentation of fact.‖
K.Principle D: Justice
As in the Belmont Report, the principle of justice refers to fairness and equity. Principle D states: ―Psychologists recognize that fairness and justice entitle all persons to access to and benefit from the contributions of psychology and to equal quality in the processes, procedures, and services being conducted by psychologists.‖
L.Principle E: Respect for People’s Rights and Dignity
The last of the five APA ethical principles builds upon the Belmont Report’s principle of respect for persons. It states: ―Psychologists respect the dignity and worth of all people, and the rights of individuals to privacy, confidentiality, and self-determination.‖
IV. Assessment of Risks and Benefits
The principle of beneficence leads one to examine potential risks and benefits that are likely to result from the research; this is called a risk-benefit analysis
Potential risks to participants include factors like psychological or physical harm and loss of confidentiality. The benefits of a study include direct benefits to the participants, such as an educational benefit, acquisition of a new skill, or treatment for a psychological or medical problem. There may also be material benefits.
A. Risks in Behavioral Research
In Milgram’s research, the risk of experiencing stress and psychological harm is obvious. It is not difficult to imagine the effect of delivering intense shocks to an obviously unwilling learner. There are several common risks involved in behavioral research.
Physical Harm
Procedures that could conceivably cause some physical harm to participants are rare but possible. Many medical procedures fall into this category for example, administering a drug such as alcohol or caffeine
M.
tress and Distress
More common than physical stress is psychological stress. For example, participants might be told that they will receive some extremely intense electric shocks. They never actually receive the shocks; it is the fear or anxiety during the waiting period that is the variable of interest.
N.Confidentiality and Privacy
The loss of expected privacy and confidentiality is another important risk to consider. Confidentiality is an issue when the researcher has assured subjects that the collected data are only accessible to people with permission, generally only the researcher. This becomes particularly important when studying sensitive topics. For example, asking participants about sexual behavior, personal questions about family history, or illegal activity may leave them vulnerable if their answers became known to others. Or consider a study that obtained information about employees’ managers. It is extremely important that responses to such questions be confidential; revealing their responses could result in real harm to the individual.
Privacy becomes an issue when, without the subject’s permission, the researcher collects information under circumstances that the subject ordinarily believes are private free from unwanted observation by others.
The Internet has posed other issues of privacy people post a great deal of personal information voluntarily, and there are questions of ethics that arise when researchers use this information.
V. Informed Consent
Principle E of the APA Ethics Code (Respect for People’s Rights and Dignity) states that research participants are to be treated as autonomous They are capable of making deliberate decisions about whether to participate in research. The key idea here is informed consent potential participants in a research project should be provided with all information that might influence their active decision of whether or not to participate in a study
A. Informed Consent Form
Participants are usually provided with some type of informed consent form that contains the information that participants need to make their decision. The content will typically cover:
The purpose of the research
Procedures that will be used including time involved (remember that one does not need to tell participants exactly what is being studied)
Risks and benefits
Any compensation
Confidentiality
Assurance of voluntaryparticipation and permission to withdraw
Contact information for questions
B. Autonomy Issues
What happens when the participants may lack the ability to make a free and informed decision to voluntarily participate? This is a threat to autonomy. Special populations such as minors, patients in psychiatric hospitals, or adults with cognitive impairments require special precautions. Coercion is another threat to autonomy. Any procedure that limits an individual’s freedom to consent is potentially coercive.
C. Withholding Information and Deception
Withholding information is sometimes described as a type of deception termed passive deception It is generally acceptable to withhold information when the research is considered minimal risk, when the information would not affect the decision to participate, and when the information will later be provided, usually in a debriefing session when the study is completed. Most people who volunteer for psychology research do not expect full disclosure about the study prior to participation. However, they do expect a thorough debriefing after they have completed the study.
Active deception is actively providing misinformation about the nature of a study. The Milgram experiment illustrates two types of deception. First, participants were deceived about the purpose of the study. Second, the Milgram study also illustrates a type of deception in which participants become part of a series of events staged for the purposes of the study. A confederate of the experimenter played the part of another participant in the study; Milgram created a reality for the participant in which obedience to authority could be observed. Such deception has been most common in social psychology research; it is much less frequent in areas of experimental psychology such as human perception, learning, memory, and motor performance.
In the decades since the Milgram experiments, researchers have become more sensitive to ethical issues when planning their studies. Ethics review committees at universities and colleges now carefully review proposed research; elaborate deception is likely to be approved only when the research is important and there are no alternative procedures available.
VI. The Importance of Debriefing
Debriefing occurs after the completion of the study and includes an explanation of the purposes of the research that is given to participants following their participation. It is an opportunity for the researcher to deal with issues of withholding information, deception, and potential harmful effects of participation. Research on the effectiveness of debriefing indicates that debriefing is an effective way of dealing with deception and other ethical issues that arise in research investigations.
Debriefing is part of a researcher’s obligation to treat participants with dignity and respect.
VII. Institutional Review Boards
The Belmont Report provided an outline for issues of research ethics and the APA Ethics Code provides guidelines as well; the actual rules and regulations for the protection of human research participants were issued by the U.S. Department of Health and Human Services (HHS). Under these regulations (U.S. Department of Health and Human Services, 2001), every institution that receives federal funds must have an Institutional Review Board (IRB) that is responsible for the review of research conducted within the institution. The HHS regulations also categorized research according to the amount of risk involved in the research.
For a project to be considered research by federal agencies, it must be (1) systematic: purposefully and intentionally designed and (2) result in generalized knowledge: The intent must be to create new knowledge with the results
A. Determining Type of IRB Review
For purposes of IRB review, research with human subjects is classified as either minimal risk or greater than minimal risk. Minimal risk means that the risks of harm to participants are no greater than risks encountered in daily life or in routine physical or psychological tests. If the research procedures are judged by the IRB as no greater than minimal risk, the studyqualifies for one of three levels of review: exempt review, expedited review, or limited review. If a project is judged to be greater than minimal risk, then a full board review is required.
Exempt Review of Minimal Risk Research
Exempt review research is exempt from the more rigorous review requirements of the federal regulations. Such research must fall into one of several exempt research categories.
O.Expedited Review of Minimal Risk Research
Expedited review applies to research that is minimal risk research but does not match the exempt research categories. Much expedited review research is biological/medical (e.g., blood samples, collection of hair samples, or saliva); it also includes research procedures frequently used by behavioral researchers.
P. Limited Review of Minimal Risk Research
The third category of IRB review for minimal risk research is called limited review. Studies that can be subject to limited review are those that include benign behavioral interventions for which sensitive data are collected from adult participants under the circumstance where participates would need to be identified.
Q.Full Review of Greater Than Minimal Risk Research
Any research procedure that places participants at greater than minimal risk is subject to thorough full review by the IRB. Such review is more extensive and time-consuming than the exempt and expedited levels of review. The IRB will carefully consider the proposed informed consent procedure, the nature of the sample and plans for recruiting participants, potential risks, and whether there are alternative procedures available to the researcher.
VIII. Research With Nonhuman Animal Subjects
Animals are used in behavioral research for a variety of reasons. Researchers can carefully control the environmental conditions of the animals, study the same animals over a long period, and monitor their behavior 24 hours a day if necessary. It is crucial to recognize that strict laws and ethical guidelines govern both research with animals and teaching procedures in which animals are used. Institutions in which animal research is carried out must have an Institutional Animal Care and Use Committee (IACUC) composed of at least one scientist, one veterinarian, and a community member. The IACUC is charged with reviewing animal research procedures and ensuring that all regulations are adhered to.
The APA Ethics Code addresses the ethical responsibilities of researchers when studying nonhuman animals.
IX. Being an Ethical Researcher: The Issue of Misrepresentation
Principle C of the APA Ethics Code focuses on integrity the promotion of accuracy, honesty, and truthfulness. The ethical researcher acts with integrityand in so doing does not engage in misrepresentation.
A. Fraud
The fabrication of data is fraud Instances of fraud in the field of psychology are considered to be very serious, but fortunately, they are very rare. In most cases, fraud is detected when other scientists cannot replicate the results of a study. Fraud is not a major problem in science in part because researchers know that others will read their reports and conduct further studies, including replications. Allegations of fraud should not be made lightly.
B. Plagiarism
Plagiarism refers to misrepresenting another’s work as one’s own. Writers must give proper citation of their sources. Plagiarism can take the form of submitting an entire paper written by someone else.
Word-For-Word Plagiarism
A writer commits word-for-word plagiarism when they copy a section of another person’s work word for word without placing those words within quotation marks to indicate that the segment was written by somebody else and without citing the source of the information
R.Paraphrasing Plagiarism
In paraphrasing plagiarism, instead of the words being directly copied without attribution, the ideas are copied without attribution.
X.Conclusion: Risk and Benefits Revisited
When one makes decisions about research ethics, one needs to weigh the factors associated with risk to the participants. One also needs to weigh the direct benefits of the research to the participants, as well as the scientific importance of the research and the educational benefits to the students who may be conducting the research for a class or degree requirement.
SOURCE: Browsegrades.net
Ethical guidelines and regulations evolve over time. The APA Ethics Code and federal, state, and local regulations maybe revised periodically.
Engaging With Research: Replication of Milgram
After reading the article, consider the following. NOTE: Student answers will vary.
F. What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?
a. How does Burger know that one thing caused another?
G. What did these researches do? What was the method?
a. How did Burger screen participants in the study? What was the purpose of the screening procedure?
H. What was measured?
I. To what or whom can we generalize the results?
J. What did they find? What were the results?
K. Have other researchers found similar results?
L. What are the limitations of this study?
M. What are the ethical issues present in this study?
a. Conduct an informal risk-benefit analysis. What are the risks and benefits inherent in this study? Do you think that the study is ethically justifiable, given your analysis? Why or why not?
b. Burger paid participants $50 for two 45-minute sessions. Could this be considered coercive? Why or why not?
c. Burger uses deception in this study. Is it acceptable? Do you believe that the debriefing session described in the report adequately addresses the issues of deception?
Sample Answers for Review Questions
Discuss the major ethical issues in behavioral research, including risks, benefits, deception,
debriefing, informed consent, and justice. How can researchers weigh the need to conduct research against the need for ethical procedures?
Student responses will vary but should correctly define and briefly discuss risks, benefits, deception, debriefing, informed consent, justice, and any other relevant terms the student may wish to discuss. Researchers can weigh the need to conduct research against the need for ethical procedures by closely following the ideas listed in the Belmont Code and the APA Ethics Code while they are closely weighing the potential risks and rewards of the research.
N. Why is informed consent an ethical principle? What are the potential problems with obtaining fully informed consent?
Informed consent is an ethical principle because to have autonomy, research participants should be informed about the purposes of the study, the risks and benefits of participation, and their rights to refuse or terminate participation in the study. They can then freely consent or refuse to participate in the research. The potential problems with obtaining fully informed consent include invalid results if subjects know too much about what is going on and their behavior or responses are influenced, or cases when it is impossible to get consent, or consent may not be necessary.
O. What alternatives to deception are described in the text?
Researchers engage in less deception than in the past by studying cognitive variables rather than emotions, by engaging in studies designed to create less deception, and by having to meet the requirements of ethics review boards at their institutions. Another very important way to avoid deception is to provide a thorough debriefing process to make sure subjects have all of the information and do not feel deceived.
P. Summarize the principles concerning research with human participants in the APA Ethics Code.
The APA Ethics Code lists five general ethical principles to use when dealing with human subjects. The first is beneficence and nonmaleficence, which means maximizing benefit while minimizing harm from participation in the research. The second is fidelity and responsibility. This means researchers have professional and scientific responsibilities to societies and to specific communities in which they work. The third ethical principle is integrity, which means being truthful, accurate, and honest. The fourth is justice, which refers to fairness and equality. The final ethical principle is respect for people’s rights and dignity.
Q. What is the difference between “no-risk” and “minimal-risk” research activities?
―No-risk‖ research activities are exempt from review by an Institutional Review Board and do not require informed consent. Examples would be public observation and anonymous questionnaires in which there is no risk of the subjects being identified. ―Minimal-risk‖ research activities present a risk of harm to participants no greater than risks encountered in
daily life or in routine physical or psychological tests. When minimal-risk research is being conducted, elaborate safeguards are less of a concern, and approval by the IRB is routine.
R. What is an Institutional Review Board?
An Institutional Review Board (IRB) is a local review agency composed of at least five individuals, and at least one member of the IRB must be from outside the institution. IRBs are mandated by the U.S. Department of Health and Human Services at every institution that receives federal funds. Each of these institutions must have an IRB that is responsible for the review of research conducted within the institution. All research conducted by faculty, students, and staff associated with the institution is reviewed in some way by the IRB. This includes research that may be conducted at another location such as a school, community agency, hospital, or via the Internet.
S. Summarize the ethical procedures for research with animals.
Ethical procedures for research with animals comply with strict laws and ethical guidelines. Such regulations deal with the need for proper housing, feeding, cleanliness, and health care. They specify that the research must avoid any cruelty in the form of unnecessary pain to the animal. In addition, institutions in which animal research is carried out must have an Institutional Animal Care and Use Committee (IACUC) composed of at least one scientist, one veterinarian, and a community member. The IACUC is charged with reviewing animal research procedures and ensuring that all regulations are adhered to.
T. What constitutes fraud, what are some reasons for its occurrence, and why does it not occur more frequently?
Fraud occurs when data are fabricated. Fraud often occurs when researchers are under high pressure to produce results, or when researchers have an exaggerated fear of failure or need for success. Fraud doesn’t occur more frequently because most research results will be reviewed by peers and closely scrutinized, making it very likely that the fraud will be detected.
U. Describe how you would proceed to identify plagiarism in a writing assignment.
Plagiarism can be identified in a writing assignment by determining whether you actually wrote the words and by determining if the ideas being used are actually your own ideas. Copying exact words would indicate word-for-word plagiarism and copying ideas would indicate paraphrasing plagiarism.
Sample Answers for Being a Skilled Consumer of Research
What do you think about the ethical aspects of the Milgram obedience study? Do you think that the study should have been allowed? Were the potential risks to Milgram’s participants worth the knowledge gained by the outcomes? If you were a participant in the
study, would you feel okay with having been deceived into thinking that you had harmed someone? What if the person you were “shocking” was a younger sibling? Or a grandparent? Would that make a difference? Why or why not?
Student answers will vary.
V. A recent study showed that participants often don’t read beyond the second paragraph of an informed consent document before signing (Douglas et al., 2021). Is this something researchers should be concerned about? Why? What might you do to increase the likelihood of reading more of the informed consent document?
Student answers will vary.
W.Consider the following experiment, similar to the one that was conducted by Smith et al (1978). Each participant interacted for an hour with another person who was actually an accomplice of the researcher. After this interaction, both persons agreed to return 1 week later for another session with each other. When the real participants returned, they were informed that the person they had met the week before had died. The researchers then measured reactions to the death of the person.
a. Discuss the ethical issues raised by the experiment.
b. Would the experiment violate the guidelines articulated in the APA Ethics Code (see Appendix B)? In what ways?
c. What alternative methods for studying this problem (reactions to death) might you suggest?
d. Would your reactions to this study be different if the participants had played with an infant and then later been told that the infant had died? Would your analysis of the ethics of the study change? Why or why not?
A good answer should include discussion of ethical issues regarding psychological stress and the use of deception. Informing the participant that the other person had died may result in an unpleasant emotional experience with unintended consequences. In addition, the use of deception in order to examine the participant's emotions may be negatively perceived by the participant. The answer should also include a discussion of an alternative method such as roleplaying in examination of the topic. Roleplaying would involve the experimenter describing the situation and then asking the participant how they would respond to the situation.
Dr. Rodríguez conducted a study to examine various aspects of college students’ drug use. The students filled out a questionnaire in a classroom on the campus; about 50 students
were tested at a time. The questionnaire asked about prior experience with various illegal substances. If a student had experience with a drug, a number of other detailed questions were asked. However, if the student did not have any prior experience with a drug, they skipped the detailed questions and simply went on to answer another general question about drug use. What ethical issues arise when conducting research such as this? What problems might arise because of the “skip” procedure used in this study?
A good answer will include a mention of the possible loss of anonymity. Requiring students with prior experience to answer detailed questions may result in them taking longer to finish the survey than those with no experience. Thus, those individuals taking the longest to complete the survey may be identified as those with the most drug experience.
X. Find your college’s code of student conduct online and review the section on plagiarism. How would you improve this section? What would you tell your professors to do to help students avoid plagiarism?
Students’ answers will vary. Some students may suggest that professors should emphasize on the importance of citing sources by stating examples of how it strengthens student papers.
Activity: Films to Illustrate Ethical Issues
A variety of films can be used to stimulate discussion of research ethics: for example, the Stanford Prison Experiment (video; available from Philip Zimbardo at Stanford University). Also, all colleges that have contracts for federal grants have copies of videotapes that deal with the Department of Health and Human Services ethical regulations.
The American Psychological Association produces a series of educational videos that address a variety of topics in research methods. Although the videos are produced primarily as a resource for secondary school teachers, there may be relevant information for your course. A website for the videos, especially one addressing animal research and ethics, may be found at the following address: http://www.apa.org/monitor/dec03/lab.html
Activity: IRB Decisions
Have students act as members of an IRB for a real or fictional study. For the study, have the students discuss the issues that would need to be addressed in order for the study to be approved. Instructors could also have them write an informed consent form for the studies addressed.
Activity: Search for News Articles on Scientific Ethics
If students have access to computer databases of news articles, such as LexisNexis and Factiva, it can be instructive to search for terms such as ―fraud in scientific research and research ethics.‖
Students can describe the most interesting things they found in a one-page summary that can be shared with other students. SOURCE: Browsegrades.net
Activity: Plagiarism and Research Fraud
The chapter only touched upon the topic of plagiarism in the context of honesty in reporting results. Have students give their own definition of plagiarism and an example of plagiarism. Compare the students’ definitions with your definition or the statement of academic honesty that is used at the instructor’s college.
There are several online interactive tutorials and self-tests that cover the topic of plagiarism. Instructors might have students work through one of the tutorials and/or tests. Instructors could also base a class lecture hour on a tutorial and test.
Additional Discussion Topics
Discussion: Research on Children
An interesting discussion that can be had involves research on kids. Some of the same arguments that are used against animal research come into play here. Ask students how manyof them think research on children is OK. Then ask how manywould allow their own child to participate in research. Out of those, how many would only allow participation if their child was ill and needed an experimental drug or therapy. Based on their responses, you can have a lively discussion on both when experimentation on children is appropriate and also sampling issues
Discussion: Research on Animals
An interesting discussion can be had if instructors have students all stand up and then ask for those against animal testing to go to one side of the room, those for it to the other, and those that are undecided to stayin the middle. Ask students who are against and for animal testing to state their opinions on the matter. Ask students by show of hand or clickers how many of them feel bad about research on chimps? What about cockroaches and sea slugs as subjects? Students often have kneejerk responses to animal research but onlywith some species and some types of research. Discussion can be used to illustrate this.
Discussion: Zimbardo’s Prison Study
The book briefly describes the Stanford Prison study You can show a YouTube clip of the experiment (Milgram’s studyis also on there) and discuss the risk–benefit of such research. Remind students that 100 psychologists were polled prior to Milgram beginning his research and none thought that subjects would shock someone to death. Likewise, Zimbardo’s colleagues, and Zimbardo himself, did not expect the power of the situation to be so considerable. In follow-up studies, both groups of subjects report no lasting ill effects, and both have contributed greatly to our understanding of the power of situation on behavior.
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Beins, B. C. (1993). Using the Barnum effect to teach about ethics and deception in research. Teaching of Psychology, 20, 33–35.
Hubbard, R. W., & Ritchie, K. L. (1995). The human subjects review procedure: An exercise in critical thinking for undergraduate experimental psychologystudents. Teaching of Psychology, 22, 64–65.
Kallgren, C. A., & Tauber, R. T. (1996). Undergraduate research and the institutional review board: A mismatch or happy marriage? Teaching of Psychology, 23, 20–25.
Korn, J. H. (1988). Students’ roles, rights, and responsibilities as research participants. Teaching of Psychology, 15, 74–78.
Korn, J. H., & Hogan, K. (1992). Effect of incentives and aversiveness of treatment on willingness to participate in research. Teaching of Psychology, 19, 21–24.
McMinn, M. R. (1988). Ethics case-study simulation: A generic tool for psychology teachers. Teaching of Psychology, 15, 100–101.
Nimmer, J. G., & Handelsman, M. M. (1992). Effects of subject pool policy on student attitudes toward psychology and psychological research. Teaching of Psychology, 19, 141–144.
Rosnow, R. L. (1990). Teaching research ethics through roleplay and discussion. Teaching of Psychology, 17, 179–181
Also recommended:
Bragger, J. D., & Freeman, M. A. (1999). Using a cost-benefit analysis to teach ethics and statistics. Teaching of Psychology, 26, 34–36.
Brinthaupt, T. M. (2002). Teaching research ethics: Illustrating the nature of the researcher-IRB relationship. Teaching of Psychology, 29, 243–245.
Keith-Spiegel, P. C., Tabachnick, B. G., & Allen, M. (1993). Ethics in academia: Students’ views of professors’ actions. Ethics and Behavior, 3(2), 149–162.
Landau, J. D., Druen, P. B., & Arcuri, J. A. (2002). Methods for helping students avoid plagiarism. Teaching of Psychology, 29, 112–115.
Schuetze, P. (2004). Evaluation of a brief homework assignment designed to reduce citation problems. Teaching of Psychology, 31, 257–259.
The American Psychological Association has a website which contains the most recent code of ethics for psychologists and researchers at: https://www.apa.org/ethics/code/index
Chapter 4: Fundamental Research Issues
Learning Objectives
Define construct validity, internal validity, external validity, and conclusion validity. Compare and contrast the four validities.
Define what a variable is and be able to develop an operational definition of a variable.
Identify the different relationships between variables: positive, negative, curvilinear, and no relationship.
Compare and contrast nonexperimental and experimental research methods.
Distinguish between an independent variable and a dependent variable.
Summarize the strengths and limitations of laboratory experiments and the advantage of using multiple methods of research.
Brief Chapter Outline
XV. Validity: An Introduction
XVI. Variables
XVII. Operational Definitions of Variables
A. Construct Validity
XVIII. Relationships Between Variables
A. Positive Linear Relationship
B. Negative Linear Relationship
C. Curvilinear Relationship
D. No Relationship
E. Relationships and Reduction of Uncertainty
XIX. Nonexperimental Versus Experimental Methods
A. Nonexperimental Method 20. irection of Cause and Effect 21. he Third-Variable Problem
B. Experimental Method Experimental Control 22. andomization
C. Internal Validity and the Experimental Method
D. Independent and Dependent Variables
XX. Experimental Methods: Additional Considerations
A. Experiments Are Tightly Controlled, Unlike the Real World
B. Some Variables Cannot Be or Should Not Be Manipulated
C. Some Questions Are Not Answerable byan Experiment
D. Advantages of Multiple Methods
XXI. Evaluating Research: Summaryof the Four Validities
Extended
Chapter Outline
Please note that much of this information is quoted from the text.
XI. Validity: An Introduction
There are four key types of validity:
Construct validity is the extent to which the measurement or manipulation of a variable accurately represents the theoretical variable being studied
Internal validity refers to the accuracy of conclusions drawn about cause and effect.
External validity is the extent to which a study’s findings can be accurately generalized to other populations and settings.
Statistical validity is the accuracy or the statistical conclusions drawn from the results of a research investigation.
XII. Variables
A variable is something that changes. A variable can be a behavior, thought, feeling, situation, characteristic, or an event. Anything that varies and can be measured is a variable Any variable must have two or more levels or values.
XIII.Operational Definitions of Variables
The operational definition of a variable is the set of procedures used to measure or manipulate it.
C. Construct Validity
Construct validity refers to the accuracy of our operational definitions: Does the operational definition of a variable actually reflect the true theoretical meaning of the variable?
XIV. elationships Between Variables
When both variables have values along a numeric scale, many different ―shapes‖ can describe their relationship. The four most common relationships found in research are the positive linear relationship, the negative linear relationship, the curvilinear relationship, and the situation in which there is no relationship between the variables.
Positive Linear Relationship
In a positive linear relationship, increases in the values of one variable are accompanied by increases in the values of the second variable. For example, speech rate in a persuasive speech and attitude change in the audience; faster rates of speech are associated with more attitude change.
D. Negative Linear Relationship
In a negative linear relationship, increases in the values of one variable are accompanied by decreases in the values of the other variable. For example, Latané et al (1979) were intrigued with reports that increasing the number of people working on a task may actually reduce group effort and productivity.
E. Curvilinear Relationship
In a curvilinear relationship, increases in the values of one variable are accompanied by systematic increases and decreases in the values of the other variable. In other words, the direction of the relationship changes at least once. This type of relationship is sometimes referred to as a nonmonotonic function.
F. No Relationship
When there is no relationship between the two variables, the graph is simply a flat line. A numerical index of the strength of relationship between variables is called a correlation coefficient Correlation coefficients are very important because one needs to know how strongly variables are related to one another.
G. Relationship and Reduction of Uncertainty
When one detects a relationship between variables, one reduces uncertainty by increasing one’s understanding of the variables that one is examining. The term uncertainty implies that there is randomness in events; scientists refer to this as random variability in events that occur. Research can reduce random variability by identifying systematic relationships between variables.
XV. Nonexperimental Versus Experimental Methods
With the nonexperimental method, relationships are studied by observing variables of interest. This may be done by asking people to describe their behavior, directly observing behavior, recording physiological responses, or even examining various public records such as census data.
In all these cases, variables are observed as they occur naturally.
The second approach to the study of relationships, the experimental method, involves direct manipulation and control of variables. The researcher manipulates the first variable of interest and then observes the response.
Nonexperimental Method
Because the nonexperimental method allows one to observe covariation between variables, another term that is frequently used to describe this procedure is the correlational method This method is not ideal when asking questions about cause and effect for two primary reasons:
5. Direction of Cause and Effect
The first problem involves direction of cause and effect. With the nonexperimental method, it is difficult to determine which variable causes the other.
6. The Third-Variable Problem
A third variable is any variable that is extraneous to the two variables being studied; this is why these variables are sometimes referred to as extraneous variables Any number of other third variables may be responsible for an observed relationship between two variables.
When we know that an uncontrolled third variable is operating, we can call the third variable a confounding variable If two variables are confounded, they are intertwined so one cannot determine which of the variables is operating in a given situation.
H. Experimental Method
The experimental method reduces ambiguity, and thus uncertainty, in the interpretation of results. With the experimental method, one variable is manipulated and the other is then measured. The manipulated variable is called the independent variable and the variable that is measured is termed the dependent variable
Experimental Control
In an experiment, all extraneous variables are controlled by being held constant. This is called experimental control. If a variable is held constant, it cannot be responsible for the results of the experiment. In other words, any variable that is held constant cannot
be a confounding variable.
7. Randomization
The experimental method eliminates the influence of such variables by randomization. Randomization ensures that an extraneous variable is just as likely to affect one experimental group as it is to affect the other group. To eliminate the influence of individual characteristics, the researcher assigns participants to the two groups in a random fashion.
I. Internal Validity and the Experimental Method
Internal validity is the ability to draw conclusions about causal relationships from the results of a study. A study has high internal validity when strong inferences can be made that one variable caused changes in the other variable. One can see that strong causal inferences can be made more easily when the experimental method is used.
Strong internal validity requires an analysis of three elements:
There must be temporal precedence.
There must be covariation between the two variables.
There is a need to eliminate plausible alternative explanations for the observed relationship.
J. Independent and Dependent Variables
Researchers use the terms independent variable and dependent variable when referring to the variables being studied. In an experiment, the manipulated variable is the independent variable. After manipulating the independent variable, the researchers measure a second variable, called the dependent variable. The basic idea is that the researchers make changes in the independent variable and then see whether the dependent variable changes in response.
Note that some research focuses primarily on the independent variable, with the researcher studying the effect of a single independent variable on numerous behaviors.
XVI.
xperimental Methods: Additional Considerations
Experiments Are Tightly Controlled, Unlike the Real World
The external validity of a studyis the extent to which the results can be generalized to other populations and settings. Another alternative is to try to conduct an experiment in a field setting.
In a field experiment, the independent variable is manipulated in a natural setting.
K. Some Variables Cannot Be or Should Not Be Manipulated
Sometimes the experimental method is not a feasible alternative because experimentation would be either unethical or impractical; for example, child-rearing practices would be impractical to manipulate with the experimental method
Participant variables (also called subject variables and personal attributes) are characteristics of individuals, such as age, gender, ethnic group, nationality, birth order, personality, or marital status. These variables are by definition nonexperimental. They cannot be manipulated; they must only be measured.
L. Some Questions Are Not Answerable by an Experiment
A major goal of science is to provide an accurate description of events. Thus, the goal of much research is to describe behavior; in those cases, causal inferences are not relevant to the primary goals of the research. A classic example of descriptive research in psychology comes from the work of Jean Piaget, who carefully observed the behavior of his own children as they matured.
In many real-life situations, a major concern is to make a successful prediction about a person’s future behavior for example, success in school, ability to learn a new job, or probable interest in various major fields in college. In such circumstances, there may be no need to be concerned about issues of cause and effect. It is possible to design measures that increase the accuracy of predicting future behavior.
XVII.
dvantages of Multiple Methods
Perhaps most important, complete understanding of any phenomenon requires study using multiple methods, both experimental and nonexperimental. No method is perfect, and no single study is definitive.
XVIII.
valuating Research: Summary of the Three Validities
Validity refers to the idea that, given everything that is known, a conclusion is reasonably accurate. Research can be described and evaluated in terms of four types of validity:
Construct validity refers to the extent to which the measurement or manipulation of a variable accurately represents the theoretical variable being studied.
Internal validity refers to the accuracy of conclusions drawn about cause and effect
External validity is the extent to which findings of a study can accurately be generalized
to other populations and settings.
Statistical validity is the accuracy or the statistical conclusions drawn from the results of a research investigation.
Engaging With Research: Studying Discrimination
After reading the article, consider the following. NOTE: Student answers will vary.
VI. What is the primary goal of this study: Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?
d. On what basis did the authors conclude that they found ―evidence of ethnic discrimination for our heterosexual senders‖ (p. 357)?
e. How do they know that one thing caused another?
VII. How do theyknow that one thing caused another?
VIII. What was measured?
What was measured? How did Agerstrom et al. (2021) operationally define ―discrimination‖? How did they operationally define ―ethnic minority‖?
IX. To what or whom can we generalize the results? Do you think that the researchers can generalize to all employers? All employers in Sweden?
X.What did theyfind? What were the results?
XI. Have other researchers found similar results?
XII. What are the limitations of this study?
XIII. What are the ethical issues present in this study?
Sample Answers for Review Questions
What is a variable?
A variable is anyevent, situation, behavior, or individual characteristic that changes or varies. Any variable must have two or more levels or values.
XIV. Define ―operational definition‖ of a variable.
The operational definition of a variable is the set of procedures used to measure or
manipulate it.
XV. Distinguish among positive linear, negative linear, and curvilinear relationships.
In a positive linear relationship, increases in the value of one variable are accompanied by increases in the value of the second variable. In a negative linear relationship, increases in the value of one variable are accompanied bydecreases in the values of the other variable. In curvilinear relationships, increases in the values of one variable are accompanied bysystematic increases and decreases in the values of the other variables.
XVI. What is the difference between the nonexperimental method and the experimental method?
In the nonexperimental method, relationships are studied byobserving variables of interest, but in the experimental method, variables are directly manipulated and controlled.
XVII. What is the difference between an independent variable and a dependent variable?
The manipulated variable is the independent variable, and the variable that is measured is the dependent variable.
XVIII. Distinguish between laboratoryexperiments and field experiments.
Student answers should demonstrate an understanding of how laboratoryand field experiments differ and explain that field experiments take place in a natural setting, whereas laboratoryexperiments take place in a more controlled, artificial environment.
XIX. What is meant bythe problem of direction of cause and effect and the third-variable problem?
Student answers should demonstrate an understanding of how a third-variable extraneous to the two variables being studied can influence the observed relationship between the two variables being studied. They should also explain how direction of cause and effect deals with trying to determine which variable causes the other.
XX. How do direct experimental control and randomization influence the possible effects of extraneous variables?
In experimental control, the extraneous variables are held constant so that they cannot be responsible for the results of the experiment. In randomization, extraneous variables are just as likely to affect one group as the other group, instead of affecting one group disproportionately.
XXI. What are some reasons for using the nonexperimental method to studyrelationships between variables?
Student answers should demonstrate an understanding of whyit is sometimes favorable to use the nonexperimental method to observe variables without manipulating them.
Sample Answers for Being a Skilled Consumer of Research
The dictionarydefinition of shy is ―being reserved or having or showing nervousness or timidity in the companyof other people.‖ Create three different operational definitions of shy and provide a critique of each one. Example: An operational definition of shy could be the number of new people that a person reports meeting in a given day. Critique: What if an outgoing person has a job that requires them to meet veryfew people? Theymay be considered (incorrectly) shy by this operational definition.
A good answer should include definitions and critiques from three different scenarios which hold good independently.
XXII. Consider the hypothesis that stress at work causes family conflict at home.
S. What type of relationship is proposed (e.g., positive linear, negative linear)?
T. Graph the proposed relationship.
U.Identify the independent variable and the dependent variable in the statement of the hypothesis. How might you investigate the hypothesis using the experimental method?
V.How might you investigate the hypothesis using the nonexperimental method (recognizing the problems of determining cause and effect)?
W. What factors might you consider in deciding whether to use the experimental or nonexperimental method to study the relationship between work stress and family conflict?
A good answer should include a detailed discussion about the type of relationship proposed. This should be followed by an investigation of the hypothesis using the experimental and nonexperimental method.
XXIII. You observe that classmates who get good grades tend to sit toward the front of the classroom, while those who receive poor grades tend to sit toward the back. What are three possible cause-and-effect relationships for this nonexperimental observation?
A good answer should include discussion of direction of cause and effect and the third variable problem. The three possible cause-and-effect relationships are: good grades may cause students to sit toward the front of the classroom, sitting at the front of the classroom may cause good grades, or a third variable may cause both good grades and students to sit at the front of the classroom.
XXIV. Identifythe independent and dependent variables in the following descriptions of experiments:
Students watched a cartoon either alone or with others and then rated how funny they found the cartoon to be.
Independent Variable: Watchingthe cartoon (alone vs. others)
Dependent Variable: Rating of how funnythey found the cartoon to be
X.A comprehension test was given to students after they had studied textbook material either in silence or with the television turned on.
Independent Variable: Environment in which studying occurred (silence vs. television on)
Dependent Variable: Scores on the comprehension test
Y.Some elementary schoolteachers were told that a child’s parents were college graduates, and other teachers were told that the child’s parents had not finished high school; they then rated the child’s academic potential.
Independent Variable: Child’s parents’ education (college graduate vs. not finishing high school)
Dependent Variable: The child’s academic potential
Z. Workers at a company were assigned to one of two conditions: one group completed a stress management training program; another group of workers did not participate in the training. The number of sick days taken by these workers was examined for the two subsequent months.
Independent Variable: Group training (completion of a stress management training vs. no training)
Dependent Variable: Number of sick days
XXV. A few years ago, newspapers reported a finding that Americans who have a glass of wine a day are healthier than those who have no wine (or who have a lot of wine or other alcohol). What are some plausible alternative explanations for this finding; that is, what variables other than wine could explain the finding? (Hint: What sorts of people in the United States are most likely to have a glass of wine with dinner?)
A good answer might include a discussion of an individual’s income and access to medical care as possible influences on one’s health.
Laboratory Demonstration: Relationships Among Variables
To clarify the concepts of correlation and relationships among variables, create scatterplots from data generated from the students in the class. At the class session prior to the demonstration,
distribute data sheets that ask for data on pairs of variables likely to produce positive, negative, and zero correlations. (Some ideas are listed below.) Collect the completed sheets. Before the next class meeting, tabulate the data for each pair of variables, using a different page for each pair. On the day of the demonstration, divide the class into groups; give each group the tabulated data from a pair of variables with instructions to create a scatterplot for that data. (This could be done on the board or on a transparency prepared ahead of time.) Have the class identify what kind of relationship exists between each pair of variables.
Probable positive correlations:
Height and weight
Age of student’s car and mileage on the odometer
Number of units and amount spent on textbooks
Distance a student lives from school and time it takes to drive to school
Probable negative correlations:
Number of units taken last term and average number of hours worked per week at an outside job
Number of miles from home to campus and number of trips to campus per week
Political position (1 = extremely liberal, 10 = extremely conservative) and attitude toward feminism (1 = antifeminism, 10 = profeminism)
Probable zero correlations:
Number of siblings and number of books read last year
Age of mother and age of car
Number of units completed and height
Shoe size and amount of time spent studying during the week
Laboratory Demonstration: Experimental and Nonexperimental Methods
This demonstration emphasizes the difference between the nonexperimental and experimental methods for studying relationships among variables. However, the composition of the class will affect the success of the demonstration, so choose the version that is most suitable to the instructor’s particular class.
Version A: Use this version if most of the students in your class are traditional students in the sense that they have entered college from high school and enroll as full-time students.
Propose to the class that the instructor wants to investigate whether or not students’ place of
residence affects the progress they make in school. Divide the class into two groups: those who live with their parents or in dorms and those who live in their own apartments or houses. Then find the mean number of total academic units completed for each group. Most likely, the group who lives on their own will have a higher average number of completed units. Have the class discuss the conclusion that living on your own facilitates progress in school. (Instructors can play devil’s advocate and propose such explanations as fewer distractions, fewer relationship power struggles, fewer outside demands on time, and so on.) Eventually, the discussion should lead to the recognition of the third-variable problem. Those students who live with their parents or in dorms are more likely to be younger compared with students living on their own and consequently will have completed fewer school terms. Compute the mean age for each group to see if this is, in fact, an important factor.
Version B: Use this version if there are many part-time students and/or older students who have returned to school. (This will be particularly likely if you are teaching a night class.)
Use the same experimental question as above but substitute ―number of units completed last term,‖ instead of total units completed, for the dependent measure. Most likely, those who live at home or in a dorm will have completed more units in the previous term than those who live on their own. Have the class discuss the conclusion that living on one’s own inhibits his or her progress in school. The discussion should lead to the recognition of the third-variable problem. Those students who live on their own are more likely to be part-time students who either work greater numbers of hours or have more off-campus commitments (e.g., students with parenting commitments or who work full time and go to school part time). Compute the group means for such variables as hours worked per week or number of children.
Have the class discuss how they could examine this question experimentally. Discuss how the procedures would differ. Randomly assign the students to two groups. Compare the groups for differences in age or number of hours worked per week. This should show how random assignment takes care of such extraneous variables.
Activity: Developing Operational Definitions of Variables
Have students generate a number of different variables as the instructor writes them on the board. Go through the list checking to make sure that each is in fact a variable. Then have students indicate how each variable could be operationally defined using the experimental or nonexperimental method. Discuss the conditions under which each would be preferred, the problems of each method, and the ethics and practicality of manipulation of variables.
Activity: Research Programs in the Department
In most departments, there are faculty and students with active research programs. If these individuals are willing to cooperate, an interview assignment can be useful. Students or groups of students are sent to interview a researcher and observe the research, if possible. Students should have a standard set of questions that might include:
b. Researcher’s name and educational background
c. Area of major interest
d. Major questions/emphasis of the research program
e. How the researcher became interested in the topic
f. Theoretical or practical implications of the research
g. Research procedures being used
h. Whether the student can observe the research
i. Recent findings of the research program
This activity introduces students to faculty in the department, shows that research is important and relevant, and may motivate students to begin doing research with a faculty member.
Activity: Is It a Correlation or Is It an Experiment?
Use Handout 8 in the back of this instructor’s manual to give students the opportunity to differentiate between correlations and experiments and between independent variables and dependent variables.
Activity: Reading a Journal Article
An important initial experience is success in reading and understanding the content of a research report. This assignment might be used early in the course to guide students through an article. Or it might be used as a framework for reading several references that relate to a class or individual project. Handout 1 in Part II of this manual may be useful.
Recommended:
Varnhagen, C. K., & Digdon, N. (2002). Helping students read reports of empirical research. Teaching of Psychology, 29, 160–164
Activity: Identifying and Operationally Defining
Variables
Students benefit from in-class or homework exercises that require them to select the appropriate method to study a hypothesis, identify independent and dependent variables, and propose a precise operational definition for each variable. A handout for such an exercise is printed in Part II (Handout 2).
Additional Discussion Topics
Discussion: Correlations and Causality
It is often difficult for students to remember that correlation does not imply causality. Correlations between smoking and lung cancer may make a causal inference seem intuitively obvious but remind students that something else may cause both, like a gene. The correlation between shark attacks and ice cream sales is a great example; ask students what kinds of things could cause both to increase?
Other correlations that instructors can use include the number of books in the family home and IQ score, the number of electrical appliances in the family home and teenager use of birth control, and shaving less than once a day and having a stroke.
Discussion: Naturalistic Versus Structured Observation
Remind students that the decision to use field study (naturalistic observation) instead of a laboratory or structured observation is in part based on design and resources available. The book describes several studies looking at structured or laboratory studies such as the social loafing study described. Naturalistic or field studies are often used but bring a different set of issues. First, there is less control over the environment than one has in a lab setting. Second, sometimes it may be difficult for an experimenter to be both a good person and a good experimenter.
Example: Jane Goodall, who studied chimps, actually rescued a chimp that had lost a dominance hierarchy battle and would have died without her intervention. Or let’s say one observes aggressive behavior on playgrounds and they see a group of boys beating up another boy. What should one do?
Chapter 5: Measurement Concepts
Learning Objectives
Define reliability of a measure of behavior and describe the difference between test-retest, internal consistency, and interrater reliability.
Define construct validity and discuss ways to establish construct validity.
Compare face validity, content validity, predictive validity, concurrent validity, convergent validity, and discriminant validity.
Describe the problem of reactivity of a measure of behavior and discuss ways to minimize reactivity.
Describe the properties of the four scales of measurement: nominal, ordinal, interval, and ratio.
Brief Chapter Outline
XXVI.
eliability of Measures
I. Test-Retest Reliability
J. Internal Consistency Reliability
K. Interrater Reliability
L. Reliabilityand Accuracy of Measures XXVII. onstruct Validity of Measures
Indicators of Construct Validity
AA. ace Validity
BB. ontent Validity
CC. redictive Validity
DD. oncurrent Validity
EE. onvergent Validity
FF. iscriminant Validity
M.Measurement Validity: For Whom XXVIII.
eactivity of Measures
XXIX.
ariables and Measurement Scales
Nominal Scales
N. Ordinal Scales
O. Interval and Ratio Scales
P. The Importance of the Measurement Scales
Extended Chapter Outline
Please note that much of this information is quoted from the text.
I. Reliability of Measures
Reliability refers to the consistency or stability of a measure. Your everyday definition of reliability is quite close to the scientific definition. For example, you might say that Professor Fuentes is ―reliable‖ because she begins class exactly at 10 a.m. each day.
Another way to think about the reliability of a measure is to think about the concepts of true score and measurement error. A true score is someone’s real, ―true‖ value on a given variable The difference between a true score and a measured score is measurement error An unreliable measure of intelligence contains considerable measurement error. Thus, it cannot reflect an individual’s true intelligence.
Reliability is most likely to be achieved when researchers use careful measurement procedures.
A correlation coefficient is a number that tells us how strongly two variables are related to each other. The correlation coefficient most commonly referred to when discussing reliability is the Pearson product-moment correlation coefficient. The Pearson correlation coefficient (symbolized as r) can range from 0.00 to +1.00 and 0.00 to 1.00. A correlation of 0.00 tells us that the two variables are not related at all. The closer a correlation is to 1.00, either +1.00 or 1.00, the stronger is the relationship.
The positive and negative signs provide information about the direction of the relationship. When the correlation coefficient is positive (a plus sign), there is a positive linear relationship high scores on one variable are associated with high scores on the second variable. A negative linear relationship is indicated by a minus sign high scores on one variable are associated with low scores on the second variable.
Test-Retest Reliability
Test-retest reliability is assessed bymeasuring the same individuals at two points in time. For example, the reliability of a test of intelligence could be assessed by giving the measure to a group of people on one day and again a week later. One would then have two scores for each person, and a correlation coefficient could be calculated to determine the relationship between the first test score and the retest score. High reliability is indicated by a high correlation coefficient showing that the two scores are very similar.
Given that test-retest reliability requires administering the same test twice, the correlation might be artificially high because the individuals remember how they responded the first time. Alternate forms reliability is sometimes used to avoid this problem; it requires administering two different forms of the same test to the same individuals at two points in time.
Q. Internal Consistency Reliability
Internal consistency reliability is the assessment of reliability using responses at only one point in time. Because all items measure the same variable, they should yield similar or consistent results.
One indicator of internal consistency is split-half reliability; this is the correlation of the total score on one half of the test with the total score on the other half. The two halves are created by randomly dividing the items into two parts.
One drawback of split-half reliability is that it is based on only one of many possible ways of dividing the measure into halves. Perhaps the most commonly used indicator of reliability based on internal consistency, called Cronbach’s alpha, provides us with the average of all possible split-half reliability coefficients
It is also possible to examine the correlation of each item score with the total score based on all items. Such item-total correlations are veryinformative because they provide information about each individual item.
R. Interrater Reliability
Interrater reliability is the extent to which raters agree in their observations. A commonly used indicator of interrater reliability is called Cohen’s kappa.
S. Reliability and Accuracy of Measures
Reliability is clearly important when researchers develop measures of behavior. But reliability is not the only characteristic of a measure or the only thing that researchers worry about. Reliability tells us about measurement error, but it does not tell us whether we have a good measure of the variable of interest.
II. Construct Validity of Measures
Construct validity refers to the adequacy of the operational definition of variables. To what extent does the operational definition of a variable actually reflect the true theoretical meaning of
the variable? In terms of measurement, construct validity is a question of whether the measure that is employed actually measures the construct it is intended to measure. Indicators of Construct Validity
23. ace Validity
When the measure appears to accurately assess the intended variable, this is called face validity.
24. ontent Validity
Content validity is based on comparing the content of the measure with the universe of content that defines the construct.
25. redictive Validity
A measure has predictive validity if research shows that scores on the measure do in fact predict the behavior or outcome it is intended to predict. Thus, with predictive validity, the criterion measure is a future behavior or outcome.
26. oncurrent Validity
Concurrent validity is demonstrated by research that examines the relationship between the measure and a criterion behavior at the same time (concurrently).
27. onvergent Validity
Convergent validity is the extent to which scores on the measure in question are related to scores on other measures of the same construct or similar constructs.
28. iscriminant Validity
When the measure is not related to variables with which it should not be related, discriminant validity is demonstrated.
T. Measurement Validity: For Whom
Construct validity is of critical importance to behavioral research. When conducting research to assess the construct validity, a sample of individuals with particular characteristics will be recruited. Manymeasures have been validated with samples consisting primarily of White, college-educated people; that has begun to change as researchers become more sensitive to this issue, though perhaps not fast enough
III. Reactivity of Measures
A potential problem when measuring behavior is reactivity A measure is said to be reactive if awareness of being measured changes an individual’s behavior. Measures of behavior vary in terms of their potential reactivity. There are also ways to minimize reactivity.
A book byWebb et al. (1981) has drawn attention to a number of measures that are called nonreactive or unobtrusive. Manysuch measures involve clever ways of indirectly recording a variable.
IV. Variables and Measurement Scales
Nominal Scales
Nominal scales have no numerical or quantitative properties. Instead, categories or groups simply differ from one another (sometimes nominal variables are called ―categorical‖ variables). An obvious example is the variable of handedness.
In an experiment, the independent variable is often a nominal or categorical variable.
U.Ordinal Scales
Ordinal scales allow us to rank order the levels of the variable being studied. Instead of having categories that are simply different, as in a nominal scale, the categories can be ordered from first to last. Letter grades are a good example of an ordinal scale.
V. Interval and Ratio Scales
In an interval scale, the difference between the numbers on the scale is meaningful. Specifically, the intervals between the numbers are equal in size.
Ratio scales do have an absolute zero point that indicates the absence of the variable being measured. Examples include many physical measures such as length, weight, or time. Ratio scales are used in the behavioral sciences when variables that involve physical measures are being studied particularly time measures such as reaction time, rate of responding, and duration of response.
W.The Importance of the Measurement Scales
When you read about the operational definitions of variables, you will recognize the levels of the variable in terms of these types of scales. The conclusions one draws about the meaning of a particular score on a variable depend on which type of scale was used.
Engaging With Research: Measurement Concepts
Browsegrades.net
After reading the article, consider the following. NOTE: Student answers will vary.
1. What is the primary goal of this study: Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals? Can you think of any other explanation of their results?
2. What did these researchers do? What was the method?
3. What was measured?
GG.
he authors don’t report any evidence for reliability or validity. How would you test their measure for reliability? How would you test it for validity?
HH.
o you think that SET measures teaching quality at all? That is, even if we are able to reduce bias, do you think that we are still measuring good teaching with instruments like the one that Peterson et al. (2019) used?
Student answers will vary. They should accurately reflect knowledge of the reliability measures and validity tests discussed in the text and apply them correctly to the material.
4. To what or whom can we generalize the results?
Do you think that their findings would be the same if they studied SET ratings for courses in psychology? Or mathematics? Why or why not?
II. Do you think that their findings would be generalizable to all college courses, including courses that are offered online? Why or why not?
JJ.What do you think would happen if this intervention were used widely at your college? Do you think that bias would be decreased? Why or why not?
Student answers will vary. They should reflect viable explanations of the results.
5. What did they find? What were the results?
6. Have other researchers found similar results? Are the findings of this study similar to the findings of other studies on the same topic?
7. What are the limitations of this study?
8. What are the ethical issues present in this study?
Sample Answers for Review Questions
1. What is meant by the reliability of a measure? Distinguish between true score and measurement error.
Reliability refers to the consistency or stability of a measure. A true score is someone’s real, ―true‖ value on a given variable. The difference between a true score and a measured score is measurement error.
2. Describe the methods of determining the reliability of a measure.
The correlation coefficient most commonly referred to when discussing reliability is the Pearson product-moment correlation coefficient. The Pearson correlation coefficient (symbolized as r) can range from 0.00 to +1.00 and 0.00 to 1.00. A correlation of 0.00 tells us that the two variables are not related at all. The closer a correlation is to 1.00, either +1.00 or 1.00, the stronger is the relationship. The positive and negative signs provide information about the direction of the relationship. When the correlation coefficient is positive (a plus sign), there is a positive linear relationship high scores on one variable are associated with high scores on the second variable. A negative linear relationship is indicated by a minus sign high scores on one variable are associated with low scores on the second variable. Test--retest reliability is assessed by measuring the same individuals at two points in time. Alternate forms reliability requires administering two different forms of the same test to the same individuals at two points in time. Internal consistency reliability is the assessment of reliability using responses at only one point in time. Split-half reliability is the correlation of the total score on one half of the test with the total score on the other half. Item-total correlations is the correlation between scores on individual items with the total score on all items of a measure.
3. Discuss the concept of construct validity. Distinguish among the indicators of construct validity.
Construct validity refers to the adequacy of the operational definition of variables. To what extent does the operational definition of a variable actually reflect the true theoretical meaning of the variable? When the measure appears to accurately assess the intended variable, this is called face validity. Content validityis based on comparing the content of the measure with the universe of content that defines the construct. Research that uses a measure to predict some future behavior is using predictive validity. Concurrent validity is demonstrated by research that examines the relationship between the measure and a criterion behavior at the same time (concurrently). Convergent validity is the extent to which scores on the measure in question are related to scores on other measures of the same construct or similar constructs. When the measure is not related to variables with which it should not be related, discriminant validity is demonstrated.
4. Why isn’t face validity sufficient to establish the validity of a measure?
Face validity is not very sophisticated; it involves only a judgment of whether, given the theoretical definition of the variable, the content of the measure appears to actually measure the variable. Face validity is not sufficient to conclude that a measure is in fact valid. Appearance is not a very good indicator of accuracy. In addition, many good measures of variables do not have obvious face validity.
5. What is a reactive measure?
A measure is said to be reactive if awareness of being measured changes an individual’s behavior.
6. Distinguish between nominal, ordinal, interval, and ratio scales.
Nominal scales have no numerical or quantitative properties. Instead, categories or groups simply differ from one another (sometimes nominal variables are called ―categorical‖ variables). Ordinal scales allow us to rank order the levels of the variable being studied. Instead of having categories that are simply different, as in a nominal scale, the categories can be ordered from first to last. In an interval scale, the difference between the numbers on the scale is meaningful. Specifically, the intervals between the numbers are equal in size. Ratio scales do have an absolute zero point that indicates the absence of the variable being measured.
Sample Answers for Being a Skilled Consumer of Research
1. Take a personality test on the Internet (you can find such tests using Internet search engines). Based on the information provided, what can you conclude about the test’s reliability, construct validity, and reactivity?
A good answer should include the students’ experience of taking the test. After concluding the test, students should analyze their own behavior and performance on the test against the three parameters of reliability, construct validity, and reactivity.
2. The ―five-factor‖ or ―Big 5‖ model of personality describes five fundamental personality traits: Extraversion, Agreeableness, Conscientiousness, Emotional stability (Neuroticism), and Openness to experience. You can assess yourself on the Big 5 using this link: https://projects.fivethirtyeight.com/personality-quiz/. For this exercise, choose one of the big five factors and then provide a definition, KK.describe how you might measure the personality trait, and
LL. describe a method that you might use to assess construct validity.
A good answer should include not only a specific process for measuring the chosen personality trait but also an understanding of how to assess that measurement accurately. Methods of analyzing the validity of a construct should include a definition of the chosen method for measuring content validity as well as examples of validity indicators related to the personality trait.
3. Few concepts are more important to behavioral science or to any science! than measurement. Consider the following three broad categories of behavioral research and the ways in which measurement impacts what we understand. For each, consider: (1) How reliable would each of your strategies be? (2) How might the measurement strategy be biased (i.e., how valid would your strategy be)?
Emotional experience. Few things are more important to humans than our emotions. Of course, emotions are difficult to measure. For this exercise, think of two ways you could measure a specific emotion. For example, you could measure grief by interviewing people at a funeral (consider the ethical implications of this strategy before you implement it!).
Student answers will vary.
MM.
arenting Practice. Parenting practices can be difficult to directly measure because when you are a parent you are always parenting! (Likewise, when you are a child, you are always being parented.) Think of two ways you could measure a specific parenting practice. For example, you could ask teenagers about their parents’ parenting practices related to curfew.
Student answers will vary.
NN.
acism. A microaggression is an act of casually degrading any marginalized group (Sue et al., 2007). For example, asking a coworker who appears to be Asian or Latinx where they are from is a microaggression because it implies that they are “not from here.” For this exercise, think of two ways you could measure occurrences of microaggressions in a workplace (e.g., you could conduct face-to-face interviews with employees and ask them if they have observed microaggressions).
Student answers will vary.
Laboratory Demonstration: Reliability
The concept of interrater, test-retest, and split-half reliability can be clarified by having students participate in a concrete demonstration of the processes. For interrater reliability, begin by randomly dividing the students into two equal groups. Present the groups with some situation in which each student is to assign a rating. A correlation between the students’ ratings in the two groups can then be examined. Some examples of situations in which ratings can be assigned are given below.
1. Show a short (5–10 minutes) cartoon and have the students count the number of aggressive acts that occur. (This example may also lead to a discussion of the importance of operational definitions in measurement.)
2. Have students in the groups estimate the length of a string, table, or blackboard in the classroom. Estimates can be given in inches, feet, or centimeters. Instructors may find greater variability of estimates depending on the unit of measurement selected.
3. Students’ responses to questions from an introductory psychology exam can be used to illustrate test-retest and split-half reliability. For the test-retest illustration, it is recommended that the retest be given at the next class meeting.
Activity: Reliability and Validity
Have groups of students think of examples of measures that seem to have reliability and validity. Also, have the groups think of examples of measures that mayhave reliability but not validity. Exchange the examples between the groups and have them identifyand critique the different examples.
Assign groups of students the task of developing a measure to assess other students’ attitudes toward various issues (e.g., food services on campus, change in graduation requirements, a smoking ban on campus). Have the students indicate which scale of measurement they would employand how they would assess the validity and reliability of the measure.
Activity: Identifying Measurement Scales
Have students generate a number of variables as you write them on the board. Have groups of students classifythe variables into their appropriate measurement scale indicating the reasons why theychose that particular scale. (If classifications differ among the groups, this exercise mayalso lead to a discussion of interrater reliability.)
Additional Discussion Topics
Discussion: Illustrating the Difference Between Reliability and Validity
A good example of the distinction between reliability and validity is a scale. Thus, if the scale is always off by 10 lbs, it is reliable but not valid. Now ask students if they would throw out the scale or keep it. This illustrates the importance of reliability and also validity.
Discussion: Interrater Reliability
Instructors can begin the discussion by showing a YouTube clip of the Loftus car accident video and ask students to write down what they see in each clip. Then ask students to compare their answers and discuss why there are so many versions of seeing the same things. This should help illustrate for students how difficult it is to get a high interrater reliability.
Suggested Readings
Article in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Strube, M. J. (1991). Demonstrating the influence of sample size and reliability on study outcome. Teaching of Psychology, 18, 113–115. Also recommended:
Buck, J. L. (1991). A demonstration of measurement error and reliability. Teaching of Psychology, 18, 46–47.
Camac, C. R., & Camac, M. K. (1993). A laboratory project in scale design: Teaching reliability and validity. Teaching of Psychology, 20, 102–104.
Chapter 6: Observational Methods
Learning Objectives
Compare and contrast quantitative and qualitative methods of studying behavior.
Describe naturalistic observation and discuss methodological issues, such as participation and concealment, and the limitations of the approach.
Describe systematic observation and discuss methodological issues, such as the use of equipment, reactivity, reliability, and sampling.
Summarize the features of a case study.
Define archival research and describe the different sources of archival data statistical records, survey archives, and written records and how they can be used to answer research questions.
Brief Chapter Outline
XXX. Quantitative and Qualitative Approaches
XXXI. Naturalistic Observation
X. Description and Interpretation of Data
Y. Participation and Concealment
Z. Limits of Naturalistic Observation
XXXII. Systematic Observation
Coding Systems
AA.
ethodological Issues
j. Equipment
k. Reactivity
l. Reliability
m.
ampling Behaviors and Experiences
XXXIII. Case Studies
XXXIV. Archival Research
Statistical Records
BB.
urveyArchives
CC.
ritten, Audio, and Video Records
Extended Chapter Outline
Please note that much of this information is quoted from the text.
I. Quantitative and Qualitative Approaches
Approaches to research based on observational methods can be broadly classified as primarily quantitative or qualitative. Quantitative research focuses on variables that can be quantified (e.g., counted).
Just as there are many types of quantitative approaches, there are many approaches to qualitative inquiry. Creswell and Poth (2018) described five broad approaches: Narrative Research, Phenomenological Research, Grounded Theory Research, Ethnographic Research, and Case Study Research. While a detailed description of these strategies falls outside of the scope of this book, it is essential to understand that qualitative research approaches can be used to give different and highly valuable perspectives to research questions. It is also possible to conduct mixed-methods research that uses both quantitative and qualitative methodologies to collect data.
II. Naturalistic Observation
In a study using naturalistic observation, the researcher makes observations of individuals in their natural environments (the field). This research approach has roots in anthropology and the study of animal behavior and is currently widely used in the social sciences to study many phenomena in all types of social and organizational settings.
OO. Description and Interpretation of
Data
The goal of naturalistic observation is to provide a complete and accurate picture of what occurred in the setting, rather than to test hypotheses formed prior to the study. To achieve this goal, the researcher must keep detailed field notes that is, write or dictate on a regular basis (at least once each day) everything that has happened.
PP. Participation and Concealment
A nonparticipant observer is an outsider who does not become an active part of the setting. In contrast, a participant observer assumes an active, insider role. By using participant observation, the researcher may be able to experience events in the same way as natural participants.
Concealed observation is less reactive than nonconcealed observation because people are not aware that their behaviors are being observed and recorded, but nonconcealed observation may be preferable from an ethical viewpoint.
QQ. Limits of Naturalistic
Observation
Naturalistic observation is most useful when investigating complex social settings both to understand the settings and to develop theories based on the observations. It is less useful for
studying well-defined hypotheses under precisely specified conditions.
III. Systematic Observation
Systematic observation refers to the careful observation of one or more specific behaviors in a particular setting. This research approach is much less global than naturalistic observation research.
Coding Systems
Numerous behaviors can be studied using systematic observation. The researcher must decide which behaviors are of interest, choose a setting in which the behaviors can be observed, and most important, develop a coding system to measure the behaviors.
RR. Methodological Issues
Equipment
It is becoming more common to use video and audio recording equipment to make direct observations because they provide a permanent record of the behavior observed that can be coded later.
Reactivity
One methodological issue is reactivity the possibility that the presence of the observer will affect people’s behaviors Reactivity can be reduced by concealed observation.
Reliability
Reliability refers to the degree to which a measurement reflects a true score rather than measurement error. Reliable measures are stable, consistent, and precise. When conducting systematic observation, two or more raters are usually used to code behavior. Reliability is indicated by a high agreement among the raters.
Sampling Behaviors and Experiences
For many research questions, samples of behavior taken over an extended period provide more accurate and useful data than single, short observations.
Technology now enables researcher to sample individuals’ behaviors as people are living their lives in real time in everyday situations. The experience sampling method (ESM)
is used to alert participants to complete a data collection procedure at that moment in time. The day reconstruction method (DRM) is another method of obtaining selfreports of daily activities, moods, and emotions.
IV. Case Studies
A case study is an observational method that provides a detailed description of an individual; or a group of people who constitute a family, a workgroup, a school, or a neighborhood; or even a situation, such as a business that failed or a school that succeeded.
A psychobiography is a type of case study in which a researcher applies psychological theory to explain the life of an individual, usually an important historical figure.
V. Archival Research
Archival research involves using previously compiled information to answer research questions. In an archival research project, the researcher does not actually collect the original data. Instead, he or she analyzes existing data such as statistics that are part of publicly accessible records.
Statistical Records
Statistical records are collected by many public and private organizations. The U.S. Census Bureau maintains the most extensive set of statistical records available, but state and local agencies also maintain such records.
SS. Survey Archives
Survey archives consist of data from surveys that are stored digitally and available to researchers who wish to analyze them. Major polling organizations make many of their surveys available.
TT. Written, Audio, and Video Records
Archival research can also be conducted using previously written, audio, and video records, including diaries and letters that historical societies have preserved, books, ethnographies of other cultures written by anthropologists, speeches by politicians, tweets, Instagram or Facebook posts, magazine articles, movies, podcasts, television programs, newspapers, and blog posts.
Content analysis is the systematic analysis of existing documents. Like systematic observation, content analysis requires researchers to devise coding systems that raters can use to quantify the information in the documents.
Engaging With Research: Observational Methods
After reading the article, consider the following. NOTE: Student answers will vary.
UU. What is the primary goal of this study: Description, Prediction, Determining Cause, or
Explaining? Do the authors achieve their goals?
a. The introduction to this study begins by stating a problem: the rates of childhood obesity. Why do the authors conduct a study of meal service and feeding practices to address this problem?
b. Do you think that this study included any confounding variables that may have impacted this study? Provide examples.
c. Does this study suffer from the problem involving the direction of causation? How so?
VV. What did these researchers do? What was the method?
a. Is the basic approach in this study qualitative or quantitative?
b. Is this study an example of concealed or nonconcealed observation?
WW. What was measured?
a. How did the researchers operationally define Mealtime Environment, Child Eating Behavior, and Parental Feeding Practices? What do you think about the quality of these operational definitions?
Student answers will vary. Theyshould include each of the four terms and its definition, then give evidence from the article about how the researchers created their operational definitions. Theyshould also include a personal opinion about the quality of each of the operational definitions.
b. Do you think that participants would be reactive to this data collection method?
Student answers will vary. They should express their own ideas, based on their understanding of the issue of reactivity, about the ways in which participants might be affected by data collection.
c. How reliable were the coders? How did the authors assess their reliability?
Student answers will vary. They should show how their conclusions about coder reliability are supported in the article. They should also include a description of the way the authors measured the coders’ reliability, citing evidence from the article.
XX. To what or whom can we generalize the results? Do you think this study would generalize across cultures, age groups, or other demographic variables? Why or why not?
YY. What did they find? What were the results?
ZZ.Have other researchers found similar results? Do the results of this study line up with other studies on the same topic?
AAA. What are the limitations of this study?
BBB. What are the ethical issues present in this study?
Sample Answers for Review Questions
8. What are the differences between qualitative and quantitative approaches to studying behavior?
Qualitative research focuses on people behaving in natural settings and describing their world in their own words Quantitative investigations often have large samples, and results are expressed in numerical terms using statistical descriptions. Quantitative researchers typically investigate research questions using experiments, surveys, structured interviews, and systematic observations.
9. What is naturalistic observation? How does a researcher collect data when conducting naturalistic observation research?
Naturalistic observation is a descriptive method in which observations are made in a natural social setting. In a study using naturalistic observation, the researcher makes observations of individuals in their natural environments (the field). The observations are made in natural settings and the researcher does not attempt to influence what occurs in the settings.
10. Why are the data in naturalistic observation research primarily qualitative?
The data are the descriptions of the observations themselves rather than quantitative statistical summaries.
11. Distinguish between participant and nonparticipant observation; between concealed and nonconcealed observation.
Participant observation allows the researcher to observe the setting from the inside, while a nonparticipant observer is an outsider who does not become an active part of the setting. Concealed observation is less reactive than nonconcealed observation because people are not aware that their behaviors are being observed and recorded.
12. What is systematic observation? Why are the data from systematic observation primarily quantitative?
Systematic observation is an observation of one or more specific variables, usually made in a precisely defined setting.
13. What is a coding system? What are some important considerations when developing a coding system?
A coding system is a set of rules used to categorize observations. The researcher must decide which behaviors are of interest, choose a setting in which the behaviors can be observed, and most important, develop a coding system to measure the behaviors.
14. What is a case study? When are case studies used? What is a psychobiography?
A case study is a descriptive account of the behavior, past history, and other relevant factors concerning a specific individual. Typically, a case study is done when an individual possesses a particularly rare, unusual, or noteworthy condition. A psychobiography is a type of case study in which the life of an individual is analyzed using psychological theory.
15. What is archival research? What are the major sources of archival data?
Archival research is the use of existing sources of information for research. Sources include statistical records, survey archives, and written records.
16. What is content analysis?
Content analysis is the systematic analysis of recorded communications.
Sample Answers for Being a Skilled Consumer of Research
Briefly describe your ideas for four studies on the following topics, using each of the four observational research strategies described in this chapter: (1) naturalistic observation, (2) systematic observation, (3) case study, and (4) archival research.
Taking an exam in college: exam speed and performance
DD.
hopping for groceries: factors that influence healthy food choices
EE.Discrimination in housing: finding a place to rent or buying a home
Students’ answers will vary.
17. Design a simple coding system that would be used in a systematic observation study that included video recordings:
Taking an exam in college: the class is recorded taking an exam
FF.Shopping for groceries: video recording equipment is set up in the produce section of a grocerystore
GG.
iscrimination in housing: video recording equipment is set up in the lobby of an apartment complex
Students’ answers will vary.
18. Describe how data would be collected using an experience sampling strategy in a study
for each of the following topics:
Stress among college students over a semester
HH.
lcohol use among college students over a semester
II. Roommate conflicts among college students over a semester
Students’ answers will vary.
19. The NORC General Social Survey website has a Data Exploration feature that allows you to examine GSS "Key Trends" over time. Go to https://gssdataexplorer.norc.org/trends. Select a topic area from the categories shown (Gender & Marriage, Current Affairs, Civil Liberties, Politics, Religion & Spirituality) for example, Life Satisfaction. Explore the data over time (by default, you are shown the percentage of respondents saying they are "very happy"). Describe any observed trend and how you might explain any change over time; you can also look at breakdowns by other variables such as health or marital status.
Students’ answers will vary.
Laboratory Demonstration: Systematic Observation
Give students experience with systematic observation using a simple coding system for easily identifiable behaviors. This can be done in the lab or by having students explore the campus in pairs at predetermined locations (e.g., library, science building, and the student union). Allow 30–45 minutes for data collection. Determine how many observations each pair needs to collect. When students return, analyze the data and discuss relevant and plausible interpretations. Here are some ideas for the kinds of observations students could make.
Eyeblink Rate
Is eyeblink rate associated with anxiety or level of attention? Instructors can test whether blinking increases in rate as the level of anxiety increases or whether blinking rate decreases as the level of attention increases. Using a time-sample approach, have students record the number of eyeblinks for a particular participant during a 30-second period. In lab, this could be done while students are reading a story about car and purse thefts on campus versus reading about a new campus art exhibit or some other nonthreatening campus activity. To examine attention, students can read material for details or general meaning. Other alternatives are possible if instructors want students to observe behavior in natural settings. If the campus has an eating establishment, students could record a participant’s blink rate when fifth in line to order and again when second in line to order. Presumably, anxiety increases as the time to order approaches. Another idea would be to observe
eyeblinks of males and females studying in the library as well as sitting in the student union. Or students could observe the blink rate of a female in a female–female dyad and then the blink rate of a female in a male–female dyad. Do the same with male participants.
Restroom Graffiti
Have students look for graffiti in various restrooms on campus (if instructors are quite dedicated, they could scout them out ahead of time to identify those restrooms that in fact display graffiti). First have students decide on a coding system. This could be very simple, such as sexual versus nonsexual, or it could be complex, such as political, heterosexual, gay or lesbian, references to the college or education, and ―other ‖ Students will then ―sample‖ restrooms to visit. Finally, students should go in pairs to categorize the graffiti found in restrooms (this procedure can be explained in terms of reliability). Compare male and female restrooms to see if the type of graffiti is related to gender of the graffiti author. Discuss whether the coding system was adequate.
Activity: Observational Methods in a Bar
If there is a drinking establishment on campus or near campus, students could replicate or expand upon the procedures used in the Geller, Russ, and Altomari (1986) study that were described in Chapter 2. This activity is useful for constructing coding systems and discussing descriptive statistics.
Activity: Survey Archives
The campus may have survey archives such as the General Social Survey on the mainframe computer. Also, more survey databases are available for analysis on microcomputers. Students can be shown how to access information in these databases and even test simple hypotheses. This is especially interesting if you are operating the computer and the students are selecting the variables to examine.
Additional Discussion Topics
Discussion: Deception of Subjects
As discussed in prior chapters, the deception of subjects is often necessary. Ask students how many of them were deceived when they participated in subject pool experiments on campus. Do they think deception was necessary?
Discussion: Coding Qualitative Data
This is a great time to discuss the issues involved in making qualitative data quantitative. Whereas Social Psych data may be largely based on surveys using Likert-type scales (meaning it is largely quantitative), areas such as Developmental Psychology tend to be more heavily qualitative. Describe to students the following: a group of developmental researchers is interested in when mental representations form so they give a toy phone to a group of 12-month-olds, 18-month-olds, and 24-month-olds. The 12-month-olds bang the phone and drool on it. The 18-month-olds bang the phone but occasionally put it to their heads. The 24-month-olds hold it to their ear and talk into
it. Ask students: Is this qualitative or quantitative? How could they make this quantitative? Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Goldstein, M. D., Hopkins, J. R., & Strube, M. J. (1994). ―The eye of the beholder‖: A classroom demonstration of observer bias. Teaching of Psychology, 21, 154–157.
Herzog, H. A. (1988). Naturalistic observation of behavior: A model system using mice in a colony. Teaching of Psychology, 15, 200–202.
Krehbiel, D., & Lewis, P. T. An observational emphasis in undergraduate psychology laboratories. Teaching of Psychology, 21, 45–48.
Chapter 7: Asking People About Themselves:
Survey Research
Learning Objectives
Discuss the reasons for conducting survey research.
Identify factors to consider when writing questions for interviews and questionnaires: simplicity, double-barreled questions, loaded questions, and negative wording.
Describe different ways to construct questionnaire responses, including closed-ended questions, open-ended questions, and rating scales.
Compare the two ways to administer surveys: written questionnaires and interviews.
Distinguish between probability and nonprobability sampling techniques, including simple random sampling, stratified random sampling, and cluster sampling; convenience (or haphazard) sampling, purposive sampling, and quota sampling.
Describe how samples are evaluated for potential bias, including sampling frame and response rate.
Brief Chapter Outline
XXXV. WhyConduct Surveys?
XXXVI. Constructing Questions to Ask JJ.Defining the Research Objectives
CCC.
acts and Demographics
DDD.
ehaviors
EEE.
ttitudes and Beliefs
KK.
uestion-Wording
Simplicity
FFF.
ouble-Barreled Questions GGG.
oaded Questions HHH.
egative Wording III. “Yea-Saying” and “Nay-Saying” XXXVII. Responses to Questions Closed-Ended Versus Open-Ended Questions LL.
umber of Response Alternatives
MM.
ating Scales
Graphic Rating Scale
JJJ.Semantic Differential Scale
KKK.
ictorial Scales
NN.
abeling Response Alternatives
XXXVIII. Finalizing the Survey Instrument
Formatting
OO.
equence of Questions
PP.
efining Questions: Pilot Testing the Survey
XXXIX. Administering Surveys
Questionnaires
Administration to Groups or Individuals LLL.
ail Surveys MMM.
nline Surveys
QQ.
nterviews
Face-to-Face Interviews NNN.
elephone Interviews
OOO.
ocus Group Interviews
XL. SurveyDesigns to Study Changes Over Time
XLI. Sampling From a Population
Confidence Intervals
RR.
ample Size
XLII. Sampling Techniques
Probability Sampling
Simple Random Sampling PPP.
tratified Random Sampling QQQ.
luster Sampling
SS.
onprobability Sampling
Convenience Sampling
RRR.
urposive and Snowball Sampling
SSS.
uota Sampling
XLIII. Evaluating Samples
Sampling Frame
TT.
esponse Rate
XLIV. Reasons for Using Convenience Samples
Extended Chapter Outline
Please note that much of this information is quoted from the text. Survey research involves using questionnaires and interviews to ask people to provide information about themselves their attitudes and beliefs, demographics (age, gender, income, marital status, etc.), and past or intended future behaviors.
Y.
hy Conduct Surveys?
Surveys are a way to collect data directly from research participants by asking them questions. Surveys have become extremely important as society demands data about people’s behavior and what people think about issues.
In research, many important variables including attitudes, current emotional states, and selfreports of behaviors are most easily studied using questionnaires or interviews. Survey research is often important as a complement to experimental research findings. An assumption that underlies the use of questionnaires and interviews is that people are willing and able to provide truthful and accurate answers. Researchers have addressed this issue by studying possible biases in the way people respond. A response set is a tendency to respond to all questions from a particular perspective rather than to provide answers that are directly related to the questions. Thus, response sets can affect the usefulness of data obtained from self-reports. The most common response set is called social desirability, or ―faking good.‖ A social desirability response set might lead a person to underreport undesirable behaviors (e.g., alcohol or drug use) and overreport positive behaviors (e.g., amount of exercise). However, it should not be assumed that people consistently misrepresent themselves. Z.
onstructing Questions to Ask
D. Defining the Research Objectives
When constructing questions for a survey, the first thing the researcher must do is explicitly determine the research objectives: What is it that they want to know? The survey questions must be tied to the research questions that are being addressed. Generally, survey questions look for information in three major areas: (1) facts and demographics, (2) behaviors, and (3) attitudes and beliefs.
1. Facts and Demographics
Factual questions ask people to indicate things they know about themselves and their situation.
2. Behaviors
Other survey questions can focus on past behaviors or intended future behaviors.
3. Attitudes and Beliefs
Questions about attitudes and beliefs focus on the ways people evaluate and think about issues.
E. Question-Wording
A great deal of care is necessary to write the very best questions for a survey. Problematic aspects of how questions are phrased include (a) unfamiliar technical terms, (b) vague or imprecise terms, (c) ungrammatical sentence structure, (d) phrasing that overloads working memory, and (e) embedding the question with misleading information. The following items will help to avoid such problems.
1. Simplicity
The questions asked in a survey should be relatively simple and straightforward. People should be able to easily understand and respond to the questions. Avoid jargon and technical terms that people won’t understand.
2. Double-Barreled Questions
Avoid double-barreled questions that ask two things at once.
3. Loaded Questions
A loaded question is written to lead people to respond in one way. Questions that include emotionally charged words such as rape, waste, immoral, ungodly, or dangerous influence the way that people respond and thus lead to biased responses
4. Negative Wording
Avoid phrasing questions with negatives.
5. “Yea-Saying” and “Nay-Saying”
When you ask several questions about a topic, a respondent may employ a response set to agree or disagree with all the questions. The tendency to agree consistently is referred to
as yea-saying (also called acquiescence response set). The tendency to disagree consistently is termed nay-saying The problem here is that the respondent may in fact be expressing true agreement, but alternatively, may simply be agreeing with anything you say.
AA. esponses to Questions
Closed-Ended Versus Open-Ended Questions
Questions may be either closed- or open-ended. With closed-ended questions, a limited number of response alternatives are given; with open-ended questions, respondents are free to answer in any way they like.
F. Number of Response Alternatives
With closed-ended questions, there is a fixed number of response alternatives.
G. Rating Scales
Rating scales ask people to provide ―how much‖ judgments on any number of dimensions amount of agreement, liking, or confidence, for example.
1. Graphic Rating Scale
A graphic rating scale requires a mark along a continuous 100-mm line that is anchored with descriptions at each end.
2. Semantic Differential Scale
The semantic differential scale is a measure of the meaning of concepts that was developed by Osgood and his associates. Respondents are asked to rate any concept persons, objects, behaviors, ideas on a series of bipolar adjectives using 7-point scales.
3. Pictorial Scales
When studying young children, adults with problems understanding verbal instructions, or even adults who are amused by emojis, researchers will often use response scales that use pictorial representations.
In addition to measures of attitudes and emotional states, pictorial measures of personality traits have been developed for children.
H. Labeling Response Alternatives
The examples thus far have labeled only the endpoints on the rating scale. Respondents decide the meaning of the response alternatives that are not labeled. But sometimes researchers need to provide labels to more clearly define the meaning of each alternative.
Labeling alternatives is particularly interesting when asking about the frequency of a behavior.
BB. inalizing the Survey Instrument
Formatting
The questionnaire should appear attractive and professional. It should be neatly designed and free of spelling errors.
I. Sequence of Questions
It is best to ask the most interesting and important questions first to capture the attention of your respondents and motivate them to complete the survey. In addition, it is a good idea to group questions together when they address a similar theme or topic.
J. Refining Questions: Pilot Testing the Survey
Before actually administering the survey, it is a good idea to give the questions to a small group of people and have them think aloud while answering them.
CC. dministering Surveys
Questionnaires
With questionnaires, the questions are presented in written format and the respondents write their answers.
1. Administration to Groups or Individuals
Often researchers are able to distribute questionnaires to groups of individuals. This might be a college class, parents attending a school meeting, people attending a new employee orientation, or students waiting for an appointment with an advisor
2. Mail Surveys
A mail survey can be mailed to individuals at a home or business address. This is a very inexpensive way of contacting the people who were selected for the sample. However, the mail format is a drawback because of potentially low response rates
3. Online Surveys
It is very easy to design a questionnaire for online administration using one of several online survey software services. Both open- and closed-ended questions can be included.
As researchers increasingly use online research strategies, it is important to consider ethical implications.
K. Interviews
One potential problem in interviews is called interviewer bias. This term describes all of the biases that can arise from the fact that the interviewer is a unique human being interacting with another human.
1. Face-to-Face Interviews
Face-to-face interviews require that the interviewer and the respondent meet to conduct the interview.
2. Telephone Interviews
Telephone interviews are less expensive than face-to-face interviews, and they allow efficient data collection because many respondents can be contacted quickly with no need for travel. With a computer-assisted telephone interview (CATI) system, the interviewer’s questions are prompted on the computer screen, and the data are entered directly into the computer for analysis.
3. Focus Group Interviews
A focus group is an interview with a group of about 6–10 individuals brought together
for a period of usually 2–3 hours. Virtually any topic can be explored in a focus group.
DD.
urvey Designs to Study Changes Over Time
Surveys most frequently study people at one point in time. Often, however, researchers wish to make comparisons over time.
One way to study changes over time is to conduct a panel study in which the same people are surveyed at two or more points in time. In a two-wave panel study, people are surveyed at two points in time; in a three-wave panel study, three surveys are conducted; and so on.
EE.
ampling From a Population
Most research projects involve sampling participants from a population of interest. The population is composed of all individuals of interest to the researcher.
Confidence Intervals
Suppose you asked a small sample of students whether they prefer to study at home or at school, and the survey results indicate that 61% prefer to study at home. Using the same degree of confidence, you would now know that the actual population value is probably between 58% and 64%. This is called a confidence interval you can have 95% confidence that the true population value lies within this interval around the obtained sample result.
The confidence interval gives you information about the likely amount of the error. The formal term for this error is sampling error, although people are probably more familiar with the term margin of error.
L. Sample Size
It is important to note that a larger sample size will reduce the size of the confidence interval. Although the size of the interval is determined by several factors, the most important is sample size. Larger samples are more likely to yield data that accurately reflect the true population value.
FF.
ampling Techniques
There are two broad categories of techniques for sampling individuals from a population:
probability sampling and nonprobability sampling. With probability sampling, each member of the population has a specifiable probability of being chosen. With nonprobability sampling, the probability of any particular member of the population being chosen is unknown.
Probability Sampling
1. Simple Random Sampling
With simple random sampling, every member of the population has an equal probability of being selected for the sample. When conducting telephone interviews, researchers commonly have a computer generate phone numbers used in the area of the sample. This will produce a random sample of the population because most people have phones.
2. Stratified Random Sampling
A somewhat more complicated procedure is stratified random sampling. The population is divided into subgroups (also known as strata), and random sampling techniques are then used to select sample members from each stratum.
3. Cluster Sampling
What if you want to study a population that has no list of members, such as people who work in county health care agencies? In such situations, a technique called cluster sampling can be used to create a probability sample. Rather than randomly sampling from a list of individuals, the researcher can identify ―clusters‖ of individuals and then sample from these clusters.
M.Nonprobability Sampling
In contrast to probability sampling, where the probability of every member is knowable, in nonprobability sampling, the probability of being selected is not known. Nonprobability sampling techniques can be quite arbitrary. However, nonprobability samples are cheap and convenient.
1. Convenience Sampling
One form of nonprobability sampling is convenience sampling. Convenience sampling could be called a ―take-them-where-you-find-them‖ method of obtaining participants. Results generated using this kind of sample might not generalize to your intended population but instead might describe only the biased sample you obtained.
2. Purposive and Snowball Sampling
Purposive sampling is a nonprobability sampling procedure in which the researcher makes a judgment regarding selection of an individual for the sample. The purpose is to obtain a sample of people who meet some predetermined criterion.
Snowball sampling is a nonprobability sampling procedure in which one or more current research participants recruit others to become part of the sample. This method relies on the participants to identify others who possess attributes needed for the sample.
3. Quota Sampling
Another form of nonprobability sampling is quota sampling. A researcher who uses this technique chooses a sample that reflects the numerical composition of various subgroups in the population. Thus, quota sampling is similar to the stratified sampling procedure previously described; however, random sampling does not occur when you use quota sampling.
GG.
valuating Samples
Samples should be representative of the population from which they are drawn. A completely unbiased sample is one that is highly representative of the population.
Sampling Frame
The sampling frame is the actual population of individuals (or clusters) from which a random sample will be drawn. Rarely will this perfectly coincide with the population of interest some biases will be introduced.
N. Response Rate
The response rate in a survey is simply the percentage of people who were selected in the sample who actually completed the survey. Thus, if you mail 1,000 questionnaires to a random sample of adults in your community and 500 are completed and returned, the response rate is 50%.
HH.
easons for Using Convenience Samples
Much of the research in psychology uses nonprobability sampling techniques to obtain participants for either surveys or experiments. The advantage of these techniques is that the investigator can obtain research participants without spending a great deal of money or time on selecting or collecting data from the sample. For example, it is common practice to select participants from students in introductory psychology classes. Why aren’t researchers more worried about obtaining random samples from the ―general population‖ for their research? Most psychological research is focused on studying the relationships between variables even though the sample may be biased.
Engaging With Research: Survey Research
After reading the article, consider the following. NOTE: Student answers will vary.
1. What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?
2. What did these researchers do? What was the method? How did Son et al. (2020) sample students? What sampling strategies did they use?
3. What was measured? These researchers set out to understand the impact of the pandemic on mental health. How did they measure mental health?
4. To what or whom can we generalize the results?
a. Where was this study conducted? Do you think that the results of this study would be generalizable to other states? Why or why not? Do you think that they would be generalizable to other countries? Why or why not?
b. See Table 1: Do you think that this distribution of survey participants by major is representative of all students? Why or why not?
c. Do you think that the results of this study would be generalizable to noncollege populations? Why or why not?
d. Son et al. (2020) did not report the ethnic/racial makeup of their sample, nor genderidentity beyond male/female. What are the implications for this lack of data?
5. What did they find? What were the results?
6. Have other researchers found similar results?
7. What are the limitations of this study?
8. What are the ethical issues present in this study?
Sample Answers for Review Questions
1. What is a survey? Describe some research questions you might address with a survey.
Surveys are a form of research that employs questionnaires and interviews to ask people to provide information about themselves. They are a way to collect data directly from research participants. Some research questions you might address in a survey include questions about attitudes and beliefs, demographics, and past or intended future behaviors.
2. What are some factors to take into consideration when constructing questions for surveys (including both questions and response alternatives)?
The following items are important to consider when you are writing questions: simplicity, double-barreled questions, loaded questions, negative wording, and ―yea-saying‖ and ―naysaying.‖ For response alternatives, consider closed- versus open-ended questions, the number of response alternatives, and the different types of rating scales.
3. What are the advantages and disadvantages of using questionnaires versus interviews in a survey?
Answers will vary; students should support their answers with evidence from the chapter.
4. Compare the different questionnaire, interview, and Internet survey administration methods.
Answers will vary, but should include discussion of the administration of questionnaires to
groups or individuals, mail surveys, online surveys, face-to-face interviews, telephone interviews, focus group interviews, and the like.
5. Define interviewer bias.
Interviewer bias is intentional or unintentional influence exerted by an interviewer in such a way that the actual or interpreted behavior of respondents is consistent with the interviewer’s expectations.
6. What is a social desirability response set?
A social desirability response set is a response set in which respondents answer questions to present themselves favorably.
7. How does sample size affect the interpretation of survey results?
Answers will vary and should be supported with evidence from the text.
8. Distinguish between probability and nonprobability sampling techniques. What are the implications of each?
Probability sampling is when one is able to specify the probability that any member of the population will be included in the sample, whereas nonprobability sampling is when you can’t specify the probability that any member of the population will be included in the sample. For the implications, answers will vary.
9. Distinguish between simple random, stratified random, and cluster sampling.
Simple random sampling is when each member of the population has an equal probability of being included in the sample. Stratified random sampling is a probability sampling method in which a population is divided into subpopulation groups, called strata, and then individuals are randomly sampled from each stratum. Cluster sampling is a probability sampling method in which existing groups or geographic areas, called clusters, are identified, randomly sampled, and then everyone in the selected clusters participates in the study.
10. Distinguish between convenience (haphazard) sampling and quota sampling.
Convenience sampling is when researchers select subjects because they are easy to obtain, usually on the basis of availability, and not with regard to having a representative sample of the population. Quota sampling is a sampling procedure in which the sample is chosen to reflect the numerical composition of the various subgroups in the population.
11. Why don’t researchers who want to test hypotheses about the relationships between variables worry a great deal about random sampling?
Answers will vary; a good answer will show an understanding of random sampling as it relates to testing hypotheses.
Sample Answers for Being a Skilled Consumer of Research
1. As we noted at the beginning of the chapter, surveys are being conducted all the time. Many survey reports are not published in peer-reviewed journals. Identify a survey report of interest to you and answer the questions below. Survey reports can be found on the Internet. Here are some examples: Youth Risk Behavior Surveillance System: https://www.cdc.gov/healthyyouth/data/yrbs/index.htm; Pew Religious Landscape Study: https://www.pewforum.org/religious-landscape-study/; National Crime Victimization Survey (NCVS): https://bjs.ojp.gov/data-collection/ncvs; Behavioral Risk Factor Surveillance System: http://cdc.gov/brfss/
a. What kinds of questions were included in the survey? Identify examples of each.
b. How were the questions developed?
c. How and when was the survey administered?
d. What was the nature of the sampling strategy? What was the final sample size?
e. What was the response rate for the survey?
f. What was the confidence interval for the survey findings?
g. Describe at least one survey finding that you found particularly interesting or surprising.
Student’s answers will vary depending on the different survey reports that each of them chooses.
2. Suppose you want to study the relationships between ratings of family satisfaction, job satisfaction, and life satisfaction (happiness). Describe how you might conduct an online survey of adults to obtain your data. What would be the reason to conduct a two-wave panel study rather than a one-wave only procedure?
Students’ answers will vary depending on the method chosen for conducting an online survey of adults. A good answer will demonstrate understanding of the definition of and differences between one-wave and two-wave panel studies. A good answer will also give specific evidence for the advantages to choosing a two-wave panel study in this instance.
3. Graesser et al. (2006) developed an application called QUAID (Question Understanding Aid) that analyzes question-wording. Write three survey questions you might ask on a topic of interest to you and go to the QUAID website (https://quaid.cohmetrix.com/) for feedback. What did you find?
Students’ answers will vary.
4. Corbie-Smith et al. (1999) conducted a focus group with 33 Black adults at an urban public hospital to better understand barriers to the participation of Black people in medical research. They found that the participants in this study were distrustful of the medical community, which was a prominent barrier to their participation in research. Cain et al. (2016) conducted a survey of 304 African American participants from the Washington, D.C., metropolitan area on a similar topic. Compare and contrast the findings. Identify the strengths and weaknesses of each research approach.
Students’ answers will vary.
5. Professional polling operations that conduct polls to describe political opinions of a population use the survey research methods described in this chapter. Political polls are often reported by news media; however, people seldom explore where the data come from. Given what you know now about survey research, educate yourself on a few different political polling operations and ask yourself what effect those methods might have on the answers people give or the nature of the samples that polling operations have access to. Here are a few examples: https://www.surveyusa.net/methodology/; https://www.gallup.com/224855/gallup-poll-work.aspx; https://poll.qu.edu/methodology/.
Students’ answers will vary.
Laboratory Demonstration: Writing Survey Questions
Students gain an appreciation for the advantages and disadvantages of writing and analyzing closed- and open-ended questions when they experience the difficulties in this process firsthand. Divide the class into two equal groups. Assign Group 1 the task of writing five open-ended questions that focus on a particular topic (suggestions for topics are listed below). Assign Group 2 the task of writing five closed-ended questions. Unobtrusively note how long it takes each group to complete the task. Have each group write their questions on the board. Next, have each student in Group 1 answer the questions written by Group 2; have each student from Group 2 answer the questions written by Group 1. After this is accomplished, have each group analyze and/or summarize the data generated from their questions. Again, unobtrusively note how long it takes each group to accomplish this. End with a discussion comparing the relative difficulties of question writing and data summary using the two types of questions. The students themselves will probably notice the differences between the groups in how long it took to write the questions and analyze the data, but if they don’t, instructors can report to them the time differences that they observed. Possible Topics
1. Is there a relationship between family support and success in college?
2. What factors are important influences on a person’s choice of movie? (Possible factors might include movie content, movie type, and actor in leading role.)
3. In a heterosexual couple, does the male or female usually determine the events of an evening out (i.e., what they do, where they eat, what movie they see)? Are there any factors that influence who decides?
Activity: Visit to a Survey Research Center
The campus may have a survey research center that conducts surveys for the college or university, government agencies, or nonprofit organizations. Many newspapers have their own survey
research centers as well. Individual students or groups of students can schedule a visit to such centers to observe the activities of the center. An interview schedule for students should be planned and should include such questions as:
1. What is a recent or ongoing survey topic?
2. What methods do you use for sampling?
3. How were the questions constructed?
4. What will be done with the results?
Activity: Surveys in the Media and on the Internet
Have students find examples of surveys conducted in the media, for example in newspapers or magazines, and on the Internet. The students can then be asked to critically evaluate the quality of the survey and its conclusions. This activity may lead to discussions on topics such as questionnaire construction and issues regarding the sample of individuals who participated in the survey.
Additional Discussion Topics
Discussion: Convenience Sampling
A good discussion can be had about the sampling validity of research that predominantly uses college students. Generally, students are forced as either part of subject pool or for extra credit to participate in research. Ask students how many of them have participated in a Gallup or Nielson poll. How many would if asked? Even though it doesn’t take a lot of time/energy, many will generally respond that they wouldn’t. How does that affect data?
Discussion: Cell Phones and National Polling
A great discussion can involve how cell phones have changed national sampling polls. Ask students how many of them have a home phone. How many of their parents still have a home phone? If companies like the Gallup poll call home phones and an entire age stratum (let’s say 35 and under) does not have home phones, are the data still generalizable? How does this lack of information on such a large group of the population affect data that are collected?
Suggested Readings
For an introduction to Internet survey research, refer to: Birnbaum, M. (2001). Introduction to behavioral research on the internet. Prentice Hall. Article in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Strube, M. J. (1991). Demonstrating the influence of sample size and reliability on study outcome. Teaching of Psychology, 18, 113–115. Also recommended:
Connor-Greene, P. A. (1993). From the laboratory to the headlines: Teaching critical evaluation of press reports of research. Teaching of Psychology, 20, 167–169. An interesting website which contains information regarding a variety of surveys and survey research is hosted by the Gallup Organization. The website can be found at: http://www.gallup.com
A research article examining considerations for research using the Internet is: Birnbaum, M. H. (2004). Human research and data collection via the internet. Annual Review of Psychology, 55, 803–832.
Chapter 8: Experimental Design
Learning Objectives
Define what a confounding variable is and describe how confounding variables are related to internal validity.
Describe the posttest-only design and the pretest-posttest design, including the advantages and disadvantages of each design.
Compare and contrast an independent groups (between-subjects) design with a repeated measures (within-subjects) design.
Summarize the advantages and disadvantages of using a repeated measures design.
Explain how counterbalancing provides a way of addressing the order effects problem.
Describe a matched pairs design, including reasons to use this design.
Brief Chapter Outline
VI.
onfounding Variables and Internal Validity
VII.
asic Experiments
A.Posttest-OnlyDesign
B. Pretest-Posttest Design
C. Comparing Posttest-Only and Pretest-Posttest Designs
VIII.
ssigning Participants to Experimental Conditions
A.Independent Groups Design
B. Repeated Measures Design
a. Advantages and Disadvantages of Repeated Measures Design
b.Counterbalancing
M. Complete Counterbalancing
N.Latin Squares
c. Time Interval Between Treatments
d.Choosing Between Independent Groups and Repeated Measures Designs
C. Matched Pairs Design
Extended Chapter Outline
Please note that much of this information is quoted from the text.
Confounding Variables and Internal Validity
A confounding variable is a variable that varies along with the independent variable Confounding occurs when the effects of the independent variable and an uncontrolled variable are intertwined so that you cannot determine which of the variables is responsible for the observed effect on the dependent variable.
Good experimental design requires eliminating all possible confounding variables that could result in alternative explanations. When the results of an experiment can confidently be attributed to the effect of the independent variable, the experiment is said to have internal validity.
IX.
asic Experiments
The simplest possible experimental design has two variables: the independent variable and the dependent variable. The independent variable has a minimum of two levels, an experimental group and a control group. Researchers must make every effort to ensure that the only difference between the two groups is the manipulated (independent) variable.
A. Posttest-Only Design
A researcher using a posttest-only design must (1) obtain two equivalent groups of participants, (2) manipulate the independent variable, and (3) measure the effect of the independent variable on the dependent variable.
The first step is to choose the participants and assign them to the two groups. The procedures used must achieve equivalent groups to eliminate any potential selection differences: The people selected to be in the conditions cannot differ in any systematic way.
Next, the researcher must choose two levels of the independent variable, such as an experimental group that receives a treatment and a control group that does not. Finally, the effect of the independent variable is measured.
B. Pretest-Posttest Design
The only difference between the posttest-only design and the pretest-posttest design is that in the latter a pretest is given before the experimental manipulation is introduced.
C. Comparing Posttest-Only and Pretest-Posttest Designs
A pretest is useful whenever there is a possibility that participants will drop out of the experiment; this is most likely to occur in a study that lasts over a long time period. The dropout factor in experiments is called attrition or mortality.
One disadvantage of a pretest, however, is that it may be time-consuming and awkward to administer in the context of the particular experimental procedures being used. Perhaps most important, a pretest can sensitize participants to what one is studying, enabling them to figure out what is being studied and (potentially) why. If awareness of the pretest is a problem, the pretest can be disguised.
It is also possible to assess the impact of the pretest directly with a combination of both the posttest-only and the pretest-posttest design. In this design, half the participants receive only the posttest and the other half receive both the pretest and the posttest. This is formally called a Solomon four-group design X.
ssigning Participants to Experimental Conditions
In one procedure, participants are randomly assigned to the various conditions so that each participates in only one group. This is called an independent groups design. It is also commonly known as a between-subjects design because comparisons are made between different groups of participants. In an experiment with two conditions, for example, each participant is assigned to both levels of the independent variable. This is called a repeated measures design because each participant is measured after receiving each level of the independent variable. This is also called a within-subjects design or a within-person design; in this design, comparisons are made within the same group of participants (subjects, persons).
A.Independent Groups Design
In an independent groups design, different participants are assigned to each of the conditions using random assignment. This means that the decision to assign an individual to a particular condition is completely random and beyond the control of the researcher.
B. Repeated Measures Design
In a repeated measures design, participants are repeatedly measured on the dependent variable after being in each condition of the experiment.
1. Advantages and Disadvantages of Repeated Measures Design
The repeated measures design has several advantages. An obvious one is that fewer research participants are needed because each individual participates in all conditions. An additional advantage of repeated measures designs is that they are extremely sensitive to finding statistically significant differences between groups.
The major problem with a repeated measures design stems from the fact that the different conditions must be presented in a particular sequence. Suppose that there is greater recall in the high-meaningful condition. Although this result could be caused by the manipulation of the meaningfulness variable, the result could also simply be an order effect the order of presenting the treatments affects the dependent variable. Performance on the second task might improve merely because of the practice gained on the first task. This improvement is in fact called a practice effect, or learning effect. It is also possible that a fatigue effect could result in a deterioration in performance from the first to the second condition as the research participant becomes tired, bored, or distracted. It is also possible for the effect of the first treatment to carry over to influence the response to the second treatment this is known as a carryover effect.
There are two approaches to dealing with order effects: counterbalancing and altering the time interval between treatments.
2. Counterbalancing
a. Complete Counterbalancing
In a repeated measures design, it is very important to counterbalance the order of the conditions. With complete counterbalancing, all possible orders of presentation are included in the experiment.
Latin Squares
A technique to control for order effects without having all possible orders is to construct a Latin square a limited set of orders constructed to ensure that (1) each condition appears at each ordinal position and (2) each condition precedes and follows each
condition one time.
3. Time Interval Between Treatments
In addition to counterbalancing the order of treatments, researchers need to carefully determine the time interval between presentation of treatments and possible activities between them.
4. Choosing Between Independent Groups and Repeated Measures Designs
Repeated measures designs have two major advantages over independent groups designs: (1) a reduction in the number of participants required to complete the experiment and (2) greater control over participant differences and thus greater ability to detect an effect of the independent variable.
C. Matched Pairs Design
A somewhat more complicated method of assigning participants to conditions in an experiment is called a matched pairs design. Instead of simply randomly assigning participants to groups, the goal is to first match people on a participant variable such as age or personality trait. The matching variable will be either the dependent measure or a variable that is strongly related to the dependent variable.
Engaging With Research: Experimental Design
After reading the article, answer the following questions (which will be familiar to you from earlier in this chapter!). NOTE: Student answers will vary.
20. What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?
21. What did these researchers do? What was the method?
a. What was the independent variable? How many levels did the independent variable have? How were research participants assigned to groups?
Was this study a independent groups design or a repeated measures design?
22. What was measured? How was visuospatial memory measured?
23. To what or whom can we generalize the results?
a. Where did the participants from this study come from?
b. Did the authors include information about the demographics of their sample?
24. What did they find? What were the results?
25. Have other researchers found similar results?
26. What are the limitations of this study?
27. What are the ethical issues present in this study?
Sample Answers for Review Questions
What is a confounding variable?
A confounding variable is a variable that is not controlled in a research investigation. It is a variable that varies along with independent variable. Confounding occurs when the effects of the independent variable and an uncontrolled variable are intertwined so you cannot determine which of the variables is responsible for the observed effect on the dependent variable.
28. What is meant by the internal validity of an experiment?
When the results of an experiment can confidently be attributed to the effect of the independent variable, the experiment is said to have internal validity.
29. How do the two true experimental designs eliminate the problem of selection differences? Answers will vary, but will demonstrate an understanding of true experimental designs and their relationship to selection differences.
30. Distinguish between the posttest-only design and the pretest-posttest design. What are the advantages and disadvantages of each?
A posttest-only design is a true experimental design in which the dependent variable (posttest) is measured only once, after manipulation of the independent variable. A pretestposttest design is a true experimental design in which the dependent variable is measured both before (pretest) and after (posttest) manipulation of the independent variable. For the second question, answers will vary.
31. What is a repeated measures design? What are the advantages of using a repeated measures design? What are the disadvantages?
A repeated measures design is an experiment in which the same subjects are assigned to each
group. Answers will vary regarding advantages and disadvantages.
32. What are ways of dealing with the problems of a repeated measures design, including counterbalancing?
Answers will vary, but should demonstrate an understanding of counterbalancing, a method of controlling for order effects in a repeated measures design by either including all orders of treatment presentation or randomly determining the order for each subject.
33. When would a researcher decide to use the matched pairs design? What would be the advantages of this design?
Answers will vary, but should demonstrate an understanding of matched pairs design, a method of assigning subjects to groups in which pairs of subjects are first matched on some characteristic and then individually assigned randomly to groups.
34. The procedure used to obtain your sample (i.e., random or nonrandom sampling) is not the same as the procedure for assigning participants to conditions; distinguish between random sampling and random assignment.
Answers will vary, but should demonstrate a clear understanding of random sampling and random assignment and the differences between the two.
Sample Answers for Being a Skilled Consumer of Research
Professor Foley conducted a cola taste test. Each participant in the experiment first tasted 2 ounces of Coca-Cola, then 2 ounces of Pepsi, and finally 2 ounces of Sam’s Choice Cola. A rating of the cola’s flavor was made after each taste. What are the potential problems with this experimental design and the procedures used? Revise the design and procedures to address these problems. Consider several alternatives and think about the advantages and disadvantages of each.
XLV. A good answer will address the possibilityof an order and carryover effect as a potential problem with this design. The rating of the cola’s flavors maybe influenced bythe taste of the previous cola. Thus, the differences in ratings may be a result of a contrast effect. Professor Foley should include counterbalancing techniques and lengthen the interval of time between taste tests in order to minimize the influence of order effects. Another suggestion is to use a Latin square design. This controls the order effects without having to use all of the combinations of order of presentation.
35. Dr. Kim wanted to study the impact of different volumes of music on dart-throwing performance. She hypothesized that distraction would affect dart-throwing skills. She decided to design a repeated measures experiment with three conditions: no music, soft music, and loud music. Answer these two questions:
Why would she choose a repeated measures design?
O.Her participants threw darts in a room with music volume set to none, soft, or loud. All subjects started in the no-music room, and then proceeded to the soft-music room and finally to the loud-music room. How could her study results be affected by these three order effects practice, fatigue, carryover? How could Dr. Kim address these problems?
Students’ answers will vary.
Laboratory Demonstration: Hemispheric Specialization
Students can gain a concrete understanding of the difference between an independent groups design and a repeated measures design by participating as participants in related experiments utilizing these approaches. The two experiments outlined below are based on the concept of brain lateralization. The organization of the brain is such that certain activities are primarily the function of the left hemisphere, while others are primarily the function of the right (see Springer and Deutsch, 1985). For example, such cognitive tasks as speaking, reading, logical analysis, or doing high-level math calculations take place in the left hemisphere for most right-handed individuals. In contrast, such tasks as musical performance, face recognition, or spatial relations take place in the right hemisphere. Those who are right-handed tend to show more complete lateralization of function than those who are left-handed; the brains of left-handed individuals are often organized such that language is a function shared by both hemispheres (Gloning, Gloning, Haub, & Quatember, 1969).
Because of the inconsistent lateralization found in left-handed individuals, we recommend that only right-handed students be used as participants for this demonstration. Pair each student participant with a student experimenter. The experimenter/participant pairs should be as separate from each other as possible. Randomly assign participants to two groups. One group completes a tapping task while saying aloud the Pledge of Allegiance (Tap + Talk condition). The second group completes the tapping task alone (Tap Alone condition). The tapping task is the same for both groups: With eyes averted, each participant taps on a piece of plain paper with a felt tip pen as rapidly as possible for 15 seconds. In addition, each participant does this twice once with their right hand and once with their left hand. The order should be counterbalanced so that half the
participants in each group use their right hands first, while the remaining participants use their left hands first. Count the number of dots each participant made during the 15-second period. Create a single score for each participant by subtracting the number of dots made by the left hand from the number of dots made by the right hand. The group means can be compared using an independent groups t test.
According to Kee, Bathurst, and Hellige (1983), there should be slightly positive scores in the Tap Alone condition as right-handed participants are generally faster at tapping with the right as compared to the left hand. It is predicted that the difference scores in the Tap + Talk condition should be lower due to the interference of the talking activity with right-hand tapping. This prediction will probably not be confirmed, however, because of the inappropriate use of a betweengroups design. Encourage the students to recognize that the individual differences in tapping rate can easily mask the interference effects we are trying to measure. Discuss with the class how a repeated measures design enhances the chance of seeing the effect.
Repeat the experiment using a repeated measures design. All participants tap the pen on the paper for 15 seconds under each condition: Tap Alone and Tap + Talk. Again, each participant taps using the right hand as well as the left hand. (Counterbalance the order of right- and left-hand tapping as well as the order of tapping condition.) This results in a total of four measurements per participant. As before, subtract the tapping score with the left hand from the tapping score with the right hand. Do this for each participant under each condition so there are two scores for each participant. Analyze these scores using a correlated groups t test.
This effect is small, but it is reliable. If instructors do not wish to take the time to statistically analyze the results, be sure to point out that in the between-groups design instructors must compare means across groups, while in the within-groups design we compare the individual scores across conditions, looking for consistent differences. These differences may then be summarized as a mean difference. In studies of this nature, the mean difference used in a repeated measures design is a more sensitive measure than the difference between two means calculated in a between-groups design.
Gloning, I., Gloning, K., Haub, G., & Quatember, R. (1969). Comparison of verbal behavior in right-handed and non right-handed patients with anatomically verified lesion of one hemisphere. Cortex, 5, 43–52.
Kee, D. W., Bathurst, K., & Hellige, J. B. (1983). Lateralized interference of repetitive finger tapping: Influence of familial handedness, cognitive load and verbal production. Neuropsychologia, 21, 617–624.
Springer, S. P., & Deutsch, G. (1985). Left brain, right brain (2nd ed.). Freeman.
Laboratory Demonstration: Taste Test
Conduct a taste test or ―Pepsi challenge‖ to demonstrate independent groups and repeated measures (between-participants, within-participants) designs. Half the class can design and conduct a taste test using an independent groups design and the rest of the class can use a repeated measures design. Instructors can use this activity to point out many facets of the decision to use one or the other design.
Activity: Confounding
Practice in identifying confounds will sensitize students to the importance of controlling all, even seemingly trivial, extraneous variables and will aid them in designing their own experiments. Handout 3 in this manual describes experiments with confounding problems and questions for students to address. These can be used as homework, discussion items in class, or exam questions.
Additional Discussion Topics
Discussion: Pretest Versus Posttest
An important distinction when performing an experiment is whether there is a pretest or not. Remind students that to some extent the area of psychology that is being studied may influence this answer. For example, if one is looking at the development of something, then one may as well want a pretest to measure Time 1 to Time 2 differences. Ask students to generate examples of things that could be studied and discuss the advantages and disadvantages of including a pretest to the design.
Discussion: Attrition
A good way to introduce the concept of attrition is by taking the class, which the students are in, as an example. How many of the students who were signed up on Day1 are still in the class? How many are still registered but have ―dropped out‖ in terms of attendance? Does this affect test scores? What if one was collecting other types of data?
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Kohn, A. (1992). Defying intuition: Demonstrating the importance of the empirical technique.
Teaching of Psychology, 19, 217–219
Polyson, J. A., & Blick, K. A. (1985). Basketball game as psychology experiment. Teaching of Psychology, 12, 52–53.
Stallings, W. M. (1993). Return to our roots: Raising radishes to teach experimental design. Teaching of Psychology, 20, 165–167.
Also recommended:
Christopher, A. N., & Marek, P. (2002). A sweet tasting demonstration of random occurrences. Teaching of Psychology, 29, 122–125.
Enders, C. K., Laurenceau, J.-P., & Stuetzle, R. (2006). Teaching random assignment: A classroom demonstration using a deck of playing cards. Teaching of Psychology, 33, 239–242.
Chapter 9: Conducting Experiments
Learning Objectives
Explain the differences between straightforward and staged manipulations of an independent variable.
Distinguish among the three types of dependent variables: self-report, behavioral, and physiological.
Discuss sensitivity of a dependent variable, contrasting floor effects and ceiling effects.
Describe ways to control participant expectations and experimenter expectations. Summarize the reasons for conducting pilot studies.
Describe the advantages of including a manipulation check in an experiment. Brief Chapter Outline
XI. Selecting Research Participants
XII. Manipulating the Independent Variable
D. Setting the Stage
E. Types of Manipulations
29. Straightforward Manipulations
30. Staged Manipulations
F. Strength of the Manipulation
G. Cost of the Manipulation
XIII. Measuring the Dependent Variable Types of Measures
Self-Report Measures
31. Behavioral Measures
32. Physiological Measures
H. Multiple Measures
I. Sensitivity of the Dependent Variable
J. Cost of Measures
XIV. Additional Controls
Controlling for Participant Expectations
Demand Characteristics
33. Placebo Groups
K. Controlling for Experimenter Expectations
Research on Expectancy Effects
Solutions to the Expectancy Problem
XV. Final Planning Considerations
Research Proposals
L. Pilot Studies
M. Manipulation Checks
N. Debriefing
O. Open Science and Preregistration
XVI. Analyzing and Interpreting Results
XVII. Communicating Research to Others
Professional Meetings
P. Journal Articles
Extended Chapter Outline
Please note that much of this information is quoted from the text.
XXII. Selecting Research Participants
The method used to select participants can have a profound impact on external validity. External validity is defined as the extent to which results from a study can be generalized to other populations and settings.
XXIII. Manipulating the Independent Variable
To manipulate an independent variable, one would have to construct an operational definition of the variable. That is, one must turn a conceptual variable into a set of operations specific instructions, events, and stimuli to be presented to the research participants.
f. Setting the Stage
In setting the stage, one usually has to supply the participants with the information necessary for them to provide their informed consent to participate. This generally includes information about the underlying rationale of the study.
g. Types of Manipulations
1. Straightforward Manipulations
Researchers are usually able to manipulate an independent variable with relative simplicity by presenting written, verbal, or visual material to the participants. Such straightforward manipulations manipulate variables with instructions and stimulus presentations.
Staged Manipulations
Sometimes, it is necessary to stage events during the experiment in order to manipulate the independent variable successfully. When this occurs, the manipulation is called a staged manipulation or event manipulation.
Staged manipulations are most frequently used for one of two reasons: first, to create some psychological state in the participants, such as frustration, anger, or a temporary lowering or raising of self-esteem; second, to simulate some situation that occurs in the real world.
Staged manipulations frequently employ a confederate (sometimes termed an ―accomplice‖). Usually, the confederate appears to be another participant in an experiment but is actually part of the manipulation.
Q. Strength of the Manipulation
The simplest experimental design has two levels of the independent variable. In planning the experiment, the researcher has to choose these levels. A general principle to follow is to make the manipulation as strong as possible; the strength of manipulation matters. A strong manipulation maximizes the differences between the two groups and increases the chances that the independent variable will have a statistically significant effect on the dependent variable.
R. Cost of the Manipulation
Cost is another factor in the decision about how to manipulate the independent variable. Researchers who have limited monetary resources may not be able to afford expensive equipment, salaries for confederates, or payments to participants in long-term experiments.
Measuring the Dependent Variable
Types of Measures
The dependent variable in most experiments is one of three general types: self-report, behavioral, or physiological.
Self-Report Measures
Self-reports can be used to measure attitudes, liking for someone, judgments about someone’s personality characteristics, intended behaviors, emotional states, attributions
about why someone performed well or poorly on a task, confidence in one’s judgments, and many other aspects of human thought and behavior.
34. Behavioral Measures
Behavioral measures are direct observations of behaviors. As with self-reports, measurements of an almost endless number of behaviors are possible.
Physiological Measures
Physiological measures are recordings of responses of the body. Many such measurements are available; examples include the galvanic skin response (GSR), electromyogram (EMG), and electroencephalogram (EEG) An MRI provides an image of an individual’s brain structure. In addition, a functional MRI (fMRI) allows researchers to scan areas of the brain while a research participant performs a physical or cognitive task or experiences something new in the environment
S. Multiple Measures
Although it is convenient to describe single dependent variables, most studies include more than one dependent measure. One reason to use multiple measures stems from the fact that a variable can be measured in a variety of concrete ways.
T. Sensitivity of the Dependent Variable
The dependent variable should be sensitive enough to detect differences between groups. The issue of sensitivity is particularly important when measuring human performance. Issues include the ceiling effect the independent variable appears to have no effect on the dependent measure only because participants quickly reach the maximum performance level. The opposite problem occurs when a task is so difficult that hardly anyone can perform well; this is called a floor effect
U. Cost of Measures
Another consideration is cost some measures may be more costly than others. Paper-andpencil self-report measures are generally inexpensive; measures that require trained observers or elaborate equipment can become quite costly.
XVIII. Additional Controls
The basic experimental design has two groups: in the simplest case, an experimental group that
receives the treatment and a control group that does not. Use of a control group makes it possible to eliminate a variety of alternative explanations for the results, thus improving internal validity.
Controlling for Participant Expectations
Demand Characteristics
We noted previously that experimenters generally do not wish to inform participants about the specific hypotheses being studied or the exact purpose of the research. The reason for this lies in the problem of demand characteristics (Orne, 1962), which is any feature of an experiment that might inform participants of the purpose of the study. The researcher may also attempt to disguise the dependent variable by using an unobtrusive measure or by placing the measure among a set of unrelated filler items on a questionnaire.
35. Placebo Groups
A special kind of participant expectation arises in research on the effects of drugs. In some experiments, we may not know whether a change was caused by the properties of the drug or by the participants’ expectations about the effect of the drug what is called a placebo effect To control for this possibility, a placebo group can be added. Participants in the placebo group receive a pill or injection containing an inert, harmless substance; they do not receive the drug given to members of the experimental group. If the improvement results from the active properties of the drug, the participants in the experimental group should show greater improvement than those in the placebo group. If the placebo group improves as much as the experimental group, all improvement could be caused by a placebo effect.
V. Controlling for Experimenter Expectations
Experimenters are usually aware of the purpose of the study and thus may develop expectations about how participants should respond. These expectations can in turn bias the results. This general problem is called expectancy effects or experimenter bias.
Research on Expectancy Effects
Expectancy effects have been studied in a variety of ways. Research has shown that experimenter expectancies can be communicated to humans by both verbal and nonverbal means. One particular generalization of this effect is called ―teacher expectancy.‖ Research has shown that telling a teacher that a pupil will bloom intellectually over the
next year results in an increase in the pupil’s IQ score.
36. Solutions to the Expectancy Problem
Experimenters should be well trained and should practice behaving consistently with all participants. Another solution is to run all conditions simultaneously so that the experimenter’s behavior is the same for all participants. Expectancy effects are also minimized when the procedures are automated.
A final solution is to use experimenters who are unaware of the hypothesis being investigated. In a single-blind experiment, the participant is unaware of whether a placebo or the actual drug is being administered; in a double-blind experiment, neither the participant nor the experimenter knows whether the placebo or actual treatment is being given.
XIX. Final Planning Considerations
Research Proposals
After putting considerable thought into planning the study, the researcher writes a research proposal. The proposal will include a literature review that provides a background for the study. The intent is to clearly explain why the research is being done what questions the research is designed to answer. The details of the procedures that will be used to test the idea are then given.
W. Pilot Studies
It is possible to conduct a pilot study in which the researcher does a trial run with a small number of participants. The pilot study will reveal whether participants understand the instructions, whether the total experimental setting seems plausible, whether any confusing questions are being asked, and so on.
X. Manipulation Checks
A manipulation check is an attempt to directly measure whether the independent variable manipulation has the intended effect on the participants. Manipulation checks provide evidence for the construct validity of the manipulation.
Y. Debriefing
After all the data are collected, a debriefing session is usually held. This is an opportunity
for the researcher to interact with the participants to discuss the ethical and educational implications of the study.
Z. Open Science and Preregistration
The Open Science movement emerged as a response to threats to the integrity of modern science. The Center for Open Science is dedicated to ―increasing the openness, integrity, and reproducibility of scientific research.‖
Science is a process; when that process works, over time, uncertainty is reduced. As we study a phenomenon, we understand it better. The integrity of the process directly influences the quality of those outcomes.
The final planning consideration would be to preregister a study and analysis plan in an open-access archive. It is to commit an act of open science.
XX. Analyzing and Interpreting Results
After the data have been collected, the next step is to analyze them. Statistical analyses of the data are carried out to allow the researcher to examine and interpret the pattern of results obtained in the study.
XXI. Communicating Research to Others
The final step is to write a report that details why you conducted the research, how one obtained the participants, what procedures were used, and what was found.
Professional Meetings
Meetings sponsored by professional associations are important opportunities for researchers to present their findings to other researchers and the public.
AA. Journal Articles
When a researcher submits a paper to a journal, two or more reviewers read the paper and recommend acceptance (often with the stipulation that revisions be made) or rejection. This process is called peer review and it is very important in making sure that research has careful external review before it is published.
Engaging With Research: Conducting Experiments
After reading the article, answer the following questions (which will be familiar to you from earlier in this chapter!). NOTE: Student answers will vary.
What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals? What did these researchers do? What was the method?
P. How did they measure visual attention?
Q. What did they manipulate?
R. What was the independent variable? How many levels did the independent variable have? How were research participants assigned to groups?
1. What was measured? How was visuospatial memory measured?
2. To what or whom can we generalize the results?
Who were the people in the sample?
S. What sampling strategy did they use and what are the implications of that strategy?
T. Where did the participants from this study come from?
U. Did the authors include information about the demographics of their sample?
3. What did theyfind? What were the results?
4. Have other researchers found similar results?
5. What are the limitations of this study?
6. What are the ethical issues present in this study?
Sample Answers for Review Questions
What is the difference between straightforward and staged manipulations of an independent variable?
Straightforward manipulations manipulate variables with instructions and stimulus presentations. Staged manipulations involve staging events during an experiment in order to
manipulate the independent variable successfully.
7. What are the three general types of dependent variables?
The three types are self-report, behavioral, and physiological.
8. What is meant bythe sensitivity of a dependent measure? What are ceiling effects and floor effects?
The dependent variable should be sensitive enough to detect differences between groups. The ceiling effect is when the independent variable appears to have no effect on the dependent measure only because participants quickly reach the maximum performance level. With a floor effect, the task is so difficult that hardly anyone can perform well.
9. What are demand characteristics? Describe ways to minimize demand characteristics.
Demand characteristics are any feature of an experiment that might inform participants of the purpose of the study. One way to minimize demand is to use deception to make participants think that the experiment is studying one thing when actually it is studying something else. One could also simply ask participants about their perceptions of the purpose of the research. Demand characteristics may be eliminated when people are not aware that an experiment is taking place or that their behavior is being observed.
10. What is the reason for having a placebo group?
A placebo group is used to control for the possibility of the placebo effect an effect that can occur when participants in a study have expectations that may affect the outcome of the study.
11. What are experimenter expectancy effects? What are some solutions to the experimenter bias problem?
Expectancy effects may occur whenever the experimenter knows which condition the participants are in. Some solutions include good training for the experimenter, resulting in consistent behavior with all participants, running all conditions simultaneously to assure equal behavior toward the participants, automating the procedures, and using experimenters who are unaware of the hypothesis being investigated.
12. What is a pilot study?
In a pilot study, the researcher does a trial run with a small number of participants, revealing whether participants understand the instructions, whether the experimental setting seems plausible, and whether any confusing questions are being asked.
13. What is a manipulation check? How does it help the researcher interpret the results of an experiment?
A manipulation check is an attempt to directly measure whether the independent variable manipulation has the intended effect on the participants. These checks provide evidence for the construct validity of the manipulation. They are particularly useful in the pilot study.
14. Describe the value of a debriefing following the study.
A debriefing session is an opportunity for the researcher to interact with the participants to discuss the ethical and educational implications of the study. It is also an opportunity to learn more about what participants were thinking during the experiment, as well as an opportunity to ask participants to refrain from discussing the study with others.
15. What does a researcher do with the findings after completing a research project?
The researcher writes a report that details why he or she conducted the research, how the participants were obtained, what procedures were used, and what the research found. The research findings are usually submitted as journal articles or as papers to be read at scientific meetings. The submitted paper is evaluated by two or more knowledgeable reviewers who decide whether the paper is acceptable for publication or presentation at the meeting.
Sample Answers for Being a Skilled Consumer of Research
Dr. García has decided to conduct an experiment to see if moods impact political attitudes. She is planning on manipulating positive mood and measuring the strength of people’s attitudes toward polarizing political topics (e.g., how strongly a person holds an attitude that climate change is a serious problem). She believes that people who are in a positive mood state will show weaker attitudes; they will care less about divisive issues if they are happy. First, she needs to consider the independent variable.
Should she choose a straightforward manipulation or a staged manipulation? Why or why not?
Student answers will vary.
V. Identify two examples of manipulations that she could make (e.g., she could show a funny video).
Student answers will vary.
W. How should she test the strength of the manipulation?
Student answers will vary.
16. Next, Dr. García needs to consider her dependent variable:
Identify examples of self-report and behavioral measures to recommend to Dr. Garcia.
Student answers will vary.
X. Should she include multiple DVs? Why or why not?
Student answers will vary.
Y. Given the examples that you identified, describe the sensitivity of each measure? Is there potential for a floor effect? Or a ceiling effect?
Student answers will vary.
17. Then, Dr. García has to consider experimental controls:
What additional controls should Dr. García include in the experiment to control for demand characteristics?
Student answers will vary.
Z. What additional controls should Dr. García include in the experiment to control for experimenter expectancies?
Student answers will vary
18. Finally, Dr. García decides to conduct a pilot study and finds no significant difference between the experimental conditions. Should she continue with the experiment? Why or why not? What should she do next? Explain your recommendations.
Student answers will vary
Activity: Data Collection for Later Analysis
The questionnaire shown on the following page is designed to illustrate various procedures for
statistical analysis. Variables include gender, age, interests, and shyness. Some are clearly measured using nominal scales and others illustrate interval scales. The shyness scale consists of four items; scores on two items must be reversed and then a sum of scores on the four items must be obtained for an overall score. Students can collect data, code data, and develop hypotheses about what kinds of findings might be expected. When instructors begin discussing data analysis, the data can be used to illustrate descriptive statistics as well as inferential statistics. Instructors may wish to have students do analyses with a computer (see Appendix B activity).
Examples of some of the questions that students might address include:
Is gender related to interests (χ 2)?
19. Is age related to shyness (Pearson r)?
20. Is gender related to shyness (independent groups t test)?
21. Do people judge the percent of the likelihood of their own divorce to be lower than the percent of the likelihood that the average person will get a divorce (paired-samples t test)?
Activity: Data From Personal Ads
For this activity, have students examine personal ads to collect data on a variety of variables. For example, students may collect information on age preference by younger versus older individuals, age preference by males versus females, or whether the individuals placing the ads had been married or not. The data can be used to address a variety of topics regarding data collection and statistical analysis (e.g., identification of the types of scales to which the data belong). The data collected in this activity can also be used in discussions of data analysis in later chapters of the text. Rajecki (2002) has employed this approach to illustrate concepts of statistical analysis.
Rajecki, D. W. (2002). Personal ad content analysis teaches statistical applications. Teaching of Psychology, 29, 119–122.
Activity: Participating in Experiments Online
A website at the University of Mississippi has a site which contains a number of online experiments that students can complete. The experiments include a mental rotation experiment similar to the one described in the text. It is possible to download the results for only students in your class or people from around the world who have completed the experiment.
Other organizations such as the Social Psychology Network and the American Psychological Society also have links to various experiments online. The sites can be found at: http://socialpsychology.org/expts.htm and http://psych.hanover.edu/research/exponnet.html
Class Survey
Please answer all the questions below. Your answers, along with the responses of many others, will be used by students in a class to learn techniques of data analysis. Thanks for your help!
Please indicate your:
Gender: _____ Male _____ Female
Age: _____
Please rate your liking of the following interests/activities:
Please answer the following questions using this rating scale:
1 = disagree strongly
2 = disagree mildly
3 = neither agree nor disagree
4 = agree mildly
5 = agree strongly
I consider myself to be shy.
1 2 3 4 5
I love to go to parties and be with people.
1 2 3 4 5
I like to be alone most of the time.
1 2 3 4 5
I consider myself to be outgoing.
1 2 3 4 5
On a scale of 0 (very unlikely) to 100 (very likely), how likely do you think it is that you will get a divorce sometime in the future?
On the same 0–100 scale, how likely do you think it is that the average person will get a divorce
Additional Discussion Topics
Discussion: Placebo Effects
This may be a good time to discuss the research on placebo versus antidepressants. Much research suggests that for mild to moderate depression, placebo effects are on par with the currently prescribed drugs. Ask students what this research suggests. What are the ethical implications if you find that a medical condition can be treated with a placebo?
See the following for more information:
Fournier, J. C., DeRubeis, R. J., Hollon, S. D., Dimidjian, S., Amsterdam, J. D., Shelton, R. C., & Fawcett, J. (2010). Antidepressant drug effects and depression severity: A patient-level metaanalysis The Journal of the American Medical Association, 303(1), 47–53.
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
At this point in the class, many instructors are having students design their own studies. Several articles in the Ware and Brewer book may be useful to instructors to design their course.
Baird, B. N. (1991). In-class poster sessions. Teaching of Psychology, 18, 27–29.
Goolkasian, P. (1985). A microcomputer-based lab for psychology instruction. Teaching of Psychology, 12, 223–225.
Gore, P. A., & Camp, C. J. (1987). A radical poster session. Teaching of Psychology, 14, 243–244
Klein, K., & Cheuvront, B. (1990). The subject-experimenter contract: A reexamination of subject pool contamination. Teaching of Psychology, 17, 166–169
Peden, B. F. (1987). Learning about microcomputers and research. Teaching of Psychology, 14, 217–219.
Rosenberg, J, & Blount, R. L. (1988). Poster sessions revisited: A student research convocation. Teaching of Psychology, 15, 38–39.
Also recommended:
Harcum, E. R. (1989). Demonstrating that an ability does not exist. Teaching of Psychology, 16, 85–86.
O’Dell, C. D., & Hoyert, M. S. (2002). Active and passive touch: A research methodology project. Teaching of Psychology, 29, 292–294.
Chapter 10: Complex Experimental Designs
Learning Objectives
Define factorial design and discuss reasons a researcher would use this design.
Describe the information provided by main effects and interaction effects in a factorial design.
Discuss the role of simple main effects in interpreting interactions.
Compare the assignment of participants in an independent groups design, a repeated measures design, and a mixed factorial design
Brief Chapter Outline
XXII. Increasing the Number of Levels of an Independent Variable XXIII. Increasing the Number of Independent Variables: Factorial Designs
BB. Interpretation of Factorial Designs
37. Main Effects
38. Interactions
XXIV. Outcomes of a 2 × 2 Factorial Design
Interactions and Simple Main Effects
Simple Main Effect of Confederate Food Intake
39. Simple Main Effect of Sociability
XXV. Assignment Procedures and Factorial Designs
Independent Groups (Between-Subjects) Design
CC. Repeated Measures (Within-Subjects) Design
DD. Mixed Factorial Design Using Combined Assignment XXVI. Increasing the Number of Levels of an Independent Variable XXVII. Factorial Designs With Three or More Independent Variables
Extended Chapter Outline
Please note that much of this information is quoted from the text
Increasing the Number of Levels of an Independent Variable
In the simplest experimental design, there are only two levels of the independent variable (IV). However, a researcher might want to design an experiment with three or more levels for several reasons.
First, a design with only two levels of one independent variable cannot provide very much information about the exact form of the relationship between the independent and dependent variables.
An experimental design with only two levels of the independent variable cannot detect curvilinear relationships between variables. If a curvilinear relationship is predicted, at least three levels must be used. Further, researchers frequently are interested in comparing more than two groups.
XXVIII. Increasing the Number of Independent Variables: Factorial Designs
Factorial designs are designs with more than one independent variable (or factor). In a factorial design, all levels of each independent variable are combined with all levels of the other independent variables. The simplest factorial design known as a 2 × 2 (two by two) factorial design has two independent variables, each having two levels.
Interpretation of Factorial Designs
Factorial designs yield two kinds of information. The first is information about the effect of each independent variable taken by itself: This is called the main effect of an independent variable. In a design with two independent variables, there are two main effects one for each independent variable.
The second type of information is called an interaction. If there is an interaction between two independent variables, the effect of one independent variable depends on the particular level of the other variable. In other words, the effect that an independent variable has on the dependent variable depends on the level of the other independent variable. Interactions are a new source of information that cannot be obtained in a simple experimental design in which only one independent variable is manipulated.
Main Effects
A main effect is the effect each variable has by itself. The main effect of each independent variable is the overall relationship between that independent variable and the dependent
variable.
40. Interactions
An interaction between independent variables indicates that the effect of one independent variable is different at different levels of the other independent variable. That is, an interaction tells one that the effect of one independent variable depends on the particular level of the other. Interactions can be seen easily when the means for all conditions are presented in a graph.
XXIX. Outcomes of a 2 × 2 Factorial Design
A 2 × 2 factorial design has two independent variables, each with two levels. When analyzing the results, researchers deal with several possibilities:
o There mayor may not be a significant main effect for independent variable A.
o There mayor may not be a significant main effect for independent variable B.
o There mayor may not be a significant interaction between the independent variables.
Interactions and Simple Main Effects
A statistical procedure called analysis of variance is used to assess the statistical significance of the main effects and the interaction in a factorial design. When a significant interaction occurs, the researcher must statistically evaluate the individual means. When there is a significant interaction, the next step is to look at the simple main effects A simple main effect analysis examines mean differences at each level of the independent variable.
Simple Main Effect of Confederate Food Intake
In Figure 4, we can look at the simple main effect of confederate food intake. This will tell one whether the difference between the low and high confederate food intake is significant when the confederate is (1) sociable and (2) unsociable. In this case, the simple main effect of confederate food intake is significant when the confederate is unsociable (means of 2.14 vs. 10.63), but the simple main effect of confederate food intake is not significant when the confederate is sociable (means of 6.58 and 5.68).
41. Simple Main Effect of Sociability
One could also examine the simple main effect of confederate sociability; here one would
compare the sociable versus unsociable conditions when the food intake is low and then when food intake is high. The simple main effect that one will be most interested in will depend on the predictions that one made when one designed the study.
XXX. Assignment Procedures and Factorial Designs
There are two basic ways of assigning participants to conditions:
In an independent groups design, different participants are assigned to each of the conditions in the study. In a repeated measures design, the same individuals participate in all conditions in the study.
The design can be completely independent groups, completely repeated measures, or a mixed factorial design that is, a combination of the two.
Independent Groups (between-subjects) Design
In a 2 × 2 factorial design, there are four conditions. If one wants a completely independent groups (between-subjects) design, a different group of participants will be assigned to each of the four conditions.
EE. Repeated Measures (within-subjects) Design
In a completely repeated measures (within-subjects) design, the same individuals will participate in all conditions.
Mixed Factorial Design Using Combined Assignment
Some studies use both independent groups and repeated measures procedures in a mixed factorial design.
XXXI. Increasing the Number of Levels of an Independent Variable
The 2 × 2 is the simplest factorial design. With this basic design, the researcher can arrange experiments that are more and more complex. One way to increase complexity is to increase the number of levels of one or more of the independent variables. A 2 × 3 design, for example, contains two independent variables: Independent variable A has two levels, and independent variable B has three levels. Thus, the 2 × 3 design has six conditions.
XXXII. Factorial Designs With Three or More Independent Variables
One can also increase the number of variables in the design. A 2 × 2 × 2 factorial design contains three variables, each with two levels. Thus, there are eight conditions in this design. In a 2 × 2 × 3 design, there are 12 conditions; in a 2 × 2 × 2 × 2 design, there are 16. The rule for constructing factorial designs remains the same throughout.
Sometimes students are tempted to include in a study as many independent variables as they can think of. A problem with this is that the design may become needlessly complex and require enormous numbers of participants.
Engaging With Research: Complex Experimental Designs
After reading the article, answer the following questions (which will be familiar to you from earlier in this chapter!). NOTE: Student answers will vary.
What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?
42. What did these researchers do? What was the method?
a. How many independent variables were there? How many levels did each IV have?
b. How were participants assigned to groups? Was this a repeated measures design or an independent groups design?
c. How was the IV manipulated? Was the manipulation staged or straightforward?
d. How did they conduct a manipulation check?
43. What was measured?
44. To what or whom can we generalize the results?
45. What did they find? What were the results?
Describe the interaction depicted in Figure 1.
46. Have other researchers found similar results?
47. What are the limitations of this study?
48. What are the ethical issues present in this study?
How were research participants compensated?
Sample Answers for Review Questions
Why would a researcher have more than two levels of the independent variable in an experiment?
With more than two levels, the researcher can arrange experiments that are increasingly complex.
49. What is a factorial design? Why would a researcher use a factorial design?
Factorial designs are experimental designs with more than one independent variable (or factor). These designs allow for more complex experiments.
50. What are the main effects in a factorial design? What is an interaction?
Main effects of an independent variable are information about the effect of each independent variable taken by itself. Interactions are a new source of information that cannot be obtained in a simple experimental design in which only one independent variable is manipulated.
51. Identify the number of conditions in a factorial design on the basis of knowing the number of independent variables and the number of levels of each independent variable.
Student answers will vary. Student answers should reflect an understanding of factorial design conditions found in the text.
52. Describe 2 × 2 factorial designs in which (a) both independent variables are independent groups (between-subjects), (b) both independent variables are repeated measures (withinsubjects), and (c) the design is mixed, so that one independent variable is independent groups and the other is repeated measures
Student answers will vary. Students should support their answers with evidence from the text covering 2 × 2 factorial designs.
Sample Answers for Being a Skilled Consumer of Research
Participants were randomly assigned to conditions in a 2 × 2 between-subjects (independent groups) experiment exploring the potential impact of distraction on reaction time. In this study, two variables were manipulated: (1) a confederate in the room reading a book, either aloud or to themselves, and (2) popular music playing on a speaker (present, absent). One variable was measured: time in seconds taken to complete a sudoku puzzle.
Describe what you think the results would be if there was a main effect for reading.
Student answers will vary.
e.Describe what you think the results would be if there was also a main effect for music.
Student answers will vary.
f. Describe what you think the results would be if there was an interaction between reading and music.
Student answers will vary.
g.What assignment procedures would you recommend completely independent groups, completely repeated measures, or a mixed factorial design? Why?
Student answers will vary.
53. Research participants were randomly assigned to read one of four ―résumés‖ which were identical except for being described as written by a 25-year-old high school graduate, a 25year-old college graduate, a 35-year-old high school graduate, or a 35-year-old college graduate. After reviewing the résumé, participants rated the résumé for overall job qualification. The results (higher numbers indicate more qualification) showed: 25-year-old high school graduate (2.02), 25-year-old college graduate (2.05), 35-year-old high school graduate (3.82), and 35-year-old college graduate (4.90).
Identify the design of this experiment.
Student answers will vary.
h. Identify the independent variable(s) and dependent variable(s).
Student answers will vary.
i. How many conditions are in the experiment?
Student answers will vary.
j. Is there a participant variable in this experiment? If so, identify it. If not, can you suggest a participant variable that might be included?
Student answers will vary.
k. Given the results, do you think that there could be any main effects or an interaction?
Student answers will vary.
Laboratory Demonstration: Proactive Inhibition
The experiment described below may seem complicated, but it is actually simple to execute, generates reliable and sizable effects, and demonstrates a mixed factorial design.
Proactive inhibition (PI) refers to the deterioration of recall performance over repeated trials (Crowder, 1976). In short-term recall, inhibition effects are mild to nonexistent when the stimulus words are unrelated from trial to trial. However, when the stimulus words are from the same semantic category (e.g., all names of flowers, or names of musical instruments), proactive inhibition builds rapidly from trial to trial (Wickens, Born, & Allen, 1963). The importance of the semantic relationship between the words can be demonstrated by switching to a new semantic category after four or five trials: Recall performance instantly improves to the level of the first trial, an effect called release from proactive inhibition. The following experiment uses the Brown–Peterson task to demonstrate the phenomena of proactive inhibition and release from proactive inhibition.
The Brown–Peterson task is commonly used to test recall from short-term memory. The basic procedure is as follows. After the participant is exposed to the stimulus material to be remembered, the participant immediately performs a simple distractor activity. The distractor activity can take many forms, but a common and inexpensive one is to have the participant count aloud backward by threes from some number provided by the experimenter. The distractor activity continues for 30–60 seconds. At the experimenter’s signal, the participant stops the distractor activity and attempts to recall the stimulus material.
Randomly assign participants to three groups: proactive inhibition/no release group (PI/No Release), proactive inhibition/release group (PI/Release), and non-proactive inhibition control (Non-PI Control). The PI/No Release group receives stimulus words from the same semantic category on each of five trials. The PI/Release group receives stimulus words from the same semantic category on the first four trials, but on the fifth trial the stimulus words are from a different semantic category. The Non-PI Control group receives stimulus words from different semantic categories on each of the five trials.
To accomplish this, prepare three sets of five large index cards, one set per condition. For the PI/No Release condition, write three words from the same semantic category on each card. For
the PI/Release condition, use the same words as for PI/No Release for the first four cards but switch to a different semantic category for the last card. For the Non-PI Control condition, on each card write three words from the same semantic category but change categories for each card; the words on the last card in this condition should be the same as the words on the last card in the PI/Release condition. Stimulus words for each condition are suggested below.
In order to conduct the experiment simultaneously with all three groups, solicit the help of two assistant experimenters. Have the participants arrange their seats in three groups such that each group faces a different corner of the room. An experimenter faces each group; in this way each group sees only one experimenter but each experimenter can clearly see the other two. Each experimenter has one of the sets of cards.
The Brown–Peterson task is conducted as follows. Experimenters simultaneously present the first card of the set to the groups for 5 seconds and then one experimenter immediately calls out a three-digit number. Participants must then count aloud backward by threes from the number called out. Counting aloud backward continues for 30 seconds. This part of the procedure is crucial. Instructors will need to lead them and keep counting. If some are not counting, call out their names and get them started. Have fun with it, but keep them occupied they must not rehearse the words they just saw! When the counting period is over, give them 5 seconds to write down the words from Trial 1. Begin Trial 2 immediately. Repeat this procedure until all five trials are completed. (Instructors may want to do a practice trial before handing out the test booklets to be sure everyone understands what to do.)
Write the 3 × 5 factorial design on the board. Have each student score their recall for each trial with a 0–5 for the number of words correctly recalled. Summarize the data for each group across all five trials. Plot the findings in a figure and discuss the main effects and interaction. The anticipated findings are drawn below:
Crowder, R. G. (1976). Principles of learning and memory Lawrence Erlbaum Associates.
Wickens, D. D., Born, D. G., & Allen, C. K. (1963). Proactive inhibition and item similarity in short-term memory. Journal of Verbal Learning and Verbal Behavior, 2, 440–445.
Stimulus Words:
Handout 4 in Part II of the instructor’s manual provides seven factorial designs to identify. These can be used as homework assignments or done in class as an activity by groups of students.
Activity: Outcomes of Factorial Designs
Students can also practice identifying outcomes of factorial designs. Handout 5 provides questions on outcomes that can be used for homework or group exercises.
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Strube, M. J., & Goldstein, M. D. (1995). A computer program that demonstrates the difference between main effects and interactions. Teaching of Psychology, 22, 207–208.
Vernoy, M. W. (1994). A computerized Stroop experiment that demonstrates the interaction in a 2 × 3 factorial design. Teaching of Psychology, 21, 186–189.
Zerbolio, D. J., Jr., & Walker, J. T., Jr. (1989). Factorial design: Binocular and monocular depth perception in vertical and horizontal stimuli. Teaching of Psychology, 16, 65–66.
Chapter 11: Single-Case, Quasi-Experimental and Developmental Research
Learning Objectives
Describe single-case experimental designs and discuss reasons to use this design.
Explain the one-group posttest-only design, and describe the situations where it would be useful.
Describe the one-group pretest-posttest design and the associated threats to internal validity that may occur: history, maturation, testing, instrument decay, and regression toward the mean.
Compare and contrast the nonequivalent control group design and nonequivalent control group pretest-posttest design; discuss the advantages of having a control group.
Distinguish between the interrupted time series design and control series design.
Describe cross-sectional, longitudinal, and sequential research designs, including the advantages and disadvantages of each design.
Explain what a cohort effect is.
Brief Chapter Outline
XXXIII. Single-Case Experimental Designs
FF.Reversal Designs
GG.
ultiple Baseline Designs
HH.
eplications in Single-Case Designs
XXXIV. Quasi-Experimental Designs
One-Group Posttest-Only Design
II. One-Group Pretest-Posttest Design
1. History
2. Maturation
3. Testing
4. Instrument Decay
5. Regression Toward the Mean
JJ. Nonequivalent Control Group Design
KK.
onequivalent Control Group Pretest-Posttest Design
LL.Propensity Score Matching of Nonequivalent Treatment and Control Groups
MM.
nterrupted Time Series Design and Control Series Design
XXXV. Developmental Research Designs
Cross-Sectional Method
NN.
ongitudinal Method
OO.
omparison of Longitudinal and Cross-Sectional Methods
PP.Sequential Method
Extended Chapter Outline
Please note that much of this information is quoted from the text.
Single-Case Experimental Design
Single-case experimental designs have traditionally been called single-subject designs; an equivalent term you may see is small N designs.
Single-case experiments were developed from a need to determine whether an experimental manipulation had an effect on a single research participant. In a single-case design, the subject’s behavior is measured over time during a baseline control period.
Reversal Designs
The basic issue in single-case experiments is how to determine that the manipulation of the independent variable had an effect. One method is to demonstrate the reversibility of the manipulation. A simple reversal design takes the following form:
A (baseline period) → B (treatment period) → A (baseline period).
This basic reversal design is called an ABA design; it requires observation of behavior during the baseline control (A) period, again during the treatment (B) period, and also during a second baseline (A) period after the experimental treatment has been removed. (Sometimes this is called a withdrawal design, in recognition of the fact that the treatment is removed or withdrawn.)
The logic of the reversal design can also be applied to behaviors observed in a single setting.
QQ.
ultiple Baseline Designs
In a multiple baseline design, the effectiveness of the treatment is demonstrated when a behavior changes only after the manipulation is introduced. To demonstrate the effectiveness of the treatment, such a change must be observed under multiple circumstances to rule out the possibility that other events were responsible.
There are several variations of the multiple baseline design. In the multiple baseline across subjects, the behavior of several subjects is measured over time In a multiple baseline across behaviors, several different behaviors of a single subject are measured over time. The third variation is the multiple baseline across situations, in which the same behavior is measured in different settings
RR. eplications in Single-Case
Designs
The procedures for use with a single subject can be replicated with other subjects, greatly enhancing the generalizability of the results. Usually, reports of research that employs single-case experimental procedures do present the results from several subjects (and often in several settings).
Single-case designs are useful for studying many research problems and should be considered a powerful alternative to more traditional research designs. They can be especially valuable for someone who is applying some change technique in a natural environment.
XXXVI. Quasi-Experimental Designs
Quasi-experimental designs address the need to study the effect of an independent variable in
settings in which the control features of true experimental designs cannot be achieved.
One-Group Posttest-Only Design
The one-group posttest-only design called a ―one-shot case study‖ lacks a crucial element of a true experiment: a control or comparison group. The one-group posttest-only design with its missing comparison group has serious deficiencies in the context of designing an internally valid experiment that will allow us to draw causal inferences about the effect of an independent variable on a dependent variable.
SS.One-Group Pretest-Posttest Design
One way to obtain a comparison is to measure participants before the manipulation (a pretest) and again afterward (a posttest). An index of change from the pretest to the posttest could then be computed. Although this one-group pretest-posttest design sounds fine, there are some major problems with it.
6. History
History refers to any event that occurs between the first and second measurements but is not part of the manipulation. Any such event is confounded with the manipulation. However, history effects can be caused by virtually any confounding event that occurs at the same time as the experimental manipulation.
7. Maturation
People change over time. In a brief period, they become bored, fatigued, perhaps wiser, and certainly hungrier; over a longer period, children become more coordinated and analytical. Any changes that occur systematically over time are called maturation effects.
8. Testing
Testing becomes a problem if simply taking the pretest changes the participant’s behavior the problem of testing effects.
9. Instrument Decay
Sometimes, the basic characteristics of the measuring instrument change over time; this is called instrument decay.
10. Regression Toward the Mean
Sometimes called statistical regression, regression toward the mean is likely to occur whenever participants are selected because they score extremely high or low on some variable. When they are tested again, their scores tend to change in the direction of the mean. Extremely high scores are likely to become lower (closer to the mean), and extremely low scores are likely to become higher (again, closer to the mean).
This problem can be eliminated by the use of an appropriate control group. In any control group, the participants in the experimental condition and the control condition must be equivalent.
TT.Nonequivalent Control Group Design
The nonequivalent control group design employs a separate control group, but the participants in the two conditions the experimental group and the control group are not equivalent. The differences become a confounding variable that provides an alternative explanation for the results. This problem, called selection differences or selection bias, usually occurs when participants who form the two groups in the experiment are chosen from existing natural groups.
UU.
onequivalent Control Group Pretest-Posttest Design
The nonequivalent control group posttest-only design can be greatly improved if a pretest is given. When this is done, a nonequivalent control group pretest-posttest design is one of the most useful quasi-experimental designs.
VV.
ropensity Score Matching of Nonequivalent Treatment and Control Groups
The nonequivalent control group designs lack random assignment to conditions and so the groups may in fact differ in important ways. Advances in statistical methods have made it possible to simultaneously match individuals on multiple variables. Instead of matching on just one variable such as health, the researcher can obtain measures of other variables thought to be important when comparing the groups. The scores on these variables are combined to produce what is called a propensity score. Individuals in the treatment and control groups can then be matched on propensity scores this process is called propensity score matching.
WW.
nterrupted Time Series Design and Control Series Design
A single comparison is really a one-group pretest-posttest design with all of that design’s problems of internal validity; chiefly confounding and third-factor variables. An alternative is to use an interrupted time series design that would examine something over an extended period of time, both before and after something else has been instituted. One way to improve the interrupted time series design is to find some kind of control group a control series design
XXXVII.
Developmental Research Designs
Developmental psychologists often study the ways that individuals change as a function of age. In all cases, the major variable is age.
There are two general methods for studying individuals of different ages: the cross-sectional method and the longitudinal method. The cross-sectional method shares similarities with the independent groups design, whereas the longitudinal method is similar to the repeated measures design.
Cross-Sectional Method
In a study using the cross-sectional method, persons of different ages are studied at only one point in time.
XX.
ongitudinal Method
In the longitudinal method, the same group of people is observed at different points in time, as they grow older.
YY.
omparison of Longitudinal and Cross-Sectional Methods
The cross-sectional method is much more common than the longitudinal method primarily because it is less expensive and immediately yields results. There are, however, some disadvantages to cross-sectional designs. Most important, the researcher must infer that differences among age groups are due to the developmental variable of age. One can think of a cohort as a group of people born at about the same time, exposed to the same events in a society, and influenced by the same demographic trends such as divorce rates and family size. In a cross-sectional study, a difference among groups of different ages may reflect developmental age changes; however, the differences may result from cohort effects.
ZZ.Sequential Method
A compromise between the longitudinal and cross-sectional methods is to use the sequential method.
The first phase of the sequential method begins with the cross-sectional method; for example, you could study groups of 55- and 65-year-olds. These individuals are then studied using the longitudinal method with each individual tested at least one more time.
Engaging With Research: Developmental Research
After reading the article, answer the following questions (which will be familiar to you from earlier in this chapter!). NOTE: Student answers will vary.
36. What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?
37.What did these researchers do? What was the method?
AA. Where, how, and by whomwere the data collected?
38.What was measured?
How was each of the five outcome variables measured?
39.To what or whom can we generalize the results?
40.What did they find? What were the results?
41.Have other researchers found similar results?
How do these results compare with or contrast with studies that have been conducted in India? What about studies that have been conducted in the United States?
42.What are the limitations of this study?
43.What are the ethical issues present in this study?
Sample Answers for Review Questions
What is a reversal design? Why is an ABAB design superior to an ABA design?
A basic reversal design (ABA design) requires observation of behavior during the baseline control (A) period, again during the treatment (B) period, and also during a second baseline (A) period after the experimental treatment has been removed An ABAB design is superior since it introduces the experimental treatment a second time, increasing the evidence for the effectiveness of the treatment.
44.What is meant by baseline in a single-case design?
Baseline is a type of control period in which a subject’s behavior is measured over time.
45.What is a multiple baseline design? Why is it used? Distinguish between multiple baseline designs across subjects, across behaviors, and across situations.
In a multiple baseline design, the effectiveness of the treatment is demonstrated when a behavior changes only after the manipulation is introduced. To demonstrate the effectiveness of the treatment, such a change must be observed under multiple circumstances to rule out the possibility that other events were responsible. In the multiple baseline across subjects, the behavior of several subjects is measured over time; for each subject, the manipulation is introduced at a different point in time. In a multiple baseline across behaviors, several different behaviors of a single subject are measured over time. At different times, the same manipulation is applied to each of the behaviors. In the multiple baseline across situations, the same behavior is measured in different settings, such as at home and at work.
46.Why might a researcher use a quasi-experimental design rather than a true experimental design?
Quasi-experimental designs address the need to study the effect of an independent variable in settings in which the control features of true experimental designs cannot be achieved.
47.Why does the use of a control group eliminate the problems associated with the one-group pretest-posttest design?
A control group that does not receive the experimental treatment provides an adequate control for the effects of history, statistical regression, and so on.
48.Describe the threats to internal validity discussed in the text: history, maturation, testing, instrument decay, regression toward the mean, and selection differences.
In a one-group pretest-posttest design, internal validity can be threatened by alternative explanations, including history, maturation, testing, instrument decay, and regression toward the mean.
History refers to any event that occurs between the first and second measurements but is not part of the manipulation. Anysuch event is confounded with the manipulation.
People change over time. Any changes that occur systematically over time are called maturation effects.
Testing becomes a problem if simplytaking the pretest changes the participant’s behavior.
Sometimes, the basic characteristics of the measuring instrument change over time; this is called instrument decay.
Sometimes called statistical regression, regression toward the mean is likely to occur whenever participants are selected because theyscore extremelyhigh or low on some variable. When they are tested again, their scores tend to change in the direction of the mean.
Selection differences occur when the experimental group and the control group are not equivalent. The differences become a confounding variable that provides an alternative explanation for the results.
49.Describe the nonequivalent control group pretest-posttest design. Why is this a quasiexperimental design rather than a true experiment?
This design employs a separate control group, but the participants in the two conditions the experimental group and the control group are not equivalent. It is not a true experimental design because assignment to groups is not random; the two groups may not be equivalent.
50.Describe the interrupted time series and the control series designs. What are the strengths of
the control series design as compared with the interrupted time series design?
The interrupted time series design examines variables over an extended period of time. To improve this design, researchers can find some kind of control group a control series design. This design increases the validity of the experiment.
51.Distinguish between longitudinal, cross-sectional, and sequential methods.
In a study using the cross-sectional method, persons of different ages are studied at only one point in time. In the longitudinal method, the same group of people is observed at different points in time, as they grow older. A compromise between the longitudinal and crosssectional methods is to use the sequential method. The cross-sectional method is less expensive than the longitudinal method and yields immediate results. The sequential method takes less time than the longitudinal study, and the researcher has available data sooner.
52. What is a cohort effect?
A cohort is a group of people born at about the same time, exposed to the same events in a society, and influenced by the same demographic trends. The cohort effect describes the impact of these people being cohorts
Sample Answers for Being a Skilled Consumer of Research
Go to CLOSER (https://www.closer.ac.uk), a research center housed in the University College London's Social Research Institute and explore their datasets. From their homepage, click on Search Our Data and then Explore (on the right side). Then, choose a topic of interest (e.g., Life Events, Mental Health, Family & Social Networks), and explore answers to survey questions. Or, from the homepage, click on Search Our Data and then CLOSER Discovery and pick a study (e.g., The Millennium Cohort Study) and find out about the cohort, or the methodologies used to collect the data. Keep the key-questions for being a skilled consumer of research in mind as you explore their data:
What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals? AAA.
hat did these researchers do? What was the method? BBB.
hat was measured? CCC.
o what or whom can we generalize the results? DDD.
hat did theyfind? What were the results? EEE.
ave other researchers found similar results? FFF.
hat are the limitations of this study? GGG.
hat are the ethical issues present in this study?
Students’ answers will vary based on their choices.
53. You leave your dog alone at home while you are at work. When you are away, your dog engages in destructive activities like chewing on your shoes, pulling down curtains or strewing wastebasket contents all over the floor. You decide that playing a radio while you are gone might help. How might you determine whether this ―treatment‖ is effective?
A possible method to assess whether playing the radio is an effective treatment would be to employ an ABA design. Destructive behavior could be measured before the radio is introduced, while the radio is played, and again after the radio is removed. If destructive behavior decreases when the radio is played, and when removed, increases back to a level equivalent to when there was no radio, one may conclude that playing the radio is an effective treatment for reducing destructive behavior. Students’ answers will vary.
54. Your best friend frequently suffers from severe headaches. You have noticed that your friend consumes a great deal of diet cola, and so you consider the hypothesis that the artificial sweetener in the cola is responsible for the headaches. Devise a way to test your hypothesis using a single-case design. What do you expect to find if your hypothesis is correct? If you obtain the expected results, what do you conclude about the effect of the artificial sweetener on headaches?
Employing an ABA design, the frequency of headaches could be measured before consumption of diet cola is removed, after it is removed, and again when consumption is reintroduced. If the hypothesis is correct, headaches would decrease when the diet cola is absent and increase when it is present. If one’s friend does not have headaches while they are drinking regular cola, then one might conclude that the headaches were due to the consumption of the artificial sweetener.
55. Gilovich (1991) described an incident he had read about during a visit to Israel. A very large number of deaths had occurred during a brief time period in one region of the country. A group of rabbis attributed the deaths to a recent change in religious practice that allowed women to attend funerals. Women were immediately forbidden to attend funerals in that region, and the number of deaths subsequently decreased. How would you explain this
phenomenon?
A good answer will include a discussion of regression toward the mean. A large number of deaths during a short period of time in a region of the country may be an unusual occurrence. At a subsequent second measurement period, the number of deaths may be quite different because the first measure was so large. Thus, the decrease in number of deaths may not be due to a change in religious practice but actually a change toward the average number of deaths that occurs in the region. In other words, the change is toward the direction of the mean number of deaths.
56. The captain of each precinct of a metropolitan police department selected two officers to participate in a program designed to reduce prejudice by increasing sensitivity to racial and ethnic group differences and community issues. The training program took place every Friday morning for 3 months. At the first and last meetings, the officers completed a measure of prejudice. To assess the effectiveness of the program, the average prejudice score at the first meeting was compared with the average score at the last meeting; it was found that the average score was in fact lower (indicating less prejudice) following the training program. What type of design is this? What specific problems arise if you try to conclude that the training program was responsible for the reduction in prejudice?
This design is an example of a one-group pretest--posttest design. A good answer will address the problems of history, statistical regression, testing effects, and mortality as possible explanations for the change in prejudice scores over time. In addition, because participants went through 3 months of training, they may have become sensitized to the desired effect of the treatment. Students’ answers will vary.
57. Many elementary schools have implemented a daily ―sustained silent reading‖ period during which students, faculty, and staff spend 15–20 minutes silently reading a book of their choice. Advocates of this policy claim that the activity encourages pleasure reading outside the required silent reading time. Design a nonequivalent control group pretest-posttest quasiexperiment to test this claim. Include a well-reasoned dependent measure as well. Discuss the advantages and disadvantages of using a quasi-experimental design in contrast to conducting a true experiment.
A possible method would be to compare two similar classrooms from an elementary school. The experimental group will have 20 minutes of sustained silent reading in school and the control group will have no silent reading. The pretest for both groups could be a questionnaire asking the students how much time they spend reading outside of class (verification by the parents could also be obtained). After 1 month of silent reading, all of the students could again be asked to indicate how much time they had spent reading outside of
class. A good answer should include a discussion regarding the internal versus external validity of the different research designs.
Laboratory Demonstration: Regression Toward the Mean
The concept of statistical regression toward the mean can become clearer through a concrete demonstration. Create a ―population‖ of 200 poker chips or slips of paper such that each chip or paper has a single score written on it. (An approximately normal distribution of 200 scores is presented in this section of the instructor’s manual for Chapter 1.) Have each student draw a single chip or piece of paper, note the score, and then return it to the population. After everyone has drawn a score, identify the five students with the highest scores and the five students with the lowest scores. Write their scores on the board and compute the two group means.
Next, have each of these students (those with the lowest and highest scores) draw a new chip or paper from the population and record the new score next to the old one. Compute the means for these sets of scores. The probability is high that those students whose original scores were high will draw lower scores than before, while those whose original scores were low will draw higher scores than before. This may not be the case for each individual, but the second mean for each group will most probably be closer to the population mean than was the first mean. Thus, regression toward the mean should occur for both the lowest-scoring group and the highest-scoring group.
Laboratory Demonstration: Effect of a Sign on Behavior
Can a sign change behavior? First choose a behavior to target. Examples include littering near some vending machines on campus or talking in a location near a classroom. Have students record the behavior during a baseline period. At a randomly determined time, post one or more signs such as ―Please don’t litter‖ or ―No talking in this area: Experiment in progress.‖ Continue recording the behavior while the sign is up. At a predetermined time, remove the sign while continuing to record the behavior.
This ABA design can illustrate the principles of single-subject designs. Also, one can contrast the experimental approach taken here with an interrupted time series quasi-experimental design.
Activity: Self-Directed Behavior Change
Students can apply reinforcement principles to change their own behavior using an ABA design. Theycan select a behavior, measure it during baseline, apply a reinforcement to change the behavior, and then remove the reinforcement.
Additional Discussion Topics
Discussion: Developmental Designs
Remind students that developmental researchers look for changes over time as well as differences between ages. The question you are asking will determine what type of design one selects. Money is also an issue. Remind students that the Terman study began in 1921 and is still continuing on today and will continue till all participants are deceased. That commitment involves not only dedicating one’s entire career to a study but also considerable fiscal resources as well.
Discussion: ABA Designs
Students may be having an issue with the relevance of some of these designs as they may seem too complex on the surface. A good example of the ABA design that may be more appealing to some students can be a study on the efficacy of treatment in abnormal psychology. So let’s say one wants to determine the best course of treatment for anxiety disorders. So one can collect data on the anxiety levels of a group of subjects as baseline data A (baseline period) then you give subjects a variety of treatments: placebo, cognitive behavioral, and benzodiazepines B (treatment period) then one can measure the level of anxiety 6 months later A (baseline period). Ask students to generate other examples of when this design would be useful.
Suggested Readings
Article in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Carr, J. E., & Austin, J. (1997). A classroom demonstration of single-subject research designs. Teaching of Psychology, 24, 188–190
Also recommended:
Treadwell, K. R. H. (2008). Demonstrating experimenter ―ineptitude‖ as a means of teaching internal and external validity. Teaching of Psychology, 35, 184–188.
Chapter 12: Understanding Research Results:
Description and Correlation
Learning Objectives
Compare and contrast the three ways of describing results: comparing group percentages, correlating scores, and comparing group means.
Describe a frequency distribution, including the various ways to display a frequency distribution.
Compare and contrast the three measures of central tendency
Describe how to determine how much variability exists in a set of scores.
Define what a correlation coefficient is.
Explain what an effect size is.
Describe how researchers use regression equations to predict behavior.
Distinguish between mediation and moderation as ways to explore more complex relationships among variables.
Explain the purpose of more advanced statistical techniques like structural equation modeling.
Brief Chapter Outline
XXIV. Scales of Measurement: A Review
XXV. Describing Results HHH.
omparing Group Percentages
III.Correlating Individual Scores JJJ.
omparing Group Means
XXVI. FrequencyDistributions
Graphing FrequencyDistributions
b. Pie Charts
c. Bar Graphs
d. Frequency Polygons
e. Histograms
XXVII.Descriptive Statistics
Central Tendency KKK.
ariability
XXVIII. Graphing Relationships
XXIX. Correlation Coefficients: Describing the Strength of Relationships
Pearson r Correlation Coefficient
f. Scatterplots LLL.
mportant Considerations
g. Restriction of Range
h. Curvilinear Relationship
XXX. Effect Size
XXXI. Regression Equations
XXXII.Multiple Correlation and Regression
XXXIII. Mediating and Moderating Variables
Mediation MMM. oderation NNN. hird Variables
XXXIV. Advanced Statistical Analysis
Extended Chapter Outline
Please note that much of this information is quoted from the text.
XXXVIII. Scales of Measurement: A Review
Whenever a variable is studied, the researcher must create an operational definition of the variable and devise two or more levels of the variable. The levels of the variable can be described using one of four scales of measurement: nominal, ordinal, interval, and ratio. The scale used determines the types of statistics that are appropriate when analyzing data.
The levels of nominal scale variables have no numerical, quantitative properties. The levels are simply different categories or groups. Most independent variables in experiments are nominal. Variables with ordinal scale levels exhibit minimal quantitative distinctions. One can rank order the levels of the variable being studied from lowest to highest.
Interval scale and ratio scale variables have much more detailed quantitative properties. With an interval scale variable, the intervals between the levels are equal in size.
Ratio scale variables have both equal intervals and an absolute zero point that indicates the absence of the variable being measured. Time, weight, length, and other physical measures are the best examples of ratio scales. Interval and ratio scale variables are conceptually different; however, the statistical procedures used to analyze data with such variables are identical.
XXXIX.
Describing Results
Depending on the way that the variables are studied, there are three basic ways of describing the results: (1) comparing group percentages, (2) correlating scores of individuals on two variables,
and (3) comparing group means.
Comparing Group Percentages
This first way of describing results is useful when a key variable is measured on a nominal scale. Here, the focus is on percentages because the variable is nominal: for example, one either has, or has not, used public transportation in the past month. After describing the data, the next step would be to perform a statistical analysis to determine whether there is a statistically significant difference between the two groups.
OOO.
orrelating Individual Scores
A second type of analysis is needed when one does not have distinct groups of subjects. Instead, individuals are measured on two variables, and each variable has a range of numerical values.
PPP.
omparing Group Means
Much research is designed to compare the mean responses of participants in two or more groups.
XL. Frequency Distributions
A frequency distribution indicates the number of individuals who receive each possible score on a variable.
Graphing Frequency Distributions
i. Pie Charts
Pie charts divide a whole circle, or ―pie,‖ into ―slices‖ that represent relative percentages.
j. Bar Graphs
Bar graphs use a separate and distinct bar for each piece of information.
k. Frequency Polygons
Frequency polygons use a line to represent the distribution of frequencies of scores. This
is most useful when the data represent interval or ratio scales.
l. Histograms
A histogram uses bars to display a frequency distribution for a quantitative variable. In this case, the scale values are continuous and show increasing amounts on a variable such as age, blood pressure, or stress.
XLI. Descriptive Statistics
Descriptive statistics allow researchers to make precise statements about the data. Two statistics are needed to describe the data. A single number can be used to describe the central tendency, or how participants scored overall. Another number describes the variability, or how widely the distribution of scores is spread. These two numbers summarize the information contained in a frequency distribution.
Central Tendency
A central tendency statistic tells us what the sample as a whole, or on the average, is like. There are three measures of central tendency the mean, the median, and the mode.
The mean of a set of scores is obtained by adding all the scores and dividing by the number of scores.
The median is the score that divides the group in half (with 50% scoring below and 50% scoring above the median). In scientific reports, the median is abbreviated as Mdn.
The mode is the most frequent score. The mode is the only measure of central tendency that is appropriate if a nominal scale is used. The mode does not use the actual values on the scale, but simply indicates the most frequently occurring value.
QQQ. ariability
We can also determine how much variability exists in a set of scores. A measure of variability is a number that characterizes the amount of spread in a distribution of scores. One such measure is the standard deviation, symbolized as s, which indicates the average deviation of scores from the mean.
In scientific reports, it is abbreviated as SD. The standard deviation is derived by first calculating the variance, symbolized as s 2 (the standard deviation is the square root of the
variance).
Another measure of variability is the range, which is simply the difference between the highest score and the lowest score. SOURCE: Browsegrades.net
XLII.Graphing Relationships
A common way to graph relationships between variables is to use a bar graph or a line graph. The levels of the independent variable (no-model and model) are represented on the horizontal x axis, and the dependent variable values are shown on the vertical y axis.
XLIII. Correlation Coefficients: Describing the Strength of Relationships
A correlation coefficient is a statistic that describes how strongly variables are related to one another. Probably most people are familiar with the Pearson product-moment correlation coefficient, which is used when both variables have interval or ratio scale properties. The Pearson product-moment correlation coefficient is called the Pearson r
Pearson r Correlation Coefficient
The Pearson r provides two types of information about the relationship between the variables. The first is the strength of the relationship and the second is the direction of the relationship. Values of a Pearson r can range from 0.00 to ±1.00. Thus, the Pearson r provides information about the strength of the relationship and the direction of the relationship. A correlation of 0.00 indicates that there is no relationship between the variables. The nearer a correlation is to 1.00 (plus or minus), the stronger is the relationship.
m.Scatterplots
In a scatterplot, each pair of scores is plotted as a single point in a diagram. The values of the first variable are depicted on the x-axis and the values of the second variable are shown on the y-axis.
RRR.
mportant Considerations
n. Restriction of Range
The problem of restriction of range occurs when the individuals in your sample are very similar on the variable you are studying. If one is studying age as a variable, for instance, testing only 6- and 7-year-olds will reduce one’s chances of finding age effects.
o. Curvilinear Relationship
The Pearson product-moment correlation coefficient (r) is designed to detect only linear
relationships. If the relationship is curvilinear, the correlation coefficient will not indicate the existence of a relationship.
XLIV. Effect Size
Effect size refers to the strength of association between variables. The Pearson r correlation coefficient is one indicator of effect size; it indicates the strength of the linear association between two variables.
XLV. Regression Equations
Regression equations are calculations used to predict a person’s score on one variable when that person’s score on another variable is already known. They are essentially ―prediction equations‖ that are based on known information about the relationship between the two variables.
When researchers are interested in predicting some future behavior (called the criterion variable) on the basis of a person’s score on some other variable (called the predictor variable), it is first necessary to demonstrate that there is a reasonably high correlation between the criterion and predictor variables. The regression equation then provides the method for making predictions on the basis of the predictor variable score only.
XLVI. Multiple Correlation and Regression
A technique called multiple regression is used to analyze the relationship between criterion variable and more than one predictor variable.
A multiple correlation (symbolized as R to distinguish it from the simple r) is the correlation between a combined set of two or more predictor variables and a single criterion variable.
XLVII. Mediating and Moderating Variables
Mediation
The word mediation means that something is coming between two forces, as in coming between two sides in a conflict. In research, a mediating variable is hypothesized to be intervening between variable X and variable Y
In a mediation model, the independent or predictor variable affects a mediating variable. The mediating variable then affects the dependent or criterion variable.
SSS.
oderation
Moderation implies restraint or change. In research, a moderating variable changes or limits the relationship between variable X and variable Y. Specifically, the relationship depends on the level of the moderator variable M. At one level of M, there may be a positive relationship between X and Y; at the other level of M, there may be no relationship between the X variable and the Y variable. That statement describes an interaction effect.
The two terms, moderation and interaction, developed from different research traditions, but they mean essentially the same thing when interpreting research findings.
TTT.
hird Variables
Researchers face the third-variable problem in nonexperimental research when some uncontrolled third variable may be responsible for the relationship between the two variables of interest. When experimental research is properly designed, there is no third-variable problem because all extraneous variables are controlled, either by keeping the variables constant or by using randomization.
Multiple regression can be used to statistically control for the effects of third variables.
XLVIII. Advanced Statistical Analyses
Structural equation modeling (SEM) is a family of related statistical analysis techniques that includes (among others) confirmatory factor analysis, path analysis, and latent growth modeling, in which researchers examine models that specify a set of relationships among variables.
A model is an expected pattern of relationships among a set of variables. The proposed model is based on a theory of how the variables are causally related to one another. After data have been collected, statistical methods can be applied to examine how closely the proposed model actually ―fits‖ the obtained data.
Researchers typically present diagrams to visually represent the models being tested. Such diagrams show the theoretical causal paths among the variables. Besides illustrating how variables are related, a final application of SEM can be to evaluate how closely the obtained data fit the specified model.
Sample Answers for Review Questions
58. What is the difference between the three ways of describing results: comparing percentages, comparing means, and correlating scores?
Comparing percentages is used when there are distinct groups of subjects and a statistical analysis is performed. Comparing group means involves comparing the responses or two or more groups. When there are no distinct groups of subjects, individuals are measured on two variables, and each variable has a range of numerical values. This is known as correlating individual scores.
59. What is a frequency distribution?
A frequencydistribution indicates the number of individuals who receive each possible score on a variable.
60. Distinguish between a pie chart, bar graph, frequency polygon, and histogram. Construct one of each.
Pie charts divide a whole circle, or ―pie,‖ into ―slices‖ that represent relative percentages Bar graphs use a separate and distinct bar for each piece of information. Frequency polygons use a line to represent the distribution of frequencies of scores. Student examples will vary. Histograms use bars as well, but the bars touch to represent the continuous variable.
61. What is a measure of central tendency? Distinguish between the mean, median, and mode.
A central tendencystatistic tells us what the sample as a whole, or on the average, is like. The mean of a set of scores is obtained by adding all the scores and dividing by the number of scores. The median is the score that divides the group in half (with 50% scoring below and 50% scoring above the median). The mode is the most frequent score. It is the only measure of central tendencythat is appropriate if a nominal scale is used.
62. What is a measure of variability? Distinguish between the standard deviation and the range.
A measure of variability is a number that characterizes the amount of spread in a distribution of scores. The standard deviation indicates the average deviation of scores from the mean. The range is the difference between the highest score and the lowest score.
63. What is a correlation coefficient? What do the size and sign of the correlation coefficient tell us about the relationship between variables?
A correlation coefficient is a statistic that describes how strongly variables are related to one another. The nearer a correlation is to 1.00 (plus or minus), the stronger is the relationship.
64. What is a scatterplot?
A scatterplot shows data in which each pair of scores is plotted as a single point in a diagram.
65. What happens when a scatterplot shows the relationship to be curvilinear?
If the relationship is curvilinear, the correlation coefficient will not indicate the existence of a relationship.
66. What is a regression equation? How might an employer use a regression equation?
Regression equations are calculations used to predict a person’s score on one variable when that person’s score on another variable is alreadyknown. Student answers about how an employer might use a regression equation mayvaryand should reflect an accurate understanding of the practical uses of regression equations.
67. How does multiple correlation increase accuracy of prediction?
A multiple correlation is the correlation between a combined set of predictor variables and a single criterion variable. Taking all the predictor variables into account usually permits greater accuracyof prediction than if any single predictor is considered alone
68. Provide an example of a mediation hypothesis.
In a mediation model, the independent or predictor variable affects a mediating variable. The mediating variable then affects the dependent or criterion variable. As an example, you might propose that adolescents who are bullied subsequently experience depression (M), which in turn is related to misuse of alcohol and marijuana (Luk et al., 2010).
69. Describe how a variable might moderate the relationship between two other variables.
Researchers face the third-variable problem in nonexperimental research when some uncontrolled third variable may be responsible for the relationship between the two variables of interest. When experimental research is properly designed, there is no thirdvariable problem because all extraneous variables are controlled, either bykeeping the variables constant or byusing randomization.
70. What is the purpose of more advanced statistical techniques like structural equation modeling?
Structural equation modeling (SEM) is a family of related statistical analysis techniques that includes (among others) confirmatoryfactor analysis, path analysis, and latent growth modeling, in which researchers examine models that specifya set of relationships among variables. A model is an expected pattern of relationships among a set of variables. The proposed model is based on a theoryof how the variables are causally related to one another. After data have been collected, statistical methods can be applied to examine how closely the proposed model actually ―fits‖ the obtained data.
71. When a path diagram is shown, what information is conveyed by the arrows leading from one variable to another?
In the diagram, arrows leading from one variable to another depict the paths that relate the variables in the model. The arrows indicate a proposed causal sequence.
Sample Answers for Being a Skilled Consumer of Research
Ask 20 students on campus how many units (credits) they are taking, as well as how many hours per week they work in paid employment. Create a frequency distribution and find the mean for each data set. Construct a scatterplot showing the relationship between class load and hours per week employed. Does there appear to be a relationship between the variables? (Note: There might be a restriction of range problem on your campus because few students work or most students take about the same number of units. If so, ask different questions, such as the number of hours spent studying and watching videos each week.)
Student’s answers will vary based on the data theygather
72. Think of three variables that use a nominal scale, three variables that use an ordinal scale, three variable that use an interval scale, and three variables that use a ratio scale. What would be the best way to describe each of your examples graphically?
Student answers will vary and should reflect an accurate understanding of nominal, ordinal, interval, and ratio scales.
Laboratory Demonstration: Correlation Coefficients
Students can provide their own data for a demonstration of correlation of scores. Collect two scores from each student on variables that seem intuitively related (e.g., degree of loneliness and level of self-esteem, or amount of study time per week and satisfaction with college). Once the data are collected, a student can read the pairs of scores while they construct a scatterplot on the board. A correlation coefficient can then be calculated. If students have the time or class size,
they could also calculate a regression equation and draw subsets of the data to show the influence of sample size or restriction of range.
Activity: Describing and Visualizing Correlation Coefficients
Students can practice describing correlation coefficients using Handout 6 in this manual. The information and Internet links on correlations in Chapter 12 of the text’s website at http://methods.fullerton.edu can be helpful.
Activity: Identifying Descriptive Statistics
Descriptive statistics can be illustrated by having students examine their own behavior. Instructors might have students indicate the number of hours of sleep they had the previous night, the number of hours they spent studying per week, the number of miles they travel to school, and the time in minutes it takes to get to school. Measures of central tendency and variability can be calculated based on these scores. In addition, the scores can be used to illustrate different methods used in graphing data.
Additional Discussion Topics
Discussion: Interpreting Correlations
To help illustrate the way correlations are interpreted, provide students with the following examples and ask how the correlation could be interpreted:
A correlation between wearing name tags and job satisfaction is .64.
A correlation between number of years married and satisfaction with significant other is a .48.
A correlation between eating sugars and being hyperactive is a .76.
A correlation between class absences and grade in course is .69.
A correlation between caloric intake and weight is .91
A correlation between number of police on the street and violent crime is .52.
Discussion: Graphing
If the class has a computer, pull up an available data set and demonstrate the different graphs that
can be used. Ask students if certain forms of representation seem to work better than others. How can the number of variables effect what form of representation one selects?
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Goldstein, M. D., & Strube, M. J. (1995). Understanding correlations: Two computer exercises. Teaching of Psychology, 22, 205–206.
Huck, S. W., Wright, S. P., & Park, S. (1992). Pearson’s r and spread: A classroom demonstration. Teaching of Psychology, 19, 45–47.
Pittenger, D. J. (1995). Teaching students about graphs. Teaching of Psychology, 22, 125–128
Also recommended:
Bragger, J. D., & Freeman, M. A. (1999). Using a cost-benefit analysis to teach ethics and statistics. Teaching of Psychology, 26, 34–36.
Connor, J. M. (2003). Making statistics come alive: Using space and students’ bodies to illustrate statistical concepts. Teaching of Psychology, 30, 141–143.
Peden, B. F. (2001). Correlational analysis and interpretation: Graphs prevent gaffes. Teaching of Psychology, 28, 129–131.
Shatz, M. A. (1985). The Greyhound strike: Using a labor dispute to teach descriptive statistics. Teaching of Psychology, 12, 85–86.
Chapter 13: Understanding Research Results: Statistical Inference
Learning Objectives
Explain how researchers use inferential statistics to evaluate sample data.
Distinguish between the null hypothesis and the research hypothesis.
Discuss probability in statistical inference, including the meaning of statistical significance.
Describe the t test and explain the difference between one-tailed and two-tailed tests.
Describe the F test, including systematic variance and error variance.
Describe what a confidence interval tells you about your data.
Distinguish between Type I and Type II errors and discuss the factors that influence the probability of a Type II error.
Discuss the reasons a researcher might obtain nonsignificant results.
Define power of a statistical test and describe how power influences research.
Demonstrate skills in selecting an appropriate statistical test.
Brief Chapter Outline
XXXV. Samples and Populations
XXXVI. Inferential Statistics
XXXVII. Null and Research Hypotheses
XXXVIII. Probability and Sampling Distributions
J. Probability: The Case of ESP
K.Sampling Distributions
L. Sample Size
XXXIX. Group Differences: The t and F tests
t Test
M.Degrees of Freedom
N.One-Tailed Versus Two-Tailed Tests
O. F Test
P. Calculating Effect Size
Q.Confidence Intervals and Statistical Significance
R. Statistical Significance: An Overview
XL. Type I and Type II Errors
Correct Decisions
S. Type I Errors
T. Type II Errors
U.The EverydayContext of Type I and Type II Errors
XLI.Choosing a Significance Level
XLII. Interpreting Nonsignificant Results
XLIII. Choosing a Sample Size: Power Analysis
XLIV. The Importance of Replications
XLV. Significance of a Pearson r Correlation Coefficient
XLVI. Statistical Analysis Software
XLVII.Selecting the Appropriate Statistical Test
Research Studying Two Variables (Bivariate Research)
V.Research With Multiple Independent Variables
Extended Chapter Outline
Please note that much of this information is quoted from the text.
XIX.Samples and Populations
Inferential statistics are used to determine whether the results match what would happen if we were to conduct the experiment again and again with multiple samples. In essence, we are asking whether we can infer that the difference in the sample means shown in Table 1 reflects a true difference in the population means
XX. Inferential Statistics
Inferential statistics allow researchers to make inferences about the true difference in the population on the basis of the sample data. Specifically, inferential statistics give the probability that the difference between means reflects random error rather than a real difference.
XXI.Null and Research Hypotheses
The null hypothesis is simply that the population means are equal the observed difference is due to random error. The research hypothesis is that the population means are, in fact, not equal. The null hypothesis states that the independent variable had no effect; the research hypothesis states that the independent variable did have an effect.
We reject the null hypothesis when we find a very low probability that the obtained results could be due to random error. This is what is meant by statistical significance a significant result is one that has a very low probability of occurring if the population means are equal. More simply, significance indicates that there is a low probability that the difference between the obtained sample means was due to random error. Significance, then, is a matter of probability.
XXII. Probability and Sampling Distributions
Probability is the likelihood of the occurrence of some event or outcome. We all use probabilities frequently in everyday life.
BB. Probability: The Case of ESP
The text describes a test of ESP. A key question becomes: How unlikely does a result have to be before we decide it is significant? A decision rule is determined prior to collecting the data. The probability required for significance is called the alpha level The most common alpha-level probability used is .05. The outcome of the study is considered significant when there is a .05 or less probability of obtaining the results; that is, there are only 5 chances out of 100 that the results were due to random error in one sample from the population.
CC. Sampling Distributions
The probabilities shown in Table 2 were derived from a probability distribution called the binomial distribution; all statistically significant decisions are based on probability distributions such as this one. Such
distributions are called sampling distributions The sampling distribution is based on the assumption that the null hypothesis is true.
All statistical tests rely on sampling distributions to determine the probability that the results are consistent with the null hypothesis.
DD. Sample Size
As the size of your sample increases, you are more confident that your outcome is actually different from the null hypothesis expectation.
XXIII. Group Differences: The t and F Tests
The t test is commonly used to examine whether two groups are significantly different from each other. The F test is a more general statistical test that can be used to ask whether there is a difference among three or more groups or to evaluate the results of factorial designs.
t Test
The sampling distribution of all possible values of t is shown in Figure 1.
The t value is a ratio of two aspects of the data, the difference between the group means and the variability within groups.
EE. Degrees of Freedom
The term degrees of freedom is abbreviated df. When comparing two means, you assume that the degrees of freedom are equal to n1 + n2 2, or the total number of participants in the groups minus the number of groups.
FF. One-Tailed Versus Two-Tailed Tests
In the table, you must choose a critical t for the situation in which your research hypothesis either (1) specified a direction of difference between the groups (e.g., ―group 1 will be greater than group 2‖) or (2) did not specify a predicted direction of difference (e.g., ―group 1 will differ from group 2‖). Somewhat different critical values of t are used in the two situations:
The first situation is called a one-tailed test.
The second situation is called a two-tailed test.
GG. F Test
The analysis of variance, or F test, is an extension of the t test. The analysis of variance is a more general statistical procedure than the t test. When a study has only one independent variable with two groups, F and t are virtually identical the value of F equals t2 in this situation. However, analysis of variance is also used when there are more than two levels of an independent variable and when a factorial design with two or more independent variables has been used.
The F statistic is a ratio of two types of variance systematic variance and error variance (hence the term analysis of variance). Systematic variance is the deviation of the group means from the grand mean, or the mean score of all individuals in all groups, and error variance is the deviation of the individual scores in each group from their respective group means. The larger the F ratio is, the more likely it is that the results are significant.
HH. Calculating Effect Size
After determining that there was a statistically significant effect of the independent variable, researchers will want to know the magnitude of the effect. Therefore, one should calculate an estimate of effect size.
II. Confidence Intervals and Statistical Significance
Confidence intervals were described in the chapter ―Asking People About Themselves: Survey Research.‖ After obtaining a sample value, we can calculate a confidence interval. An interval of values defines the most likely range of actual population values. The interval has an associated confidence interval: A 95% confidence interval indicates that we are 95% sure that the population value lies within the range; a 99% interval would provide greater certainty but the range of values would be larger.
JJ. Statistical Significance: An Overview
Some general concepts are important when you conduct a statistical test. First, the goal of the test is to allow you to make a decision about whether your obtained results are reliable; you want to be confident that you would obtain similar results if you conducted the study over and over again. Second, the significance level (alpha level) you choose indicates how confident you wish to be when making the decision. A .05 significance level says that you are 95% sure of the reliability of your findings; however, there is a 5% chance that you could be wrong. Third, you are most likely to obtain significant results when you have a large sample size because larger sample sizes provide better estimates of true population values. Finally, you are most likely to obtain significant results when the effect size is large, that is, when differences between groups are large and variability of scores within groups is small.
XXIV. Type I and Type II Errors
The decision to reject the null hypothesis is based on probabilities rather than on certainties. That is, the decision is made without direct knowledge of the true state of affairs in the population.
Correct Decisions
One correct decision occurs when we reject the null hypothesis and the research hypothesis is true in the population. The other correct decision is to accept the null hypothesis, and the null hypothesis is true in the population: The population means are in fact equal.
KK. Type I Errors
A Type I error is made when we reject the null hypothesis but the null hypothesis is actually true. Our decision is that the population means are not equal when they actually are equal.
LL. Type II Errors
A Type II error occurs when the null hypothesis is accepted although in the population the research hypothesis is true. The population means are not equal, but the results of the experiment do not lead to a decision to reject the null hypothesis.
MM. The Everyday Context of Type I and Type II Errors
The decision matrix used in statistical analyses can be applied to the kinds ofdecisions people frequently must make
in everydaylife. One illustrationof the use of a decision matrix involves the important decision to marry someone. If the null hypothesis is that the person is ―wrong‖ for you, and the true state is that the person is either ―wrong‖ or ―right,‖ you must decide whether to go ahead and marry the person.
XXV. Choosing a Significance Level
Researchers traditionally have used either a .05 or a .01 significance level in the decision to reject the null hypothesis. If there is less than a .05 or a .01 probability that the results occurred because of random error, the results are said to be significant. However, there is nothing magical about a .05 or a .01 significance level. The significance level chosen merely specifies the probability of a Type I error if the null hypothesis is rejected.
XXVI. Interpreting Nonsignificant Results
Although ―accepting the null hypothesis‖ is convenient terminology, it is important to recognize that researchers are not generally interested in accepting the null hypothesis. Research is designed to show that a relationship between variables does exist, not to demonstrate that variables are unrelated. More important, a decision to accept the null hypothesis when a single study does not show significant results is problematic, because negative or nonsignificant results are difficult to interpret.
XXVII.Choosing a Sample Size: Power Analysis
Most researchers take note of the sample sizes in the research area being studied and select a sample size that is typical for studies in the area. A more formal approach is to select a sample size on the basis of a desired probability of correctly rejecting the null hypothesis. This probability is called the power of the statistical test. It is obviously related to the probability of a Type II error.
XXVIII. The Importance of Replications
If the results of a single research investigation the means and standard deviations are statistically significant, you conclude that they would likely be obtained over and over again if the study were repeated.
XXIX. Significance of a Pearson r Correlation Coefficient
The Pearson r correlation coefficient is used to describe the strength of the relationship between two variables when both variables have interval or ratio scale properties. However, there remains the issue of whether the correlation is statistically significant.
The null hypothesis in this case is that the true population correlation is 0.00 the two variables are not related. What if you obtain a correlation of 0.27 (plus or minus)? A statistical significance test will allow you to decide whether to reject the null hypothesis and conclude that the true population correlation is, in fact, greater than 0.00.
XXX. Statistical Analysis Software
Some of the major statistical software programs include SPSS, SAS, SYSTAT, and freely available R
XXXI. Selecting the Appropriate Statistical Test
Research Studying Two Variables (Bivariate Research)
In these cases, the researcher is studying whether two variables are related. In general, we would refer to the first variable as the independent variable (IV) and the second variable as the dependent variable (DV). However, because it does not matter whether we are doing experimental or nonexperimental research, we could just as easily refer to the two variables as Variable X and Variable Y or Variable A and Variable B
NN. Research With Multiple Independent Variables
The research design situations described in Table 7 have been described in previous chapters. There are of course many other types of designs.
Sample Answers for Review Questions
54.Distinguish between the null hypothesis and the research hypothesis. When does the researcher decide to reject the null hypothesis?
Answers will vary. Students should contrast the null and research hypotheses so that the differences betweenthe two are clear. Students should demonstrate their understandingthat the null hypothesis is rejected when researchers find a very low probability that the obtained results could be due to random error. Additionally, students should demonstrate their understanding that the decision to reject the null hypothesis is based on probabilities rather than on certainties. That is, the decision is made without direct knowledge of the true state of affairs in the population. Thus, the decision might not be correct; errors may result from the use of inferential statistics.
55.What is meant by statistical significance?
Answers will vary. Students should demonstrate an understanding that a significant result is one that has a very low probability of occurring if the population means are equal. More simply, significance indicates that there is a low probability that the difference between the obtained sample means was due to random error. Significance, then, is a matter of probability.
56.What factors are most important in determining whether obtained results will be significant?
Answers will vary. Students should list the important factors in determining whether obtained results will be significant and explainthe reasoning behind each factor. When possible, students should support their explanations with examples.
57.Distinguish between a Type I and a Type II error. Why is your significance level the probability of making a Type I error?
Answers will vary. Students should contrast Type I and Type II errors in a way that demonstrates their understanding that a Type I error is an incorrect decision to reject the null hypothesis when it is true, and a Type II error is an incorrect decisionto reject the null hypothesis when it is false. Answers should also demonstrate an understanding that the probability of making a Type I error is determined by the choice of significance or alpha level, and the probability of making a Type I error can be changed by either decreasing or increasing the significance level.
58.What factors are involved in choosing a significance level?
Answers will vary. Students should list the factors involved in choosinga significance level and explain why they are necessary.
59.What influences the probability of a Type II error?
Answers will vary. Students should demonstrate an understanding ofthe following: The probability of making a Type II error is related to three factors. The first is the significance (alpha) level. If we set a very low significance level to decrease the chances of a Type I error, we increase the chances of a Type II error. In other words, if we make it very difficult to reject the null hypothesis, the probability of incorrectly accepting the null hypothesis increases. The second factor is sample size. True differences are more likely to be detected if the sample size is large. The third factor is effect size. If the effect size is large, a Type II error is unlikely. However, a small effect size may not be significant with a small sample.
60.What is the difference between statistical significance and practical significance?
Answers will vary. Students should demonstrate an understanding that circumstances can determine whether or not an effect has practical significance.
61.Discuss the reasons a researcher might obtain nonsignificant results.
Answers will vary. Students should demonstrate an understanding that nonsignificant results do not necessarily indicate that the null hypothesis is correct. However, there must be circumstances in which we can accept the null hypothesis and conclude that two variables are, in fact, not related.
Sample Answers for Being a Skilled Consumer of Research
In an experiment, one group of research participants is given 10 pages of material to proofread for errors. Another group proofreads the same material on a computer screen. The dependent variable is the number of errors detected in a 5-minute period. A .05 significance (alpha) level is used to evaluate the results.
h. What statistical test would you use?
i. What is the null hypothesis? The research hypothesis?
j. Given the hypothesis, what is the Type I error? The Type II error?
k. What is the probability of making a Type I error?
l. When Professor Rodríguez conducted the proofreading study, the average number of errors detected in the print and computer conditions was 38.4 and 13.2, respectively; this difference was not statistically significant. When Professor Seuss conducted the same experiment, the means of the two groups were 21.1 and 14.7, respectively, but the difference was statistically significant. Explain how this could happen.
t test
m.Null hypothesis the population means of the noncomputer group and the computer group are equal.
Research hypothesis the population means of the noncomputer group and the computer group are not equal.
n. Type I concluding the groups differ in the number of errors detected when in fact they do not.
Type II concluding the groups do not differ in the number of errors detected when in fact they do.
o. The probability of making a Type I error is based on the alpha level. In this example, the probability of a Type I error is less than 5%
p. A good answer would address sample size, alpha level, and the possibility of a Type II error on Professor Rodríguez’s part as possible reasons for the difference between his and Professor Seuss’s results. With a small sample size, a greater mean difference in errors detected would be required to obtain a significant difference. Also, if Professor Rodríguez had set a more stringent alpha level for the test (e.g., .01 instead of .05), this would also influence the likelihood of obtaining significant results.
62.A researcher investigated attitudes toward individuals in wheelchairs. The question was: Would people react differently to a person they perceived as being temporarily confined to a wheelchair than to a person who had a permanent disability? Participants were randomly assigned to two groups. Individuals in one group each worked on various tasks with a confederate in a wheelchair; members of the other group worked with the same confederate in a wheelchair, but this time the confederate wore a leg cast. After the session was over, participants filled out a questionnaire regarding their reactions to the study. One question asked, ―Would you be willing to work with your test partner in the future on a class assignment?‖ with ―yes‖ and ―no‖ as the only response alternatives. What would be the appropriate significance test for this experiment? Can you offer a critique of the dependent variable? If you changed the dependent variable, would it affect your choice of significance tests? If so, how?
As stated, the data in the experiment represent a nominal scale. Because of this, the appropriate statistical test for the experiment would be the χ 2 test. Employing only a ―yes‖ and ―no‖ response for the dependent variable would be less sensitive than employing a scale with several points from which to choose. If the new scale were not based on nominal data, then a test other than the χ 2 would be more appropriate.
Laboratory Demonstration: Sampling Distribution of the Mean
The concept of the sampling distribution of the mean is often a difficult one because it is abstract. Instructors can make it clearer by using a concrete demonstration. The demonstration described in Chapter 1 using samples of various sizes works well for this. Use the same procedures described for the demonstration in Chapter 1, but have the student groups select 10 samples instead of 5. Have them plot a frequency distribution using means rounded to whole numbers. In addition, have each group compute the mean of the sample means. Most will be very close to the population mean. While this does not create a true theoretical sampling distribution of the mean, this empirical distribution allows the student to understand the principle underlying the theoretical sampling distribution of the mean.
Laboratory Demonstration: Type I and Type II Errors
Type I and Type II errors can be demonstrated using the population of scores described in the demonstration for Chapter 1. To set the scene, tell the students to pretend they are testing the effectiveness of a ―memory pill‖ that is supposed to increase the memory abilities of humans. Instructors have a population of participants whose known mean recall for a 30-word list is 17 with a standard deviation of 4.66.
For the first part of the demonstration, instructors will make reality such that the pill truly has no effect in other words, the null hypothesis is true. Therefore, any participant who is randomly drawn and takes the pill will recall the same as he or she normally would. (For the purpose of
this scenario, there will be no placebo effects.) Have a student draw 10 ―participants‖ (scores) from the population. Write these on the board and have another student calculate the mean while yet another student draws a second sample of size 10. Repeat until there are 20 samples on the board. Since instructors know the population mean and standard deviation, they can use a z test to calculate the probability of drawing each sample. According to the rules of probability, one of these samples should have a 5% or less chance of occurring. This would result in a Type I error. In actuality, instructors may have none that are this improbable or you may have two or three. Instructors can use these occurrences to discuss the nature of probability and how predictions are generally true but not correct every time.
To demonstrate Type II errors, instructors must create an ―effect‖ to fail to detect. Now have the students pretend that the pill truly does increase memory. Assume the effect of the pill is to add 4 points to each individual score. Using the 20 samples already on the board, add 4 points to each score and calculate the new mean. (Students will rapidly recognize they only need to add 4 points to each old mean because the new mean is always equal to the old mean plus four.)
Calculate the probability of obtaining each new mean given the parameters of the original population. All but a very few (one to three generally) of the samples should lead to a rejection of the null hypothesis. Those samples that result in failure to reject would be the Type II errors.
Use the following formulas to calculate the probabilities:
n = 10
m = 17
s = 4.66
sM = 4.66/10 = 1.47
z = (M − m)/sM
Reject H0 if z 1.645
Retain H0 if z < 1.645
Activity: Type I and Type II Errors
Handout 7 in this manual can be used for a homework assignment or a group exercise that focuses on the concepts of Type I and Type II errors.
Activity: Inferential Statistics Films/Videos
There are several film and video resources. These include Inferential Statistics: Hypothesis Testing (Wiley) and Against All Odds (Annenberg/CPB Multimedia Collection). Against All Odds consists of 26 one-half hour programs covering most topics in an introductory statistics class. Instructors might be particularly interested in the last program, called ―Case Study,‖ which depicts planning a study, collecting data, and drawing inferences.
Activity: Selecting Statistical Tests
Identify a variety of current social issues of interest. Have groups of students discuss how they would address the difference in attitudes people have toward the issues. Next, have the students identify the statistical test that would be most appropriate to analyze the data.
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Cronan-Hillix, T. (1988). Teaching students the importance of accuracy in research. Teaching of Psychology, 15, 205–207
Johnson, D. E. (1989). An intuitive approach to teaching analysis of variance. Teaching of Psychology, 16, 67–68
Peden, B. F. (1991). Teaching the importance of accuracy in preparing references. Teaching of Psychology, 18, 102–105
Strube, M. J. (1991). Demonstrating the influence of sample size and reliability on study outcome. Teaching of Psychology, 18, 113–115.
Weaver, K. A. (1992). Elaborating selected statistical concepts with common experience. Teaching of Psychology, 19, 178–179
Zerbolio, D. J. (1989). A ―bag of tricks‖ for teaching about sampling distributions. Teaching of Psychology, 16, 207–209.
Also recommended:
Aberson, C. L., Berger, D. E., Healy, M. R., & Romero, V. L. (2003). Evaluation of an interactive tutorial for teaching hypothesis testing concepts. Teaching of Psychology, 30, 75–78.
Connor, J. M. (2003). Making statistics come alive: Using space and students’ bodies to illustrate statistical concepts. Teaching of Psychology, 30, 141–143.
Tomcho, T. J., & Foels, R. (2002). Teaching acculturation: Developing multiple ―cultures‖ in the classroom and role-playing the acculturation process. Teaching of Psychology, 29, 226–229.
Treadwell, K. R. H. (2008). Demonstrating experimenter ―ineptitude‖ as a means of teaching internal and external validity. Teaching of Psychology, 35, 184–188.
Zimbardo, P. G. (2004). Does psychology make a significant difference in our lives? American Psychologist, 59, 339–351.
Chapter 14: Generalization
Learning Objectives
Define external validity.
Discuss how our ability to generalize research findings to broader populations is affected by sex and gender, race and ethnicity, and culture.
Describe threats to external validity related to using college students and volunteers as participants, the location where a student takes place, and the use of online samples.
Describe the issues surrounding external validity when researchers study nonhuman animals.
Describe how laboratory settings, the use of a pretest, and the characteristics of the research team may impact external validity.
Define and discuss the importance of replications; distinguish between exact replications and conceptual replications.
Describe what a literature review and meta-analysis are. Compare and contrast their role in providing evidence for external validity.
Brief Chapter Outline
XLVIII. Generalizing Across People
E. Sex and Gender Identity
F. Racial and Ethnic Identity
G. Culture
H. Threats to External Validity
63. College Students
64. Volunteers
65. Location
66. Online Research Participants
I. Nonhuman Animals
XLIX. Generalizing Across Situations
L. Replications
J. Exact Replications
K. Conceptual Replications
LI. Assessing External Validity via Literature Reviews and Meta-Analyses
LII. Using Research to Improve Lives
Extended Chapter Outline
External validity is the extent to which findings may be generalized to other populations and settings.
XXXII.Generalizing Across People
Researchers may randomly assign participants to experimental conditions, yet rarely are participants randomly selected from the general population. Individuals who participate in psychological research are usually selected because they are available. For academic researchers, the most readily available population of potential participants is college students. Most commonly, the students are in their first or second year and enrolled in an introductory psychology course to satisfy a general education requirement.
OO. Sex and Gender Identity
Although the terms sex and gender are frequently used interchangeably, it is critical to remember that sex generally refers to biological classification, typically assigned at birth based on the external appearance of specific genitalia, and denoted most often with the terms male and female. Sex differences, then, refer to physiological differences.
Gender is a sociocultural classification shaped by cultural and historical forces and often denoted with the terms man and woman. The term gender identity refers to a person’s personal and psychological experience of gender. The term cisgender refers to people whose biological sex matches their gender identity.
Much research in the recent past ignored gender. Today it is more common for researchers to report that an analysis of the data did not reveal a gender difference or, if there was a difference, to report it. To be a skillful consumer of research, it is essential always to review the gender makeup of a sample and consider who the results might apply to
Studies must be built from samples that represent humanity. That said, it can also be essential to study subgroups: studying almost any topic in samples of people who identify as men, women, or transgender may reveal something important.
PP. Racial and Ethnic Identity
Race is the social categorization of humans based on skin color and other physical characteristics. No science of human behavior can be complete without considering this critical identity.
The construct of race is related to but different from ethnicity. Ethnicity is how a person belongs or is thought to belong to a population or subpopulation made up of people who share a common cultural background or descent. Ethnicity, too, is a complex variable to measure.
Race and ethnicity are related to many critical psychological processes and outcomes. When designing a research project, one should not overlook the impact that race and ethnicity may have on your variables of interest. Collect data from diverse samples and report your results accordingly.
QQ. Culture
Culture is the set of behaviors, customs, values, beliefs, knowledge, language, and expressions of a social group culture is passed from generation to generation and is often, but not always, associated with a specific time or place.
In many cases, research samples consist primarily of college students from the United States, other English-speaking countries, and Europe. But if psychologists want to understand human behavior, they must understand human behavior across and among cultures.
Much of the research on culture centers on identifying similarities and differences in personality and other psychological characteristics and how individuals from different cultures respond to the same environments. Behavioral research across the globe has increased dramatically, and researchers everywhere now have more direct access to this literature.
RR. Threats to External Validity
73. College Students
College students have been oversampled. Because research using college students is research on a highly restricted population generally first- and second-year students taking the introductory psychology class the subjects tend to be young and to possess the characteristics of emerging adults.
It is easy to criticize research based on participant characteristics, yet criticism by itself does not mean that results cannot be generalized.
74. Volunteers
Researchers often ask people to volunteer to participate in their research. When using volunteers, the external validity of the findings may be limited because the data from volunteers may be different from what would be obtained with a more general sample. Research indicates that volunteers differ in various ways from nonvolunteers.
75. Location
The location where participants are recruited can also impact a study’s external validity.
76. Online Research Participants
By recruiting a sample of internet users, researchers introduce some biases that need to be recognized when interpreting the results.
SS. Nonhuman Animals
Most research with other species is undertaken to study the behavior of those animals directly to gather the information that may help with the survival of endangered species and increase our understanding of our bonds with nonhuman animals such as dogs, cats, and horses. The basic research that psychologists conduct with nonhuman animals is usually done with the expectation that the findings can be generalized to humans.
XXXIII. Generalizing Across Situations
The generalization of a study from its participants to the population is a critical aspect of external validity. There are three crucial aspects of a study’s methodology that need to be considered, when thinking about external validity: the influences of the people conducting the study, the effects of a pretest, and the differences between a field study and a laboratory study.
The characteristics of the individuals conducting research can impact the external validity of the results. A warm, friendly experimenter will almost certainly produce different results from a cold, unfriendly experimenter. Participants also may behave differently with experimenters based on visible differences in gender, race, ethnicity, or disability status.
XXXIV. Replications
Replication of research is a way of overcoming any problems of generalization that occur in a single study. There are two types of replications to consider: exact replications and conceptual replications.
Exact Replications
An exact replication is an attempt to replicate precisely the procedures of a study to see whether the same results are obtained.
TT. Conceptual Replications
A conceptual replication is the use of different procedures to replicate a research finding. In a conceptual replication, researchers attempt to understand the relationships among abstract conceptual variables by using new, or different, operational definitions of those variables.
XXXV. Assessing External Validity via Literature Reviews and Meta-Analyses
In a literature review, a reviewer reads a number of studies that address a particular topic and then writes a paper that summarizes and evaluates the literature. Another technique for comparing a large number of studies in an area is meta-analysis. In a meta-analysis, the researcher combines the actual results of a number of studies. The analysis consists of a set of statistical procedures that employ effect sizes to compare a given finding across many different studies.
XXXVI. Using Research to Improve Lives
The impact of psychological research can be seen in areas such as health (programs to promote health-related behaviors related to stress, heart disease, and sexually transmitted diseases), education (providing methods for encouraging academic performance), and work environments (providing workers with more control and improving how people interact with computers and other machines in the workplace).
The APA itself has made significant contributions. Multicultural Guidelines is a comprehensive framework for all psychologists, including clinicians, researchers, consultants, and educators.
Engaging With Research: Generalizing Results
After reading the article, answer the following questions (which will be familiar to you from earlier in this chapter!). NOTE: Student answers will vary.
What is the primary goal of this study? Description, Prediction, Determining Cause, or Explaining? Do the authors achieve their goals?
What did these researchers do? What was the method?
q. The authors described the study as mixed-methods. What did they mean?
What was measured?
e.How did they measure COVID-19-related stress?
f. How did they measure the positive experiences that they may have experienced during the pandemic?
g.How did the researchers operationalize ―low-income‖?
To whom or what can we generalize the results?
a. Should we generalize the results of this study to a sample of economically disadvantaged mothers from Alaska? Or Syria? Or Bangladesh?
b.If another research with the same research question used a different method, how do you think the results would differ
c.What was the racial and ethnic breakdown of the participants?
d.How did the participants identify their gender?
What did they find? What were the results?
Have other researchers found similar results?
What are the limitations of this study?
What are the ethical issues present in this study? Sample Answers for Review Questions
What is external validity?
External validity is an aspect of a study that we try to assess, but cannot truly know.
Why should a researcher be concerned about generalizing to other populations?
The results of the research may not represent the general population, since most research participants are people who are available. Often, the participants may be college students from a particular university, or they might be people who have time to volunteer or be more heavily represented by one gender.
How can the fact that most studies are conducted with college students, volunteers, and individuals from a limited location and culture potentially impact external validity?
The key way to support external validity is through a study’s methodology. Using a census or a random sample produces better external validity than using a nonrandom sample. When studies are conducted with participants from a limited location and culture, the participants may not accurately represent the general population.
How does the use of the Internet to recruit subjects and collect data impact external validity?
If you are studying populations other than college students, you are even more dependent on volunteers. Conducting research on the Internet with volunteers limits external validity, because the data from volunteers may be different from what would be obtained with a more general sample. In addition, Internet users represent a unique demographic.
What is the source of the problem of generalizing to other experimenters? How can this problem be solved?
Most research involves only one experimenter, and rarely is much attention paid to the researcher’s personal characteristics. There is always the possibility that the results are generalizable only to certain types of experimenters. One solution to the problem of generalizing to other experimenters is to use two or more experimenters.
Why is it important to pretest a problem for generalization? Discuss why including a pretest may affect the ability to generalize results.
The researcher can be sure that the groups are equivalent on the pretest, and it is often more satisfying to see that individuals changed their scores than it is to look only at group means on a posttest. A pretest also enables the researcher to assess mortality (attrition) effects when it is likely that some participants will withdraw from an experiment. If you give a pretest, you can determine whether the people who withdrew are different from those who completed the study. Pretesting, however, may limit the ability to generalize to populations that did not receive a pretest. Simply taking the pretest may cause subjects to behave differently than they would without the pretest.
Distinguish between an exact replication and a conceptual replication. What is the value of a conceptual replication?
An exact replication is an attempt to precisely replicate the procedures of a study to see whether the same results are obtained. A conceptual replication is the use of different
procedures to replicate a research finding. In a conceptual replication, the same independent variable is operationalized in a different way, and the dependent variable may also be measured differently. Conceptual replications are extremely important in the social sciences because the variables used are complex and can be operationalized in many ways.
What is a meta-analysis?
In a meta-analysis, the researcher combines the actual results of a number of studies. The analysis consists of a set of statistical procedures that employ effect sizes to compare a given finding across many different studies.
Sample Answers for Being a Skilled Consumer of Research
If you conducted a study using a sample of participants from your college or university, in what ways do you think the sample would:
a. differ from the population of the city that your college/university is in?
Answers will vary.
b. differ from the population of the state? or the population of the country?
Answers will vary.
c. differ from the population of another country?
Answers will vary.
d. differ from other college student samples by gender, race, and/or ethnicity?
Answers will vary.
e. differ from other college student samples by other factors (e.g., regional, cultural, social class)?
Answers will vary.
Activity: Discussion of Gender and Cultural Issues
Papers by Denmark et al. and Miller are excellent sources for a discussion of gender and cultural
issues. Students can read the articles and write a short summary and reaction. Class or small group discussion can focus on issues directly related to sexism as well as cultural biases.
Denmark, F., Russo, N. F., Frieze, I. H., & Sechzer, J. A. (1988). Guidelines for avoiding sexism in psychological research: A report of the Ad Hoc Committee on Nonsexist Research. American Psychologist, 43, 582–585.
Miller, J. G. (1999). Cultural psychology: Implications for basic psychological theory. Psychological Science, 10, 85–91.
Activity: Search for Research on Gender and Ethnicity
Students can do a literature search for articles that examine gender, race, or ethnicity. In their descriptions of studies that they found, students should include information on whether there was an overall group difference (e.g., males were always more competitive than females) or an interaction with some independent variable (e.g., males were more competitive when money was involved and females were more competitive when nonmonetary resources were involved). Another possibility would be to have students report the percentage of articles in two or three journals that studied gender, race, or ethnicity in three different years. The issue of nonbinary gender identity or transgender individuals in research may also be raised.
Suggested Readings
Articles in the Handbook for Teaching Statistics and Research Methods (2nd ed.)
Cronan-Hillix, T. (1988). Teaching students the importance of accuracy in research. Teaching of Psychology, 15, 205–207.
Peden, B. F. (1991). Teaching the importance of accuracy in preparing references. Teaching of Psychology, 18, 102–105
Also recommended:
Tomcho, T. J., & Foels, R. (2002). Teaching acculturation: Developing multiple ―cultures‖ in the classroom and role-playing the acculturation process. Teaching of Psychology, 29, 226–229.
Treadwell, K. R. H. (2008). Demonstrating experimenter ―ineptitude‖ as a means of teaching internal and external validity. Teaching of Psychology, 35, 184–188.
Zimbardo, P. G. (2004). Does psychology make a significant difference in our lives? American Psychologist, 59, 339–351.
Part II
Handouts
Handout 1
Reading a Journal Article
Choose an empirical study from a journal (see the list in Chapter 2 of the text).
XLVI. Write the reference for this article in APA style.
XLVII. Write an ―overview‖ or brief summaryof the article. Indicate your assessment of what the study is about and the major findings of the study.
XLVIII. According to the introduction, what information was already known about the topic (look for references to previous research)?
XLIX. What variables were studied? What were the hypotheses concerning these variables?
L. What were the operational definitions of the variables studied?
LI. Who were the participants in this study? Were there anyspecial participant characteristics?
LII. What were the procedures used to test the hypotheses? Did you notice any problematic features of the procedure?
LIII. Was the experimental or nonexperimental method used? Were there attempts to control any extraneous variables?
LIV. What were the major results of the study? Were the results consistent with the hypotheses?
LV. How do the results relate to the other studies cited in the introduction?
LVI. How did the researcher interpret the results? Can you think of alternative interpretations?
LVII. Did the author give suggestions for future research or applications? Can you provide other suggestions?
LVIII. What would you do if you wished to find out more about this research topic?
Handout 2
Hypothesis and Operational
Definition Exercise
Provide the same information for each hypothesis.
Hypothesis 1: Job stress is associated with absenteeism rates at work. What type of relationship would you predict (positive linear, negative linear, and so on)?
LIX. Graph and label the relationship you predict. State a prediction based on the relationship shown by your graph.
LX.Which method (experimental or nonexperimental) would you use to test this hypothesis? Why?
LXI. Which is the independent or ―cause‖ variable and which is the dependent or ―effect‖ variable?
LXII.State an operational definition for each variable. If experimental, describe how the independent variable might be manipulated and the dependent variable measured. If nonexperimental, describe how both variables might be measured.
Hypothesis 2: A pregnant person’s diet affects the birth weight of the baby.
Hypothesis 3: The size of a meeting is related to the length of the meeting.
Hypothesis 4: Frequency of child abuse is related to the parents’ ages when they married.
Hypothesis 5: Political orientation is related to attitude toward gun control.
Hypothesis 6: Visual imagery improves memory.
Hypothesis 7: Jury decisions are influenced by the attractiveness of the defendant.
Hypothesis 8: Exercise is related to levels of stress.
Hypothesis 9: Age of instructor is related to students’ evaluation of the instructor.
Hypothesis 10: Number of hours worked is related to the number of hours spent studying.
Handout 3
Confounding Variables
Answer the same four questions for each of the experiments described below. Hint: One of the selections contain no confounds.
Confound Selection 1:
Tom Rogers wanted to test a new ―singalong‖ method to teach math to fourth graders (e.g., ―I love to multiply‖ to the tune of ―God Bless America‖). He used the singalong method in his first period class. His sixth period students continued solving math problems with the old method. At the end of the term, Mr. Rogers found that the first period class scored significantly lower than the sixth period class on a mathematics achievement test. He concluded that the singalong method was a total failure.
Identify the independent variable(s).
LXIII. Identify the dependent variable(s).
LXIV. Identify any confounding variable(s).
LXV.Propose a method to ―unconfound‖ the experiment.
Confound Selection 2:
An airport administrator investigated the attention spans of air traffic controllers to determine how many incoming flights the average controller can coordinate at the same time. Each randomly selected controller was tested, without their knowledge, by a computer program that fed false flight information to a computer terminal. The controller first ―received‖ information from one plane, and by the end of an hour the controller was coordinating 10 planes simultaneously. The administrator analyzed the errors collected by the computer program. The analysis revealed that the maximum number of planes a controller could handle without making potentially fatal errors was six planes. Also, no errors occurred when only one to three planes were incoming. The administrator concluded that a controller should never coordinate more than six incoming flights.
Confound Selection 3:
A drug company developed a new medication to control the manic phase of bipolar manicdepression. The firm hired a hospital psychiatrist to test the effectiveness of the drug. The psychiatrist identified a group of manic-depressive patients and randomly assigned them to a drug or placebo group. Nurse Ratched was told to administer the drug and Nurse Johnson was told to administer the placebo. Each nurse made daily observations of their patients during
treatment. A month later the observations were compared. In general, patients in the drug group had behaved more ―normally‖ than patients in the placebo group. The drug company publicized its product’s effectiveness.
Confound Selection 4:
Dr. Goodrich wanted to demonstrate that Dr. Goodrich’s tires were better than those of Dr. Goodyear. From car registration and leasing records, Dr. Goodrich found 40 salespeople who drove the same model of automobile approximately the same number of miles per week. Anonymously, Dr. Goodrich hired an independent research assistant, who was unaware of the purpose of the study, to randomly assign to 20 of the salespeople a new set of unmarked Goodrich tires, and to the other 20 a new set of unmarked Goodyear tires of the same price and quality. After six months and an average of 15,000 miles traveled by both groups, the assistant arranged for the salespeople to exchange tires. After another six months, and similar mileage, the assistant measured the amount of tread wear and reported that the Goodrich tires had actually worn more than the Goodyear tires.
Confound Selection 5:
An investigator was interested in studying the effect of taking a course in child development upon attitudes toward childrearing. At the end of the semester, the researcher distributed a questionnaire to students who had taken the child development course. Questionnaires were also given to an equal number of students who had not taken the course. The students who had taken the child development course had different attitudes from the students who had not taken the course (e.g., they had more positive attitudes about having large families).
Sample Answers for Handout 3:
Selection #1:
The identified independent variable in this study is the method of teaching math: singalong vs. the traditional. The dependent variable is the students’ performance on the mathematics achievement test. The most obvious confounding variable is the time of day. Students who received the singalong method had math during the first period of the day while students receiving the traditional method had math during the sixth period. It is possible that the lower scores for the singalong students are due to Mr. Rogers’ lack of enthusiasm for the subject early in the day or because the students are not yet ready to pay attention. Any variables that influenced the assignment of students to the class periods would also operate to confound the study. For example, suppose the sixth period class had a disproportionate number of advanced students because the school’s honors program assigned these students the same schedule. This could result in higher scores in the sixth period even though the singalong method does
facilitate learning.
Selection #2:
The identified independent variable in this experiment is the number of incoming flights the controllers handled. The dependent variable is the number of errors made by the controllers. The introduction of additional incoming flights is confounded with time and fatigue The higher numbers of flights only occurred during the latter part of the hour’s trial, since the number of flights gradually increased from one to ten over the course of the hour. It may be that controllers are perfectly able to coordinate more than six flights at a time but only for short periods of time. To determine this, the number of flights the controller must coordinate should vary randomly throughout the hour’s trial. The fatigue variable could be examined by looking at the errors across the hour interval; this would create a mixed factorial design with number of flights as the independent variable and the amount of time elapsed as the repeated measure.
Selection #3:
The identified independent variable is the drug vs. placebo treatments. The dependent variable is the observer ratings or observations of the patients. The confounding variable in this situation is the observer. Nurse Ratched makes all the observations for the drug group while Nurse Johnson makes all the observations for the placebo. This is especially troublesome because each observer knows the group membership of the patient being observed, thereby increasing the probability of observer effects. Additionally, if one observer consistently interprets behavior differently from the other, the resulting difference may have nothing to do with the variable but only represent scoring bias. To rectify these problems, both nurses should make observations on all patients without knowing the group membership of the patients.
Selection #4:
The identified independent variable is the type of tire, either a Goodrich or Goodyear tire. The dependent variable is the amount of wear on the tires. This is a well-designed study with random assignment to groups.
Selection #5:
The identified independent variable is the varying experience with the subject of child development: Half the participants took a child development class while the other participants did not. The dependent variable was the attitude toward childrearing. The confounding variable in this study stems from the possibility of nonequivalent groups since the participants were not
randomly assigned to groups, but rather, were self-selected into groups. It is very likely that those who choose to take a class in child development already have more positive attitudes toward childrearing (e.g., wanting large families) and that the class did not produce this difference. To correct this design problem, the experimenter should arrange to randomly assign the participants to groups.
Handout 4
Design Identification
Answer the same six questions for each of the experiments described below:
Design 1:
College sophomores were given a short course in speed reading. Three groups had courses lasting for 5, 15, or 25 sessions. At the conclusion of the course, participants were asked to read a paragraph, followed by a test of comprehension. Before taking the test, participants in each group were offered a monetary incentive: no money, $1, or $10 for a certain level of performance. The researcher collected the reading time and number of correct items on the comprehension test for each participant.
Identify the design (e.g., 2 X 2 factorial).
LXVI. Identify the total number of conditions.
LXVII. Identify the manipulated variable(s).
LXVIII. Is this an IV X PV design? If so, identify the participant variable(s).
LXIX. Is this a repeated measures design? If so, identify the repeated variable(s).
LXX.Identify the dependent variable(s).
Design 2:
A researcher interested in weight control wondered whether normal and overweight individuals differ in their reaction to the availability of food. Thus, normal and overweight participants were told to eat as many peanuts as they desired while working on a questionnaire. One manipulation was the proximity of the peanut dish (close or far from the participant); the second manipulation was whether the peanuts were shelled or unshelled. After filling out the questionnaire, the peanut dish was weighed to determine the amount of peanuts consumed.
Design 3:
A researcher studied the influence of intensity of room illumination (low, medium, and high) on reading speed among fifth graders. Also, children were classified as ―good‖ or ―poor‖ readers from achievement test scores. Each group of children read 750-word passages under all three levels of illumination (three reading trials). The order of trials for each child was randomly determined.
Design 4:
A researcher investigated the effect of a child’s hair length on judgments of personality and intelligence. Teachers were shown photographs of children to obtain their ―first impressions‖ of the children. Each teacher was shown a boy or girl whose hair was either very short, shoulder length, or very long. Teachers rated the friendliness of the child and estimated the child’s intelligence level.
Design 5:
An investigator was interested in the effects of various treatments on reduction of fear in phobic participants. He suspected that type of phobia may interact with therapeutic treatments; specifically, that the types of treatments for agoraphobics (fear of open spaces) and claustrophobics (fear of closed spaces) might be different. He divided participants into two groups based upon type of fear and then assigned members of each group to treatment groups: desensitization, insight, or implosive therapies. After three months of treatment, participants’ anxiety in the feared situation was measured.
Design 6:
Participants participated in a driving simulation study to investigate night-driving reactions as a function of alcohol consumption and road conditions. Participants drank ―cocktails‖ containing either no alcohol, 3 ounces of alcohol, or 6 ounces of alcohol. After 30 minutes, they began the driving simulation test. Each participant simulated a drive on a straight road, a road with gentle curves, or a road with many sharp curves and on which the participants encountered various road hazards. Driving speed and the number of accidents were measured.
Design 7:
A researcher was interested in the effects of sexual arousal on the ability to concentrate, and also wondered whether gender and age are important factors. The researcher had participants read passages that were low, medium, or high in sexual arousal content. The participants included both males and females and were divided into three age categories (18–24, 25–35, and 36–50 years). After reading the passage, participants were asked to perform a proofreading task;
the researcher measured the number of errors detected on the task.
Sample Answers for Handout 4:
Design 1:
Identify the design: 3 x 3
LXXI. Identify the total number of conditions: 9
LXXII. Identify the manipulated variable(s): length of sessions, monetary incentive
LXXIII. Is this an IV X PV design? If so, identify the participant variable(s): no
LXXIV. Is this a repeated measures design? If so, identify the repeated variable(s): no
LXXV. Identify the dependent variable(s): reading time, number of correct items
Design 2:
Identify the design: 2 x 2 x 2
LXXVI. Identify the total number of conditions: 8
LXXVII. Identify the manipulated variable(s): proximity of peanut dish, type of peanut LXXVIII. Is this an IV X PV design? If so, identify the participant variable(s): yes, weight of participant
LXXIX. Is this a repeated measures design? If so, identify the repeated variable(s): no
LXXX. Identify the dependent variable(s): weight of peanut dish/peanuts consumed
Design 3:
Identify the design: 3 x 2
LXXXI. Identify the total number of conditions: 6
LXXXII. Identify the manipulated variable(s): room illumination
LXXXIII. Is this an IV X PV design? If so, identify the participant variable(s): yes, level of reader
LXXXIV. Is this a repeated measures design? If so, identify the repeated variable(s): yes, room illumination
LXXXV. Identify the dependent variable(s): reading speed
Design 4:
Identify the design: 2 x 3
LXXXVI. Identify the total number of conditions: 6
LXXXVII. Identify the manipulated variable(s): gender of child, length of hair
LXXXVIII. Is this an IV X PV design? If so, identify the participant variable(s): no
LXXXIX. Is this a repeated measures design? If so, identify the repeated variable(s): no
XC.Identify the dependent variable(s): friendliness, intelligence level
Design 5
Identify the design: 2 x 3
XCI. Identify the total number of conditions: 6
XCII.Identify the manipulated variable(s): type of treatment
XCIII. Is this an IV X PV design? If so, identify the participant variable(s): yes, type of fear
XCIV. Is this a repeated measures design? If so, identify the repeated variable(s): no
XCV.Identify the dependent variable(s): anxiety/reduction of fear
Design 6:
Identify the design: 3 x 3
XCVI. Identify the total number of conditions: 9
XCVII. Identify the manipulated variable(s): alcohol consumption, road conditions
XCVIII. Is this an IV X PV design? If so, identify the participant variable(s): no
XCIX. Is this a repeated measures design? If so, identify the repeated variable(s): no
C. Identify the dependent variable(s): driving speed and number of accidents
Design 7:
Identify the design: 3 x 2 x 3
CI. Identify the total number of conditions: 18
CII. Identify the manipulated variable(s): sexual content of passages
CIII. Is this an IV X PV design? If so, identify the participant variable(s): yes, gender of participant, age of participant
CIV. Is this a repeated measures design? If so, identify the repeated variable(s): no
CV.Identify the dependent variable(s): number of errors detected Handout 5
Outcomes of Factorial Designs
Outcome 1:
3 8
Is there a main effect of IV A?
Overall means of A1 _____ A2 _____
CVI. Is there a main effect of IV B?
Overall means of A1 _____ A2
CVII.Is there an A X B interaction?
CVIII. Graph the results.
Outcome 2:
Is there a main effect of IV A?
Overall means of A1 _____ A2 _____
CIX. Is there a main effect of IV B?
Overall means of B1 _____ B2 _____
CX.Is there an A X B interaction?
CXI. Graph the results.
Outcome 3:
B2
1 5
2 4 A3 3 3
Is there a main effect of IV A?
Overall means of A1 _____ A2 _____ A3 _____
CXII.Is there a main effect of IV B?
Overall means of B1 _____ B2 _____
CXIII. Is there an A X B interaction?
CXIV. Graph the results.
Outcome 4:
Is there a main effect of IV A?
Overall means of A1 _____ A2 _____ A3 _____
CXV.Is there a main effect of IV B?
Overall means of B1 _____ B2 _____ B3 _____
CXVI. Is there an A X B interaction?
CXVII. Graph the results.
Outcome 5:
Is there a main effect of IV A?
Overall means of A1 _____ A2 _____
CXVIII. Is there a main effect of IV B?
Overall means of B1 _____ B2 _____
CXIX. Is there a main effect of IV C?
Overall means of C1 _____ C2 _____
CXX.Are there anytwo-way interactions? (Create a data matrix or figure to help you determine the answer).
A X B interaction: A X C interaction: B X C interaction:
CXXI. Is there a three-wayinteraction? (Draw a figure for the A X B interaction at C1, and another for the A X B interaction at C2, to help you answer.)
Sample Answers for Handout 5:
Outcome 1:
Is there a main effect of IV A? Yes
Overall means: A1: 5.0 A2: 5.5
CXXII. Is there a main effect of IV B? Yes
Overall means: B1: 3.5 B2: 7.0
CXXIII. Is there an A X B interaction? Yes
Outcome 2:
Is there a main effect of IV A? No
Overall means: A1: 30 A2: 30
CXXIV. Is there a main effect of IV B? Yes
Overall means: B1: 25 B2: 35
CXXV. Is there an A X B interaction? Yes
Outcome 3:
Is there a main effect of IV A? No
Overall means: A1: 3 A2: 3 A3: 3
CXXVI. Is there a main effect of IV B? Yes
Overall means: B1: 2 B2: 4
CXXVII. Is there an A X B interaction? Yes
Outcome 4:
Is there a main effect of IV A? Yes
Overall means: A1: 6.00 A2: 7.67A3: 9.67
CXXVIII. Is there a main effect of IV B? Yes
Overall means: B1: 5.67 B2: 7.00B3: 10.67
CXXIX. Is there an A X B interaction? Yes
Outcome 5:
Is there a main effect of IV A? Yes
Overall means: A1: 4.25 A2: 6.25
CXXX. Is there a main effect of IV B? Yes
Overall means: B1: 7.5 B2: 3.0
CXXXI. Is there a main effect of IV C? Yes
Overall means: C1: 5.25 C2: 5.25
CXXXII. Are there any two-way interactions? A X B: Yes A X C: Yes B X C: No
CXXXIII. Is there a three-wayinteraction? Yes
Handout 6
Describing Correlation Coefficients
Describe in words the nature of the relationship between each pair of variables as indicated by the value of the correlation coefficient. Be sure to include:
UU. an appropriate graph of the relationship.
VV. the direction of the relationship.
WW. the strength of the relationship.
XX. a verbal description of the way the variables ―go together.‖
r = -.96 between craving for pizza and ability to concentrate on studying.
CXXXIV. r = +.02 between length of marriage and marital satisfaction.
CXXXV. r = +.55 between parent and child intelligence test scores.
CXXXVI. r = -.90 between amount of alcohol consumed and performance on a motor coordination task.
CXXXVII. r = +.75 between amount of money won and the number of hot dogs purchased at race tracks.
CXXXVIII. r = +.67 between scores on a hyperactivity scale and an aggressiveness scale.
CXXXIX. r = -.82 between a job applicant’s age and likelihood of being hired.
CXL. r = +.06 between the number of yearly predictions by psychics and the number of correct predictions.
Handout 7
Statistical Decisions
Answer the same five questions for each of the experiments described below
Statistical Decision 1:
A researcher compared the number of cavities of children who had used either Toothpaste brand X or Toothpaste brand Y for a year. At the end of the year, the researcher found that the children who had used brand X had significantly fewer cavities than the children who had used
brand Y. The difference was significant at the .05 level.
What is the null hypothesis?
CXLI. What is the research hypothesis?
CXLII. What would be the Type I error?
CXLIII. What would be the Type II error?
CXLIV. What is the probability of a Type I error?
Statistical Decision 2:
The effect of a monetary incentive on performance on a cognitive task was investigated. The researcher predicted that greater monetary incentives would result in higher performance. Participants were told that they would receive either 5 cents, 25 cents, or 50 cents for each word puzzle that they correctly solved. A statistical test showed that there was a significant effect of incentive on performance at the .01 level. The greater the incentive, the more puzzles were solved.
Statistical Decision 3:
A researcher investigated whether job applicants with popular names are viewed more favorably than equally qualified applicants with less popular names. Participants in one group read a resume of a job applicant with a popular name; participants in the other group read the same resume but the applicant had an unpopular name. There were five participants in each group. The results showed that the difference in evaluation of the applicants was not significant (p < .22).
Statistical Decision 4:
A social psychologist predicted that ratings of an individual’s social desirability would be influenced by their physical attractiveness. Participants received photos of attractive, average, or unattractive individuals. The researcher found that the more attractive the individual, the higher their rating of social desirability. This difference was significant at the .05 level.
Sample Answers for Handout 7:
Statistical Decision 1:
What is the null hypothesis?
There will be no difference in cavities between users of Toothpaste Brand X and Toothpaste Brand Y
CXLV. What is the research hypothesis?
There will a difference in cavities between users of Toothpaste Brand X and Toothpaste Brand Y
CXLVI. What would be the Type I error?
Claiming there is a difference in cavities between users of Toothpaste Brand X and Toothpaste Brand Y when there actually is not
CXLVII. What would be the Type II error?
Claiming there is no difference in cavities between users of Toothpaste Brand X and Toothpaste Brand Y when there actually is a difference.
CXLVIII. What is the probability of a Type I error? .05
Statistical Decision 2:
What is the null hypothesis?
Greater monetary incentives will not result in higher performance
CXLIX. What is the research hypothesis?
Greater monetary incentives will result in higher performance
CL. What would be the Type I error?
Claiming monetary incentive resulted in higher performance when it actually did not
CLI. What would be the Type II error?
Claiming monetary incentive did not result in higher performance when it actually did
CLII. What is the probability of a Type I error? .01
Statistical Decision 3:
What is the null hypothesis?
There will be no difference in ratings between applicants with popular names than applicants with less popular names
CLIII. What is the research hypothesis?
There will be a difference in ratings between applicants with popular names than applicants with less popular names.
CLIV. What would be the Type I error?
Claiming there is a difference in ratings when there actually is not
CLV. What would be the Type II error?
Claiming there is no difference in ratings when there actually is
CLVI. What is the probability of a Type I error? < .22
Statistical Decision 4:
What is the null hypothesis?
There will be no difference in ratings of social desirability between attractive, average, and unattractive individuals.
CLVII. What is the research hypothesis?
There will be a difference in ratings of social desirability between attractive, average, and unattractive individuals.
CLVIII. What would be the Type I error?
Claiming there is a difference in ratings of social desirability when there actually is not.
CLIX. What would be the Type II error?
Claiming there is no difference in ratings of social desirability when there actually is.
CLX. What is the probability of a Type I error? .05
Handout 8
Is it a Correlation or is it an Experiment?
For the following questions please answer if the experiment described is a correlation or an experiment. If it is a correlation, then answer if it is positive or negative. If it is an experiment, what are the independent and dependent variables?
Two researchers are interested in the effects of smoking marijuana on learning. They place several ferrets in a box. If the ferrets press a lever they will get a food pellet. Some ferrets are exposed to marijuana smoke three times a day; some once a day, and some not at all. After two weeks of exposure, the researchers measure which groups of ferrets learn to push the lever faster. They find that less marijuana smoke is associated with faster learning.
CLXI. Researchers are interested in the effects of bystanders on altruistic (helping) behaviors. They have someone pretend to have a seizure when either several people are
present or only one person is present, and then they observe whether helping behaviors are affected. They find that people are more likely to help with fewer bystanders.
CLXII. Researchers are interested in the effects of patterns of TV watching on children’s aggressive behavior. They have children keep a diary of what they are watching and for how long and then compare it to school reports of aggressive actions. They find that the more aggressive TV a child watches, the more schools report those children as aggressive.
CLXIII. Dr. Mnemonic is interested in the effect of types of questioning on memory. She has several subjects watch a video of car accidents and then asks some of the subjects leading questions like ―was the car red?‖ Others are asked open-ended questions like ―what color was the car?‖ She then looks at how people remember events based on question type.
CLXIV. Dr. Ratrunner is interested in predictors of teen pregnancy He hands out surveys to 1,000 teenagers, asking about their sexual behaviors, background information, and a slew of miscellaneous items. He finds that the best predictor of teen use of birth control is the number of electrical appliances in the family house.
Handout 8: Suggested Answers
Experiment: IV: marijuana smoke; DV speed of learning
CLXV. Correlation: Negative
CLXVI. Correlation: Positive
CLXVII. Experiment: IV: type of question; DV: memory
CLXVIII. Correlation: Positive