Page 1

Volume 29 / Number 1 / 2017

Volume 29 / Number 1 / 2017

Journal of

Media Psychology Journal of Media Psychology

Editor-in-Chief Nicole Krämer

Theories, Methods, and Applications

Associate Editors Gary Bente Nick D. Bowman Jesse Fox Christoph Klimmt Diana Rieger

Special Issue New Evidentiary Standards for the Science of Technology and Human Behavior


The positive aspects of digital game play and its impact Topics covered include • Children’s reasons and motivations for getting involved in game play and how they benefit from such play • Exergames, designed to encourage physical activity • Games played by young people with learning difficulties • The importance of educational games and how they can benefit children’s knowledge and learning • A game application designed to help middle school students learn algebra

Mark Blades / Fran C. Blumberg / Caroline Oates (Editors)

Children and Digital Games (Series: Zeitschrift für Psychologie – Vol. 221/2) 2013, iv + 56 pp., large format US $49.00 / € 34.95 ISBN 978-0-88937-446-1 Children’s and adolescents’ exposure to and use of digital media has steadily increased in recent years, in both entertainment and education contexts. The growth of the internet and the development of interactive media such as computer games have opened up new issues that are as yet only partly understood. Furthermore, existing research has often been about the negative impact of media on young people. However, an activity that engages and motivates, such as game play, is one that can have a positive

www.hogrefe.com

impact, e.g., by stimulating knowledge and skills acquisition. This volume investigates the positive aspects of digital game play and its impact on children’s and adolescents’ learning, development, and physical activity. It contains contributions on the design, the use, and the benefits of digital games and provides a general overview of the appeal and educational ramifications of digital game play for youths.


Journal of

Media Psychology Theories, Methods, and Applications

Volume 29, No. 1, 2017


Editor-in-Chief

Editorial Assistant

Associate Editors

Nicole Kra¨mer, Social Psychology: Media and Communication, Department of Computer Science and Applied Cognitive Science, University Duisburg-Essen, Forsthausweg 2, 47057 Duisburg, Germany, Tel. +49 203 379-2482, Fax +49 203 379-3670, E-Mail nicole.kraemer@uni-due.de German Neubaum, Social Psychology: Media and Communication, University of Duisburg-Essen, Forsthausweg 2, Office LE 243, 47057 Duisburg, Germany, Tel. +49 203 379-2442, Fax +49 203 379-3670, Email german.neubaum@uni-due.de Gary Bente, University of Cologne, Cologne, Germany, E-mail bente@uni-koeln.de Nick D. Bowman, West Virginia University, Morgantown, VA, USA, E-mail Nicholas.Bowman@mail.wvu.edu Jesse Fox, The Ohio State University, Columbus, OH, USA, E-mail fox775@osu.edu Christoph Klimmt, Hanover University of Music, Drama and Communication Research, Germany, E-mail christoph.klimmt@ijk.hmtm-hannover.de Diana Rieger, University of Mannheim, Germany, E-mail diana.rieger@uni-mannheim.de

Editorial Board

Markus Appel (Koblenz-Landau, Germany) Florian Arendt (Mu¨nchen, Germany) Omotayo Banjo (Cincinnati, OH, USA) Anne Bartsch (Mu¨nchen, Germany) Paul Bolls (Columbia, MO, USA) Johannes Breuer (Cologne, Germany) Jennings Bryant (Tuscaloosa, AL, USA) Caleb Carr (Normal, IL, USA) Elizabeth Cohen (Morgantown, WV, USA) Enny Das (Nijmegen, The Netherlands) Kevin Durkin (Glasgow, UK) Allison Eden (Amsterdam, The Netherlands) Nicole Ellison (Ann Arbor, MI, USA) Malte Elson (Bochum, Germany) David Ewoldsen (Columbus, OH, USA) Christopher Ferguson (DeLand, FL, USA) Jesse Fox (Columbus, OH, USA) Sabine Glock (Wuppertal, Germany) Melanie Green (Chapel Hill, NC, USA) Matthew Grizzard (Buffalo, NY, USA) Dorothe´e Hefner (Hannover, Germany) Shirley S. Ho (Singapore) Matthias Hofer (Zurich, Switzerland) Juan Jose´ Igartua (Salamanca, Spain) Jimmy Ivory (Blacksburg, VA, USA) Jeroen Jansz (Rotterdam, The Netherlands) Julia Kneer (Rotterdam, The Netherlands) Elly Konijn (Amsterdam, The Netherlands) Maja Krakowiak (Colorado Springs, CO, USA)

Publisher

Hogrefe Publishing, Merkelstr. 3, 37085 Go¨ttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail publishing@hogrefe.com, Web www.hogrefe.com North America: Hogrefe Publishing, 7 Bulfinch Place, 2nd floor, Boston, MA 02114, USA, Tel. (866) 823-4726, Fax (617) 354-6875, E-mail publishing@hogrefe.com Regina Pinks-Freybott, Hogrefe Publishing, Merkelstr. 3, 37085 Go¨ttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail production@hogrefe.com Hogrefe Publishing, Herbert-Quandt-Str. 4, 37081 Go¨ttingen, Germany, Tel. +49 551 99950-999, Fax +49 551 99950-998 Melanie Beck, Hogrefe Publishing, Merkelstr. 3, 37085 Go¨ttingen, Germany, Tel. +49 551 99950-423, Fax +49 551 99950-425, E-mail marketing@hogrefe.com ISSN-L 1864-1105, ISSN-Print 1864-1105, ISSN-Online 2151-2388

Production Subscriptions Advertising/Inserts ISSN Copyright Information

Publication Subscription Prices

Payment

Electronic Full Text Abstracting/Indexing Services

Journal of Media Psychology (2017), 29(1)

Annie Lang (Bloomington, IN, USA) Eun-Ju Lee (Seoul, South Korea) Jo¨rg Matthes (Vienna, Austria) Peter Nauroth (Marburg, Germany) Anne Oeldorf-Hirsch (Mansfield, CT, USA) Jochen Peter (Amsterdam, Netherlands) Daniel Pietschmann (Chemnitz, Germany) Robert F. Potter (Bloomington, IN, USA) Arthur A. Raney (Tallahassee, USA) Leonard Reinecke (Mainz, Germany) Meghan Sanders (Baton Rouge, LA, USA) Frank Schwab (Wu¨rzburg, Germany) Stephan Schwan (Tu¨bingen, Germany) Michael Slater (Columbus, OH, USA) Kaveri Subrahmanyam (Los Angeles, CA, USA) Ron Tamborini (East Lansing, MI, USA) Catalina Toma (Madison, WI, USA) Sabine Trepte (Hohenheim, Germany) Mina Tsay-Vogel (Boston, MA, USA) Dagmar Unz (Wu¨rzburg-Schweinfurt, Germany) Sonja Utz (Tu¨bingen, Germany) Sebastia´n Valenzuela (Santiago de Chile, Chile) Brandon van der Heide (East Lansing, MI, USA) Christian von Sikorski (Vienna, Austria) Peter Vorderer (Mannheim, Germany) Patrick Weber (Hohenheim, Germany) Rene´ Weber (Santa Barbara, CA, USA) Stephan Winter (Duisburg-Essen, Germany) Mike Yao (Hong Kong, ROC)

Ó 2017 Hogrefe Publishing. This journal as well as the individual contributions and illustrations contained within it are protected under international copyright law. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without prior written permission from the publisher. All rights, including translation rights, reserved. Published in 4 issues per annual volume. The Journal of Media Psychology is the continuation of Zeitschrift fu¨r Medienpsychologie (ISSN 1617-6383), the last annual volume of which (Volume 19) was published in 2007. Calendar year subscriptions only. Rates for 2017: Institutions: US $324.00 / 249.00; Individuals: US $159.00 / 114.00 (all plus US $16.00 / 12.00 postage & handling). Single issue US $81.00 / 62.50 (plus postage & handling) Payment may be made by check, international money order, or credit card, to Hogrefe Publishing, Merkelstr. 3, 37085 Go¨ttingen, Germany. US and Canadian subscriptions can also be ordered from Hogrefe Publishing, 7 Bulfinch Place, 2nd floor, Boston, MA 02114, USA. The full text of Journal of Media Psychology is available online at http://econtent.hogrefe.com and in PsycARTICLES. Abstracted/indexed in Current Contents/Social and Behavioral Sciences (CC/S&BS), Social Sciences Citation Index (SSCI), IBR, IBZ, PsycINFO, PsycLit, PSYNDEX, and Scopus. Impact Factor (2015): 0.694

Ó 2017 Hogrefe Publishing


Contents Editorial

The Science of Technology and Human Behavior: Standards, Old and New

1

Malte Elson and Andrew K. Przybylski Pre-Registered Reports

Meeting Calendar

Ó 2017 Hogrefe Publishing

The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior Monica Soliman, Johanna Peetz, and Mariya Davydenko

8

Effects of Subtitles, Complexity, and Language Proficiency on Learning From Online Education Videos Tim van der Zee, Wilfried Admiraal, Fred Paas, Nadira Saab, and Bas Giesbers

18

‘‘Drive the Lane; Together, Hard!’’: An Examination of the Effects of Supportive Coplaying and Task Difficulty on Prosocial Behavior Johannes Breuer, John Velez, Nicholas Bowman, Tim Wulf, and Gary Bente

31

Video Game Use as Risk Exposure, Protective Incapacitation, or Inconsequential Activity Among University Students: Comparing Approaches in a Unique Risk Environment Adrienne Holz Ivory, James D Ivory, and Madison Lanier

42

Interactive Narratives Affecting Social Change: A Closer Look at the Relationship Between Interactivity and Prosocial Behavior Sharon T. Steinemann, Glena H. Iten, Klaus Opwis, Seamus F. Forde, Lars Frasseck, and Elisa D. Mekler

54

67

Journal of Media Psychology (2017), 29(1)


Editorial The Science of Technology and Human Behavior Standards, Old and New Malte Elson1 and Andrew K. Przybylski2,3 1

Educational Psychology Research Group, Ruhr University Bochum, Germany

2

Oxford Internet Institute, University of Oxford, UK

3

Department of Experimental Psychology, University of Oxford, UK

The Present Technology and Human Behavior – a title that could not be more generic for a Special Issue. On its face, the phenomena examined in this issue are not distinct from others published in the Journal of Media Psychology (JMP): Can immersive technology be used to promote proenvironmental behaviors? (Soliman, Peetz, & Davydenko, 2017); How do subtitles and complexity of MOOC-type videos impact learning outcomes? (van der Zee, Admiraal, Paas, Saab, & Gisbers, 2017); Does cooperative video game play foster prosocial behavior? (Breuer, Velez, Bowman, Wulf, & Bente, 2017); What is the role of video game use in the unique risk environment of college students? (Holz Ivory, Ivory, & Lanier, 2017); Do interactive narratives have the potential to advocate social change? (Steineman, Iten, Opwis, Forse, Frasseck, & Mekler, 2017). What makes this issue special is not a thematic focus, but the nature of the scientific approach to hypotheses testing: It is explicitly confirmatory. All five studies are registered reports which are reviewed in two phases: First, the theoretical background, hypotheses, methods, and analysis plans of a study are peer-reviewed before the data are collected. If they are evaluated as sound, the study receives an “in-principle” acceptance, and researchers proceed to conduct it (taking potential changes or additions suggested by the reviewers into consideration). Consequently, the data collected can be used as a true (dis-)confirmatory hypothesis test. In a second step, the soundness of the analyses and discussion section are reviewed, but the publication decision is not contingent on the outcome of the study (see our call for papers; Elson, Przybylski, & Krämer, 2015). All additional, nonpreregistered analyses conducted are clearly labelled as exploratory and serve to discover alternative explanations or generate new hypotheses. Ó 2017 Hogrefe Publishing

Further, the authors were required to provide a sampling plan designed to achieve at least 80% statistical power (or comparable criterion for Bayesian analysis strategies) for all of their confirmatory hypothesis tests, and to make all materials, data, and analysis scripts freely available on the Open Science Framework (OSF) at https://osf.io/5cvkr/. We believe that making these materials available to anyone increases the value of the research as it allows others to reproduce analyses, replicate the studies, or build on and extend the empirical foundation. As such, the five studies represent the first in JMP that employ these new practices. It is our hope that these contributions will serve as an inspiration and model for other media researchers, and encourage scientists studying media to preregister designs and share their data and materials openly. All research proposals were reviewed by content experts from within the field and additional outside experts in methodology and statistics. Their reviews, too, are available on the OSF, and we deeply appreciate their contributions to meliorate each individual research report and their commitment to open and reproducible science: Marko Bachl, Chris Chambers, Julia Erdmann, Pete Etchells, Alexander Etz, Karin Fikkers, Jesse Fox, Chris Hartgerink, Moritz Heene, Joe Hilgard, Markus Huff, Rey Junco, Daniël Lakens, Benny Liebold, Patrick Markey, Jörg Matthes, Candice Morey, Richard Morey, Michèle Nuijten, Elizabeth Page-Gould, Daniel Pietschmann, Michael Scharkow, Felix Schönbrodt, Cary Stothart, Morgan Tear, Netta Weinstein, and additional reviewers who would like to remain anonymous. Finally, we would like to extend our sincerest gratitude to JMP’s Editor-in-Chief Nicole Krämer and editorial assistant German Neubaum for their support and guidance from the conception to the publication of this issue. Journal of Media Psychology (2017), 29(1), 1–7 DOI: 10.1027/1864-1105/a000212


2

The Past Concerns have been raised about the integrity of the empirical foundation of psychological science, such as the average statistical power and publication bias (Schimmack, 2012), availability of data (Wicherts, Borsboom, Kats, & Molenaar, 2006), and the rate of statistical reporting errors (Nuijten, Hartgerink, van Assen, Epskamp, & Wicherts, 2015). Currently, there is little information to which extent these issues also exist within the media psychology literature. Therefore, to provide a first prevalence estimate, and to illustrate how some of the practices adopted for this special issue can help reducing these problems, we surveyed research designs, availability of data, errors in the reporting of statistical analyses, and statistical power of studies published in the traditional format in JMP. We analyzed the research published in JMP between volume 20/1, when it became an English-language publication, and volume 28/2 (the most recent issue when this analysis was planned). The raw data, analysis code, and code book are freely available at https://osf.io/5cvkr/.

Sample of Publications Publications in JMP represent a rich range of empirical approaches. Of the N = 146 original research articles1 identified, nemp = 131 (89.7%) report data from at least one empirical study (147 studies in total2). Of those, more than half are experiments (54.4%) or quasi-experiments (8.8%), followed by cross-sectional surveys (23.8%), and longitudinal studies (7.5%). The rest are content analyses, observational studies, or interview studies (5.4%).

Availability of Data and Materials Recently, a number of open science initiatives including the Transparency and Openness Promotion Guidelines, the Peer Reviewers’ Openness Initiative, and the Commitment to Research Transparency3 have been successful in raising awareness of the benefits of open science and increasing the rate of publicly shared datasets (Kidwell et al., 2016). Historically the availability of research data in psychology has been poor (Wicherts et al., 2006). Our sample of JMP

Editorial

publications suggests that media psychology is no exception to this, as we were not able to identify a single publication reporting a link to research data in a public repository or the journal’s supplementary materials.4

Statistical Reporting Errors Most conclusions in empirical media psychology, and psychology overall, are based on Null Hypothesis Significance Tests (NHSTs). Therefore, it is important that all statistical parameters and NHST results be reported accurately. However, a recent study by Nuijten et al. (2015) indicates a high rate of reporting errors in psychological research reports. The consequences of such inconsistencies are potentially serious as the analyses reported and conclusions drawn may not be supported by the data. Similar concerns have been voiced for published empirical studies in communication research (Vermeulen et al., 2015). To make sure such inconsistencies were avoided for our special issue, we validated all accepted research reports with statcheck (version 1.2.2; Epskamp & Nuijten, 2015), a package for the statistical programming language R (R Core Team, 2016) that works like a spellchecker for NHSTs by automatically extracting reported statistics from documents and recomputing5 p-values. For our own analyses, we downloaded all nemp = 131 JMP publications as HTML files6 and scanned them with statcheck to obtain an estimate for the reporting error rate in JMP. Statcheck extracted a total of K = 1036 NHSTs7 reported in nnhst = 98 articles. Initially, 134 tests were flagged as inconsistent (i.e. reported test statistics and degrees of freedom do not match reported p-values), of which 27 were grossly inconsistent (the reported p-value is < .05 while the recomputed p-value is > .05, or vice-versa). For one paper, a correction had been published, now reporting a consistent p-value. A number of inconsistent tests were marked as being consistent with one-tailed testing. Therefore, we manually checked those papers for any indication that one-tailed tests instead of two-tailed tests were conducted. Four tests were explicitly one-tailed in the corresponding publications, reducing the number to 129 inconsistent NHSTs (12.5% of K), of which

1

Editorials, calls for papers, volume information tables, meeting calendars, and other announcements were excluded. Pilot studies were excluded. 3 https://cos.io/top/; https://opennessinitiative.org; http://www.researchtransparency.org 4 It is, of course, entirely possible that some authors have made their data publicly available without clarifying this in the publication. 5 p-values are recomputed from the reported test statistics and degrees of freedom. Thus, for the purpose of recomputation, it is assumed that test statistics and degrees of freedom are correctly reported, and that any inconsistency is caused by errors in the reporting of p-values. The actual inconsistencies, however, can just as well be caused by errors in the reporting of test statistics and/or degrees of freedom. 6 Due to copyright restrictions, these are not available at https://osf.io/5cvkr/, but will be shared on request. 7 Note that statcheck might not extract NHSTs from figures, tables, supplementary materials, or when their reporting style deviates from the APA guidelines. For further details on the extraction method see Nuijten et al. (2015). 2

Journal of Media Psychology (2017), 29(1), 1–7

Ó 2017 Hogrefe Publishing


Editorial

3

23 (2.2% of K) were grossly inconsistent. Forty-one publications (41.8% of nnhst) reported at least one inconsistent NHST (range 1 to 21), and 16 publications (16.3% of nnhst) reported at least one grossly inconsistent NHST (range 1â&#x20AC;&#x201C;4) (see Figure 1). Thus, a substantial proportion of publications in JMP seem to contain inaccurately reported statistical analyses, of which some might affect the conclusions drawn from them. Types of Errors Many of the inconsistencies are probably clerical errors that do not alter the inferences or conclusions in any way. For example, in 20 cases the authors reported p = .000, which is mathematically impossible (for each of these precomputed < .001). Other inconsistencies might be explained by authors not declaring that their tests were one-tailed (which is relevant for their interpretation). Of course, in many cases we could not determine the source of errors without being able to access the study data or analysis scripts. Although nearly one in six of the papers with NHSTs contain gross inconsistencies potentially affecting reported conclusions, caution is advised when speculating about the causes. As with other inconsistencies, random human error certainly plays an important part. However, with some concern, we observe it is unlikely to be the only cause, as in 19 out of 23 cases, the reported p-values were equal to or smaller than .05 while the recomputed p-values were larger than .05, whereas the opposite pattern was observed in only four cases. Indeed, if incorrectly reported p-values resulted merely from clerical errors, we would expect inconsistencies in both directions to occur at approximately equal frequencies. We acknowledge that before the development of valuable tools like statcheck, there was little awareness of the high prevalence of reporting errors in psychology generally (including media psychology). All of these inconsistencies can easily be detected using the freely available R package statcheck or via www.statcheck.io for those who do not use R. JMP will adopt this practice for all forthcoming papers prior to publication, and we recommend researchers use statcheck for their own manuscripts and for the works of others in their role as reviewers.

Sample Sizes and Statistical Power High statistical power is paramount to reliably detect true effects in a sample and, thus, to correctly reject the null hypothesis when it is false. Further, low power reduces the confidence that a statistically significant result actually reflects a true effect (Button et al., 2013; Schimmack, 2012). A generally low-powered field is more likely to yield

Ă&#x201C; 2017 Hogrefe Publishing

Figure 1. Nested chart of publication characteristics in JMP volumes 20/1 to 28/2. Each square represents one paper. Figure created with the R package waffle (version 0.6.0; Rudis & Gandy, 2016).

Journal of Media Psychology (2017), 29(1), 1â&#x20AC;&#x201C;7


4

Editorial

Table 1. Sample sizes by research design in empirical studies published in JMP (volumes 20/1 to 28/2) Research design

n

MDI

MI

SDI

(Quasi-)experiments Between-subjects Mixed Within-subjects Cross-sectional surveys Longitudinal studies

93 71 12 10 35 8

107 119 54 40 327 378.50

139.71 149.03 73.58 152.90 500.51 595.00

139

259.35

Total

136

MinI

MaxI

130.53 117.34 44.24 240.77 531.07 489.70

18 29 29 18 58 97

748 748 176 666 2582 1536

355.94

18

2582

Notes. n = Number of published studies; MDI = Median sample size; MI = Mean sample size; SDI = Standard deviation of MI; MinI/MaxI = Smallest/largest reported sample size.

Table 2. Average cell sizes and power of (quasi-)experiments published in JMP (volumes 20/1 to 28/2) for different effect sizes βd=.2

1

βd=.5

1

βd=.8

Design

n

MDi/cell

Mi/cell

SDi/cell

Mini/cell

Maxi/cell

1

Between Mixed Within

71 12 10

30 26 40

43.57 34.25 152.90

46.87 21.71 240.77

14.50 14.50 18

374 88 666

12% 11% 23%

48% 42% 87%

86% 81% 99%

Total

93

30.67

54.34

93.18

14.50

666

12%

49%

87%

Notes. n = Number of published studies; MDi/cell = Median cell size; Mi/cell = Mean cell size; SDi/cell = Standard deviation of Mi/cell; Mini/cell/ Maxi/cell = Smallest/largest reported cell size; 1 βd=.2/1 βd=.5/1-βd=.8 = Power to detect small/medium/large differences between cells. For betweensubjects, mixed designs, and total we assumed independent t-tests. For within-subjects designs we assumed dependent t-tests.

unreliable estimates of effect sizes and low reproducibility of results. We are not aware of any previous attempts to estimate average power in media psychology. One obvious strategy for estimating average statistical power is to examine the reported power analyses in empirical research articles. For publications in JMP, however, this is difficult, as searching all papers for the word “power” yielded only a single article reporting an a priori determined sample size. This is not to say media psychologists are generally unaware of the concept of power. In 19 further articles, power is indeed mentioned, in many cases to either demonstrate observed or post-hoc power (which is redundant with reported NHSTs, see e.g. Lakens, 2014), to suggest larger samples should be used in future research, or to explain why an observed nonsignificant “trend” would in fact be significant had the statistical power been higher. Another strategy is to examine the power for different effect sizes, e.g. using Cohen’s (1988) rule of thumb8, given the average sample size (I) found in the literature. The median sample size in JMP is 139 with a considerable range across all experiments and surveys (see Table 1). As in other fields, surveys tend to have healthy sample sizes apt to reliably detect medium to large relationships between variables. The median sample size for survey studies is 327, allowing researchers to detect small bivariate correlations

of r = .1 at 44% power (rs = .3 and .5 both > 99%).9 Longitudinal research exhibits similar characteristics, with a median sample size of 378.50, allowing researchers to detect r = .1 at 49% power (rs = .3 and .5 at > 99%). For experiments (including quasi-experiments), the outlook is a bit different, with a median sample size of 107. To determine average power in experimental designs, two further parameters must be considered: a) the study design (between-subjects or within-subjects), and b) the number of cells (or conditions) realized. Across all types of designs (see Table 2), the median cell size is 30.67. Thus, the average power of experiments published in JMP to detect small differences between conditions (d = .20) is 12%, 49% for medium effects (d = .50), and 87% for large effects (d = .80). Again, we currently do not have reliable estimates of the average true, expected, or even observed effect size in media psychology. But even when assuming the effects examined in the media psychological literature could be as large as to those in social psychology (average d = .43 according to Richard, Bond, & Stokes-Zoota, 2003), our results indicate that the chance that an experiment published in JMP will detect them is worse (at 38%) than flipping a coin – an operation that would also be considerably less expensive. We do not think this is a sustainable way of accumulating scientific knowledge and spending (public) resources.

Small: r = .10/d = .20; Medium: r = .30/d = .50; Large: r = .50/d = .80; α fixed at .05, all tests two-tailed. Power analyses were conducted with the R package pwr (version 1.20; Champely, 2016). 9 Naturally, when anticipating more complex relationships between multiple variables, those numbers are dramatically different. 8

Journal of Media Psychology (2017), 29(1), 1–7

Ó 2017 Hogrefe Publishing


Editorial

The Future Psychological and communication scientists use a wide range of methodologies to enhance our understanding of the role of media in human behavior. Unfortunately, like in other fields of social science (Pashler & Harris, 2012), much of what we think we know may be based on a tenuous empirical foundation. As a first estimation in this field, our analysis of JMP publications indicates that materials and data of few, if any, media psychology reports are openly available, many lack the statistical power required to reliably detect the effects they were set out to detect, and a substantial number contain statistical errors of which some might alter the conclusions the research draws. Although these observations are deeply worrying, they provide some clear guiding points on how to improve our field.

Self-Reflection Our observations could lead readers to believe that we are concerned about the quality of publications in JMP in particular. If anything, the opposite is true, as this journal recently committed itself to a number of changes in its publishing practices to promote open, reproducible, high-quality research. The space provided by the editor-in-chief for our analysis is simply another step in this phase of sincere selfreflection. Similar analyses in other fields suggest that the issues we discuss here go far beyond media psychology (or the JMP): About half of all articles in major psychology journals report at least one inconsistent NHST (Nuijten et al., 2015) and at least one mean value that is inconsistent with the sample size and integer data (Brown & Heathers, 2016). Estimates of average statistical power in social psychology are similar to those in JMP (Fraley & Vazire, 2014), but as low as 18% in neuroscience (Button et al., 2013). Thus, we would like these findings, troubling as they are, to be taken not as a verdict, but as an opportunity for researchers, journals, and organizations to reflect similarly on their own practices and hence improve the field as a whole.

Construction and Testing of Theories One key area which would be improved in response to these challenges is how researchers create, test, and refine psychological theories used to study media. Like other psychology subfields, media psychology is characterized by frequent emergence of new theories which purport to explain phenomena of interest (Anderson, 2016). This generativity may, in part, be a consequence of the fuzzy boundaries between exploratory and confirmatory modes of social sciences research (Wagenmakers, Wetzels, Borsboom, van der Maas, & Kievit, 2012). Ó 2017 Hogrefe Publishing

5

Exploratory work, research meant to introduce new ideas and generate hypotheses, often involves taking a look at the results of a study and flexibly listening to “what the data have to say”. This mode is fundamental to social science work in general and also creatively informs media psychology. However, conclusions drawn from this mode are of limited strength, as they may reflect chance variations in the data. Confirmatory work, by contrast, involves theory testing, a process that requires research questions and hypotheses to be clearly stated in advance of data collection. This mode allows researchers and audiences to trust the results of a study rigorously testing a specific prediction. A problem confronting theories in media psychology is that the boundaries between exploratory and confirmatory work are often blurred. This means young scholars, parents, and policymakers cannot know which studies present tentative results and which ones are worthy of investing limited resources to build on or implement in the real world. Articles in this special issue provide a clear example of how exploratory and confirmatory modes of research can coexist and how science can thrive as a result. Most present both exploratory and confirmatory elements, each clearly labeled. As a result, it is easier to see and understand what the researchers were expecting based on knowledge of the relevant literature, and what they eventually found in their studies. It allows readers to build a clearer idea of the research process and the elements of the studies that came as inspiration after the reviews and the data collection were completed. Both modes of research – studying previously observed phenomena and exploring uncharted territory – benefit from preregistration. Drawing this distinction helps the reader determine which hypotheses carefully test ideas derived from theory and previous empirical, and it liberates exploratory research from the pressure to present an artificial hypothesis-testing narrative. The Registered Reports model is also an effective countermeasure against psychology’s aversion to statistically nonsignificant or “null” results, as study protocols are reviewed and accepted before results are known. Further, it ensures that p-values can be meaningfully interpreted, given these lose their meaning in data exploration as the Type I error inflation is unknown (Wagenmakers et al., 2012). If adopted by media psychologists, this approach could allow us to rigorously test and extend promising theories, and to retire theories which do not reliably account for observed data.

Increasing the Value of Media Psychology With Open Science Tools As technology experts, media psychology researchers are well positioned to use and study new tools which shape Journal of Media Psychology (2017), 29(1), 1–7


6

Editorial

our science. A range of new Internet-based platforms have been built by scientists and engineers at the Center for Open Science, including their flagship, the OSF (http:// www.osf.io), and preprint services like PsyArXiv (http:// www.psyarxiv.com) and SocArXiv (http://www.socarxiv. com). Designed to work with scientists’ existing research flows, these tools can help prevent data loss due to hardware malfunctions, misplacement, or relocations of researchers, while enabling scientists to claim more credit by allowing others to use and cite their materials, protocols, and data.10

The High Stakes Facing Media Psychology Like psychological science as a whole, media psychology faces a pressing credibility gap. Unlike some other areas of psychological inquiry, however, media research – whether concerning the Internet, video games, or film – speaks directly to everyday life in the modern world. It affects how the public forms their perceptions of media effects (Przybylski & Weinstein, 2016), and how professional groups and governmental bodies make policies and recommendations (Council on Communications and Media, 2016). In part because it is key to professional policy, empirical findings disseminated to caregivers, practitioners, and educators should be built on an empirical foundation with sufficient rigor. If policy makers and the public are to value our views as experts, we must take steps to demonstrate this trust is warranted. Such challenges and high stakes are by no means unique to media psychology. Indeed, in medical and drug research, the study registration movement (Goldacre & Gray, 2016) has lead the way with publicly accessible registries that include all the studies, published or not, conducted in a research area. To build good faith for the general public, industry collaborators, and policy makers, we propose the creation of a registry (https://osf.io/registries/) for confirmatory media psychology research. Creative exploratory research (i.e., theory building) would continue as it does now, but confirmatory work (i.e., theory testing) could be registered in a central repository so that its results – positive, null, or negative – would be available for scrutiny. This would also allow media psychologists to educate the general public and journalists about the distinction between exploratory and confirmatory research. Through such a public registry, researchers and policy makers could quickly determine which evidence is promising (though tentative), and which conclusions are suitable as the basis for interventions, policy decision making, caregiver guidance, or new products. 10

Closing Thoughts We are, on balance, optimistic that media psychologists can meet these challenges and lead the way for psychologists in other areas. This special issue and the registered reports submission track present an important step in this direction and we thank the JMP editorial board, our expert reviewers, and of course, the dedicated researchers who devoted their limited resources to this effort. The promise of building an empirically-based understanding of how we use, shape, and are shaped by technology is an alluring one. We firmly believe that incremental steps taken towards scientific transparency and empirical rigor will help us realize this potential. Acknowledgments We sincerely thank Charleen Brand and Hannah Borgmann for their invaluable assistance with the data preparation for this editorial.

References Anderson, J. A. (2016). Communication descending. International Communication Gazette, 78(7), 612–620. doi: 10.1177/ 1748048516655708 Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafò, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365–376. doi: 10.1038/ nrn3475 Breuer, J., Velez, J., Bowman, J., Wulf, T., & Bente, G. (2017). “Drive the lane; together, hard!” An examination of the effects of supportive co-playing, and task difficulty on prosocial behavior. Journal of Media Psychology, 29, 31–41. doi: 10.1027/ 1864-1105/a000209 Brown, N. J. L., & Heathers, J. A. J. (2016). The GRIM Test: A simple technique detects numerous anomalies in the reporting of results in psychology. Social Psychological and Personality Science. doi: 10.1177/1948550616673876 Champely, S. (2016). pwr: Basic functions for power analysis. Retrieved from http://cran.r-project.org/package=pwr Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates. Council on Communications and Media. (2016). Media and young minds. Pediatrics, 138(5), e20162591. doi: 10.1542/peds.2016-2591 Elson, M., Przybylski, A. K., & Krämer, N. C. (2015). Technology and human behavior: A preregistered special issue of the Journal of Media Psychology. Journal of Media Psychology, 27(4), 203–204. doi: 10.1027/1864-1105/a000170 Epskamp, S., & Nuijten, M. B. (2015). statcheck: Extract statistics from articles and recompute p values. Retrieved from https:// cran.r-project.org/package=statcheck Fraley, R. C. & Vazire, S. (2014). The N-pact factor: Evaluating the quality of empirical journals with respect to sample size and statistical power. PLOS ONE, 9(10), e109019. doi: 10.1371/ journal.pone.0109019

A public repository for media psychology research materials is already in place at https://osf.io/wzb32/

Journal of Media Psychology (2017), 29(1), 1–7

Ó 2017 Hogrefe Publishing


Editorial

Goldacre, B., & Gray, J. (2016). OpenTrials: Towards a collaborative open database of all available information on all clinical trials. Trials, 17(1), 164. doi: 10.1186/s13063-016-1290-8 Holz Ivory, A., Ivory, J.D., & Lanier, M. (2017). Video game use as risk exposure, protective incapacitation, or inconsequential activity among university students: Comparing approaches in a unique risk environment. Journal of Media Psychology, 29, 42–53. doi: 10.1027/1864-1105/a000210 Kidwell, M. C., Lazarević, L. B., Baranski, E., Hardwicke, T. E., Piechowski, S., Falkenberg, L.-S., . . . Nosek, B. A. (2016). Badges to acknowledge open practices: A simple, low-cost, effective method for increasing transparency. PLoS Biology, 14(5), e1002456. doi: 10.1371/journal.pbio.1002456 Lakens, D. (2014). Observed power, and what to do if your editor asks for post-hoc power analyses. The 20% Statistician. Retrieved from http://daniellakens.blogspot.de/2014/12/ observed-power-and-what-to-do-if-your.html Nuijten, M. B., Hartgerink, C. H. J., van Assen, M. A. L. M., Epskamp, S., & Wicherts, J. M. (2015). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods. doi: 10.3758/s13428-015-0664-2 Pashler, H., & Harris, C. R. (2012). Is the replicability crisis overblown? Three arguments examined. Perspectives on Psychological Science, 7(6), 531–536. doi: 10.1177/ 1745691612463401 Przybylski, A. K., & Weinstein, N. (2016). How we see electronic games. PeerJ, 4, e1931. doi: 10.7717/peerj.1931 R Core Team. (2016). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from https://www.r-project.org Richard, F. D., Bond, C. F., & Stokes-Zoota, J. J. (2003). One hundred years of social psychology quantitatively described. Review of General Psychology, 7(4), 331–363. doi: 10.1037/ 1089-2680.7.4.331 Rudis, B., & Gandy, D. (2016). waffle: Create waffle chart visualizations in R. Schimmack, U. (2012). The ironic effect of significant results on the credibility of multiple-study articles. Psychological Methods, 17(4), 551–566. doi: 10.1037/a0029487 Soliman, M., Peetz, J., & Davydenko, M. (2017). The impact of immersive technology on nature relatedness and pro-environmental behavior. Journal of Media Psychology, 29, 8–17. doi: 10.1027/1864-1105/a000213 Steineman, S.T., Iten, G.H., Opwis, K., Forse, S.F., Frasseck, L., & Mekler, E.D. (2017). Interactive narratives affecting social change: A closer look at the relationship between interactivity and prosocial behavior. Journal of Media Psychology, 29, 54–66. doi: 10.1027/1864-1105/a000211 van der Zee, T., Admiraal, W., Paas, F., Saab, N., & Gisbers, B. (2017). Effects of subtitles, complexity, and language proficiency on learning from online education videos. Journal of Media Psychology, 29, 18–30. doi: 10.1027/18641105/a000208 Vermeulen, I., Beukeboom, C. J., Batenburg, A., Avramiea, A., Stoyanov, D., van de Velde, B., & Oegema, D. (2015). Blinded by the light: How a focus on statistical “significance” may cause p-value misreporting and an excess of p-values just below .05 in communication science. Communication Methods

Ó 2017 Hogrefe Publishing

7

and Measures, 9(4), 253–279. doi: 10.1080/19312458.2015. 1096333 Wagenmakers, E.-J., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632–638. doi: 10.1177/1745691612463078 Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61(7), 726–728. doi: 10.1037/ 0003-066X.61.7.726

Malte Elson Educational Psychology Research Group Department of Education Ruhr University Bochum Universitätsstr. 150 44801 Bochum Germany malte.elson@rub.de Andrew K. Przybylski Department of Experimental Psychology University of Oxford 1 St Giles' Oxford, OX1 3JS UK andy.przybylski@oii.ox.ac.uk

Malte Elson (PhD) is a behavioral psychologist and postdoc in the Educational Psychology Research Group at Ruhr University Bochum. He studies human learning in various contexts, such as the contingencies of behaviors in academic research (meta science), human interaction with technology, and effects of entertainment media.

Andrew Przybylski (PhD) is a senior research fellow based at the Oxford Internet Institute and Department of Experimental Psychology at the University of Oxford. His research focuses on applying motivational theory to understand the universal aspects of video games and social media that draw people in, the role of game structure and content on human aggression, and the factors that lead to successful versus unsuccessful self-regulation of gaming contexts and social media use.

Journal of Media Psychology (2017), 29(1), 1–7


Pre-Registered Report

The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior Monica Soliman, Johanna Peetz, and Mariya Davydenko Psychology Department, Carleton University, Canada

Abstract: Those who feel connected to nature tend to be more likely to engage in pro-environmental behavior. How can this connection with nature be created? We examined whether viewing nature-related videos – specifically, the immersiveness of the technological devices used to display these videos – can enhance connection with nature and increase pro-environmental behavior. Participants watched videos of either natural or built environments through a head-mounted display (immersive technology) or a regular computer screen. We predicted that watching a nature video would enhance nature relatedness and pro-environmental behaviors, particularly when presented with immersive technology than with a traditional computer monitor. There was limited support for the hypotheses; watching the nature video significantly enhanced nature relatedness but not pro-environmental behaviors. The type of technology used did not influence the effect of the videos. Keywords: immersive technology, presence, nature, environmental behavior

Over the last few decades, knowledge and concern about environmental problems have increased, and yet many people fail to translate their knowledge of these problems into responsible environmental actions (Bashir, Wilson, Lockwood, Chasteen, & Alisat, 2014; Pelletier, Dion, Tuson, & Green-Demers, 1999). Perhaps one of the main reasons why people in modern-day society are not sufficiently engaged in pro-environmental actions is their detachment from the natural world. The majority of the world population lives in urban settings and the number is expected to grow (Montgomery, 2007). Hence, many individuals worldwide may not have the opportunity to spend time in nature. Indeed, research shows that having direct contact with nature tends to increase the degree to which people feel related to, or connected with, nature (Mayer, Frantz, Bruehlman-Senecal and Dolliver, 2009; Nisbet & Zelenski, 2011; Schultz & Tabanico, 2007); and that nature relatedness captures many of the predictors of environmentally responsible behavior (Nisbet, Zelenski, & Murphy, 2009). Technology has the potential to be a useful tool for bringing nature closer to individuals in urban settings. Some forms of technology – such as 3D videos and/or head-mounted displays – may provide a particularly immersive viewing experience and enhance the realism of virtual environments to resemble direct contact with nature. The role of immersive technology in influencing Journal of Media Psychology (2017), 29(1), 8–17 DOI: 10.1027/1864-1105/a000213

nature relatedness and environmental behavior is important to study. In the present research, we examine whether the immersiveness of the technology used to watch videos of natural (vs. built) environments can affect nature relatedness and pro-environmental behavior.

Nature Relatedness and Pro-Environmental Behavior Over the past decade, psychologists have started exploring the concept of nature relatedness. A number of self-report questionnaires have been created to capture individual differences in the extent to which people associate themselves with the natural environment such as the Connectedness to Nature Scale (Mayer & Frantz, 2004), the Connectivity With Nature Scale (Dutcher, Finley, Luloff, & Johnson, 2007), the Nature Relatedness Scale (Nisbet et al., 2009), and the inclusion of nature in the self measure (Schultz, 2002). The degree to which people feel connected to nature tends to be a robust predictor of happiness (e.g., Zelenski and Nisbet, 2014; for a review see Capaldi, Dopko, & Zelenski, 2014) and engagement in environmental behavior (e.g., Dutcher et al., 2007; Nisbet et al., 2009). Mayer and Frantz (2004) explain the latter relationship by suggesting that if people feel that they Ó 2017 Hogrefe Publishing


M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

are connected to nature, harming the environment would be akin to inflicting harm on the self. Although nature relatedness has been generally conceptualized as an individual difference, it can shift in the moment. Several empirical studies attest to the malleability of self-nature associations. Spending time in nature (e.g., a short walk) was shown to increase connections with nature as well as positive affect (Mayer et al., 2009; Nisbet & Zelenski, 2011; Schultz & Tabanico, 2007). There is some evidence that exposure to virtual nature can also have some restorative effects such as decreased stress and reduced negative affect (e.g., De Kort, Meijnders, Sponselee, & IJsselsteijn, 2006; Kjellgren & Buhrkall, 2010; Valtchanov, Barton, & Ellard, 2010). In experiments comparing the effects of real and virtual nature, Mayer et al. (2009) found that spending time experiencing nature virtually (e.g., by watching a video of nature) can also increase connectedness to nature but to a smaller degree than experiencing real nature. Given that many individuals – particularly those living in urban settings – may not always have the opportunity to be in direct contact with nature, virtual nature experienced through technology may be a promising avenue for boosting nature relatedness and pro-environmental behavior. Indeed, there is some evidence supporting the idea that watching nature videos may boost sustainable behavior. Zelenski, Dopko, and Capaldi (2015) found that watching a video of nature (vs. built or neutral control stimuli) can increase sustainable behavior in the context of a commons dilemmas game. Zelenski et al. (2015) also found that videos of nature increased nature relatedness, but this effect was only apparent in one of two studies. Because the effects of watching videos of nature on nature relatedness and environmentally responsible behavior have been generally small and inconclusive (Mayer et al., 2009; Zelenski et al., 2015), more empirical research is needed on experiencing nature in virtual environments and its potential role. We propose that using immersive technology can be one way of bridging the gap between experiencing actual and virtual nature. Immersive technology tends to enhance the realism of virtual experiences. Hence, watching a nature-related video via immersive technology may be powerful enough to produce a strong effect on nature relatedness and thereby change actual behavior.

The Impact of Immersive Technology Historically, technological advancements have enabled social science researchers to conduct research more effectively and efficiently. The development of immersive virtual environment technology presents immense

Ó 2017 Hogrefe Publishing

9

opportunities for psychologists. Researchers are now able to create realistic virtual environments and to employ sophisticated devices that enable participants to be immersed in such environments and to experience them as though they were real (Blascovich et al., 2002; Fox, Arena, & Bailenson, 2009). A call for greater use of immersive technology to enhance the mundane realism of experiments and strengthen effects of experimental manipulations (Blascovich et al., 2002) remains relevant today. Immersive technology (e.g., head-mounted displays, computer-generated kinesthetic and tactile feedback) tends to increase the users’ subjective sense of presence – defined as “the subjective experience of being in one place or environment, even when one is physically situated in another” (Witmer & Singer, 1998, p. 225) – compared with less immersive forms of technology (e.g., traditional computer monitors; Baños et al., 2004; De Kort et al., 2006; Moreno & Mayer, 2002). Researchers have examined consequences of experiencing a sense of presence in virtual environments such as video games (Skalski, Tamborini, Shelton, Buncher, & Lindmark, 2011), learning environments (Moreno & Mayer, 2002), and advertising contexts (Li, Daugherty, & Biocca, 2002). However, most of this research examined potential affective (e.g., enjoyment) and cognitive (e.g., memory) consequences of experiencing presence, and yielded mixed results. One study (De Kort et al., 2006) showed that increasing immersiveness of technology by using larger monitors to display a nature video can influence physiological responses to that video, with more immersive technology leading to greater restorative effects of watching the video. Further research is needed to understand behavioral consequences of experiencing presence in virtual environments. Can experiencing an enhanced sense of presence when using immersive technology actually change human behavior outside of the virtual environment? We examine the impact of immersive technology in the domain of environmental connectedness and pro-environmental behavior. To our knowledge, this is the first study to experimentally assess the effect of using immersive technology (head-mounted display vs. traditional desktop display) when viewing nature on environmental attitudes and behavior. If immersive technology enhances connectedness with nature and pro-environmental behaviors after viewing nature videos, such a finding would underscore the importance of the type of technology chosen when presenting stimuli. If immersive technology makes no difference to connectedness with nature and proenvironmental behavior, such a finding would underscore the validity of purely online or less immersive research procedures.

Journal of Media Psychology (2017), 29(1), 8–17


10

M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

The Present Research The purpose of the present research is two-fold: First, we would like to provide a replication of the effects of nature-related videos on nature relatedness (Mayer et al., 2009; Zelenski et al., 2015). Second, we would like to assess whether using immersive technologies would enhance such effects. Specifically, participants were randomly assigned to watch a video of nature or a built environment. We also varied the degree of the immersion in the virtual environment by having half the participants view the video on a traditional computer monitor and the other half use a head-mounted display. We expected participants to report greater connectedness with nature after watching a video of a natural (vs. built) environment (Hypothesis 1). Moreover, immersive technology was expected to enhance the effect of watching the nature video on nature relatedness (Hypothesis 2). Similarly, we expected participants to engage in more pro-environmental behaviors after watching a video of a natural (vs. built) environment (Hypothesis 3) and we expected immersive technology to enhance the effect of watching the nature video on environmental behavior (Hypothesis 4).

Method Participants In all, 230 undergraduate students from a Canadian university participated in this study in exchange for partial course credit. This sample size was determined based on the 2  2 between-subject design. Based on calculations in G*Power 3.1 (Faul, Erdfelder, Buchner, & Lang, 2009), a sample size of 199 participants should be large enough to detect a small-to-medium sized effect (an effect size of 0.20) with over 80% power for analyses like the ones outlined here (see Electronic Supplementary Material, ESM 1). We assumed a small-to-medium effect as a conservative benchmark, given that documented effect sizes of the effects of nature exposure vary widely and effect sizes for behavioral effects are usually smaller. For example, effect sizes of nature exposure that were below 0.20 were typically deemed to be small and insignificant (e.g., Zelenski et al., Study 2). Our power was calculated to detect the smallest effect size of interest. We recruited 230 participants to account for the possibility of attrition or drop-out. Three participants withdrew their data from the study. Thus, the final sample included 227 participants

1

(67 male, 158 female, 1 other, 1 undisclosed, Mage = 21.20, SD = 6.42).1

Procedure and Materials Participants were randomly assigned to one of four conditions in a 2 (video: nature vs. built)  2 (display device: desktop screen vs. head-mounted display) between-subject design. The materials used in the study are presented in Electronic Supplementary Material, ESM 2. After providing consent and reporting demographics (age, gender) participants viewed a 4-min video. The nature video depicted various landscape scenes (e.g., forests, mountains, rivers, and wildlife) whereas the built environment video depicted various scenes from a city (e.g., vehicles, skyscrapers, bridges, and crowds). The videos were equal in duration and featured accompanying audio consistent with being in a natural (e.g., birds chirping) or built (e.g., vehicles honking their horns) environment. Pilot testing of sample videos with similar content indicated that the two settings were equally pleasant and fun to watch (Davydenko & Peetz, 2015). Participants in the low-immersion condition viewed the video on a regular desktop screen with speakers, whereas those in the high-immersion condition viewed the same video using a virtual reality head-mounted display with headphones. The participants in the low-immersion condition watched the video on a DELL E228WFPc LCD 22” monitor with 1,680  1,050 resolution. The participants in the high-immersion condition wore a Sony HMZT2 display, which features a 45° field of view of a virtual screen. The resolution of the display was 1,280  720 with an aspect ratio of 16:09. The rest of the procedure was the same in all conditions. Specifically, participants rated their attitudes toward nature, mood, and experience and evaluation of the video. Then, they completed behavioral measures, as outlined in the next section. Questionnaires were answered using the same medium (i.e., computer-based surveys) in all conditions to ensure that differences in responses cannot be attributed to the technology used to record ratings. The only difference between immersion conditions was the technology used for presenting the video stimuli. First, participants reported their attitudes about nature in two measures. They reported the degree to which nature was important to their self-concept. This inclusion of nature in the self measure (Schultz, 2002) consists of pairs of circles that overlap to varying degrees with one another. In each pair, one circle is labeled self and the other is labeled nature. Participants were asked to select the pair of circles that best represents how connected they felt to

The data, syntax, and additional analyses files are available for download through the Open Science Framework (OSF): https://osf.io/76796/? view_only=14b7ae2e02454663bead8a8e1e480654

Journal of Media Psychology (2017), 29(1), 8–17

Ó 2017 Hogrefe Publishing


M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

nature. This measure was used in previous research (e.g., Schultz, 2001; Zelenski and Nisbet, 2014). It was adapted from the Inclusion of Other in Self Scale (Aron, Aron, & Smollan, 1992; Aron, Aron, Tudor, & Nelson, 1991), which has been widely used in psychological research and has been well validated. Participants also rated their connectedness to nature using a 14-item scale (Mayer & Frantz, 2004) on Likert scales from 1 = strongly disagree to 5 = strongly agree. This scale includes items such as “I often feel a kinship with animals and plants” and had good reliability in the present study (Cronbach’s α = .83). Second, as a measure of mood, participants rated the extent to which they felt happy and sad (reverse-scored) right in that moment on Likert scales from 1 = not at all to 5 = extremely. These items correlated (r = .43, p < .001) and were averaged to indicate positive mood. Third, we measured participants’ experience while watching the video. Specifically we assessed participants’ sense of presence while watching the videos; we used an adapted version of a presence scale developed by Witmer and Singer (1998). The adapted scale includes 13 items (e.g., “How completely were all of your senses engaged?”), rated on 7-point Likert scales. Some of the items from the original scale were not relevant to our procedure (e.g., items assessing haptic experiences) and were thus excluded. This measure was included for exploratory purposes only; analyses involving this measure are presented on the Open Science Framework (see Footnote 1). In addition to their sense of presence in the presented video, participants also rated how fun the video was to watch and how pleasant it was on 5-point Likert scales. Finally, the experimenter assessed participants’ engagement in three pro-environmental behaviors. First, the experimenter asked if participants wanted to receive a copy of the debriefing form in a hard-copy format or as an e-mail (paper-saving format = pro-environmental choice). Second, the experimenter asked participants if they wanted to sign up for a monthly nature and sustainability newsletter (yes = pro-environmental choice). Participants were told that this online newsletter contains information about practical tips on sustainable living. Past research indicated that acquiring knowledge about environmental issues and sustainable living is one form of pro-environmental behavior and it tends to correlate significantly with other forms of pro-environmental behaviors (Bashir et al., 2014; El Gamal, Wilson, Schuett, & Courneya, 2014). Third, the experimenter asked participants whether they wanted to receive the download link for the campus sustainability strategic plan (yes = pro-environmental choice). The experimenter recorded the number of pro-environmental choices 2

11

that participants make (ranging between 0 and 3). Finally, participants were debriefed.

Results Confirmatory Analyses To test our main hypotheses, we conducted 2 (video: nature vs. built)  2 (display device: desktop screen vs. headmounted display) ANOVAs examining the effects of our experimental manipulations on nature connectedness and pro-environmental behaviors. The two measures assessing nature connectedness – the Connectedness to Nature Scale (Mayer & Frantz, 2004) and the Inclusion of Nature in the Self Scale (Schultz, 2002) – were only moderately correlated, r = .56, p < .001. Thus, the two measures were analyzed separately. The number of environmental choices made was positively correlated with the inclusion of nature in the self (r = .26, p < .001) and with connectedness to nature (r = .29, p < .001). Zero-order correlations between the variables are presented in Electronic Supplementary Material, ESM 3. Attitudes Toward Nature Consistent with Hypothesis 1, participants reported including nature into their self-concept more (Schultz, 2002) after watching a video of a natural (vs. built) environment, M = 4.63, SD = 1.46 vs. M = 4.12, SD = 1.55 respectively, F(1, 223) = 6.70, p = .010, d = 0.34, η2p = 0.03.2 However, the main effect of the display device and the video  display device interaction effect were not significant, F(1, 223) = 0.001, p = .977, d = 0, η2p = 0 and F(1, 223) = 0.004, p = .952, η2p = 0, respectively (see Figure 1). Thus, the type of device did not influence inclusion of nature in the self. Similarly, on the Connectedness to Nature Scale (Mayer & Frantz, 2004), participants reported greater connectedness with nature after watching a video of a natural (vs. built) environment, M = 3.50, SD = 0.61 vs. M = 3.30, SD = 0.65, F(1, 223) = 5.63, p = .018, d = 0.32, η2p = 0.03. Again, the main effect of the display device and the interaction were not significant, F(1, 223) = 0.33, p = .564, d = 0.08, η2p = 0 and F(1, 223) = 0.15, p = .698, η2p = 0, respectively (see Figure 2). In other words, when examining attitudes toward nature, the content of the video mattered (supporting Hypothesis 1), but the type of technology used to deliver it did not (not supporting Hypothesis 2). Pro-Environmental Behaviors The number of pro-environmental choices that participants made was not significantly different in the natural

Cohen’s d was calculated for main effects after running the ANOVA based on the means and standard deviations of the relevant groups. The omnibus effect size, partial eta-squared, was computed in SPSS.

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 8–17


12

M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

Figure 1. Frequency distribution of inclusion of nature in the self in each experimental condition.

environment condition (M = 2.20, SD = 0.86) than in the built environment condition (M = 2.00, SD = 0.87), F(1, 222) = 2.97, p = .086, d = 0.23, η2p = 0.01. The main effect of display device and the video  display device interaction were also not significant, F(1, 222) = 0.47, p = .495, d = 0.09, η2p = 0 and F(1, 222) = 0.32, p = .574, η2p = 0, respectively (see Figure 3). In other words, neither the content of the video nor the type of technology used to 3

deliver it had a significant influence on environmental behavior (not supporting Hypotheses 3 and 4).3

Exploratory Analyses Next we examined additional variables that were not part of the main hypotheses but that represent aspects in which the videos or display devices might have differed.

There were two unforeseen limitations in the measure of environmental behavior. First, a Shapiro-Wilk test indicated that the data deviated significantly from a normal distribution, p < .001. Second, the relationships between the three environmental behaviors were not as strong as one would expect, suggesting a low reliability of the three-item environmental behavior measure: There was a significant positive relationship between choosing to receive the monthly newsletter and the sustainability plan (ϕ = .47, p < .001). However, choosing to receive the debriefing form by e-mail (vs. on paper) was not significantly related to the choice of receiving the monthly newsletter (ϕ = .12, p = .065) or downloading the sustainability plan (ϕ = .02, p = .743). To address these limitations, further analyses were conducted (logistic regression analyses) and are presented on the OSF: https://osf.io/76796/?view_only=14b7ae2e02454663bead8a8e1e480654

Journal of Media Psychology (2017), 29(1), 8–17

Ó 2017 Hogrefe Publishing


M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

13

Figure 2. Frequency distribution of nature relatedness in each experimental condition.

Mood A 2 (video: nature vs. built)  2 (display device: desktop screen vs. head-mounted display) MANOVA with the two mood items as dependent variables indicated that participants’ mood was not affected by the type of video they watched, F(2, 215) = 0.07, p = .929, η2p = 0, the type of display device, F(2, 215) = 1.72, p = .181, η2p = 0.02, or their interaction, F(2, 215) = 1.82, p = .164, η2p = 0.02. This null effect of video content on mood was surprising given the well-established association in the literature between nature and happiness (Capaldi et al., 2014).

Ó 2017 Hogrefe Publishing

Video Properties We conducted 2 (video: nature vs. built)  2 (display device: desktop screen vs. head-mounted display) ANOVAs examining effects of the experimental manipulations on ratings of how fun and how pleasant the videos were to watch. Type of video, F(1, 222) = 0, p = .972, d = 0, η2p = 0, type of device, F(1, 222) = 2.09, p = .150, d = 0.19, η2p = 0.01, or their interaction term, F(1, 222) = 0.03, p = .855, η2p = 0, did not affect how fun the video was to watch. Likewise, the type of device, F(1, 222) = 1.53, p = .217, d = 0.15, η2p = 0.01, and the

Journal of Media Psychology (2017), 29(1), 8–17


14

M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

Figure 3. Frequency distribution of number of environmental behaviors chosen in each experimental condition.

video  display device interaction, F(1, 222) = 0.04, p = .851, η2p = 0, did not affect how pleasant the video was to watch. However, there was a significant main effect of the type of video, F(1, 223) = 23.69, p < .001, d = 0.65, η2p = 0.10, indicating that the nature video (M = 4.70, SD = 0.59) was more pleasant than the video depicting an urban environment (M = 4.21, SD = 0.89). This difference in video ratings was unexpected, given that the sample videos were rated as similarly fun and similarly pleasant in a pilot study with a sample drawn from the same population (Davydenko & Peetz, 2015).

Discussion In this study, we examined the effects of watching videos of natural (vs. built) environments on nature attitudes and environmental behavior. By varying the type of technology Journal of Media Psychology (2017), 29(1), 8–17

used to present these videos (immersive head-mounted display vs. traditional desktop monitor), we sought to examine whether immersive technology can enhance effects of viewing nature videos on nature relatedness and on environmental behavior. The results of the study provided limited support for the hypotheses. Exposure to virtual nature increased the degree to which participants felt connected to nature compared with exposure to built environments. However, exposure to videos of nature (vs. built environments) did not meaningfully alter the propensity to make environmentally responsible choices. The type of technology used (and specifically, how immersive it is designed to be) did not influence the effect of watching these videos on either nature connectedness or environmental behaviors. This research contributes to the body of work on nature relatedness and environmental behavior. Whereas the effects of exposure to real offline natural environments Ó 2017 Hogrefe Publishing


M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

has been well established (e.g., Mayer et al., 2009; Nisbet & Zelenski, 2011; Schultz & Tabanico, 2007), the effects of exposure to virtual nature have been much less studied (Mayer et al., 2009; Zelenski et al., 2015) with mixed results. The present research provides further support that virtual nature may offer a promising avenue for bringing the benefits of nature closer to people living in urban environments who may have little opportunity to get frequent direct contact with nature. The study is also the first to test whether exposure to virtual nature can increase actual pro-environmental behavior outside the virtual environment, as past research to date has only looked at effects on performance in a computer game (Zelenski et al., 2015). Watching the video depicting the natural environment did not meaningfully change the number of environmental choices that participants made, despite its effect on attitudes toward nature. This suggests a possible attitude–behavior gap whereby increased connectedness with nature following exposure to virtual nature may not always translate into more pro-environmental behaviors. It is also possible that effects of virtual nature on actual pro-environmental behavior may be smaller than previously assumed, and that a larger sample – than the sample in the present research – would be needed to detect such a small effect. A challenge for future research would be to identify conditions in which virtual nature can impact behavior (e.g., duration of exposure, type and familiarity of natural environment, etc.). This study also contributes to research on immersive technology. We explored the effects of using immersive forms of technology such as head-mounted displays with headphones as opposed to more simple desktop screens with speakers. Contrary to expectations, the use of headmounted display devices did not meaningfully alter the psychological effects of the videos on nature connectedness or pro-environmental behavior. The study also contributes to emerging research on immersive technology and environmental behavior (Ahn, Bailenson, & Park, 2014; Bailey et al., 2014). This research tested the role of viewing an avatar performing nonsustainable behaviors – such as consuming coal to heat water or cutting down a tree – using a virtual reality headmounted display on directly relevant behaviors (heating up water and using paper, respectively). In contrast to the findings of the present research, this past work demonstrated contexts where using virtual reality technology can have an impact on environmental behaviors. The discrepancy in findings may be attributed to a number of differences between the studies. First, the content of the videos in this past research depicted specific actions that harmed the environment such as cutting a tree, thus triggering embodied experiences of these actions. The content of the videos used in the present research depicted a natural Ó 2017 Hogrefe Publishing

15

environment, for which the underlying psychological mechanism – embodiment – is less applicable. Second, the behavior assessed in the previous research was closely tied to the content of the video (cutting a tree vs. using paper). Perhaps this direct link is necessary for virtual experiences to translate into meaningful real-world behavior. Third, the head-mounted displays used in these past studies enabled participants to interact with the virtual environment and to experience vibrations corresponding to actions happening in that environment. These additional features of the display device may have further enhanced immersion in the virtual experience, thus strengthening its effects. Of course, it is difficult to conclude which of these factors causally played a role in the differences in research findings. This line of work is still at its nascent stages, and more research is needed to establish the replicability of these past findings with larger sample sizes. Future research should also systematically vary the factors outlined here in an experimental setting to see if they moderate effects of immersive technology on environmental behavior. The present research assessed actual choices that participants made when having options that vary in the degree to which they are pro-environmental (saving paper vs. using a hard copy; acquiring more knowledge about nature and sustainability vs. rejecting the resources offered). These behaviors were not directly linked to the content of the videos. Exploring an even wider array of environmental behaviors might uncover additional interactions in which virtual environments may have a greater bearing on behavior where the link with the natural world is more salient (e.g., resource conservation). Nowadays, there is great variability in the types of technological devices designed to provide immersive viewing experiences, and the level of immersion resulting from these devices can vary considerably along a continuum. For instance, larger screens are designed to promote greater immersion relative to smaller computer monitors (De Kort et al., 2006). Head-mounted displays are among the devices designed to immerse viewers, but they can also vary considerably in terms of the features they offer (e.g., resolution, two-dimensional vs. three-dimensional viewing, ability to interact with the virtual environment). One possibility is that participants’ experiences with the videos may have been impacted by their expectations of the headmounted display used. Indeed, a few participants in the head-mounted display condition commented on having had greater expectations of the device relative to their actual experience. For example, they had the expectation that they would be able to manipulate the environment or see things in a true three-dimensional space when using the head-mounted display; in reality, the video was not three-dimensional and the device did not enable them to control aspects of the virtual environment. The gap Journal of Media Psychology (2017), 29(1), 8–17


16

M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

between participants’ expectations and their experience may have resulted in unanticipated consequences such as dissatisfaction with the environment that they have encountered, thus, counteracting any potential benefits of the device. Future research may consider assessing participants’ expectations prior to the viewing experience. More broadly, the question of expectations of virtual reality devices warrants further investigation: If modern day users have greater expectations of a given technological device than what the device actually offers, how does that impact the user’s experience and subsequent downstream psychological effects? It is also possible that using more advanced displays in which the viewer can control some aspects of the virtual environment would be more effective. The findings also highlight the important distinction between the designer’s intentions and the user’s experience. Some technological devices are designed to provide an immersive viewing experience but the way in which different users experience the same device may vary considerably. It is possible that some people may experience virtual exposure to natural environments as fake and artificial regardless of the medium used to present it. Future research can examine potential moderators of the psychological effects of exposure to virtual environments, as people may vary in terms of their attitudes toward new technology and their responsiveness to it. For instance, some people may perceive technology to be a root cause in sustainability problems and in people’s lack of connection with nature. Virtual nature is unlikely to be effective in this case. Participants in the present research were all undergraduate students. It is possible that this sample was very familiar with technology, including head-mounted displays. Future research should examine whether benefits of virtual nature could be experienced to the same degree among individuals who may be less familiar with technology (e.g., older adults) and across a variety of research samples in order to assess the generalizability of these findings.

Conclusion The development of immersive technology offers a world of opportunity for psychologists: both as a tool and as an area of investigation. Using immersive technology as a tool in research can increase realism in an experimental setting while maintaining experimental control; hence, such technologies can play a significant role in facilitating replication initiatives (Blascovich et al., 2002). The present study presented a preliminary step in this area of research showing that virtual environments changed attitudes such as nature relatedness. Furthermore, this study suggests that technologies intended to be more immersive may not provide an advantage over less advanced technology and Journal of Media Psychology (2017), 29(1), 8–17

did not, at least in this case, confer any additional impact to the effect of the virtual environment. Future technologies may, however, bridge the gap between exposures to actual versus virtual environments more completely. Acknowledgments The authors would like to thank Adrienne Paynter, Jonathan Capaldi, Josh Hulley-Carroll, and Taylor Apperley for their assistance with the data collection. Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at http://dx.doi.org/10.1027/ 1864-1105/a000213 ESM 1. Text (PDF). Protocol of power analysis. ESM 2. Figure, Text (PDF). Study materials. ESM 3. Table (PDF). Zero-order correlations between nature attitudes, behavior, mood, and video properties.

References Ahn, S. J., Bailenson, J. N., & Park, D. (2014). Short- and long-term effects of embodied experiences in immersive virtual environments on environmental locus of control and behavior. Computers in Human Behavior, 39, 235–245. doi: 10.1016/ j.chb.2014.07.025 Aron, A., Aron, E. N., & Smollan, D. (1992). Inclusion of other in the self scale and the structure of interpersonal closeness. Journal of Personality and Social Psychology, 63, 596–612. Aron, A., Aron, E. N., Tudor, M., & Nelson, G. (1991). Close relationships as including other in the self. Journal of Personality and Social Psychology, 60, 241–253. Bailey, J. O., Bailenson, J. N., Flora, J., Armel, K. C., Voelker, D., & Reeves, B. (2014). The impact of vivid messages on reducing energy consumption related to hot water use. Environment and Behavior, 47(5), 570–592. Baños, R. M., Botella, C., Alcañiz, M., Liaño, V., Guerrero, B., & Rey, B. (2004). Immersion and emotion: their impact on the sense of presence. Cyber Psychology & Behavior, 7(6), 734–741. Bashir, N. Y., Wilson, A. E., Lockwood, P., Chasteen, A. L., & Alisat, S. (2014). The time for action is now: Subjective temporal proximity enhances pursuit of remote-future goals. Social Cognition, 32(1), 83–93. Blascovich, J., Loomis, J., Beall, A., Swinth, K., Hoyt, C., & Bailenson, J. N. (2002). Immersive virtual environment technology as a methodological tool for social psychology. Psychological Inquiry, 13, 103–124. Capaldi, C. A., Dopko, R. L., & Zelenski, J. M. (2014). The relationship between nature connectedness and happiness: a metaanalysis. Frontiers in Psychology, 5, 976. Davydenko, M., & Peetz, J. (2015). Effects of nature on time perception, Unpublished manuscript. De Kort, Y. A. W., Meijnders, A. L., Sponselee, A. A. G., & IJsselsteijn, W. A. (2006). What’s wrong with virtual trees? Restoring from stress in a mediated environment. Journal of Environmental Psychology, 26(4), 309–320. Ó 2017 Hogrefe Publishing


M. Soliman et al., The Impact of Immersive Technology on Nature Relatedness and Pro-Environmental Behavior

Dutcher, D. D., Finley, J. C., Luloff, A. E., & Johnson, J. B. (2007). Connectivity with nature as a measure of environmental values. Environment and Behavior, 39, 474–493. El Gamal, M., Wilson, A. E., Schuett, K., & Courneya, D. (2014, April). What do people mean by going green? Understanding lay perceptions of pro-environmental action. Paper presented at the Earth Day Colloquium, University of Western Ontario, London, ON. Faul, F., Erdfelder, E., Buchner, A., & Lang, A. G. (2009). Statistical power analyses using G* Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149–1160. Fox, J., Arena, D., & Bailenson, J. N. (2009). Virtual reality: A survival guide for the social scientist. Journal of Media Psychology, 21(3), 95–113. doi: 10.1027/1864-1105.21.3.95 Kjellgren, A., & Buhrkall, H. (2010). A comparison of the restorative effect of a natural environment with that of a simulated natural environment. Journal of Environmental Psychology, 30(4), 464–472. Li, H., Daugherty, T., & Biocca, F. (2002). Impact of 3-D advertising on product knowledge, brand attitude, and purchase intention: The mediating role of presence. Journal of Advertising, 31(3), 43–57. Mayer, F. S., & Frantz, C. M. (2004). The Connectedness to Nature Scale: A measure of individuals’ feeling in community with nature. Journal of Environmental Psychology, 24, 504–515. Mayer, F. S., Frantz, C. M., Bruehlman-Senecal, E., & Dolliver, K. (2009). Why is nature beneficial? The role of connectedness to nature. Environment and Behavior, 41(5), 607–643. Montgomery, M. (2007). United Nations Population Fund: State of world population 2007: Unleashing the potential of urban growth. Population and Development Review, 33(3), 639–641. Moreno, R., & Mayer, R. E. (2002). Learning science in virtual reality multimedia environments: Role of methods and media. Journal of Educational Psychology, 94(3), 598. Nisbet, E. K., & Zelenski, J. M. (2011). Underestimating nearby nature: Affective forecasting errors obscure the happy path to sustainability. Psychological Science, 22(9), 1101–1106. Nisbet, E. K., Zelenski, J. M., & Murphy, S. A. (2009). The nature relatedness scale: Linking individuals’ connection with nature to environmental concern and behavior. Environment and Behavior, 41(5), 715–740. Pelletier, L. G., Dion, S., Tuson, K., & Green-Demers, I. (1999). Why do people fail to adopt environmental protective behaviors? Toward a taxonomy of environmental amotivation. Journal of Applied Social Psychology, 29, 2481–2504. Schultz, P. W. (2001). The structure of environmental concern: Concern for self, other people, and the biosphere. Journal of Environmental Psychology, 21, 1–13. Schultz, P. W. (2002). Inclusion with nature: The psychology of human-nature relations. In P. Schmuck & P. W. Schultz (Eds.), Psychology of sustainable development (pp. 61–78). Boston, MA: Kluwer. Schultz, P. W., & Tabanico, J. J. (2007). Self, identity, and the natural environment: Exploring implicit connections with nature. Journal of Applied Social Psychology, 37, 1219–1247. Skalski, P., Tamborini, R., Shelton, A., Buncher, M., & Lindmark, P. (2011). Mapping the road to fun: Natural video game controllers, presence, and game enjoyment. New Media & Society, 13(2), 224–242. Valtchanov, D., Barton, K. R., & Ellard, C. (2010). Restorative effects of virtual nature settings. Cyberpsychology, Behavior and Social Networking, 13(5), 503–512.

Ó 2017 Hogrefe Publishing

17

Witmer, B. G., & Singer, M. J. (1998). Measuring presence in virtual environments: A presence questionnaire. Presence: Teleoperators and Virtual Environments, 7(3), 225–240. Zelenski, J. M., Dopko, R. L., & Capaldi, C. A. (2015). Cooperation is in our nature: Nature exposure may promote cooperative and environmentally sustainable behavior. Journal of Environmental Psychology, 42, 24–31. Zelenski, J. M., & Nisbet, E. K. (2014). Happiness and feeling connected the distinct role of nature relatedness. Environment and Behavior, 46(1), 3–23.

Received January 15, 2016 Revision received December 29, 2016 Accepted January 10, 2017 Published online March 21, 2017

Monica Soliman Psychology Department Social Science Research Building (SSRB 314D) Carleton University 1125 Colonel By Drive Ottawa, ON, K1S 5B6 Canada monica.soliman@carleton.ca

Monica Soliman (PhD) is a postdoctoral researcher at Carleton University, Canada. Her research can be broadly situated in the area of judgment and behavioral decisionmaking. She is interested in studying the factors that motivate people to make positive changes in their behavior, such as engaging in more pro-environmental actions.

Johanna Peetz completed her doctoral studies at Wilfrid Laurier University, Canada, and is currently Associate Professor at Carleton University in Ottawa, Canada. She studies time perception, identity, and financial reasoning.

Mariya Davydenko is a master’s student at Carleton University, Canada. She studies the effect of nature settings on time perception.

Journal of Media Psychology (2017), 29(1), 8–17


Pre-Registered Report

Effects of Subtitles, Complexity, and Language Proficiency on Learning From Online Education Videos Tim van der Zee,1 Wilfried Admiraal,1 Fred Paas,2 Nadira Saab,1 and Bas Giesbers2 1

ICLON, Universiteit Leiden, Leiden, The Netherlands

2

Erasmus University Rotterdam, The Netherlands

Abstract: Open online education has become increasingly popular. In Massive Open Online Courses (MOOCs) videos are generally the most used method of teaching. While most MOOCs are offered in English, the global availability of these courses has attracted many non-native English speakers. To ensure not only the availability, but also the accessibility of open online education, courses should be designed to minimize detrimental effects of a language barrier, for example by providing subtitles. However, with many conflicting research findings it is unclear whether subtitles are beneficial or detrimental for learning from a video, and whether this depends on characteristics of the learner and the video. We hypothesized that the effect of 2nd language subtitles on learning outcomes depends on the language proficiency of the student, as well as the visual-textual information complexity of the video. This three-way interaction was tested in an experimental study. No main effect of subtitles was found, nor any interaction. However, the student’s language proficiency and the complexity of the video do have a substantial impact on learning outcomes. Keywords: Online education, multimedia, subtitles, video, MOOCs

Open online education has rapidly become a highly popular method of education. The promise – global and free access to high-quality education – has often been applauded. With a reliable Internet connection comes free access to a large variety of massive open online courses (MOOCs) found on platforms such as Coursera and edX. MOOC participants indeed come from all over the world, although participants from Western countries are still overrepresented (Nesterko et al., 2013). In all cases there are many non-native English speakers in English courses. This raises the question as to what extent can non-native English speakers benefit from these courses, compared with native speakers. Open online education may be available to most, the content might not be as accessible for many owing to language barriers. It is important to design online education in such a way that it minimizes detrimental effects of potential language barriers to increase its accessibility for a wider audience. MOOCs typically feature a high number of videos that are central to the student learning experience (Guo, Kim, & Rubin, 2014; Liu et al., 2014). The central position of educational videos is reflected by students’ behavior and their intentions: Most students plan to watch all videos in a MOOC, and also spend the majority of their time watching these videos (Campbell, Gibbs, Najafi, & Severinski, 2014; Seaton, Bergner, Chuang, Mitros, & Pritchard, 2014). In this Journal of Media Psychology (2017), 29(1), 18–30 DOI: 10.1027/1864-1105/a000208

study, we investigate the impact of subtitles on learning from educational videos in a second language. Providing subtitles is a common approach to cater to diverse audiences and support non-native English speakers. The Web Content Accessibility Guidelines 2.0 (WCAG, 2008) prescribe subtitles for any audio media to ensure a high level of accessibility. Intuitively there seems nothing wrong with this advice and many studies have indeed found a positive effect of subtitles on learning (e.g., Markham et al., 2001). However, a different set of studies provides evidence that subtitles can also hamper learning (e.g., Kalyuga, Chandler, & Sweller, 1999). In the current study the effects of subtitles will be further examined. This paper is organized as follows: First, conflicting findings on the effects of subtitles on learning will be discussed. Second, a framework will be proposed that can explain these conflicting findings by considering the interaction between subtitles, language proficiency, and visual-textual information complexity (VTIC). In turn, an experimental study will be described that tests the main hypothesis of the framework. Although this study is situated in an online educational setting, the results may also be of relevance for other media-orientated fields, such as film studies and video production. Ó 2017 Hogrefe Publishing


T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

Subtitles: Beneficial or Detrimental for Learning? Research on the effects of subtitles typically differentiates between subtitles in someone’s native language, called L1, versus subtitles in one’s second language, or L2. A metaanalysis of 18 studies showed positives effects of L2 subtitles for language learning (Perez, Noortgate, & Desmet, 2013). Specifically, enabling subtitles for language-learning videos substantially increases student performance on recognition tests, and to a lesser extent on production tests. Other studies have found similar positive effects of subtitles on learning from videos and there appears to be a consensus that subtitles are beneficial for learning a second language (e.g., Baltova, 1999; Chung, 1999; Markham, 1999; Winke, Gass, & Sydorenko, 2013). However, these are all studies that focus on learning a language and not on learning about a non-linguistic topic in a second language. There are important differences between language learning and what we will call content learning. When learning a language, practicing with reading and understanding L2 subtitles is directly relevant for this goal. By contrast, when learning about a specific topic, apprehending L2 subtitles is not a goal in itself but only serves the purpose of better understanding the actual content. As such, we would argue that findings from studies focusing on language learning are by themselves not convincing enough to be directly applied to content learning, as subtitles have a different relationship with the content and the learning goals. In contrast to studies on language learning, there are only a few studies that have investigated the effects of subtitles for content learning. These studies have shown positive effects for subtitles for content learning in a second language. For example, when watching a short Spanish educational clip, English-speaking students benefited substantially from Spanish subtitles, but even more so from English subtitles (Markham et al., 2001). Another study, focused on different combinations of languages, similarly showed that students performed better at comprehension tests when watching an L2 video with subtitles enabled (Hayati & Mohmedi, 2011). Although several studies did find positive effects of subtitles, a range of other studies yielded contradictory findings. For example, Kalyuga et al. (1999) found that narrated videos without subtitles are better for learning than are videos with subtitles. In this study, subtitles were shown to lead to lower performance, an increased perceived cognitive load, and more reattempts during the learning phase (i.e., rewatching videos). This is in contrast to the earlier discussed studies, which showed positive effects of learning from videos with subtitles. In a different study on learning from narrated videos, two experiments showed that enabling subtitles led to lower knowledge retention and Ó 2017 Hogrefe Publishing

19

transfer (Mayer, Heiser, & Lonn, 2001). With Cohen’s d effect sizes ranging from 0.36 to 1.20, the detrimental effects of subtitles in these studies were quite substantial. A range of other studies found similar evidence that for content and language learning alike, narrated explanations are typically better than showing only subtitles, or narration combined with subtitles (Harskamp, Mayer, & Suhre, 2007; Mayer, Dow, & Mayer, 2003; Mayer & Moreno, 1998; Moreno & Mayer, 1999). Finally, some studies showed neither a positive nor a negative effect of subtitles on learning (e.g., Moreno & Mayer, 2002a).

Explaining Conflicting Findings on the Effects of Subtitles The previously discussed literature provides a confusing paradox for instructional designers: Are subtitles beneficial, detrimental, or irrelevant for learning? Here we will present an attempt to explain the conflicting findings using a framework built on theories of attention and information processing. In short, we propose that the conflicting findings can be integrated by considering the interaction between subtitles, language proficiency, and the level of visual-textual information complexity (VTIC) in the video. Working Memory Limitations An essential characteristic of the human cognitive architecture is that not every type of information is processed in an identical way. Working memory is characterized by having modality-specific channels, one for auditory and one for visual information (Baddeley, 2003). Both have a limited capacity for information, which can only hold information chunks for a few moments before they decay (Baddeley, 2003). During learning tasks, working memory acts as a bottleneck for processing novel information; as more cognitive load is imposed on the learner, less cognitive resources are available for the integration of information into longterm memory, effectively impairing learning (Ginns, 2006; Sweller, Van Merrienboer, & Paas, 1998). For novel information, the cognitive resources required for processing appears to be primarily dictated by measurable attributes of the information-in-the-world such as the amount of words and their interactivity (Sweller, 2010). As each channel has its own capacity it is generally more effective to distribute processing load between both channels, instead of relying only on one modality (Mayer, 2003). When two sources of information are presented in the same modality this can (more) easily overload our limited processing capacity (Kalyuga et al., 1999). This provides an explanation of why a range of studies found negative effects of subtitles when learning from videos, as both are sources of visual information.

Journal of Media Psychology (2017), 29(1), 18–30


20

T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

Textual Versus Nontextual Visual Information Up to now, we did not distinguish between textual and nontextual visual information. As previously discussed, auditory and visual information are initially processed in separate channels. However, after this initial processing, any language presented either visually or verbally will be processed in the same working memory subcomponent: the phonological loop (Baddeley, 2003). By contrast, nontextual visual information is processed in a different component, the visuospatial sketchpad. This notion can further clarify the earlier presented findings. Specifically, the presence of textual visual information, as compared with nontextual visual information becomes an important variable to account for. A video that contains three different sources of language – narration, subtitles, and in-video text – is likely to induce cognitive overload. If the visual information in the video does not have a language component we can expect a reduced or non-detrimental effect. More precisely, subtitles are expected to be detrimental to learning when a video already has a high level of VTIC. However, when a video has a relatively low level of VTIC, adding subtitles will not necessarily lead to cognitive overload. Should the addition of subtitles be desired, it then becomes necessary to ensure that the VTIC of a video is low enough to prevent detrimental effects due to cognitive overload. We propose two ways of how the VTIC of an educational video can be manipulated while maintaining the educationally relevant content. Amount of Visual–Textual Information The first, and most straightforward, aspect of VTIC is the amount of visual–textual information shown in a video. That is, a video in which much more text is shown is arguably more complex to process than a video with much less text. However, while removing information that is vital to understand the topic of the video might reduce the complexity, it will also harm the educational value of the video. However, removing or adding visual–textual information that is not strictly relevant for the learning goals can be used to, respectively, decrease or increase the VTIC of a video. Given that such information does not benefit the student in mastering the learning goal, the validity of the video as an educational tool is fully maintained. For example, take a complex image such as a schematic representation of the human eye, with many labels referring to each individual part of the eye. Labels that are not relevant to the learning goals can be effectively removed, possibly greatly limiting the amount of visual–textual information presented to the student. Evidence for the beneficial effect of removing irrelevant information has been found by several studies and is typically referred to as the “coherence effect” (Butcher, 2006; Mayer et al., 2001; Moreno & Mayer, 2000). Journal of Media Psychology (2017), 29(1), 18–30

Presentation Rate of Visual–Textual Information The second proposed component of VTIC is the presentation rate of the visual–textual information. As discussed earlier, working memory is limited in how much information it can hold and process at any given time. Therefore, introducing many concepts simultaneously risks overloading a student with more information than (s)he can effectively handle. This can be prevented by spreading the information over time, while maintaining the same overall amount of information. For example, detrimental effects of subtitles disappear when verbal and written text explanations are presented before the visual information is shown (Moreno & Mayer, 2002b). With such a sequential presentation the student does not need to process the spoken word, written word, as well as the visual information simultaneously. Instead, first the narration and subtitles are processed, and only afterward is the visual information shown. This effectively removes the role of split-attention effects as well as spreading out cognitive load over time, thus reducing the risk of cognitive overload. However, while this form of information segmentation makes videos easier to process and understand, it also increases the video duration, which is often not a desired consequence. Visual–textual information can be segmented without affecting the video duration by only showing new information from the moment it is mentioned in the narration and becomes relevant. Using the previous example of a complex schematic image with many labels, at the start of a video segment the complex image can be shown without any labels, with labels becoming visible from the moment they are verbally discussed. In this format the total duration as well as the narration remain unchanged, while decreasing overall VTIC through a segmentation presentation style. Split-Attention Effects As discussed, subtitles add an additional source of information that needs to be processed, leaving less cognitive resources for learning processes. Additionally, subtitles also draw visual attention, such that less attention is spent on other – possibly important – aspects of the video. Like other cognitive resources, attention is limited. That subtitles can cause a so-called split-attention effect has been made clear by several eye-tracking studies. In general, viewers spend a substantial amount of time paying attention to subtitles (Schmidt-Weigand, Kohnert, & Glowalla, 2010). In a video with a lecturer and subtitles, non-native speakers spent 43% of the time looking at the subtitles (Kruger, Hefer, & Matthew, 2014). The finding that subtitles draw so much attention further signifies their importance. Even when in certain circumstances subtitles are beneficial for learning, it should be taken into account that students will have less attention for other visual information. In situations where subtitles do not significantly aid the learner, a substantial Ó 2017 Hogrefe Publishing


T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

amount of attention will have been wasted. We propose two additional factors contributing to the VTIC of a video. Attention Cuing When presented with novel information, it can be diďŹ&#x192;cult to immediately understand where to look. Profound differences in visual search and attention anticipation have been reported for expertise differences in many areas, such as in chess, driving, and clinical reasoning (Chapman & Underwood, 1998; Krupinski et al., 2006; Reingold, Charness, Pomplun, & Stampe, 2001). Given the already high attentional load present in visually complex videos, the presence of subtitles can be expected to have detrimental effects. However, to lower the attentional load, attention can be guided by using attentional cues such as arrows pointing to the most relevant area in a video, or by underling or highlighting these sections. Such attentional cues help novice learners to more effectively direct their attention when and where it is necessary (Boucheix & Lowe, 2010; Ozcelik, Arslan-Ari, & Cagiltay, 2010), possibly lowering detrimental effects of subtitles. Physical Distances The final proposed factor of VTIC relates to the physical organization of related information in a video. Specifically, the physical distance between a header (such as a label) and its referent. Nontrivial physical distances between headers and referents are detrimental for learning, as longer distances require more cognitive resources to hold and process information (Mayer, 2008). Additionally, longer distances can induce a split-attention effect, as the increased distances require more attention, which can thus not be spent on other, more relevant parts of the video (Mayer & Moreno, 1998). A split-attention effect can further explain the contradictory findings: Subtitles will cause a splitattention in the presence of other visual information, such as graphics, texts, annotated pictures, or diagrams with textual explanations. Furthermore, physical distances can be manipulated to increase or decrease the VTIC without affecting the educational content itself. Using the earlier example of the complex image with labels, the physical distances between the labels and the position in the image can be changed to manipulate the VTIC of a video. The Possible Role of Language Proficiency It is argued that subtitles (whether L1 or L2) are beneficial for the comprehension of L2 video content because they help students bridge the gap between their language proficiency and the target language (Chung, 1999; Vanderplank, 1988). More specifically, it is often easier to understand L2 written text over spoken word, as reading comprehension skills are typically more developed in students (Danan, 2004; Garza, 1991). Perez et al. (2013) report different learning gains based on L2 proficiency, Ă&#x201C; 2017 Hogrefe Publishing

21

although this study provides insuďŹ&#x192;cient evidence to verify a moderating role of L2 proficiency. Furthermore, L2 subtitles typically draw more attention than subtitles in oneâ&#x20AC;&#x2122;s native language, presumably because L1 subtitles can be processed more automatically and require only peripheral vision (Kruger et al., 2014). A final reason to consider L2 proficiency as an inďŹ&#x201A;uential factor is because information that is known by a person requires much less, or possible no, cognitive resources to operate in working memory (Diana, Reder, Arndt, & Park, 2006; Sweller et al., 1998). As such, processing L2 subtitles can be expected to require less cognitive resources when a student has a higher L2 proficiency. At first sight, this appears to be in conďŹ&#x201A;ict with the argument that subtitles specifically help students with a lower L2 proficiency to bridge the language barrier. A possible integration of these findings would be that with a lower L2 proficiency subtitles do indeed require more effort to process, but can also aid learning only if no other visual information is present.

Putting the Pieces Together Based on the discussed literature, we would argue that to better understand the effects of subtitles on learning it is essential to consider both language proficiency and the complexity of visualâ&#x20AC;&#x201C;textual information. For example, consider videos showing a teacher explaining a topic simultaneously with a written summary, annotated pictures, or diagrams with textual explanations. The inclusion of subtitles in such videos can be detrimental for learning, especially for students with a low English proficiency. The amount of different visual sources of information will put more strain on the limited capacity of the visual working memory channel. Not only do the subtitles potentially cause a cognitive overload, but also a split-attention effect.

Hypothesis The main hypothesis of the proposed model is a three-way interaction effect, specifically:  There is a three-way interaction effect between English proficiency, subtitles, and VTIC on test performance. Additionally, we predict specific directions in this three-way interaction:  For low-VTIC videos, lower English proficiency is related to a higher performance gain when subtitles are enabled, and  For high-VTIC videos, lower English proficiency is related to a higher performance loss when subtitles are enabled. Note that the hypotheses concern relative differences in performance change. No claim is made about absolute Journal of Media Psychology (2017), 29(1), 18â&#x20AC;&#x201C;30


22

T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

difference between students with different levels of English proficiency, or between videos with different levels of visual–textual information. The underlying reasoning is that the presented framework predicts different effects of subtitles depending on the amount of additional visual information and level of English proficiency, but it does not necessarily predict absolute differences.

Method Videos Four types of videos were used: videos with high/low VTIC, and with/without subtitles. To ensure ecological validity, actual videos from MOOCs from the Coursera platform were used as base material; however, to make the videos usable for this experiment they were extensively edited as will be further described. Four videos were used as raw material, which were manipulated to create the four versions of each video, resulting in 16 videos. To manipulate the complexity of the videos, the four proposed VTIC components were used as a guideline, as summarized in Table 1. All other video characteristics were kept the same for each video. The duration of each video is approximately 7 min, with no differences between the versions of each video. None of the videos in any version show the narrator or teacher. Each video was narrated by the same person to exclude a narrator effect. In the versions with subtitles, the subtitles are shown in the bottom part of the screen where they do not overlap with any other content. The narration and subtitles are verbatim identical. The video topics are: “The Kidney,” “History of Genetics,” “The Visual System,” and “The Peripheral Nervous System.”

English Proficiency Test To test the hypothesis of the proposed model it was necessary to estimate the English proficiency of the participants. As the goal of this study is to generate results that can be easily implemented in online education, a short and easy-to-implement test was preferred. With this in mind, the English proficiency placement test made by TrackTest was used (TrackTest, 2016). TrackTest is a placement test used to estimate the user’s English proficiency at the level of the widely used Common European Framework of Reference (CEFR) scales (Council of Europe, 2001). The CEFR identifies six levels, from A1 to C2, signifying beginner to advanced proficiency levels. The TrackTest is adaptive, meaning that subsequent questions are based on the performance on earlier questions. In total, each participant was presented 15 multiple-choice Journal of Media Psychology (2017), 29(1), 18–30

Table 1. Overview of the manipulations to create more and less complex versions Complexity factor

High complex version

Low complex version

Irrelevant information

Included

Removed

Segmentation

None

Segmented

Attentional cues

No cues

Cues

Physical distances

Increased

Minimal

Table 2. Counterbalance list List

Video 1

Video 2

Video 3

Video 4

1

C+ S+

C

C+ S

C

2

C

C+ S

C

C+ S+

3

C+ S

C

4

C

C+ S+

S S+

S S+

S+

S+

C+ S+

C

C

C+ S

S

S

Note. C+ and C refer to more and less complex versions of a video, respectively. S and S+ refer to the absence and presence of subtitles, respectively.

questions, from a pool of 90. The test takes less than 10 min to complete. In a pilot test with 800 users who took the test twice, the test–retest reliability was satisfying with a Spearman’s ρ of .736.

Procedure Upon registration, each participant was randomly allocated to one of four counterbalance lists, which are presented in Table 2. The annotations C and C+ refer to the video versions with decreased and increased levels of VTIC, respectively. Likewise, S+ and S refer to videos with and without English subtitles, respectively. As shown, each participant views one video in each condition. Before the study, the participants were asked to rate their prior knowledge about each of the four topics. For example, regarding the video on the organization of the human eye, the participants were asked how much they know about the different parts and the organization of the human eye. For these questions 5-point Likert scales were used; participants who scored a 3 or higher (e.g., who self-report a moderate amount of prior knowledge) were excluded from the study, to eliminate a confounding influence of expertise. The participants were allowed to watch a video only once. After every video, the participants were asked to rate how much mental effort they had to invest to understand the video, on a 9-point Likert scale. The difference in average mental effort ratings for the videos high and low in VTIC serves as a measure of manipulation success. Subsequently, participants were presented with a knowledge tests of 10 multiple-choice questions, containing factual questions about the content of the video. After completing the questions participants Ó 2017 Hogrefe Publishing


T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

continued with the next video, until all videos and tests were completed. Afterward, the participants were asked to take the short English proficiency placement test. Finally, they completed some questions about possible technical issues while watching the videos; these questions were asked for quality assurance. No relevant technical issues were reported.

Participants As this study focuses on online education, participants were recruited and tested online, using the Prolific platform (Prolific Academic, 2015). Participants were considered eligible if they are over 18 years of age and non-native English speakers. Upon completion of the study the participants received €6.50 per hour as compensation. Instead of doing a power analysis to a priori decide on a fixed sample size, the study started with an initial sample size of 50 participants, and sampling continued in batches of 25 until there was sufficient evidence present in the data, as explained in the next section.

Preregistered Analysis Plan To test the hypothesis, a Bayesian repeated measures ANOVA was performed on the mean test scores with subtitles (yes/no) and visual information (yes/no) as within-subject variables, and English proficiency as between-subject variable (1–5). A Bayesian model comparison was used to decide on the model with the strongest evidence compared with the other models. This analysis was performed in JASP version 0.7.5 (Love et al., 2015), which uses a default Cauchy prior on effect sizes, centered on 0 with a scaling of 0.707, as argued for by Rouder, Morey, Speckman, and Province (2012). A Bayes factor of 3–10 of one model over another is interpreted as moderate evidence, 10–30 as strong, and above 30 as very strong. This analysis was performed after every batch of participants, and sampling continued until one model had a Bayes factor of at least 10 compared with every other model. This was the case after 125 participants. All the analyses described here were done using the data from all 125 participants.

Results First the descriptive statistics will be shown, followed by the confirmatory analysis, and several exploratory analyses. The data and the analysis scripts are available on the Open Science Framework at: https://osf.io/n6zuf/. Ó 2017 Hogrefe Publishing

23

Descriptive Statistics A total of 125 participants successfully completed the entire study. As is shown in Table 3, the group of participants is well balanced in terms of gender and age. The language proficiency is skewed, with 43% of the participants having a high level, but all levels of language proficiency are sufficiently represented in the sample. Note that all participants are non-native English speakers, including the students with the highest proficiency level of C. In the study, the participants watched four videos, each in a different condition. Table 4 shows the within-subject differences in test scores and self-reported mental effort ratings for each condition pair. These descriptive results give a mixed image. The mean differences between the conditions with the same complexity but subtitles enabled or disabled are the smallest, both for the test scores and mental effort ratings. The differences between conditions with the same setting for subtitles but different levels of complexity are larger, suggesting a main effect of complexity. Furthermore, this difference appears larger when subtitles are disabled, which might mean there is an interaction between complexity and subtitles. Note that Table 4 does not consider a possible main effect or interaction of language proficiency. The analysis of the full model with all the main effects and interactions is reported in the next section.

Confirmatory Analysis In accordance with the preregistered analysis plan, a Bayesian repeated measures ANOVA was performed on the test scores, with the following predictors: video complexity (high/low), subtitles (yes/no), and the participant’s language proficiency (1–5). These results are in the form of model comparison; all the possible combinations of main effects and interactions between the three predictors are compared in terms of how well they can explain the data. Note that in contrast to frequentist ANOVAs, multiple comparisons between all models can be performed without the need for corrections. The results of the Bayesian repeated measures ANOVA are displayed in Table 5, which shows all the models in descending order of evidence. The results show that Model 1 has the most evidence, which consists only of the main effects of complexity and language proficiency, no main effect of subtitles, and no interactions between any of the factors. This model has nearly 108 times more evidence than the null model. Importantly, the evidence provided by this study favors the complexity + language proficiency model over complexity + subtitles + language proficiency model (which is the second-best model) by a factor of 10.30:1. In other Journal of Media Psychology (2017), 29(1), 18–30


24

T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

Table 3. Descriptive statistics Gender

Age

Language proficiency

67 Male (54%)

Min: 17 yr

A1: 14 (11%)

56 Female (46%)

Max: 53 yr

A2: 16 (13%)

Mean: 27.62 yr

B1: 23 (18%)

Sd: 7.24 yr

B2: 18 (14%) C1–2: 54 (43%)

Notes. Language proficiency: A1 is the lowest level. Two participants did not report their age and gender.

Table 4. Within-subject differences between conditions Conditions

Test difference (Sd)

Mental Effort difference (Sd)

Complex: Subs – No Subs

0.10 (2.04)

0.09 (2.06)

Simple: Subs – No Subs

0.17 (2.10)

0.06 (1.71)

Subs: Complex – Simple

0.50 (2.04)

0.32 (2.33)

No Subs: Complex – Simple

0.77 (2.33)

0.18 (2.00)

words, there is 10.3 times more evidence for the C + L model than the C + S + L model. Furthermore, every model that does not contain a main effect of subtitles is stronger than its counterpart that includes an effect of subtitles. The preregistered hypothesis was that the data would be best explained by the full three-way interaction model (Model 15). While there is more evidence for this model than for a null model, the data favor the simpler C + L model by a factor of 400,000:1. In the last column of Table 5, the posterior probability of each model is shown. When considering only these models, and having no preference for any model before the study, the P(M|D) gives the probability that the model is true, given the data and priors.

Exploratory Analyses While the Bayes factors quantify the amount of relative evidence provided for the models, they do not provide information about estimations of population parameters such as means and effect sizes. Using Markov chain Monte Carlo (MCMC) methods from the BayesFactor Package in R, we estimated population parameters using all the available data with all the factors and their interactions (Morey & Rouder, 2015; R Core Team, 2016). Chains were constructed with 106 iterations; visual inspection of the chains and auto-correlation plots revealed no quality issues. We put Cauchy priors on the effect size parameters with a scaling factor of 1/2, as further described by Rouder et al. (2012). A Cauchy with a scaling factor of 1/2 has half the probability mass between 0.5 and 0.5, and the remaining half on the more extreme values. In other words, we expect effect sizes of around (–)0.5, but the prior is diffuse enough Journal of Media Psychology (2017), 29(1), 18–30

to be sensitive to more extreme effects. Using much wider or more narrow scaling factors (from 1/6 to 4) does not affect the estimations in a consequential manner. As these priors cover all the effect sizes in the literature discussed, we consider the results to be insensitive to all plausible alternative priors. Complexity and subtitles were entered as factors (yes/no), while language was entered as a continuous variable (1–5). This analysis was done separately for effects on test scores (described in the next section) and on mental effort ratings (described in the subsequent section). Effects on Test Scores A visualization of the posterior probability densities of the three main effects on test scores is shown in Figure 1. While Figure 1 shows only the main effects, the entire model with all factors and interactions were used to generate these posterior distributions. The density plot shows the most likely values of each effect size parameter, such that any point in the plot that is twice as high as another point is twice as likely. Note how the effect of subtitles is centered around 0, with higher and lower values becoming increasingly unlikely. By contrast, the effect size of complexity is much stronger, and most likely to be around –0.62 (compared with simple). Language has a positive effect, with an effect size slope of around 0.55. Note that these are unstandardized effect sizes measures in grade points, on a scale of 0 to 10. In other words, while the effect of subtitles is mostly likely to be (close to) zero, both complexity and language have a noticeable effect on test scores. Compared with complexity and subtitles, the effect of language proficiency can be estimated with relatively little uncertainty. The parameter estimations of the main effects and all interactions are shown in Table 6. As can be seen in Table 6, the difference between two identical videos, which only differ in complexity, is 0.62 grade points (as the difference between low and high complexity is 0.62). Dividing this by the standard deviation results in a Cohen’s d effect size of 0.31. The slope of language proficiency (measured on a scale of 1–5) is 0.55 (Cohen’s d of 0.27), with a 95% credible interval of 0.43–0.68. However, the effect of subtitles is 0.04 grade points (Cohen’s d of 0.02), and we cannot even be certain about the direction of the effect as the credible intervals span both negative and positive values. This means that it is very likely to be (close to) zero. All the interaction effects are similarly centered around 0, with credible intervals that span both negative and positive values. These findings are fully consistent with the confirmatory analysis, which suggested that the best model includes only the main effects of complexity and language proficiency, but not the effect of subtitles or any of the interactions. Only complexity and Ó 2017 Hogrefe Publishing


T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

25

Table 5. Bayes Factors of all models relative to the null model #

Model

BF(M, 0)

% Error

BF(M, M + 1)

P(M|D)

1

C+L

99,980,000.00

1.48%

10.30

0.868

2

C+S+L

9,704,000.00

1.35%

3.98

0.084

3

C+L+CL

2,439,000.00

1.44%

1.17

0.021

4

C+S+L+CS

2,086,000.00

2.51%

3.98

0.018

5

C+S+L+SL

524,242.34

1.67%

2.10

0.005

6

C+S+L+CL

250,015.78

1.98%

2.09

0.002

7

C+S+L+CS+SL

119,737.40

2.77%

1.93

0.001

8

L

61,208.53

0.62%

1.17

< 0.001

9

C+S+L+CS+CL

52,917.54

2.37%

3.97

< 0.001

10

C+S+L+CL+SL

13,331.75

2.19%

2.18

< 0.001

11

S+L

6,125.66

1.06%

2.17

< 0.001

12

C+S+L+CS+CL+SL

2,818.57

2.33%

1.31

< 0.001

13

C

1,747.43

8.04%

5.43

< 0.001

14

S+L+SL

321.62

1.64%

1.29

< 0.001

15

C+S+L+CS+CL+SL+CSL

249.57

2.80%

1.61

< 0.001

16

C+S

155.37

1.26%

3.97

< 0.001

17

C+S+CS

39.09

13.77%

39.09

< 0.001

0

Null (intercept + subject)

1.00

n/a

10.10

< 0.001

18

S

0.10

1.13%

n/a

< 0.001

Note. C = Complexity (high/low). S = Subtitles (yes/no). L = Language Proficiency (1–5). BF(M, 0) = Bayes Factor of Model compared to Null Model. BF (M, M + 1) = Bayes Factor of Model compared to next model. P(M|D) = Posterior probability of model given the data, if each model had equal probability of being true before this study.

Figure 1. Posterior probability density plots for the effects on test score (1–10).

language proficiency have 95% credible intervals that do not include zero, such that we can be confident about the direction of the effect, while the effects of the other factors are close to zero. Effects on Mental Effort Ratings In addition to the effects on test scores, we analyzed the effects of complexity, language proficiency, and subtitles on the participants’ self-reported mental effort ratings of the videos. This analysis is identical to the previous analysis in every aspect other than the different outcome variable. As described earlier, the participants were asked how much mental effort they had to invest in watching and Ó 2017 Hogrefe Publishing

understanding each video on a 9-point Likert scale, higher scores meaning more invested effort. A visualization of the posterior probability densities of the three main effects on mental effort ratings is shown in Figure 2. As can be seen in Figure 2, the effects of subtitles and language proficiency on the participants’ mental effort ratings are both centered near 0. For subtitles, the effect is estimated at a 0.015 difference in mental effort ratings, 95% credible interval [ 0.16, 0.19]. The effect of language proficiency is estimated at 0.017, 95% credible interval [ 0.10, 0.14]. Of the three effects, complexity is the only one not centered around zero, and is estimated at 0.24, Journal of Media Psychology (2017), 29(1), 18–30


26

T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

Table 6. Parameter estimations of intercept and factor effects Parameter

Estimation

95% Credible interval

Intercept

4.81

[4.63, 4.98]

Standard deviation

2.00

[1.88, 2.13]

C0

0.31

[0.13, 0.48]

C1

0.31

[ 0.48, 0.13]

S0

0.02

[ 0.16, 0.19]

S1

0.02

[ 0.19, 0.16]

Lang

0.55

[0.43, 0.68]

C0  S0

0.06

[ 0.11, 0.24]

C0  S1

0.06

[ 0.24, 0.11]

C1  S0

0.06

[ 0.24, 0.11]

C1  S1

0.06

[ 0.11, 0.24]

C0  Lang

0.03

[ 0.10, 0.15]

C1  Lang

0.03

[ 0.15, 0.10]

S0  Lang

0.04

[ 0.17, 0.08]

S1  Lang

0.04

[ 0.08, 0.17]

C0  S0  Lang

0.05

[ 0.18, 0.07]

C0  S1  Lang

0.05

[ 0.07, 0.18]

C1  S0  Lang

0.05

[ 0.07, 0.18]

C1  S1  Lang

0.05

[ 0.18, 0.07]

Notes. C = Complexity (1 = high, 0 = low). S = Subtitles (1 = yes, 0 = no). Lang = Language Proficiency (slope).

95% credible interval [0.06, 0.41]. When transformed into standardized Cohen’s d effect sizes, the effect of subtitles is 0.008, for language proficiency it is 0.010, and complexity has an effect of 0.120. The (unstandardized) effects of the interactions are all smaller than 0.02, which – on a scale of 1 to 10 – is so small they will not be further discussed.

Discussion Open online education plays an important role in the globalization and democratization of education. To ensure not only the availability but also the accessibility of open online education, it is vital to remove potential obstacles and biases that put certain students at a disadvantage, for example, students with lower levels of English proficiency. This is not yet a given, as MOOCs are still provided primarily in English. In this study, we investigated whether the presence of English subtitles has beneficial or possibly detrimental effects on students’ understanding of the content of English videos. Specifically, we tested the hypothesis that the effect of subtitles on learning depends on the English proficiency of the students and the VTIC of the video. Contrary to this hypothesis, we found strong evidence that there is no main effect of subtitles on learning, nor any interaction, but only a main effect of complexity and language proficiency. We will discuss these findings in that order.

Journal of Media Psychology (2017), 29(1), 18–30

No Main Effect of Subtitles Contrary to a range of previous studies, we found strong evidence that subtitles neither have a beneficial nor a detrimental effect on learning from educational videos. In addition, the presence or absence of subtitles also appears to have no effect on self-reported mental effort ratings. This is surprising given an apparent consensus that enabling subtitles increases the general accessibility of online content, as is stated by the Web Content Accessibility Guidelines 2.0 (WCAG, 2008). These null findings contradict two lines of research, one showing beneficial effects of subtitles, the other showing detrimental effects. Earlier research that has shown beneficial effects of subtitles are primarily studies on second language learning, which show that second-language subtitles help students with learning that language (e.g., Baltova, 1999; Chung, 1999; Markham, 1999; Winke et al., 2013). While this appears conflicting with the results of the present study, the important difference is that the current study did not use language-learning videos but content videos, and did not measure gains in second-language proficiency. Based on the current study it seems that for content videos there is little to no benefit of enabling subtitles, even for students with a low language proficiency and for visually complex videos. A different body of research has shown detrimental effects of subtitles. This is often labeled the redundancy effect, as the reasoning is that because the subtitles are verbatim identical to the narration they are redundant and can only hinder the learning process (e.g., Mayer et al., 2001, 2003). This is in clear contrast to the findings of the current study, which estimates the effect of subtitles to be (close to) zero. Importantly, the English language proficiency of the students did not moderate the effect of subtitles, even though the study included participants with the full range of English proficiency levels. As noted before, it might be that the subtitles helped the students with lower proficiency levels to increase their understanding of English, but it did not affect their test performance. With the Bayesian analyses we showed that subtitles do not merely have an indistinguishable effect (e.g., a nonsignificant effect in frequentist statistics) but that there is strong evidence for the absence of a subtitle effect on learning and mental effort. While these conclusions are only based on the selection of videos used in the current study, it puts the generalizability of the redundancy effect in question by showing that it does not hold for these specific videos, but arguably also for a wider range of similar videos. More research is needed to further establish the potential (lack of) effects of subtitles on learning from videos; both in highly controlled settings as well as in real-life educational settings. Specifically, it is essential to study the generalizability of findings like the redundancy effect and establish

Ó 2017 Hogrefe Publishing


T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

27

Figure 2. Posterior probability density plots for the effects on mental effort ratings (1–10).

boundary conditions. Even though the current study used four different videos, each with four different versions, this is not sufficient to be able to generalize to all kinds of educational videos. However, by manipulating the complexity of the videos, we were able to show that the null effect of subtitles cannot be explained by complexity or element interactivity (Paas, Renkl, & Sweller, 2003; Sweller, 1999). Furthermore, we compared the amount of evidence for a wide range of different models and found that every model that does not include a main effect of subtitles is stronger than its respective alternative model that does include subtitles. In addition, the within-subject design of the study severely reduces the plausibility of confounding participant characteristics. Finally, it is noteworthy that the current study only used second- language subtitles, meaning that providing subtitles in the native language of students can still have a positive effect on learning and accessibility (Hayati & Mohmedi, 2011; Markham et al., 2001).

Main Effect of Complexity The effect of video complexity shows how video design can have a noticeable effect on test performance, either positively or negatively. In this study, the effect was estimated at 0.62 grade points (on a scale of 0–10), which translates to a Cohen’s d of 0.31. In addition, the self-reported mental effort ratings was 0.24 higher for complex videos (on a scale of 1–10), which is a Cohen’s d of 0.12. This study did not use measures of engagement such as video dwelling time. However, a recent study showed that the textual complexity of videos in open online education explains over 20% of the variance in dwelling time (Van der Sluis, Ginn, & Van der Zee, 2016). As the quizzes took place immediately after each video, the current study only provides insight into how VTIC affects short-term performance on tests. Effects on long-term learning are unknown, but it is plausible that the performance gap remains stable, or even worsens as the test delay increases, since initial (test) performance typically strongly predicts future (test) Ó 2017 Hogrefe Publishing

performance (e.g., Gow et al., 2011; Harackiewicz, Barron, Tauer, Carter, & Elliot, 2000; Karpicke & Roediger, 2007). Furthermore, the current study used individual videos while most online courses have multiple related videos that build on each other. Whether such inter-video dependency strengthens or weakens the effect of VTIC is yet unknown, but warrants further investigation. In this study, the complexity of the videos was manipulated based on four principles extracted from the literature on multimedia learning. These are the segmentation effect, the signaling effect, the spatial contingency effect, and the coherence effect, all of which are further explained and discussed in the Introduction, as well as by Mayer and Moreno (2003). This resulted in two different versions of each video that differ only in the (mainly visual) complexity of the presentation of information. While the mentioned manipulations have each been investigated independently, this is – to the best of our knowledge – the first study that combined all four to experimentally manipulate the complexity of videos. Surprisingly, while the individual manipulations had effect sizes ranging from Cohen’s d values of 0.48 to 1.36, the combined effect is estimated at a Cohen’s d of 0.31. We note several plausible interpretations for this discrepancy: the effect of the manipulations varies with (a) video characteristics, (b) student characteristics, (c) different implementations, and/or varies with (d) study design. First, while the current study used multiple videos and different versions of each video, a moderating effect of video characteristics cannot be ruled out. For example, the size of the effect might partly depend on characteristics such as the video’s length, educational content, or other aspects that were not manipulated in this study. Should this be the case, this would mean that the generalizability of the four effects are limited by these moderating variables. Secondly, characteristics of the students in the different studies might partly explain the discrepancy in effect sizes. While many of the cited studies used the relatively homogeneous subpopulation of psychology students, the current study used participants from various countries, with varying Journal of Media Psychology (2017), 29(1), 18–30


28

T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

levels of education as well as levels of English proficiency. Given the wider and less selective range of participants, one would typically expect a more accurate estimation of the size and generalizability of the studied effects. Furthermore, given the within-subject design of the current study, it seems unlikely that potentially relevant participant characteristics confounded the results, which would be more likely in a between-subject design. Thirdly, it is important to note that the current study necessarily employed a specific operationalization of the four effects. For example, there are many ways to operationalize the signaling effect by using attentional cues of different kinds, such as underlining, highlighting, or different kinds of arrows or circles. Given the wide range of possible operationalizations, the variation in the effects of these manipulations is to be expected. While this is likely to be of influence, it remains unclear whether it is a sufficient explanation. Finally, the fourth potential explanation of the difference in effect sizes is based on differences in study design and methodology. For example, the current study took place online, and not in an physical location such as a university. Another potential explanation is in the way that Cohen’s d is calculated, as well as different estimations of the standard deviation, or other choices in statistical procedures that can differ between studies (Baguley, 2009). In sum, while there are many plausible reasons for the differences in effect sizes, it remains unclear what the exact causes are, and whether these are systematic or due to random variation. This further emphasizes the need to study these instructional design guidelines for videos using a wide range of videos, in different educational contexts, and with a representative sample of participants. While it is unrealistic to expect to be able to accurately predict the effect size of such manipulations with great precision across many different situations, it is paramount for better understanding moderating variables and boundary conditions to make better recommendations of how to make high-quality educational videos.

Main Effect of Language Proficiency Students with a higher English language proficiency scored substantially higher than students with a lower proficiency. The slope of this effect was estimated to be 0.55 grade points. Given the range of language proficiency of 1–5, the grade point difference between the students with the highest and lowest proficiency levels will be over 2.5 grade points. This further signifies the issue that open online courses such as MOOCs are not equally accessible to everyone, as the majority of the courses are provided in English. By extension, this calls for research on investigating interventions or design strategies that might help close this performance gap. However, it is important to mention that Journal of Media Psychology (2017), 29(1), 18–30

the design of the current study does not directly translate to how non-native English speakers engage with online courses. For example, the participants in this study were not allowed to re-watch or pause videos, take notes, or use any other strategy that might be particularly helpful for non-native speakers in online courses. Students who experience trouble with understanding a video might choose to use such strategies to counteract their initial disadvantage. However, it is also plausible that non-native English speakers are put off by the mainly English online courses, and choose to not engage at all or drop out early in such courses, which should be prevented.

Summary and Consequences for Practice To summarize, the visual–textual information of a video and especially the language ability of the student are both strong predictors of learning from content videos. By contrast, English subtitles neither increased nor decreased the student’s ability to learn from the videos. However, this does not lead to the conclusion that English subtitles should not be made available, as they are vital for students with hearing disabilities. Furthermore, students might prefer watching videos with subtitles for other reasons, even though this might not directly affect their learning. The extent to which subtitles in the students’ native language might help them cope with lacking English proficiency is as of yet unknown, and remains to be investigated. Another possibility would be to provide dubbed versions of each video to cater to more languages, but this is a costly intervention. Overall, we have shown that both the language proficiency as well as the video’s complexity can have a substantial effect on learning from educational videos, which deserves attention in order to increase the quality and accessibility of open online education. These results have several consequences for educational practice. First, it is important to make sure that educational videos are designed in such a way that they do not hamper the learning process. Specifically, the visual-textual information complexity of educational videos should not be too high, such as by having too much irrelevant information, or a suboptimal physical organization of information. Secondly, educators of online courses should be aware of the possible detrimental effects of lower levels of English proficiency and aim to help these students as much as possible; merely providing English subtitles is not enough to guarantee accessibility.

References Baddeley, A. (2003). Working memory: Looking back and looking forward. Nature Reviews Neuroscience, 4(10), 829–839. doi: 10.1038/nrn1201 Ó 2017 Hogrefe Publishing


T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

Baguley, T. (2009). Standardized or simple effect size: What should be reported? British Journal of Psychology, 100(3), 603–617. doi: 10.1348/000712608X377117 Baltova, I. (1999). Multisensory language teaching in a multidimensional curriculum: The use of authentic bimodal video in core French. Canadian Modern Language Review, 56(1), 31–48. doi: 10.3138/cmlr.56.1.31 Boucheix, J.-M., & Lowe, R. K. (2010). An eye tracking comparison of external pointing cues and internal continuous cues in learning with complex animations. Learning and Instruction, 20(2), 123–135. doi: 10.1016/j.learninstruc.2009.02.015 Butcher, K. R. (2006). Learning from text with diagrams: Promoting mental model development and inference generation. Journal of Educational Psychology, 98(1), 182–197. doi: 10.1037/0022-0663.98.1.182 Campbell, J., Gibbs, A. L., Najafi, H., & Severinski, C. (2014). A comparison of learner intent and behaviour in live and archived MOOCS. The International Review of Research in Open and Distributed Learning, 15(5), 235–262. Retrieved from http:// www.irrodl.org/index.php/irrodl/article/view/1854 Chapman, P. R., & Underwood, G. (1998). Visual search of driving situations: Danger and experience. Perception, 27(8), 951–964. doi: 10.1068/p270951 Chung, J. M. (1999). The effects of using video texts supported with advance organizers and captions on Chinese college students’ listening comprehension: An empirical study. Foreign Language Annals, 32(3), 295–308. doi: 10.1111/j.1944-9720.1999.tb01342.x Council of Europe. (2001). Common European Framework of Reference for Languages: learning, teaching, assessment. Cambridge, UK: Cambridge University Press. Danan, M. (2004). Captioning and subtitling: Undervalued language learning strategies. Meta, 49(1), 67–77. doi: 10.7202/009021ar Diana, R. A., Reder, L. M., Arndt, J., & Park, H. (2006). Models of recognition: A review of arguments in favor of a dual-process account. Psychonomic Bulletin & Review, 13(1), 1–21. doi: 10.3758/bf03193807 Garza, T. J. (1991). Evaluating the use of captioned video materials in advanced foreign language learning. Foreign Language Annals, 24(3), 239–258. doi: 10.1111/j.1944-9720.1991.tb00469.x Ginns, P. (2006). Integrating information: A meta-analysis of the spatial contiguity and temporal contiguity effects. Learning and Instruction, 16(6), 511–525. doi: 10.1016/j.learninstruc.2006.10.001 Gow, A. J., Johnson, W., Pattie, A., Brett, C. E., Roberts, B., Starr, J. M., & Deary, I. J. (2011). Stability and change in intelligence from age 11 to ages 70, 79, and 87: The Lothian birth cohorts of 1921 and 1936. Psychology and Aging, 26(1), 232. doi: 10.1037/a0021072 Guo, P. J., Kim, J., & Rubin, R. (2014). How video production affects student engagement: An empirical study of MOOC videos. Proceedings of the first ACM Conference on Learning Scale Conference, 41–50. doi: 10.1145/2556325.2566239 Harackiewicz, J. M. , Barron, K. E. , Tauer, J. M. , Carter, S. M. , & Elliot, A. J. (2000). Short-term and long-term consequences of achievement goals: Predicting interest and performance over time. Journal of Educational Psychology, 92(2), 316. doi: 10.1O37/0022-O663.92.2.316 Harskamp, E. G., Mayer, R. E., & Suhre, C. (2007). Does the modality principle for multimedia learning apply to science classrooms? Learning and Instruction, 17(5), 465–477. doi: 10.1016/j.learninstruc.2007.09.010 Hayati, A., & Mohmedi, F. (2011). The effect of films with and without subtitles on listening comprehension of EFL learners. British Journal of Educational Technology, 42(1), 181–192. doi: 10.1111/j.1467-8535.2009.01004.x Kalyuga, S., Chandler, P., & Sweller, J. (1999). Managing splitattention and redundancy in multimedia instruction. Applied Cognitive Psychology, 13(4), 351–371. doi: 10.1002/(SICI)10990720(199908)13:4<351::AID-ACP589>3.0.CO;2-6

Ó 2017 Hogrefe Publishing

29

Karpicke, J. D., & Roediger, H. L. (2007). Repeated retrieval during learning is the key to long-term retention. Journal of Memory and Language, 57(2), 151–162. doi: 10.1016/j.jml.2006.09.004 Kruger, J., Hefer, E., & Matthew, G. (2014). Attention distribution and cognitive load in a subtitled academic lecture: L1 vs l2. Journal of Eye Movement Research, 7(5), 1–15. doi: 10.16910/jemr.7.5.4 Krupinski, E. A., Tillack, A. A., Richter, L. , Henderson, J. T., Bhattacharyya, A. K., Scott, K. M., & Weinstein, R. S. (2006). Eye-movement study and human performance using telepathology virtual slides. Implications for medical education and differences with experience. Human Pathology, 37(12), 1543–1556. doi: 10.1016/j.humpath.2006.08.024 Liu, M., Kang, J., Cao, M., Lim, M., Ko, Y., Myers, R., & Schmitz Weiss, A. (2014). Understanding MOOCs as an emerging online learning tool: Perspectives from the students. American Journal of Distance Education, 28(3), 147–159. doi: 10.1080/08923647.2014.926145 Love, J., Selker, R., Marsman, M., Jamil, T., Dropmann, D., Verhagen, A. J., . . . Wagenmakers, E. J. (2015). JASP. Retrieved from https://jasp-stats.org/ Markham, P. (1999). Captioned videotapes and second-language listening word recognition. Foreign Language Annals, 32(3), 321–328. doi: 10.1111/j.1944-9720.1999.tb01344.x Markham, P., Peter, L. A., & McCarthy, T. J. (2001). The effects of native language vs target language captions on foreign language students’ dvd video comprehension. Foreign Language Annals, 34(5), 439–445. doi: 10.1111/j.1944-9720.2001.tb02083.x Mayer, R. E. (2003). The promise of multimedia learning: using the same instructional design methods across different media. Learning and Instruction, 13(2), 125–139. doi: 10.1016/s09594752(02)00016-6 Mayer, R. E. (2008). Applying the science of learning: Evidence-based principles for the design of multimedia instruction. American Psychologist, 63(8), 760–769. doi: 10.1037/0003-066x.63.8.760 Mayer, R. E., Dow, G. T., & Mayer, S. (2003). Multimedia learning in an interactive self-explaining environment: What works in the design of agent-based microworlds? Journal of Educational Psychology, 95(4), 806–812. doi: 10.1037/0022-0663.95.4.806 Mayer, R. E., Heiser, J., & Lonn, S. (2001). Cognitive constraints on multimedia learning: When presenting more material results in less understanding. Journal of Educational Psychology, 93(1), 187–198. doi: 10.1037/0022-0663.93.1.187 Mayer, R. E., & Moreno, R. (1998). A split-attention effect in multimedia learning: Evidence for dual processing systems in working memory. Journal of Educational Psychology, 90(2), 312–320. doi: 10.1037/0022-0663.90.2.312 Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38(1), 43–52. doi: 10.1207/S15326985EP3801_6 Moreno, R., & Mayer, R. E. (1999). Cognitive principles of multimedia learning: The role of modality and contiguity. Journal of Educational Psychology, 91(2), 358–368. doi: 10.1037/0022-0663.91.2.358 Moreno, R., & Mayer, R. E. (2000). A coherence effect in multimedia learning: The case for minimizing irrelevant sounds in the design of multimedia instructional messages. Journal of Educational Psychology, 92(1), 117. doi: 10.1037/0022-0663.92.1.117 Moreno, R., & Mayer, R. E. (2002a). Learning science in virtual reality multimedia environments: Role of methods and media. Journal of Educational Psychology, 94(3), 598–610. doi: 10.1037/0022-0663.94.3.598 Moreno, R., & Mayer, R. E. (2002b). Verbal redundancy in multimedia learning: When reading helps listening. Journal of Educational Psychology, 94(1), 156–163. doi: 10.1037/00220663.94.1.156

Journal of Media Psychology (2017), 29(1), 18–30


30

T. van der Zee et al., Effects of Subtitles, Complexity, and Language Proficiency

Morey, R. D., & Rouder, J. N. (2015). BayesFactor: Computation of Bayes factors for common designs [Computer software manual]. doi: 10.1016/j.jmp.2012.08.001 Nesterko, S. O., Dotsenko, S., Han, Q., Seaton, D., Reich, J., Chuang, I., & Ho, A. (2013). Evaluating the geographic data in MOOCS. Neural Information Processing Systems. Retrieved from http://nesterko.com/files/papers/nips2013-nesterko.pdf Ozcelik, E., Arslan-Ari, I., & Cagiltay, K. (2010). Why does signaling enhance multimedia learning? Evidence from eye movements. Computers in Human Behavior, 26(1), 110–117. doi: 10.1016/ j.chb.2009.09.001 Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38(1), 1–4. doi: 10.1207/S15326985EP3801_1 Perez, M. M., Noortgate, W. V. D., & Desmet, P. (2013). Captioned video for l2 listening and vocabulary learning: A meta-analysis. System, 41(3), 720–739. doi: 10.1016/j.system.2013.07.013 Prolific Academic. (2015). Prolific Academic. Retrieved from https://prolificacademic.co.uk R Core Team. (2016). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria: Retrieved from https//www.R-project.org/ Reingold, E. M., Charness, N., Pomplun, M., & Stampe, D. M. (2001). Visual span in expert chess players: Evidence from eye movements. Psychological Science, 12(1), 48–55. doi: 10.1111/ 1467-9280.00309 Rouder, J. N., Morey, R. D., Speckman, P. L., & Province, J. M. (2012). Default Bayes factors for ANOVA designs. Journal of Mathematical Psychology, 56(5), 356–374. doi: 10.1016/j.jmp. 2012.08.001 Schmidt-Weigand, F., Kohnert, A., & Glowalla, U. (2010). A closer look at split visual attention in system-and self-paced instruction in multimedia learning. Learning and Instruction, 20(2), 100–110. doi: 10.1016/j.learninstruc.2009.02.011 Seaton, D. T., Bergner, Y., Chuang, I., Mitros, P., & Pritchard, D. E. (2014). Who does what in a massive open online course? Communications of the ACM, 57(4), 58–65. doi: 10.1145/ 2500876 Sweller, J. (1999). Instructional design in technical areas. Camberwell, Australia: ACER Press. Sweller, J. (2010). Element interactivity and intrinsic, extraneous, and germane cognitive load. Educational Psychology Review, 22(2), 123–138. doi: 10.1007/s10648-010-9128-5 Sweller, J., Van Merrienboer, J. J., & Paas, F. G. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10(3), 251–296. doi: 10.1023/A:1022193728205 TrackTest. (2016). Tracktest English. Retrieved from http://tracktest.eu/ Vanderplank, R. (1988). The value of teletext sub-titles in language learning. ELT Journal, 42(4), 272–281. doi: 10.1093/elt/42.4.272 Van der Sluis, F., Ginn, J., & Van der Zee, T. (2016). Explaining Student Behavior at Scale: The influence of video complexity on student dwelling time. In Proceedings of the Third (2016) ACM Conference on Learning@ Scale (pp. 51–60). ACM. doi: 10.1145/ 2876034.2876051 WCAG. (2008). WCAG 2.0 web content accessibility guidelines (WCAG) 2.0. Retrieved from http://www.w3.org/WAI/WCAG20/glance/ Winke, P., Gass, S., & Sydorenko, T. (2013). Factors influencing the use of captions by foreign language learners: An eye-tracking study. The Modern Language Journal, 97(1), 254–275. doi: 10.1111/j.1540-4781.2013.01432.x Received January 25, 2016 Revision received November 30, 2016 Accepted December 2, 2016 Published online March 21, 2017

Journal of Media Psychology (2017), 29(1), 18–30

Tim van der Zee ICLON Universiteit Leiden Wassenaarseweg 62A 2333 AL Leiden The Netherlands t.van.der.zee@iclon.leidenuniv.nl

Tim van der Zee is a PhD student at ICLON, Leiden University Graduate School of Teaching, The Netherlands. He studies how students learn from educational videos in online learning environments such as MOOCs. As an experimental psychologist, he strives to better understand learning processes and how we can enhance the quality of online education.

Wilfried Admiraal (Leiden University) is full professor Educational Sciences and Academic director of Leiden University Graduate School of Teaching. His research interest combines the use of technology in education and social psychology in secondary and higher education.

Nadira Saab (PhD) is an assistant professor at the ICLON, Leiden University Graduate School of Teaching, The Netherlands. Her research interests involve the impact of powerful and innovative learning methods and approaches on learning processes and learning results, such as collaborative learning, technology enhanced learning, (formative) assessment and motivation.

Fred Paas is a Professor of Educational Psychology at Erasmus University Rotterdam in the Netherlands and a professorial fellow at the University of Wollongong in Australia. His research focuses on cognitive load theory and instructional design.

Bas Giesbers is learning innovation consultant and researcher at the Learning Innovation Team of Rotterdam School of Management. His research interests involve synchronous and asynchronous online communication, motivation, collaborative learning, and the use of learning analytics to understand and improve learning.

Ó 2017 Hogrefe Publishing


Pre-Registered Report

“Drive the Lane; Together, Hard!” An Examination of the Effects of Supportive Coplaying and Task Difficulty on Prosocial Behavior Johannes Breuer,1 John Velez,2 Nicholas Bowman,3 Tim Wulf,1 and Gary Bente1 1

Department of Psychology, University of Cologne, Germany

2

College of Media and Communication, Texas Tech University, Lubbock, TX, USA

3

Department of Communication Studies, West Virginia University, Morgantown, WV, USA

Abstract: As an entertainment technology, video games are a popular social activity that can allow for multiple players to cooperatively engage on-screen challenges. Emerging research has found that when people play together, the resulting teamwork can have beneficial impacts on their prosocial orientations after gameplay – especially when the players are cooperative with one another. The present study wanted to expand the scope of these beneficial interpersonal effects by considering both inter- and intrapersonal factors. In an experimental study (N = 115) we manipulated the difficulty of a game (easy or hard) and the behavior of a confederate teammate (supportive or unsupportive playing style). We found that neither coplayer supportiveness nor game difficulty had an effect on the expectations of a teammate’s prosocial behavior or one’s own prosocial behavior toward the teammate after the game (operationalized as willingness to share small amounts of money with one’s teammate after playing). Increased expectations of prosocial behavior from one’s teammate were related to one’s own prosocial behaviors, independent of our manipulations. Considering these results, we propose alternative theoretical approaches to understanding complex social interactions in video games. Furthermore, we suggest to explore other types of manipulations of game difficulty and cooperation between video game players as well as alternative measures of prosocial behavior. Keywords: video games, social play, reciprocity, prosocial behavior, challenge

Although often studied when played in isolation, video games can be a surprisingly social entertainment technology. The earliest games such as SpaceWar! (1962) required two players, and most home consoles from the Atari 2600 (1977) to the Microsoft XBox One (2013) have at least two controller ports. Advances in computing technology, such as the spread of high-speed and wireless Internet access, have further increased the opportunities for playing with others, colocated in the sense of playing with the same device/in front of the same screen or otherwise. Research found that the prosocial effects of cooperative video game play are determined by in-game behaviors that provide or withhold expected helpful behaviors from teammates (Velez, 2015). However, little is known about how different modes of social video game play can change expectations of teammates’ in-game behaviors and how this may influence subsequent expectations of reciprocity and prosocial behaviors. The current study provides an extension of previous research examining the benefits of supportive versus unsupportive teammates as applied from the perspective of bounded generalized reciprocity (Velez, 2015). Additionally, game difficulty is manipulated to examine if greater need for helpful in-game behaviors from teammates (i.e., hard difficulty settings) or lower need (i.e., easy difficulty settings) influence Ó 2017 Hogrefe Publishing

how teammates who satisfy or deny such expectations affect subsequent prosocial behaviors.

Interpersonal Processes and Video Game Play A growing body of research suggests that video game play, when engaged as a shared social activity, is more about the act of playing together than the content being played (e.g., Elson & Breuer, 2013), which requires a shift of theoretical frameworks to examine their interpersonal effects. One such framework is the theory of bounded generalized reciprocity (BGR; Yamagishi, Jin, & Kiyonari 1999), proposed to explain people’s prosocial behaviors in the most basic inter- and intragroup interactions: minimal groups (i.e., groups formed by arbitrarily assigning strangers to group membership). BGR suggests that people rely on a set of instinctual expectations of others’ prosocial behaviors (i.e., the group heuristic) in order to maximize personal gain. People expect ingroup members to reciprocate prosocial behaviors that deems them safe for prosocial interactions, whereas providing prosocial behaviors to outgroup members is considered risky because of the expected low Journal of Media Psychology (2017), 29(1), 31–41 DOI: 10.1027/1864-1105/a000209


32

J. Breuer et al., Drive the Lane; Together, Hard!

chance of reciprocation, even if individuals have never previously interacted (Yamagishi et al., 1999). Prosocial behaviors are proposed to be generalized to all ingroup members, such that prosocial behaviors are expected to be provided and reciprocated whether or not two ingroup members have previously interacted. Ingroup members who adhere to these expectations are rewarded, whereas those who disregard expectations are punished by excluding them from further benefits (Yamagishi et al., 1999). Research suggests teammates’ behaviors during cooperative video game play are similarly rewarded or punished depending on whether teammates adhere or disregard expectations according to the group heuristic. For example, teammates who confirmed expectations by providing and reciprocating helpful behaviors during video game play received the most prosocial behaviors, whereas teammates who defied expectations and provided no helpful behaviors received the lowest amount of prosocial behaviors; even lower than minimal groups teammates who did not play a video game together (Velez, 2015). Additionally, the same study found that participants’ prosocial behaviors toward teammates after video game play were mediated by expectations of teammates to provide their own subsequent prosocial behaviors as suggested by BGR. That is, supportive teammates indicated their participation in the group heuristic and, thus, allowed participants to fulfill expectations of their own behaviors (e.g., to behave prosocially toward a teammate) with the assurance of collecting prosocial behaviors from their teammate. The current study aims to conceptually replicate and extend this previous research, which leads to our first set of hypotheses:

increase expectations due to the elevated need for teamwork. In addition, previous research has shown that success in a game is an important factor that also influences subsequent social interactions. For example, a study by Breuer, Scharkow, and Quandt (2015) found that losing in a competitive game increases negative emotions which, in turn, also increases the tendency to (re-)act aggressively to the opponent in subsequent interactions. Following this line of reasoning one could expect more prosocial behavior to be shown after playing an easy game. Other research suggests that increases in difficulty can monopolize players’ attention and redirect players’ focus from social influences to the effort necessary to undertake the increase in challenge (Bowman, Weber, Tamborini, & Sherry, 2013). However, it is plausible that cooperatively taking on a difficult challenge might be a more powerful bonding activity than tackling an easy challenge. As both theoretical reasoning and the findings from previous studies suggest that the effect of game difficulty on prosocial behavior and expectations about the prosocial behavior of a teammate could be positive or negative, we chose to formulate a set of competing hypotheses:

Hypothesis 1 (H1): People who play with a supportive teammate will be more likely to expect that teammate to provide subsequent prosocial behaviors than those who play with an unsupportive teammate.

Hypothesis 4a (H4a): People who play a more difficult game will behave more prosocially toward their teammate than those who play an easy game.

Hypothesis 2 (H2): People who play with a supportive teammate will behave more prosocially toward that teammate than those who play with an unsupportive teammate.

Task Difficulty in Social Video Game Play The research discussed in the previous section suggests helpful teammates confirm reciprocity expectations and unhelpful teammates defy them, but researchers need to also take into account other dimensions of social video game play that may increase or decrease what is expected of teammates during (and after) game play. For instance, the challenges faced by teams likely determine how much is expected of teammates, such that easy challenges result in fewer expectations of teammates, while hard challenges Journal of Media Psychology (2017), 29(1), 31–41

Hypothesis 3a (H3a): People who play a more difficult game will be more likely to expect their teammate to provide subsequent prosocial behaviors than those who play an easy game. Hypothesis 3b (H3b): People who play a more difficult game will be less likely to expect their teammate to provide subsequent prosocial behaviors than those who play an easy game.

Hypothesis 4b (H4b): People who play a more difficult game will behave less prosocially toward their teammate than those who play an easy game. Given our expectations for independent main effects of cooperation and game difficulty on prosocial behaviors (both in terms of the player and the player’s expectations of their teammate), it seems appropriate to consider the potential interaction of these main effects. In the case of our study, it makes sense that manipulations of teammate supportiveness might alter perceptions of the game’s difficulty, and, conversely, it is sensible to assume that a more difficult game might impact perceptions of a teammate’s helpfulness, due to frustration gestating from the increased game challenge that might be misattributed (either explicitly or implicitly) to one’s teammate. Thus, we proposed a research question regarding the (potential) interaction between supportiveness and difficulty and keep this open for exploration. Ó 2017 Hogrefe Publishing


J. Breuer et al., Drive the Lane; Together, Hard!

Research Question 1 (RQ1): How will the effect of teammate supportiveness and game difficulty on expectations and prosocial behavior interact? Finally, as the core of BGR is the expectation of reciprocity, we also assumed that expectations about the behavior of the teammate would also affect one’s own prosocial behavior: Hypothesis 5 (H5): The expectation of prosocial behaviors from a teammate will predict prosocial behavior toward the teammate.

Method We employed a 2 (teammate style: supportive vs. unsupportive)  2 (game difficulty: hard vs. easy) betweensubjects factorial design to examine the impact of both independent variables (and their potential interaction) on players’ subsequent expectations of reciprocity and prosocial behaviors toward teammates.

Participants We conducted an a priori power analysis with G*Power (version 3.1.9; Faul, Erdfelder, Lang, & Buchner, 2007) to determine an optimal sample size. We consulted previous work in this area to choose a realistic effect size. As there were no previous studies that investigated the effect of game difficulty on prosocial behavior, we had to restrict our a priori power analysis to the player interaction variable (i.e., supportiveness). Using an effect size estimate of f = 0.25 (Cohen’s d = 0.5) we arrived at a suggested sample size of N = 128 for our 2  2 ANCOVA with one covariate to test Hypotheses 2, 4a, and 4b. As we expected that roughly 15% of the participants would have to be removed based on our exclusion criteria (see data analysis section), our targeted gross sample size was N = 148.1 Participants were recruited via postings in student groups on social media, university mailing lists, and leaflets distributed among students of the institution of the first author as well as at a university of applied sciences in the same city and a local eSports bar. Psychology students were offered

1

2

3

33

course credit2 for their participation. Psychology students who did not want course credit and participants who did not study psychology were entered into a drawing for one of 12 €25 Amazon.de gift certificates. A total of 122 individuals participated in the study (68 female, 54 male). The average age of the sample was 23.83 (SD = 4.68).3

Procedure All interested participants were directed to a scheduling website via a URL in the e-mail/posting/leaflet where they could choose one 45-min laboratory session. These sessions were then randomly assigned to one of four experimental conditions. The laboratory set-up for the study included a Nintendo Wii console connected to a standard television screen and two conformable theater chairs to create a living room style atmosphere. Upon arrival to the laboratory, the participants read and signed an informed consent document, before they were introduced to the video game through a controller information sheet and given a 3-min training session to familiarize themselves with the game. To manipulate supportiveness, we used confederates who were trained in playing the game prior to data collection. In order to avoid suspicions, the confederate was also presented with the controller information sheet and given time to “become familiar with the controls” (to reinforce their guise as a naive participant in the study). After the practice session, the game was reset and both participant and confederate played a full game of 12 min (four 3-min quarters). Following gameplay, the confederate was taken to an adjacent room while the participant remained in the same room to complete an online questionnaire. At the end of the session, the participants were thanked and debriefed and received the €1 from the sharing task. The study was run by four experimenters (all female).

Stimulus Material Video Game Participants played NBA Jam: On Fire Edition (released in 2011 by EA Sports). The game is well situated for studying colocated social game play with two players as it (a) features teams of two cooperative players on screen at once,

Details about our a priori power analysis can be found in the preregistration document for this study, available via the Open Science Framework (OSF): https://osf.io/5ubwm/ At the institution of the first author, psychology students have to participate in a certain number of studies (measured in hours). With some buffer time, we rewarded psychology students with credits for 1 hr. The reason we did not meet the targeted sample of N = 148 is that it was originally planned to distribute the data collection across the institutions of the first, second, and third author of this paper; however, due to a delay in the submission and revision process for the preregistration document for this study, data collection was only possible at the institution of the first author as the semester break had already started at the other two institutions when the data collection phase began.

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 31–41


34

(b) gameplay is simplified to require four total buttons for gameplay, and (c) the game features a version of a popular global sport (basketball) but presented in a simplified fashion to focus on two core play mechanics: scoring points and preventing the other team from scoring. Supportive Teammate Manipulation Participants were assigned to play with one of four trained male confederates, posing as naive participants and trained to be helpful or not according to a script adapted from Velez (2015). Confederates who were instructed to be supportive began the game by stating: “Let’s use some teamwork,” and at three time points attempted to engage in a cooperative move (an alley-oop, where one player passes the ball to another player who, while jumping, catches the ball and slams it into the basketball hoop) with the participant that requires both players’ participation.4 Supportive confederates were instructed to pass the ball to participants as much as possible throughout the game. Confederates who were instructed to be unsupportive used neutral statements at the beginning (“Looks like we are on the same team”) and at three points throughout the game (“We are playing for 12 minutes, it looks like we have some more time to play.”) in order to ensure all conditions had equal amounts of verbal statements from the confederate. Unsupportive confederates were instructed to never pass the ball to the participant. The detailed script that was given to the confederates (in German) as well as its English translation can be found in our Open Science Framework (OSF) project for this study (see https://osf.io/bsd97). Game Difficulty Manipulation To alter the difficulty of NBA Jam, players were assigned to play in either the “easy” or “hard” mode in the game’s options menu. To ensure that the participants were unaware of this manipulation, the difficulty was chosen by the experimenter before the participant and the confederate entered the laboratory.

Measures All of the following measures were presented in the online questionnaire administered after the main playing session. The order of the measures in the questionnaire was as follows: (a) demographic information, (b) manipulation checks, (c) reciprocity expectations, and (d) prosocial behavior (sharing). Measurements were written in English and translated into German by one of the experimenters. Back-translations were performed by the authors to ensure the face validity of the items. All measures (in both English and German) are available via our shared OSF project link. 4

J. Breuer et al., Drive the Lane; Together, Hard!

Reciprocity Expectations Replicating past work (Greitemeyer, Traut-Mattausch, & Osswald, 2012; Velez, 2015), participants were told that they would engage in a money transaction game with their teammate. Specifically, they were told that both they and their teammate have ten 10-cent coins (€1 in total) and that they can donate any number of those to their teammate and/or keep as many as they like. They were also told that any number of coins they donate would double in value for their teammate, but any coins they keep will not (and that the same was, of course, true for their teammate). After reading this instruction they were first asked: “Out of the ten 10-cent coins possible to donate, how many do you think your teammate will choose to donate to you?’’ The response options ranged from 0 to 10. Prosocial Behavior (Sharing) To assess prosocial (sharing) behavior, participants were asked to indicate how many coins they would like to give to their teammate (from 0 to 10). For practical (there was no real interaction) and ethical reasons, all participants received the full €1 as payment when they were debriefed at the end of the study. Manipulation Checks To ensure that our manipulation of game difficulty was successful, we used both objective and subjective indicators of difficulty and performance. For objective performance, we noted the final score as well as the points scored by the participant and the confederate. To assess the players’ subjective experience of success, we asked them to indicate how difficult the game was and how successful they felt after playing the game. As manipulation checks for the supportiveness of the teammate (confederate), we also used objective (number of assists and alley-oops) and subjective indicators (rating the confederate teammate in terms of sympathy, supportiveness, and competence, and indicating how much support they expected from their teammate). Response options for self-report items ranged from 1 = strongly disagree to 7 = strongly agree. Control Measures As previous research has shown that the outcome of a video game (i.e., the success or score) can influence subsequent social interactions (Breuer et al., 2015), we wanted to use the difference between points scored by the team controlled by the participant and the confederate and the computer-controlled team as a covariate in our analyses. Additional Measures For the purpose of describing the sample, we asked participants to indicate their age, gender, and for how many

If at least one of these attempts was successful, the confederates were free to use this move again at any time.

Journal of Media Psychology (2017), 29(1), 31–41

Ó 2017 Hogrefe Publishing


J. Breuer et al., Drive the Lane; Together, Hard!

hours they play video games in an average week. We also asked participants whether they had known the other player (confederate) before the study. If they indicated that they did, we asked them how well they know them on a scale ranging from 1 = barely (e.g., “You saw her/him in a lecture or on the bus without really talking to her/him”) to 5 = very well (e.g., “You are close friends, are/have been roommates”). At the end of the questionnaire, participants were asked in two open-ended questions if they noticed anything particular during the study or if they had any comments. These items were used to identify participants who guessed the true purpose of the study or noticed that they played against a confederate.

Data Analysis Following the exclusion criteria defined in the preregistration document for this study, six participants were excluded because they guessed the true purpose of the study or indicated that they knew their teammate was a confederate in the open comments section of the questionnaire or in a verbal statement to the experimenter. Another participant was excluded because s/he received zero assists and completed zero alley-oops in the supportive condition. None of the participants indicated that they knew the confederate before the study. We chose not to use the fourth exclusion criterion described in the preregistration document (i.e., exclude participants who gave the confederate a supportiveness rating of 7 in the unsupportive condition or a rating of 1 in the supportive condition). Using this criterion would have led to the removal of 17 participants (in addition to the seven excluded based on the other three criteria). We discussed the potential reasons for this surprisingly high number, and the amount of high supportiveness ratings in the unsupportive condition (n = 15) supported our assumption that this is likely due to an issue of wording or terminology. Participants were asked whether their teammate was supportive. As the confederates were skilled players and instructed 5

6

7

8

9

35

to play as well as possible, the participants might have perceived the “egoistic” performance of the confederate in the unsupportive condition as “supportive” (or helpful) for being successful in the game.5 Applying the exclusion criteria 1 to 3 defined in the preregistration document resulted in a sample of N = 115 (65 female, 50 male) with an average age of 23.77 years (SD = 4.68) and an average amount of weekly video game playing of 4.23 hr (SD = 7.23).6 Of the participants of the net sample, 59 were in the supportive (31 easy difficulty, 28 hard difficulty) and 56 in the unsupportive conditions (28 easy, 28 hard).7 In conducting tests of our a priori hypotheses and to probe the research question presented, our observed data were analyzed using two approaches, in parallel: (a) null hypothesis significance tests (NHST) that are derived from a frequentist interpretation of probability and directly test for rejection of (or failure to reject) a null hypothesis, and (b) Bayesian hypothesis testing that tests the probability of the observed data under different hypotheses (including the null). Providing an overview of both analyses and their relative affordances and constraints is beyond the scope of this paper, but a critical difference between the two approaches is that NHST does not conceptually allow for direct tests of the proposed alternative hypothesis (it only allows us to reject or fail to reject the null), whereas Bayesian hypothesis testing does (as it tests for competing likelihoods of both the predicted and the null hypothesis). While our original plan was to include the final score of the game as a covariate (see “Analysis Plan” in the OSF preregistration document), we refrained from doing so because we found in an ANOVA with the experimental manipulations as independent and the score as dependent variables that, despite the extensive training for the confederates and the instruction to always play at their best and try to win, the final score was heavily influenced by the difficulty condition, F(1, 111) = 125.8, p < .001, ω² = 0.52, BF10 > 1,000.8,9 In the easy condition the average score (i.e., the difference between points scored by the team

Besides not fully meeting the targeted sample size, this change of exclusion criteria was the only major deviation from our preregistration document. A detailed list of deviations – both major and minor – along with explanations for those can be found in our OSF project for this study (https://osf.io/db9af/). Participants who do not (currently) play video games were asked to enter a 0 into the corresponding box in the online questionnaire. Removing these n = 63 individuals increased our sample’s average video game use per week to 9.35 hr (SD = 8.25). We also ran all of our confirmatory analyses (i.e., the manipulation checks and tests of our hypotheses) with all of the original exclusion criteria applied (N = 98). The results of these analyses are available as a separate JASP file in the OSF project. To provide some orientation for readers who are not familiar at all with Bayesian hypothesis testing: Bayes factors are measures of the strength of the relative evidence in the data for a certain hypothesis (Morey, 2014). More specifically, “Bayes factors provide a numerical value that quantifies how well a hypothesis predicts the empirical data relative to a competing hypothesis” (Schönbrodt, 2014). In the case of the BF10 that we will report for the Bayesian independent samples t tests in the results section, higher numbers indicate stronger evidence for the alternative hypothesis, while numbers < 1 provide more evidence for the null hypothesis the closer they are to 0 (Lakens, 2014). The BF01, on the other hand, indicates support for the null hypothesis (i.e., a higher BF01 means stronger support for the null hypothesis). The BF01 is simply the multiplicative inverse of the BF10 and vice versa (i.e., BF01 = 1/BF10 and BF10 = 1/BF01). For very large or very small numbers, JASP uses the e-notation. The exact BF10 for difficulty was 5.278e + 16 (i.e., 5.278  1016). To keep the numbers (and tables) in a readable format, we chose to report the Bayes factor as > 1,000 or < .001 if they are larger or smaller than these values.

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 31–41


36

J. Breuer et al., Drive the Lane; Together, Hard!

Table 1. Manipulation checks for difficulty Descriptives Easy

Frequentist

Bayesian

Hard

M

Mdn

SD

M

Mdn

SD

p

Cohen’s d

BF10

BF01

Self-rated difficulty

3.39

3

1.73

4.32

4

1.7

1,155

.005

0.54

8.3

0.12

Self-rated success

4.08

4

1.87

3.52

4

1.61

1,950

.092

0.33

0.77

1.3

Score

U

8.76

10

10.15

12.64

12

10.44

3,064

< .001

2.08

> 1,000

< .001

Points by participant

18.10

18

12.24

14.21

12.5

12.57

1,980.5

.066

0.31

0.7

1.43

Points by confederate

35.39

32

19.38

31.07

31

17.56

1,849.5

.270

0.23

0.4

2.5

controlled by participant and confederate and the AIcontrolled team) was in favor of the human players (M = 8.76, SD = 10.15), whereas the opposite was true for the hard condition (M = 12.64, SD = 10.44). While we expected the absolute value of the scores to differ significantly between the easy and difficult conditions, we had hoped that the values would be positive in both conditions – that is, that players would always defeat their opponent, but the magnitude of that victory would be diminished in the difficult condition (a scenario that would have provided a more direct conceptual replication of Bowman et al., 2013). Hence, we tested our Hypotheses 1–4b in two separate ANOVAs and did not consider score magnitude as covariate as it was naturally confounded with game difficulty. Hypothesis 5 was tested in a bivariate regression with reciprocity expectation as the predictor and sharing as the dependent variable. All manipulation checks were done in a series of independent-samples t tests. Data preparation and descriptive analyses were done with SPSS 22.0, while all inferential tests (both frequentist and Bayesian10) were conducted using JASP version 0.8.0.0 (JASP Team, 2016). The SPSS datasets and syntax as well as the complete JASP project (including the data, the analyses, and the results) are available in our OSF project.

Results Confirmatory Analyses Manipulation Checks As stated in the measures section, we used both objective (based on the performance in the game) and subjective 10

11

12

13

(based on self-report) indicators for our manipulation checks. Because the Shapiro–Wilk test suggested deviations from normality for almost all of the manipulation check variables, we used the non-parametric Mann–Whitney U test for the frequentist analyses. The results from the difficulty manipulation checks are shown in Table 1. As expected, participants in the hard condition rated the difficulty higher than those in the easy condition (Cohen’s d = 0.54, BF10 = 8.3).11 According to the suggestions of Cohen (1988), this constitutes a medium effect and the Bayes factor provides moderate evidence (Lee & Wagenmakers, 2013).12 There was also a large difference in the final score between the easy and hard condition (Cohen’s d = 2.08, BF10 > 1,000). While the differences for self-rated success and points scored by the participant were in the expected direction and the impact of difficulty can be interpreted as a small effect sensu Cohen (1988), these were not significant and the Bayes factors were indecisive (slightly in favor of the null hypothesis). That difficulty had almost no impact on the number of points scored by the confederates indicates that the training they received prior to the study seems to have been effective. Table 2 shows the results of the independent-samples t tests for our supportiveness manipulation checks. Participants in the supportive conditions reported that they received more support, gave the confederate higher sympathy ratings13, received more assists from the confederate, and scored more alley-oops in the game. Notably, while the latter two were experimentally controlled behaviors (and, hence, the large effect sizes are to be expected) the former two are perception measures, and provide strong evidence that our manipulations worked in that the conditions differed from each other in these respects (especially

For the Bayesian analyses we used the default settings in JASP: A Cauchy prior width of 0.707 for the independent-samples t tests, an r-scale of 0.5 for fixed and 1 for random effects in the ANOVAs, and an r-scale of 0.354 for the predictors in the regression analyses. There was stronger evidence for the effect of the difficulty manipulation on self-rated difficulty in the smaller sample using all four exclusion criteria (Cohen’s d = 0.69, BF10 = 29.86). Although some authors have criticized the use of labels to categorize the evidence provided by Bayes factors (Morey, 2015; Rouder, Speckman, Sun, Morey, & Iverson, 2009), we will use the categories proposed by Lee and Wagenmakers (2013) to provide some guidance; especially for readers who are unfamiliar with Bayesian hypothesis testing. Schönbrodt (2015) contrasts these views and provides a handy “grades of evidence cheat sheet.” In the smaller sample (i.e., with all four of the original exclusion criteria applied) the effect of the supportiveness manipulation on sympathy ratings for the confederate was noticeably larger (Cohen’s d = 0.99, BF10 > 1,000).

Journal of Media Psychology (2017), 29(1), 31–41

Ó 2017 Hogrefe Publishing


J. Breuer et al., Drive the Lane; Together, Hard!

37

Table 2. Manipulation checks for supportiveness Descriptives Unsupportive M

Mdn

Frequentist Supportive

SD

M

Cohen’s d

BF10

BF01

< .001

0.68

64.30

0.02

1,540

.525

0.10

0.23

4.43

1.29

1,654

.993

0.01

0.20

5.03

1.37

1,075

.005

0.53

7.05

0.14

128.5

< .001

2.08

> 1,000

< .001

555

< .001

1.31

> 1,000

< .001

Mdn

SD

U

Received support

4.95

5

1.89

6.07

7

1.38

1,052.5

Expected support

4.70

5

1.67

4.86

5

1.67

Confederate competence

6.02

6

1.18

6.00

6

Confederate sympathy

5.05

5

1.30

5.76

6

Assists by confederate

0.86

0

1.66

8.86

8

5.13

Alley-oops scored by participant

0.04

0

0.19

5.53

4

5.85

the received support metric, Cohen’s d = 0.68, BF10 = 64.3).14 As we had hoped for, the ratings of the confederates in terms of competence and the expectations about the supportiveness of the confederate did not differ between the unsupportive and supportive conditions. Reciprocity Expectations To test the impact of our experimental manipulations on reciprocity expectations, we calculated an ANOVA with supportiveness and difficulty as independent and the expectations about how many coins the teammate will share as the dependent variable. The number of coins participants expected their teammate to share did not differ between the easy (M = 6.20, SD = 2.88) and hard (M = 6.36, SD = 2.86) or the unsupportive (M = 6.20, SD = 3.12) and supportive (M = 6.36, SD = 2.62) conditions (see Table 3 for test statistics, including effect sizes and Bayes factors).15 For Hypothesis 1 the BF01 for supportiveness indicates that the data were 4.85 times more likely under the null hypothesis than under the alternative hypothesis. In the case of our competing Hypotheses 3a and 3b, the BF01 for difficulty suggests that the data were 4.86 times more likely under the null hypothesis than under the alternative hypothesis. According to the verbal categories proposed by Lee and Wagenmakers (2013), this is moderate evidence for the null hypotheses. Prosocial Behavior The effect of supportiveness and difficulty on prosocial behavior (sharing) was tested in an ANOVA with the experimental conditions as independent and the number

14

15

16 17

Bayesian

p

of 10-cent coins shared as the dependent variable. The results of this ANOVA are displayed in Table 4. Contrary to our expectations, there was no effect of coplayer supportiveness on prosocial behavior. The number of coins shared neither differed between the unsupportive (M = 7.5, SD = 2.89) and supportive conditions (M = 7.63, SD = 2.46) nor between the easy (M = 7.51, SD = 2.59) and hard conditions (M = 7.62, SD = 2.77).16 With regard to our Hypothesis 2, the BF01 for supportiveness indicates that the data were 4.9 times more likely under the null hypothesis than under the alternative hypothesis. For Hypotheses 4a and 4b, the BF01 for difficulty suggests that the data were 4.92 times more likely under the null hypothesis than under the alternative hypothesis. Again, this is moderate evidence for the null hypotheses according to Lee and Wagenmakers (2013). In order to test our fifth hypothesis in which we assumed that the expectation of prosocial behaviors from a teammate will predict prosocial behavior toward the teammate, we used bivariate linear regression. As can be seen in Table 5, expectations about how much the teammate (confederate) would share strongly predicted the participants’ own prosocial behavior. Accordingly, our data strongly support Hypothesis 5.17

Exploratory Analyses As Breuer et al. (2015) have found that the outcome of a competitive video game can affect aggressive (i.e., antisocial) behavior toward the opponent, we investigated in

Unsurprisingly, the evidence for this manipulation check was substantially stronger when the fourth exclusion criterion was also applied (Cohen’s d = 1.56, BF10 > 1,000). While the means for the expectations were around 6 in all conditions, the most common expectation across conditions was that the teammate would share five coins (n = 37), followed by 10 coins (n = 32). The most common amount of coins shared across conditions was 10 (n = 54) and the second most frequent choice was five coins (n = 30). There were no differences in the results for any of our hypothesis tests between the sample with all four and the one with only the first three exclusion criteria applied.

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 31–41


38

J. Breuer et al., Drive the Lane; Together, Hard!

Table 3. Results of the ANOVA for expectations about how much the teammate will share

Difficulty

Table 4. Results of the ANOVA for prosocial behavior (sharing) Frequentist

Bayesian

Frequentist

Bayesian

F

p

η²p

ω²

ω² BFInclusiona

Difficulty

0.05

.82

0

0

0.14

0

0.14

Supportiveness

0.07

.79

0

0

0.14

Difficulty  Supportiveness

0.14

.712

0

0

0.03

F

p

η²p

0.08

.782

0

Supportiveness

0.10

.751

0

0

0.15

Difficulty  Supportiveness

0.37

.544

0

0

0.04

BFInclusion

Note. aBayes factor in favor of including the variable.

additional exploratory analyses whether the result of the game also affected reciprocity expectations and prosocial behavior in our study. A Mann–Whitney U test revealed that the number of coins shared did not differ between games that were won (n = 53, M = 7.43, Mdn = 8, SD = 2.69) and games that were lost (n = 62, M = 7.68, Mdn = 9, SD = 2.67), U = 1740.5, p = .561, d = 0.09, BF10 = 0.22. Similarly, the participants’ expectations about how many coins their teammate would share did also not differ between games that were won (M = 6.25, Mdn = 5, SD = 2.89) and games that were lost (M = 6.31, Mdn = 5, SD = 2.86), U = 1663, p = .910, d = 0.021, BF10 = 0.2.18 This is further corroborated by the fact that the score did not predict the prosocial behavior (β = .08, p = .394, BF10 = 0.28)19 or the expectations of the participant (β = .02, p = .809, BF10 = 0.2). Since we did not fully meet the number of 128 participants (after exclusion) suggested by our a priori power analysis (see preregistration document) we used G*Power (version 3.1.9; Faul et al., 2007) to calculate the power we had with our net sample size to detect an effect of d = .5 (i.e., the effect size we used for our a priori power calculations based on the literature) in t tests for the main effects. This analysis indicated that with our final net sample of N = 115 we had a power of 0.76 to detect a main effect of our experimental manipulations in the magnitude of d = .5.

Discussion Previous research found that cooperative video game play can have prosocial effects for players (Adachi, Hodson, Willoughby, & Zanette, 2015; Ewoldsen, Eno, Okdie, Velez, Guadagno, & DeCoster, 2012; Greitemeyer & Cox, 2013; Velez, 2015; Velez, Mahood, Ewoldsen, & Moyer-Guse,

18

19

Table 5. Linear regression with expectation as predictor and prosocial behavior as dependent variable Frequentist

Expectation

Bayesian

F

p

B

β

BFInclusion

119.9

<.001

0.67

0.72

>1,000

2014; Waddell & Peng, 2014) and bounded generalized reciprocity theory suggests people naturally form expectations of prosocial reciprocity from ingroup members in minimal group settings (i.e., arbitrary group formation between strangers; Yamagishi et al., 1999). Recent research has explored how naturally formed ingroup reciprocity expectations (i.e., the group heuristic) are influenced when playing a video game with others and the subsequent effect of these changes on prosocial behaviors. Specifically, research suggests that, compared with minimal groups (i.e., strangers arbitrarily assigned to groups and did not play a video game), video game dyads with a helpful teammate confirmed players’ ingroup reciprocity expectations, while dyads with an unhelpful teammate disconfirmed expectations, which then led to increases or decreases in prosocial behaviors, respectively (Velez, 2015). The present study tried to extend this work by examining how supportive and unsupportive teammate behaviors can influence players’ ingroup reciprocity expectations and their resulting prosocial behaviors when playing under hard or easy game difficulty settings. While our study supported the BGR assertion that one’s own prosocial behaviors are largely determined by expectations of others’ reciprocity, our data do not lend support to the hypothesis that these expectations (and, consequently, one’s own prosocial behavior) are affected by the degree of supportiveness shown by a teammate in hard or easy cooperative video game play. As the current study suggests, supportive or unsupportive behaviors in hard or easy difficulty settings may not

It should be noted that the incidences of winning and losing were distributed unequally across the conditions: In the easy/unsupportive condition, 27 games were won and only one was lost, compared with 22 wins and nine defeats in the easy/supportive condition, two wins and 26 defeats in the hard/unsupportive condition, and also two wins and 26 defeats in the hard/supportive condition. In the case of bivariate regression (i.e., if there is only one predictor), BF10 for the model and BFInclusion for the predictor are the same.

Journal of Media Psychology (2017), 29(1), 31–41

Ó 2017 Hogrefe Publishing


J. Breuer et al., Drive the Lane; Together, Hard!

confirm or disconfirm prior reciprocity expectations as they do under intermediate settings (Velez, 2015). There are several possible reasons why (un)supportive behaviors in easy and hard difficulty settings did not effectively convey reciprocity expectation information. For instance, when examining difficulty settings at the ends of the spectrum (e.g., hard and easy settings) it is possible that the increased focus on scoring points (i.e., unsupportive behaviors) may be perceived positively under easy difficulty settings, considering cooperative behaviors are not needed to score (i.e., at least to a lesser degree than under hard difficulty), and, thus, scoring might have substituted as a supportive behavior. However, it is also possible that the more difficult game might have caused participants to focus more on their own performance or mastery of the game than on the behavior of their teammate. Consistent with a social facilitation theory interpretation (Bowman et al., 2013; Bowman, 2016), the cognitive and behavioral demands of competing in the high challenge game might have pulled attention away from the social elements; in such a scenario, it is possible that teammates’ behaviors do not have a strong influence on subsequent prosocial behaviors regardless of their supportiveness.20 These possibly different interpretations of teammate behaviors in hard and easy difficulty settings may also require different or more nuanced manipulations of supportive teammates during video game play and alternative measures of prosocial postgame behaviors. For example, when in-game behaviors are not sufficient or effective at influencing players’ subsequent reciprocity expectations then other aspects of social video game play may carry more significance for players. In the study by Velez (2015) confederates asked whether they should “pass more” or “set more screens” after each attempted helpful behavior, which provided two additional attempts at collaborative dialogue that was absent from the current study. Perhaps collaborative or supportive dialogue may be more effective at drawing players’ attention toward helpful teammates under circumstances of difficult game play. In regards to alternative post-game measures, it is important to remember that reciprocal interactions are often predicated on equal contributions from interactants. However, it is possible that, in comparison with the less practiced and competent participants, the skillful confederates created an environment of inequality between teammates, particularly in the current study’s manipulation of game difficulty. The prisoner’s dilemma game used in the current

20

39

study adequately examines prosocial behaviors between interactants of similar standing but may not have been an appropriate prosocial measure in the current study given the dominance of the confederates’ contributions (e.g., points scored and defense against the opposing team) and the resulting reliance of participants on confederates. Future research should examine possible alternative measures, such as a dictator game (Guala & Mittone, 2010), which may be more suitable for examining the effects of social interactions in which one person takes a leading and dominant role. Of course, our results have not only methodological, but also theoretical implications. Previous research (Velez, 2015; Velez & Ewoldsen, 2013; Velez, Greitemeyer, Whitaker, Ewoldsen, & Bushman, 2016) has advocated BGR as an appropriate theoretical background for examining the dynamics of social video game behaviors, particularly in comparison with other theoretical frameworks that overlook the social implications of how players treat each other during video games (e.g., social identity theory, general learning model, Deutsch’s theory of cooperation and competition; see Velez et al., 2016 for further examples and elaboration). Aside from the need to more systematically identify, include (on the theoretical level), and take into account (on the methodological level) relevant boundary conditions and potential moderators, it is important to discuss additional or alternative theoretical approaches in order to better understand why the predictions we made in our hypotheses for this study might have been wrong or at least imprecise. As suggested in previous research (Velez et al., 2016), other theoretical frameworks outside or related to BGR may be needed to examine increasingly complex social video game interactions. For example, interdependence theory (Kelley & Thibaut, 1978) has been suggested as a useful theory for research on cooperative play that may be used to examine the moderating role of players being more or less dependent on teammates for success, similar to interactions typically found in hard and easy game settings between players of unequal skill levels. Furthermore, interdependence theory suggests players’ comfortableness with this vulnerability or responsibility likely influences subsequent prosocial behaviors. To connect the methodological and theoretical implications of our study, future research examining hard and easy social video game play should utilize the moderators and mediators suggested by interdependence theory given

While a 2  2 ANOVA with the data from our study did not provide evidence for an interaction effect of the difficulty and supportiveness manipulations on perceived coplayer supportiveness (p = .388, η²p = .01, BFInclusion = 0.26) we cannot rule out that the meaning of support or supportiveness was understood differently by participants (depending not only on the type of challenge that they faced, but also on factors like their own skills).

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 31–41


40

the current study’s unexpected variations in team scores, wins versus losses (see McGloin, Hull, & Christensen, 2016) and points scored by confederates. In sum, there are several potential reasons why the predictions we made in our Hypotheses 1–4b were wrong. It may be that our predictions were imprecise as there are relevant boundary conditions that we did not take into account, such as previous game experience and skill, the personal relevance of success in the game or an imbalance of power in the player interactions. Testing this would require alternative methodological approaches (some of which we have outlined). It may also be that BGR has less explanatory power for complex social interactions in video games and their effects than we previously assumed, and using and explicitly testing the predictions made by other theories that aim to explain cooperation and prosocial behavior, such as interdependence theory, would be a way to address this in future research. While our findings do not inherently invalidate BGR as a useful theoretical framework for studying cooperative play and prosocial behavior in video games, they do suggest that the generalizability of BGR to complex social video game interactions is potentially limited, and future research should pull from other theories geared toward understanding dynamic social interaction. Finally, given the methodological limitations of the study we discussed earlier, the administered video game session may have been too weak to influence social–cognitive processes the way we expected. Verifying this assumption would require additional studies with alternative and potentially stronger manipulations of cooperative behavior and possibly also other, more subtle or nuanced, measures of prosocial behavior. Acknowledgments The authors thank Jennifer Meier, Jennifer Suckow, Benedikt Senk, Nadine Jarosch, Fabian Macholdt, Ilya Botkin, and Nina van Doorn for their work as experimenters and confederates in this study.

References Adachi, P. J., Hodson, G., Willoughby, T., & Zanette, S. (2015). Brothers and sisters in arms: Intergroup cooperation in a violent shooter game can reduce intergroup bias. Psychology of Violence, 5(4), 455–462. doi: 10.1037/a0037407 Bowman, N. D. (2016). Video gaming as co-production. In R. Lind (Ed.), Producing 2.0: The intersection of audiences and production in a digital world (Vol. 2, pp. 107–123). New York, NY: Peter Lang Publishing. Bowman, N. D., Weber, R., Tamborini, R., & Sherry, J. L. (2013). Facilitating game play: How others affect performance at and enjoyment of video games. Media Psychology, 16(1), 39–64. doi: 10.1080/15213269.2012.742360.

Journal of Media Psychology (2017), 29(1), 31–41

J. Breuer et al., Drive the Lane; Together, Hard!

Breuer, J., Scharkow, M., & Quandt, T. (2015). Sore losers? A reexamination of the frustration–aggression hypothesis for colocated video game play. Psychology of Popular Media Culture, 4(2), 126–137. doi: 10.1037/ppm0000020 Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum. Elson, M., & Breuer, J. (2013). Isolated violence, isolated players, isolated aggression. The social realism of experimental research on digital games and aggression. In T. Quandt & S. Kröger (Eds.), Multiplayer: The social aspects of digital gaming (pp. 226–233). London, UK: Routledge. Ewoldsen, D. R., Eno, C. A., Okdie, B. M., Velez, J. A., Guadagno, R. E., & DeCoster, J. (2012). Effect of playing violent video games cooperatively or competitively on subsequent cooperative behavior. Cyberpsychology, Behavior, and Social Networking, 15(5), 277–280. doi: 10.1089/cyber.2011.0308 Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. doi: 10.3758/BF03193146 Greitemeyer, T., & Cox, C. (2013). There’s no “I” in team: Effects of cooperative video games on cooperative behavior. European Journal of Social Psychology, 43(3), 224–228. doi: 10.1002/ ejsp.1940 Greitemeyer, T., Traut-Mattausch, E., & Osswald, S. (2012). How to ameliorate negative effects of violent video games on cooperation: Play it cooperatively as a team. Computers in Human Behavior, 28, 1465–1470. doi: 10.1016/j.chb.2012. 03.009 Guala, F., & Mittone, L. (2010). Paradigmatic experiments: the dictator game. The Journal of Socio-Economics, 39(5), 578–584. doi: 10.1016/j.socec.2009.05.007 JASP Team. (2016). JASP (Version 0.8.0.0) [Computer software]. Available at https://jasp-stats.org/download/ Kelley, H. H., & Thibaut, J. W. (1978). Interpersonal relations: A theory of interdependence (pp. 341). New York, NY: Wiley. Lakens, D. (2014, September 15). Bayes factors and p-values for independent t-tests. [Web log post]. Retrieved from http://daniellakens.blogspot.de/2014/09/bayes-factors-andp-values-for.html Lee, M. D., & Wagenmakers, E.-J. (2013). Bayesian cognitive modeling: A practical course. Cambridge, UK: Cambridge University Press. McGloin, R., Hull, K. S., & Christensen, J. L. (2016). The social implications of casual online gaming: Examining the effects of competitive setting and performance outcome on player perceptions. Computers in Human Behavior, 59, 173–181. doi: 10.1016/j.chb.2016.02.022 Morey, R. D. (2014, February 9). What is a Bayes factor? [Web log post]. Retrieved from http://bayesfactor.blogspot.co.uk/2014/ 02/the-bayesfactor-package-this-blog-is.html Morey, R. D. (2015, January 30). On verbal categories for the interpretation of Bayes factors. [Web log post]. Retrieved from http://bayesfactor.blogspot.de/2015/01/on-verbal-categoriesfor-interpretation.html Rouder, J. N., Speckman, P. L., Sun, D., Morey, R. D., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16(2), 225–237. doi: 10.3758/PBR.16.2.225 Schönbrodt, F. (2014, January 21). A short taxonomy of Bayes factors. [Web log post]. Retrieved from http://www.nicebread. de/a-short-taxonomy-of-bayes-factors/ Schönbrodt, F. (2015, April 17). Grades of evidence – a cheat sheet. [Web log post]. Retrieved from http://www.nicebread. de/grades-of-evidence-a-cheat-sheet/

Ó 2017 Hogrefe Publishing


J. Breuer et al., Drive the Lane; Together, Hard!

Velez, J. A. (2015). Extending the theory of Bounded Generalized Reciprocity: An explanation of the social benefits of cooperative video game play. Computers in Human Behavior, 48, 481–491. doi: 10.1016/j.chb.2015.02.015 Velez, J. A., & Ewoldsen, D. R. (2013). Helping behaviors during video game play. Journal of Media Psychology: Theories, Methods, and Applications, 25(4), 190. Velez, J. A., Greitemeyer, T., Whitaker, J. L., Ewoldsen, D. R., & Bushman, B. J. (2016). Violent video games and reciprocity the attenuating effects of cooperative game play on subsequent aggression. Communication Research, 43(4), 447–467. Velez, J. A., Mahood, C., Ewoldsen, D. R., & Moyer-Guse, E. (2014). Ingroup versus outgroup.conflict in the context of violent video game play: The effect of cooperation on increased helping and decreased aggression. Communication Research, 41(5), 607–626. doi: 10.1177/0093650212456202 Waddell, J. C., & Peng, W. (2014). Does it matter with whom you slay? The effects of competition, cooperation and relationship type among video game players. Computers in Human Behavior, 38, 331–338. doi: 10.1016/j.chb.2014.06.017 Yamagishi, T., Jin, N., & Kiyonari, T. (1999). Bounded generalized reciprocity: Ingroup boasting and ingroup favoritism. Advances in Group Processes, 16(1), 161–197. Received January 29, 2016 Revision received December 2, 2016 Accepted December 2, 2016 Published online March 21, 2017 Johannes Breuer Media and Communication Psychology University of Cologne Richard-Strauss-Str. 2 50931 Cologne Germany johannes.breuer1@uni-koeln.de

Johannes Breuer (PhD, 2013) is a postdoctoral researcher at the professorship for media and communication psychology at the University of Cologne (Germany) and the project “Redefining Tie Strength” (ReDefTie) at the Leibniz-Institut für Wissensmedien (Knowledge Media Research Center), Tübingen (Germany). His research interests include the uses and effects of video games, learning with new media, and methods of media effects research.

Ó 2017 Hogrefe Publishing

41

John Velez is an Assistant Professor in the College of Media and Communication at Texas Tech University, USA. His Ph.D. from the Ohio State University in 2014 focused on Mass Communication Uses and Effects. His research explores the psychological processes underlying new media selection and effects. His primary focus examines prosocial effects of video games.

Nick Bowman is an associate professor in the Department of Communication Studies at West Virginia University, USA. He received his PhD in communication with an emphasis on media psychology from Michigan State University in 2010. His research focuses on the cognitive, emotional, behavioral, and social demands of interactive media such as video games and social media.

Tim Wulf is a PhD student at the University of Cologne, Germany. He works as a research assistant at the University of Mannheim, Germany and his PhD project on media use and nostalgia is funded by the Foundation of German Business (sdw). His research interests include the role of cooperation and competition in video games and mediainduced nostalgia.

Gary Bente (PhD) is Professor of Media and Communication in the Department of Psychology, University of Cologne and appointed as Professor at the Department of Communication, Michigan Sate University. His research interests include nonverbal behavior and person perception in face-to-face as well as mediated interactions, Virtual Reality as a research tool and an emergent communication medium, and emotional and cognitive media effects.

Journal of Media Psychology (2017), 29(1), 31–41


Pre-Registered Report

Video Game Use as Risk Exposure, Protective Incapacitation, or Inconsequential Activity Among University Students Comparing Approaches in a Unique Risk Environment Adrienne Holz Ivory, James D Ivory, and Madison Lanier Virginia Tech, Blacksburg, VA, USA

Abstract: While there is extensive literature exploring the possible negative effects of video games, and many such studies using college student samples, there is little research on how video game use impacts the unique risk environment of college students. This study focuses on the unique risk aspects of the college and university environment with a preregistered survey comparing three competing models of video games’ possible role (games as risk, incapacitation, or inconsequential) in predicting alcohol and substance use, sexual risk, interpersonal violence, bullying victimization, suicide, disordered eating, and exercise to provide a baseline measure of what role, if any, video games play in the college and university risk environment. Video game play was most consistently associated with outcomes related to suicide and interpersonal violence, and more sporadically associated with some other outcomes. Keywords: video games, risk, college, university health

One of the most prolific and prominent topics in media psychology scholarship over the past few decades has been the effects of video games on their users. Hundreds of studies have examined questions such as whether violent video games increase users’ aggressive behavior or whether video games are addictive for their users. Much of the research investigating the potential impact of video games on their users has involved either laboratory experiments or surveys. The former have tested causal links between video game play and artificial measures conceptually related to more important real-world outcomes, such as the effects of violent video games on administration of noise blasts to a confederate (see Elson, Mohseni, Breuer, Scharkow, & Quandt, 2014). The latter have tended to examine correlations between self-reported video game play and self-reported outcomes such as school performance (e.g., Gentile, Choo, Liau, Sim, Li, & Fung, 2011) and risk behavior (e.g., Hull, Brunelle, Prescott, & Sargent, 2014). It suffices here to say that the results and interpretation of all of this research are mixed. Opinions vary widely among scholars about the strength of evidence for negative Journal of Media Psychology (2017), 29(1), 42–53 DOI: 10.1027/1864-1105/a000210

effects of video games, so much so that there is even animated disagreement about the extent of the existing scholarly consensus on the issue (see Bushman, Gollwitzer, & Cruz; Griffiths et al., 2016; Ivory et al., 2015; Petry et al., 2014; Quandt et al., 2015). Many of these disagreements revolve around the extent to which a host of methodological issues affect the validity and generalizability of research in the area. One of many such issues is the heavy use of college student samples in research on games’ effects (Ferguson & Olson, 2014), particularly given that much of the public discussion about societal effects of video games concerns children. Interestingly enough, however, few of these studies of college students address the unique health risks of that population, with studies instead generalizing from their postsecondary student samples to broader societal groups. This is a missed opportunity, considering that (a) at least some video game use is nearly universal among college and university students and a majority of them play regularly (Jones, 2003), and (b) several risk behaviors are known to be particularly prevalent among college and university students – in some cases higher than the general population Ó 2017 Hogrefe Publishing


A. Holz Ivory et al., Video Game Use and Risk at Universities

despite college and university students’ youthful age and low prevalence of many risk factors for poor health (e.g., low socioeconomic status; Centers for Disease Control & Prevention, 1997). It is possible that the social effect of video games on college students’ health is negligible among other risk factors, as has been argued about the effect of video games in general (e.g., Ferguson & Olson, 2014). It is also possible that some video games are a contributing factor to certain unhealthy behaviors among college students, as has also been argued in the case of other populations (e.g., Gentile & Bushman, 2012; Hull et al., 2014). A third possibility, given that exposure to the social environment of colleges and universities introduces a constellation of health risks (Weitzman, Nelson, & Wechsler, 2003), is that time spent playing video games is a protective factor against campus risks simply because spending time with games causes users to spend less time in high-risk settings. This incapacitation approach has been examined in video game sales and crime data (Cunningham, Engelstätter, & Ward, 2016) but has been somewhat unexplored in laboratory and survey research. This study’s purpose is twofold: (a) to provide baseline survey data to indicate what potential role video games may have in the unique health risk environment of college and university campuses, and (b) to test the competing predictions of the risk, incapacitation, and inconsequential approaches to the possible role of video games in the postsecondary risk environment.

Literature Review

43

Video Games as Added Risk, Incapacitating Distraction, or Inconsequential Activity in the Postsecondary Environment The friction between different findings and interpretations of the literature about video game effects complicates predictions of what role, if any, video game use might play in the unique problems of the college and university risk environment. For several of the most prominently identified health risks in college, competing predictions are plausible based on three competing conceptual mechanisms: 1. Video game exposure as an added risk influencing negative outcomes along with other factors, whether because of the effects of message content or the effects of game play setting, as predicted by popular models of learned negative game effects (Gentile & Bushman, 2012; Hull et al., 2014). 2. Time spent with video games as an incapacitating distraction from the dangerous elements of the university social environment, as predicted by economic approaches to video game use as time displacement from other activities (Cunningham et al., 2016). 3. Video game use as an inconsequential activity not uniquely influential in the face of greater risk factors, as predicted by models focusing on behavior as a product of biology and long-standing traits with minimal influence of short-term environmental triggers (e.g., Ferguson & Olson, 2014).

Health Risks in Colleges and Universities Health risks can be influenced by a range of factors, which include not only individuals’ overt risk-taking behaviors but also circumstances that expose individuals to the risk of victimization by others. A commonly identified risk behavior among college and university students is abuse of alcohol and other substances, which is itself a risk and is also understood to be an important contributor to other associated health risks among students (Weitzman et al., 2003). Campus alcohol use is associated with negative outcomes for both drinkers and nondrinkers, and is implicated as a factor in injuries, fights, sexual assaults, and risky sexual behavior (Abbey, 1991; Cooper, 2002) – all of which are behaviors identified as substantial risks for college students (Centers for Disease Control and Prevention, 1997). Health risks such as tobacco use, drug abuse, eating disorders, and decline in physical activity are often adopted in college, bullying remains prevalent in college, and college students attempt suicide at a higher rate than the general population (Chappell et al., 2004; O’Neill, 2007; Taylor et al., 2006). Ó 2017 Hogrefe Publishing

Tobacco, Alcohol, and Substance Use In the case of abuse of alcohol and other substances, this friction between these three competing approaches to video game effects is apparent. There are findings indicating that video games and media that idealize and celebrate risk may promote risk-taking among users (e.g., Hull et al., 2014); this effect is not unique to oft-violent action games (a genre that includes first-person shooters and fighting games), but has been observed with racing games as well (Fischer, Greitemeyer, Kastenmüller, Vogrincic, & Sauer, 2011). Additionally, playing games, whether alone or with friends, may co-occur with substance abuse. Meanwhile, however, the incapacitation argument that time spent playing games may be time not spent in the alcoholdrenched college social environment, as well as the inconsequential argument that game effects are not consistently observed on meaningful outcomes, both remain relevant to alcohol and substance abuse – although for these approaches, game content and genre are not Journal of Media Psychology (2017), 29(1), 42–53


44

A. Holz Ivory et al., Video Game Use and Risk at Universities

important to effects. Therefore, we propose competing hypotheses based on the competing approaches in turn: Hypothesis 1a (H1a): Overall video game use, action game use, and sports and racing game use will be positively associated with use of tobacco, alcohol, and other substances (risk). Hypothesis 1b (H1b): Overall video game use will be negatively associated with use of tobacco, alcohol, and other substances (incapacitation). Hypothesis 1c (H1c): Overall video game use, action game use, and sports and racing game use will not be associated with use of tobacco, alcohol, and other substances (inconsequential).

Sexual Risk For sexual risk, the risk approach again predicts genrespecific increases in risk, as well as effects of overall game use given the prevalence of risk-taking in popular video games (Hull et al., 2014), the incapacitation approach predicts broad reduction in risk, and the inconsequential approach predicts that video game play will be irrelevant: Hypothesis 2a (H2a): Overall video game use, action game use, and sports and racing game use will be positively associated with sexual risk (risk). Hypothesis 2b (H2b): Overall video game use will be negatively associated with sexual risk (incapacitation). Hypothesis 2c (H2c): Overall video game use, action game use, and sports and racing game use will not be associated with sexual risk (inconsequential). Interpersonal Violence The same thread of competing hypotheses is plausible for interpersonal violence, although with a unique focus on violent action games given the literatureâ&#x20AC;&#x2122;s focus on aggression as a possible effect of exposure to violent games (e.g., Bushman et al., 2015; Hull et al., 2014). Here, the risk approach predicts unique effects of oft-violent action video game use, although the generally high prevalence of violence in video games would be consistent with the risk prediction as well: Hypothesis 3a (H3a): Overall video game use and action game use will be positively associated with carrying weapons and fighting (risk). Hypothesis 3b (H3b): Overall video game use will be negatively associated with carrying weapons and fighting (incapacitation). Journal of Media Psychology (2017), 29(1), 42â&#x20AC;&#x201C;53

Hypothesis 3c (H3c): Overall video game use, action game use, and sports and racing game use will not be associated with carrying weapons and fighting (inconsequential). Bullying Victimization While bullying victimization is a concern on college and university campuses, it is less clear what the competing approaches might predict in terms of game effects given that bullying victimization is not instigated by the user. Therefore, we ask: Research Question 1 (RQ1): Are overall video game use, action game use, and sports and racing game use associated with bullying victimization? Suicide It is also unclear how suicide can be interpreted as a risk affected by game messages or time displacement. Therefore, we ask: Research Question 2 (RQ2): Are overall video game use, action game use, and sports and racing game use associated with suicide risk? Disordered Eating Evidence for the presence of idealized body images in video game characters (Lynch, Tompkins, van Driel, & Fritz, 2016) suggests a risk prediction that overall video game use would be associated with disordered eating. However, the incapacitation hypothesis is relevant if video game use distracts from unhealthy social influences, and the inconsequential approach also remains plausible: Hypothesis 4a (H4a): Overall video game use will be positively associated with disordered eating (risk). Hypothesis 4b (H4b): Overall video game use will be negatively associated with disordered eating (incapacitation). Hypothesis 4c (H4c): Overall video game use will not be associated with disordered eating (inconsequential). Exercise In the case of exercise, the incapacitation hypothesis loses relevance as time displacement could detract from exercise opportunity. Therefore, only a risk hypothesis of game time detracting from exercise and an inconsequential hypothesis are proposed: Hypothesis 5a (H5a): Overall video game use will be negatively associated with exercise (risk). Ă&#x201C; 2017 Hogrefe Publishing


A. Holz Ivory et al., Video Game Use and Risk at Universities

Hypothesis 5b (H5b): Overall video game use will not be associated with exercise (inconsequential).

Method Design An online survey of full-time students enrolled at colleges and universities in the United States measured participants’ age, sex, parents’ education level, self-reported overall video game use, video game use by genre, and risk-related behaviors pertaining to tobacco, alcohol, and substance use, sexual risk, interpersonal violence, bullying victimization, suicide, disordered eating, and exercise.

Participants Participants were recruited for this study using Amazon’s Mechanical Turk crowdsourcing Internet marketplace, which has been found to contain a disproportionately high prevalence of college students (e.g., Levay, Freese, & Druckman, 2016; Ross, Irani, Silberman, Zaldivar, & Tomlinson, 2010). Recruiting materials indicated that only full-time college and university students 18 years and older were eligible to participate, and participants were asked to confirm their full-time enrollment status and list the college or university they were attending. To limit irrelevant, spurious, and careless responses, eligibility criteria based on Mechanical Turk participation history from Levay et al. (2016) were applied, and data from participants who provided disqualifying (e.g., not a full-time student) or impossible responses (e.g., smoking cigarettes 40 days per month), or who completed the instrument in fewer than 5 min, were excluded from analyses. Given the results of an a priori power analysis (multiple regressions with three predictors including two controls; f2 = .02 with power = .80; see Analysis Strategy section), the targeted number of valid participants was n = 543. Recruitment was conducted over a period of about 36 hr until the target number was reached after eliminating disqualified respondents. The resulting final sample comprised 553 respondents included in analyses (of 1,313 total respondents). Respondents were paid US $1. Respondents were 57.3% male (n = 317) with a median age of 23 years (M = 25.02, SD = 5.67). Respondents’ ethnic make-up was 68.2% White (n = 377), 13.6% Black, not Hispanic (n = 75), 8.7% Hispanic or Latino (n = 48), 6.5% Asian or Pacific Islander (n = 36), 1.4% American Indian or Alaskan Native (n = 8), and 1.6% other races (n = 9). Compared with the US population of postsecondary students, the sample of participants is proportionally more male, slightly more White, and older (National Center for Educational Statistics, 2016). Ó 2017 Hogrefe Publishing

45

Participants reported spending a mean of 6.41 hr per week (SD = 7.79) playing video games, with 84.45% (n = 467) reporting at least 1 hr of play per week.

Questionnaire Instrument and Measures The Qualtrics online survey tool was used to deliver a questionnaire to participants. Measures of video game use and demographics are original. Measures of risk behaviors and parents’ education level were adapted from the Centers for Disease Control and Prevention’s (1997) National College Health Risk Behavior Survey (NCHRBS) and the American College Health Association’s (2015) National College Health Assessment. Overall Weekly Video Game Use One question asked how many hours participants spend playing video games (including games played on a game console, computer, or mobile device) in a typical week at college. Weekly Video Game Use by Genre A subsequent question presented only to participants who reported some weekly video game use measured the proportion of time spent in a typical week at college playing games from each a short and broad list of genres: action, role-playing, simulation, strategy, sports (including racing), and puzzle and trivia. Percentage responses for each genre were multiplied by overall weekly video game use to produce weekly video game use measures for each genre. Tobacco, Alcohol, and Other Substance Use In all, 12 questions asked how many times in a typical month at college participants (a) smoke cigarettes, (b) smoke electronic cigarettes, (c) use chewing tobacco or snuff, (d) have at least one drink of alcohol, (e) have five or more drinks of alcohol in a row, within a few hours, (f) use marijuana, (g) use any form of cocaine including powder, crack, or freebase, (h) use inhalants, (i) use other illegal drugs, (j) use steroids without a prescription, (k) take prescription drugs that were not prescribed to them, and (l) ride in a vehicle driven by someone who has been drinking alcohol. Sexual Risk Questions assessed general sexual risk behavior and sexual assault risk exposure specifically. One question asked the number of sexual partners with whom participants have sexual intercourse, oral sex, or anal sex in a typical month at college. Three questions asked how many times in a typical month at college participants have (a) sexual intercourse, oral sex, or anal sex, (b) sexual intercourse or Journal of Media Psychology (2017), 29(1), 42–53


46

A. Holz Ivory et al., Video Game Use and Risk at Universities

anal sex using a condom, and (c) sexual intercourse or anal sex without using a condom. One question asked how many times in the past year while at college participants have become pregnant or gotten someone pregnant. Two questions asked how many times in the past year participants (a) had someone have sexual intercourse, oral sex, or anal sex with them without their permission or when too intoxicated to provide consent, and (b) had sexual intercourse, oral sex, or anal sex without being sure their partner gave permission or when the partner may have been too intoxicated to provide consent.

option. The variable was treated as continuous in analyses with responses scored 1–4 and averaged across the two measures (Cronbach’s α = .671). Respondents who selected the not sure option for both parents did not receive a score.

Fighting Three questions asked how many times in the past year while at college participants have (a) carried a weapon such as a gun, knife, or club (not for work), (b) been in a physical fight, and (c) been in a physical fight in which they were injured and had to be treated by a doctor or nurse.

Analysis Strategy

Bullying Victimization Two questions asked how many times in the past year while at college participants have been (a) bullied physically, and (b) bullied electronically online. Suicide Two questions asked how many times in the past year while at college participants have (1) seriously considered suicide and (b) attempted suicide. Disordered Eating One question asked for participants’ perceptions of their weight (1 = very underweight, 3 = about the right weight, 5 = very overweight). Three questions asked how many times in a typical month at college participants (a) diet to lose weight or keep from gaining weight, (b) vomit or take laxatives to lose weight or keep from gaining weight, and (c) take diet pills to lose weight or keep from gaining weight. Exercise Four questions asked how many days in a typical month at college participants (a) participate in sports activities for at least 20 min that made them sweat or breathe hard, (b) do stretching exercises, (c) do strength exercises, and (d) walk or bicycle for at least 30 min at a time. Demographic Measures Three questions asked participants’ age, sex, and race. Sex was used as a control variable in analyses. Parents’ Education To control for an indirect indicator of socioeconomic status, two questions asked the education level of participants’ mother and father with four responses ranging from did not finish high school to graduated from college and a not sure Journal of Media Psychology (2017), 29(1), 42–53

Other Questions Questions measuring class in school, television use, reading for school, and reading for pleasure were included, but not analyzed.

While some outcome measures were conceptually related (e.g., three measures of tobacco use), they were not considered unidimensional enough to be examined in indices, and normality of data was not expected to be consistent across measures. Therefore, analyses were conducted individually for each measure. Each of a series of multiple regression analyses included the control variables of sex and parents’ education, the appropriate video game use predictor variable, and the appropriate outcome measure entered as the dependent measure. For each of the risk hypotheses, a regression equation included the control measures and overall game use measure predicting each outcome measure, then with analyses repeated for any predicted game genre measure (e.g., action games and sports and racing games for H1a and H2a, action games for H3a). The incapacitation hypotheses were tested with the same regression analyses for overall game use, and the inconsequential hypotheses were tested with the same series of regressions for overall game use as well as additional regressions for the same game use genres predicted for the corresponding risk hypotheses. Corrections to critical alpha values were made to adjust for the number of measures included for each conceptual outcome variable using the Bonferroni method. Specifically, the traditional critical alpha of p < .05 was divided by the number of outcome measures examined for each conceptual variable (3 for tobacco use, 3 for alcohol use, 6 for other substance use, 5 for general sexual behavior, 2 for sexual assault, 3 for fighting, 2 for bullying, 2 for suicide, 4 for disordered eating, and 4 for exercise). While hierarchical OLS regression may be appropriate for variables with normally distributed data, event count data are likely to be positively skewed and better fit by the Poisson regression model, which is modeled on the Poisson distribution rather than the normal distribution. That said, the Poisson model assumes the variance and mean of data to be equal, and sometimes fits event count data poorly. In such cases, the negative binomial regression model, which allows for estimation of mean and variance Ó 2017 Hogrefe Publishing


A. Holz Ivory et al., Video Game Use and Risk at Universities

independently, may provide a superior fit. Finally, some count variables have a disproportionately high number of zero scores because certain circumstances may lead some cases to be in a structural zero group where the event never occurs (e.g., committed nondrinkers) while other circumstances predict event counts among cases where the event may occur (e.g., factors influencing number of drinks for sporadic or heavier drinkers). For these variables, zeroinflated Poisson regression models and zero-inflated negative binomial models create models of two processes for a given outcome: likelihood of zero or nonzero outcomes using logistic regression, and counts for the outcome using either Poisson or negative binomial regression, respectively (see Atkins & Gallop, 2007). Given that some event count outcome measures were not expected to be normally distributed, an a priori analysis plan was developed to account for skewness. If skewness of the outcome measure did not exceed a threshold of ±2, a hierarchical OLS regression was used for that outcome measure with the control measures entered in the first step and the predictor measure added in the second step. If skewness did exceed ±2, then a series of regression models (each increasingly deviant from assumptions of normality and dispersion) were conducted and tested for fit in this order: Poisson model, negative binomial model, zeroinflated Poisson model, and zero-inflated negative binomial model. However, sparseness of data for some counts led all Poisson and negative binomial models to produce Hessian matrix singularity, yielding unstable parameter estimates and unreliable model fit data. Additionally, some zeroinflated negative binomial models did not produce results as optimization failed to converge. Thus, zero-inflated Poisson models with the control variables as covariates are reported for all analyses of outcome variables with skewness exceeding ±2.

47

Tobacco See Table 1 for a list of all study outcome measures, associated descriptive statistics, and a summary regression analysis results. A series of nine zero-inflated Poisson regression model equations were conducted with each combination of the three weekly game play variables predicting the three skewed (> ±2) tobacco use variables with the two control measures as covariates and a critical alpha value of .0167. Weekly action game play was a significant predictor in the zero-inflation model for e-cigarette use (coefficient estimate = .069, SE = .027, z = 2.529, p = .011), as well as the count model for chewing tobacco use (coefficient estimate = .073, SE = .024, z = 3.080, p = .002). No other coefficients were significant (p > .0167). Overall, results for tobacco use favor confirmation of H1c (inconsequential), except that action game play was associated with a decreased likelihood of never smoking e-cigarettes in a month (structural zero group) in partial support of H1a (risk). Action game play was associated with fewer days per month using chewing tobacco, partially consistent with H1b (incapacitation) but the predicted mechanism of H1b was not genre-specific.

Tobacco, Alcohol, and Other Substance Use

Alcohol A series of three hierarchical linear multiple regression equations and six zero-inflated Poisson regression equations were conducted with each combination of the three weekly game play variables predicting the one nonskewed (< ±2) and two skewed (< ±2) alcohol use variables with the two control measures as covariates and a critical alpha value of .0167. Weekly sports and racing game play was a predictor in the count model for having at least five drinks in a sitting (coefficient estimate = .091, SE = .017, z = 5.406, p < .001), as well as the zero-inflation model for riding with a drunk driver (coefficient estimate = .316, SE = .087, z = 3.615, p < .001). No other coefficients were significant (p > .0167). Overall, results for alcohol use favor confirmation of H1c (inconsequential), except that sports and racing game play was associated with more days per month having five or more drinks in a sitting and a decreased likelihood of never riding with a drunk driver in a month in partial support of H1a (risk).

Study data and materials are available at https://osf.io/ evknr/?view_only=f5d377bfa06b47fda1354e5b33737b51. H1a predicted that overall video game use, action game use, and sports and racing game use would be positively associated with use of tobacco, alcohol, and other substances (risk), while H1b predicted that overall video game use would be negatively associated with substance use (incapacitation), and H1c predicted that overall video game use, action game use, and sports and racing game use would not be associated with substance use (inconsequential).

Other Substances A series of 18 zero-inflated Poisson regression model equations were conducted with each combination of the three weekly game play variables predicting the six skewed (> ±2) substance use variables with the two control measures as covariates and a critical alpha value of .0083. Overall weekly game play was a significant predictor in the count model for using other illegal drugs (coefficient estimate = .046, SE = .015, z = 3.107, p = .002) Weekly action game play was a significant predictor in the count

Results

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 42–53


48

A. Holz Ivory et al., Video Game Use and Risk at Universities

Table 1. Descriptive statistics and results of hypothesis and research question tests for outcome measures

Days per month smoking cigarettes

M

SD

3.449

8.867

Skewness 2.479

Video Game Use Predictors Tested* Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Days per month smoking e-cigarettes

1.440

5.823

4.364

Overall weekly play (null) Weekly action play (

zero

)

Weekly Sports/Racing Play (null) Days per month using chewing tobacco

0.440

2.969

8.073

Overall weekly play (null) Weekly action play (

count

)

Weekly Sports/Racing Play (null) Days per month having at least one drink

5.780

6.434

1.446

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Days per month having five or more drinks in a sitting

1.960

3.575

2.985

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (+count)

Days per month riding with a drunk driver

0.570

2.304

6.668

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (

Days per month using marijuana

3.101

7.804

2.617

zero

)

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Days per month using cocaine

0.083

0.593

11.113

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Days per month using inhalants

0.060

0.609

12.644

Overall weekly play (null) Weekly Action Play (+count) Weekly Sports/Racing Play (

Days per month using other illegal drugs

0.180

1.438

17.064

count

Overall weekly play (+

count

)

)

Weekly Action Play (null) Weekly Sports/Racing Play (null) Days per month using unprescribed steroids

0.070

0.645

11.420

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Days per month using drugs prescribed to others

0.310

1.557

7.654

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Number of sexual partners per month

1.580

5.480

17.107

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (+count)

Times per month having sex

7.690

9.559

3.081

Overall weekly play (+count) Weekly Action Play (null) Weekly Sports/Racing Play (+count)

Times per month having sex using a condom

4.160

6.797

2.727

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Times per month having sex not using a condom

3.610

8.237

4.889

Overall weekly play (+count) Weekly Action Play (+count) Weekly Sports/Racing Play (null) (Continued on next page)

Journal of Media Psychology (2017), 29(1), 42â&#x20AC;&#x201C;53

Ă&#x201C; 2017 Hogrefe Publishing


A. Holz Ivory et al., Video Game Use and Risk at Universities

49

Table 1. (Continued)

Times in past year involved in a pregnancy

M

SD

0.050

0.228

Skewness 5.175

Video Game Use Predictors Tested* Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Times in past year victim of someone else having sex without consent

0.100

0.745

10.627

Overall weekly play (+count) Weekly Action Play (null) Weekly Sports/Racing Play (+count)

Times during past year had sex with someone else without consent

0.110

0.734

9.699

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (+count)

Times in past year carrying a weapon not for work

8.620

49.350

6.477

Overall weekly play (+count)

Times in past year in a physical fight

0.220

0.997

7.123

Overall weekly play (+count)

Times in past year needing medical attention from a physical fight

0.080

0.568

11.672

Overall weekly play (+count)

Weekly Action Play (+count) Weekly Action Play (null) Weekly Action Play (+count) Times in the past year a victim of physical bullying

0.110

0.797

13.873

Overall weekly play (null) Weekly Action Play (null) Weekly Sports/Racing Play (null)

Times in the past year a victim of online bullying

2.630

42.989

22.749

Overall weekly play (+count) Weekly Action Play (

count

)

Weekly Sports/Racing Play (null) Times considered suicide in past year

1.050

12.966

22.338

Overall weekly play (+count. Weekly Action Play (+

0.030

0.440

21.381

)

)

Weekly Sports/Racing Play ( Times attempted suicide in past year

zero

count

count

Overall weekly play (+

count

)

)

Weekly Action Play (+count) Weekly Sports/Racing Play (null) Weight (1 = very underweight, 5 = very overweight)

3.320

0.710

0.401

Overall weekly play (+)

Days per month dieting

6.290

9.874

1.400

Overall weekly play (null)

Days per month vomiting or taking laxatives

0.170

1.132

11.164

Days per month taking diet pills

Overall weekly play (

count count

)

0.830

4.362

5.892

Overall weekly play (

Days per month playing sports

10.670

8.405

0.551

Overall weekly play (null)

Days per month stretching

10.250

9.192

0.631

Overall weekly play (null)

9.058

8.311

0.732

Overall weekly play (null)

13.268

9.957

0.280

Overall weekly play (null)

Days per month doing strength exercises Days per month walking or bicycling

)

Notes. *Results of tests indicated in parentheses: “(null)” represents no significant finding, “(+)” and “( )” represent significant positive and negative coefficients in linear regression model, “(+count)” and “( count)” represent significant positive and negative coefficients in count model in zero inflated Poisson regression model, and “(+zero)” and “( zero)” represent significant positive and negative coefficients in zero-inflation model in Poisson regression model (i.e., increased or decreased likelihood of being in “always zero” group).

model for inhalant use (coefficient estimate = 2.579, SE = .855, z = 3.016, p = .003). Weekly sports and racing game play was a significant predictor in the count model for inhalant use (coefficient estimate = 2.577, SE = .728, z = 3.541, p < .001). No other coefficients were significant (p > .0083). Overall, results for other substance use favor confirmation of H1c (inconsequential), except that overall game play was associated with more days per month using illegal drugs and action game play was associated with more days per month sniffing inhalants, both in partial support of H1a (risk). Sports and racing game play was Ó 2017 Hogrefe Publishing

associated with fewer days sniffing inhalants, partially consistent with H1b (incapacitation), but the predicted mechanism of H1b was not genre-specific.

Sexual Risk H2a predicted that overall video game use, action game use, and sports and racing game use would be positively associated with sexual risk (risk), while H2b predicted that overall video game use would be negatively associated with sexual risk (incapacitation), and H2c predicted that overall Journal of Media Psychology (2017), 29(1), 42–53


50

video game use, action game use, and sports and racing game use would not be associated with sexual risk (inconsequential). General Sexual Behavior A series of 15 zero-inflated Poisson regression model equations were conducted with each combination of the three weekly game play variables predicting the five skewed (> ±2) general sexual behavior variables with the two control measures as covariates and a critical alpha value of .01. Overall weekly game play was a significant predictor in the count model for number of times per month having sexual intercourse (coefficient estimate = .011, SE = .002, z = 5.991, p < .001), as well as the count model for number of times per month having sexual intercourse without a condom (coefficient estimate = .015, SE = .002, z = 7.158, p < .001). Weekly action game use was a significant predictor in the count model for number of times per month having sex without a condom (coefficient estimate = .036, SE = .005, z = 6.606, p < .001). Weekly sports and racing game use was a significant predictor in the count model for number of sexual partners per month (coefficient estimate = .119, SE = .022, z = 5.293, p < .001), as well as the count model for number of times having sex per month (coefficient estimate = .049, SE = .011, z = 4.469, p < .001). No other coefficients were significant (p > .01). Overall, results for general sexual behavior favor confirmation of H2c (inconsequential), but several findings partially support H2a (risk): Overall game play was associated with having sex more times per month and having sex without a condom more times per month, while action game play was also associated with having sex more times without a condom per month and sports and racing game play was associated with more sexual partners per month and having sex more times per month. Sexual Assault A series of six zero-inflated Poisson regression model equations were conducted with each combination of the three weekly game play variables predicting the two skewed (> ±2) sexual assault variables with the two control measures as covariates and a critical alpha value of .025. Overall weekly game play was a significant predictor in the count model for number of times in the past year being sexually victimized without giving consent (coefficient estimate = .052, SE = .017, z = 2.946, p = .003). Weekly sports and was a significant predictor of the count model for number of times in the past year being sexually victimized without giving consent (coefficient estimate = .344, SE = .126, z = 2.737, p = .006), and weekly sports was a significant predictor of the count model for number of times in the past year having sex with others without obtaining consent (coefficient estimate = .391, SE = .131, Journal of Media Psychology (2017), 29(1), 42–53

A. Holz Ivory et al., Video Game Use and Risk at Universities

z = 2.990, p = .003). No other coefficients were significant (p > .025). Overall, results for sexual assault are mixed. The findings from half the analyses partially support H1c (inconsequential), while the other half partially support H1a (risk): Overall weekly game play and sports and racing game play were associated with being sexually victimized without consent more often in the past year, while sports and racing game play was also associated with having sex with others without obtaining consent more often in the past year.

Interpersonal Violence H3a predicted that overall video game use and action video game use would be positively associated with carrying weapons and fighting (risk), while H3b predicted that overall video game use would be negatively associated with carrying weapons and fighting (incapacitation), and H3c predicted that overall video game use and action game use would not be associated with carrying weapons and fighting (inconsequential). A series of six zero-inflated Poisson regression model equations were conducted with each combination of the two relevant weekly game play variables predicting the three skewed (> ±2) interpersonal violence variables with the two control measures as covariates and a critical alpha value of .0167. Overall weekly game play was a significant predictor in the count model for number of times carrying a weapon in the past year (coefficient estimate = .006, SE = .002, z = 2.957, p = .003), as well as the count model for number of times in the past year getting in a physical fight (coefficient estimate = .027, SE = .010, z = 2.616, p = .009), and the count model for number of times in the past year getting in a physical fight where someone needed medical attention (coefficient estimate = .069, SE = .019, z = 3.670, p < .001). Weekly action game play was also a significant predictor of the count model for times carrying a weapon in the past year (coefficient estimate = .081, SE = .004, z = 18.746, p < .001), and the count model for getting in a physical fight where a participant needed medical attention (coefficient estimate = .119, SE = .034, z = 3.490, p < .001.) No other coefficients were significant (p > .0167). Overall, results for interpersonal violence support H3a (risk), with overall game play associated with increased frequency of carrying a weapon, getting in a fight, and getting in a fight involving hospitalization in the past year, and action game play associated with increased frequency of carrying a weapon and getting in a fight involving hospitalization in the past year. Bullying Victimization RQ1 asked if overall video game use, action game use, and sports and racing game use are associated with bullying victimization. A series of six zero-inflated Poisson regression Ó 2017 Hogrefe Publishing


A. Holz Ivory et al., Video Game Use and Risk at Universities

model equations were conducted with each combination of the three weekly game play variables predicting the two skewed (> ±2) bullying variables with the two control measures as covariates and a critical alpha value of .025. Overall weekly game play was a significant predictor in the count model for number of times bullied online in the past year (coefficient estimate = .031, SE = .002, z = 13.259, p < .001). Weekly action game play was a significant predictor of the count model for number of times bullied online in the past year (coefficient estimate = .051, SE = .007, z = 7.590, p < .001). No other coefficients were significant (p > .025). Overall, results for bullying are mixed. Overall weekly play is associated with more frequent bullying victimization online, while action game play is associated with less frequent bullying victimization online, and neither game play measure is associated with risk of physical bullying.

Suicide RQ2 asked if overall video game use, action game use, and sports and racing game use are associated with suicide risk. A series of six zero-inflated Poisson regression model equations were conducted with each combination of the three weekly game play variables predicting the two skewed (> ±2) suicide variables with the two control measures as covariates and a critical alpha value of .025. Overall weekly game play was a significant predictor in both the count model (coefficient estimate = .046, SE = .003, z = 17.576, p < .001) and the zero-inflation model (coefficient estimate = .033, SE = .014, z = 2.367, p = .018) for number of times suicide was considered in the past year, and of the count model for number of times attempted suicide in the past year (coefficient estimate = .102, SE = .020, z = 5.089, p < .001). Weekly action game play was a significant predictor of the count model for number of times suicide was considered in the past year (coefficient estimate = .245, SE = .010, z = 23.856, p < .001), and of the count model for number of times suicide was attempted in the past year (coefficient estimate = .779, SE = .154, z = 5.068, p < .001). Weekly sports and racing game play was a significant predictor of the count model for number of times suicide was considered in the past year (coefficient estimate = .288, SE = .085, z = 3.371, p < .001). No other coefficients were significant (p > .025). Overall, results for suicide risk suggest relationships with video game play. Overall weekly play is associated with more frequently considering suicide, a lower likelihood of never considering suicide in a year, and more frequently attempting suicide, and weekly action play is associated with more frequently considering and attempting suicide. By contrast, playing sports and racing games is associated with less frequently considering suicide. Ó 2017 Hogrefe Publishing

51

Disordered Eating H4a predicted that overall video game use would be positively associated with disordered eating (risk), while H4b predicted that overall video game use would be negatively associated with disordered eating (incapacitation), and H4c predicted that overall video game use would not be associated with disordered eating (inconsequential). A series of two hierarchical linear multiple regression equations and two zero-inflated Poisson regression equations were conducted with overall weekly game play predicting the two nonskewed (< ±2) and two skewed (< ±2) disordered eating variables with the two control measures as covariates and a critical alpha value of .0125. Overall video game play was a significant predictor of weight (B = .013, SE = .004, β = .138, t = 3.235, p = .001, R2 change = .018), as well as of the count model for days per month vomiting or taking laxatives (coefficient estimate = .103, SE = .029, z = 3.576, p < .001), and the count model for days per month taking diet pills (coefficient estimate = .025, SE .007, z = 3.466, p < .001). No other coefficients were significant (p > .0125). Overall, results for disordered eating tend to support H4c (incapacitation) with overall weekly play associated with fewer days vomiting or taking laxatives and fewer days taking diet pills. The finding that overall video game play is associated with higher weight may be ambiguous, as it may indicate a tendency toward greater likelihood of being overweight in support of H4a (risk) or a decreased tendency to be underweight in support of H4b (incapacitation).

Exercise H5a predicted that overall video game use would be negatively associated with exercise behavior (risk), while H5b predicted that overall video game use would not be associated with exercise behavior (inconsequential). A series of four hierarchical linear multiple regression equations were conducted with overall weekly game play predicting the four nonskewed (< ±2) exercise variables with the two control measures as covariates and a critical alpha value of .0125. None of the video game use measures were a significant predictor of any of the exercise measures (p > .0125). Results support H5c (inconsequential).

Discussion This study provides baseline information about the role of video games in an understudied set of risk factors – specifically, health risks prevalent in postsecondary student populations. The unique risk circumstances of college and university students merit attention given the population’s prevalent video game use and disproportionately high rate Journal of Media Psychology (2017), 29(1), 42–53


52

of some negative health outcomes. While correlational in nature, and thus not appropriate evidence for causal claims, this study provides a first step toward understanding whether video games may have a meaningful harmful or protective role in the college risk environment. It is also important that research investigating video game use and behavioral outcomes be conducted using preregistration and open science practices. Even though this study cuts a broad conceptual swath by analyzing many potential outcomes and competing conceptual predictions, preregistered research with open data is an effective way to extend knowledge by ensuring that procedures are determined a priori rather than conducted or reported flexibly or selectively with prejudice toward one mechanism or outcome versus another. With the attention of the public and policymakers on researchers’ findings in this area, there is too much at stake for investigations to proceed any other way. Our findings suggest that video game use may be a predictor of outcomes related to suicide and interpersonal violence, as well as for unprotected sex and sexual assault as predicted by the risk approach. It may be that video game play influences these behaviors, or that video game play is a marker of lifestyle and health status factors associated with these behaviors. In either case, more research is needed. If a strong body of further preregistered research replicates our findings of video game use variables predicting interpersonal violence, suicide, unprotected sex, and sexual assault (whether causally or as a lifestyle marker) colleges, universities, campus organizations, and video game producers may do well to consider targeting video game enthusiasts with preventative campaigns. Video game play was largely unrelated with other outcomes analyzed here, most consistently exercise behavior and use of tobacco and other substances, in support of the inconsequential approach. This study adds little to speculation about video games influencing these behaviors, although more studies should be conducted before firm conclusions are drawn. Further, while there was evidence that video game play was associated with higher weight – which may conflate increased risk of being overweight with decreased risk of being underweight – video game use was associated with reduced rates of some disordered eating behavior in a manner consistent with the protective incapacitation approach. It may be useful to explore what mechanism may account for video game play as a marker for reduced disordered eating behavior. For other outcomes, there was little evidence for the incapacitation approach. Despite the advantages of the preregistered approach of this study and its openly available data for further analysis, the study also has many limitations. This study cannot imply causal directions, nor does it isolate a conceptual mechanism by which video game use might be associated with outcomes. Also, the strength and robustness of observed Journal of Media Psychology (2017), 29(1), 42–53

A. Holz Ivory et al., Video Game Use and Risk at Universities

associations in a practical setting is difficult to assess from one cross-sectional survey with limited covariates. Aside from the usual limitations involved with self-report data, this study’s use of the Mechanical Turk platform may affect its representativeness. While college and university students are plentiful among Mechanical Turk users, they may be different from the general college and university student population in many ways. This study should thus be replicated with other samples of postsecondary student populations. Further, the arbitrary 5-min minimum completion time for included responses may have been too conservative given the number of disqualified responses. While preregistered disqualification criteria are preferable to flexible criteria conducive to a “fishing expedition” in search of preferred findings, consistent best practices for a priori criteria can be developed as more preregistered studies are conducted. Several other possible control measures were not captured. Finally, many measures were imprecise. For example, the broad sports and racing category, measured in the questionnaire as sports including racing, was more general than measures in previous research targeting racing games as a specific predictor, and the measure of frequency of unprotected sex did not distinguish unprotected sex between committed partners from unprotected sex in riskier settings. The risks facing the college and university student population are unique, and the roles of video game use in predicting those risks is unclear. This study represents an early broad step toward a better understanding of what risks game users face to greater or lesser degrees than their peers. Why some risks are unique to the game-playing population, and what can be done to alleviate them, is something we must investigate further. Electronic Supplementary Material The electronic supplementary material is available with the online version of the article at http://dx.doi.org/10.1027/ 1864-1105/a000210 ESM 1. Text (PDF). Survey Questionnaire Instrument Items (24).

References Abbey, A. (1991). Acquaintance rape and alcohol consumption on college campuses: How are they linked? Journal of American College Health, 39, 165–169. doi: 10.1080/07448481.1991. 9936229 American College Health Association. (2015). National college health assessment. Retrieved from http://www.acha-ncha.org/ Atkins, D. C., & Gallop, R. J. (2007). Rethinking how family researchers model infrequent outcomes: A tutorial on count regression and zero-inflated models. Journal of Family Psychology, 21, 726–735. doi: 10.1037/0893-3200.21.4.726 Bushman, B. J., Gollwitzer, M., & Cruz, C. (2015). There is broad consensus: Media researchers agree that violent media

Ó 2017 Hogrefe Publishing


A. Holz Ivory et al., Video Game Use and Risk at Universities

increase aggression in children, and pediatricians and parents concur. Psychology of Popular Media Culture, 4, 200–214. doi: 10.1037/ppm0000046 Centers for Disease Control and Prevention. (1997, November 14). Youth risk behavior surveillance: National college health risk behavior survey – United States, 1995. Morbidity and Mortality Weekly Report, 46(SS-6). Retrieved from http://www.cdc.gov/ mmwr/PDF/ss/ss4606.pdf Chappell, M., Casey, D., De la Cruz, C., Ferrell, J., Forman, J., Lipkin, R., . . . Whittaker, S. (2004). Bullying in college by students and teachers. Adolescence, 39, 53–64. Cooper, M. L. (2002). Alcohol use and risky sexual behavior among college students and youth: Evaluating the evidence. Journal of Studies on Alcohol, s14, 101–117. doi: 10.15288/jsas.2002.s14.101 Cunningham, S., Engelstätter, B., & Ward, M. R. (2016). Violent video games and violent crime. Southern Economic Journal, 82, 1247–1265. doi: 10.1002/soej.12139 Elson, M., Mohseni, M. R., Breuer, J., Scharkow, M., & Quandt, T. (2014). Press CRTT to measure aggressive behavior: The unstandardized use of the competitive reaction time task in aggression research. Psychological Assessment, 26, 419–432. doi: 10.1037/a0035569 Ferguson, C. J., & Olson, C. K. (2014). Video game violence use among “vulnerable” populations: The impact of violent games on delinquency and bullying among children with clinically elevated depression or attention deficit symptoms. Journal of Youth and Adolescence, 43, 127–136. doi: 10.1007/s10964-013-9986-5 Fischer, P., Greitemeyer, T., Kastenmüller, A., Vogrincic, C., & Sauer, A. (2011). The effects of risk-glorifying media exposure on risk-positive cognitions, emotions, and behaviors: A meta-analytic review. Psychological Bulletin, 137, 367–390. doi: 10.1037/a0022267 Gentile, D. A., & Bushman, B. J. (2012). Reassessing media violence effects using a risk and resilience approach to understanding aggression. Psychology of Popular Media Culture, 1, 138–151. doi: 10.1037/a0028481 Gentile, D. A., Choo, H., Liau, A., Sim, T., Li, D., & Fung, D. (2011). Pathological video game use among youths: A two-year longitudinal study. Pediatrics, 127, e319–e329. doi: 10.1542/peds.2010-1353 Griffiths, M. D., van Rooij, A. J., Kardefelt-Winther, D., Starcevic, V., Király, O., Pallesen, S., . . . Demetrovics, Z. (2016). Working towards an international consensus on criteria for assessing Internet gaming disorder: A critical commentary on Petry et al. (2014). Addiction, 111, 167–175. doi: 10.1111/add.13057 Hull, J. G., Brunelle, T. J., Prescott, A. T., & Sargent, J. D. (2014). A longitudinal study of risk-glorifying video games and behavioral deviance. Journal of Personality and Social Psychology, 107, 300–325. doi: 10.1037/a0036058 Ivory, J. D., Markey, P. M., Elson, M., Colwell, J., Ferguson, C. J., Griffiths, M. D., . . . Williams, K. D. (2015). Manufacturing consensus in a diverse field of scholarly opinions: A comment on Bushman, Gollwitzer, and Cruz (2015). Psychology of Popular Media Culture, 4, 222–229. doi: 10.1037/ppm0000056 Jones, S. (2003). Let the games begin: Gaming technology and entertainment among college students. Pew Internet and American Life Project. Retrieved from http://www. pewinternet.org/files/old-media//Files/Reports/2003/PIP_ College_Gaming_Reporta.pdf.pdf Levay, K. E., Freese, J., & Druckman, J. N. (2016). The demographic and political composition of Mechanical Turk samples. SAGE Open, 6(1), 1–17. doi: 10.1177/2158244016636433 Lynch, T., Tompkins, J. E., van Driel, I. I., & Fritz, N. (2016). Sexy, strong, and secondary: A content analysis of female characters in video games across 31 years. Journal of Communication, 66, 564–584. doi: 10.1111/jcom.12237 National Center for Educational Statistics. (2016). Fast facts. Retrieved from http://nces.ed.gov/fastfacts/display.asp?id=98

Ó 2017 Hogrefe Publishing

53

O’Neill, E. K. (2007). Differences in health risk behaviors between college freshmen living in special interest housing and traditional housing (Unpublished doctoral dissertation). Retrieved from http://hdl.handle.net/10919/28081 Petry, N. M., Rehbein, F., Gentile, D. A., Lemmens, J. S., Rumpf, H.-J., Mößle, T., . . . O’Brien, C. P. (2014). An international consensus for assessing Internet gaming disorder using the new DSM-5 approach. Addiction, 109, 1399–1406. doi: 10.1111/add.12457 Quandt, T., Van Looy, J., Vogelgesang, J., Elson, M., Ivory, J. D., Consalvo, M., & Mäyrä, F. (2015). Digital games research: A survey study on an emerging field and its prevalent debates. Journal of Communication, 65, 975–996. doi: 10.1111/jcom.12182 Ross, J., Irani, L., Silberman, M. S., Zaldivar, A., & Tomlinson, B. (2010). Who are the crowdworkers? Shifting demographics in Mechanical Turk. In CHI ‘10. (Eds.), Extended Abstracts on Human Factors in Computing Systems (CHI EA ‘10) (pp. 2863–2872). New York, NY: ACM. doi: 10.1145/1753846.1753873 Taylor, C. B., Bryson, S., Luce, K., Cunning, D., Doyle, A. C., Abascal, L. B., Abascal, L. B., . . . Wilfley, D. E. (2006). Prevention of eating disorders in at-risk college-age women. Archives of General Psychiatry, 63, 881–888. doi: 10.1001/archpsyc.63.8.881 Weitzman, E. R., Nelson, T. F., & Wechsler, H. (2003). Taking up binge drinking in college: The influences of person, social group, and environment. Journal of Adolescent Health, 32, 26–35. doi: 10.1016/S1054-139X(02)00457-3 Received January 23, 2016 Revision received December 5, 2016 Accepted December 14, 2016 Published online March 21, 2017 James D. Ivory 111 Shanks Hall (Mail Code: 0311) 181 Turner Street NW Virginia Tech Blacksburg, VA, 24061 USA jivory@vt.edu

Adrienne Holz Ivory (PhD, Human Development, Virginia Tech), is an assistant professor in the Department of Communication at Virginia Tech, USA.

James D. Ivory (PhD, Mass Communication, University of North Carolina at Chapel Hill), is an associate professor in the Department of Communication at Virginia Tech, USA.

Madison Lanier (BA, Communication and Political Science, Virginia Tech), is a master’s student in the Department of Communication at Virginia Tech, USA.

Journal of Media Psychology (2017), 29(1), 42–53


Pre-Registered Report

Interactive Narratives Affecting Social Change A Closer Look at the Relationship Between Interactivity and Prosocial Behavior Sharon T. Steinemann,1 Glena H. Iten,1 Klaus Opwis,1 Seamus F. Forde,1 Lars Frasseck,1 and Elisa D. Mekler1,2 1

Center for Cognitive Psychology and Methodology, University of Basel, Switzerland

2

HCI Games Group, Games Institute, University of Waterloo, Canada

Abstract: Interactive narratives offer interesting opportunities for the study of the impact of media on behavior. A growing amount of research on games advocating social change, including those focusing on interactive narratives, has highlighted their potential for attitudinal and behavioral impact. In this study, we examine the relationship between interactivity and prosocial behavior, as well as potential underlying processes. A yoked study design with 634 participants compared an interactive with a noninteractive narrative. Structural equation modeling revealed no significant differences in prosocial behavior between the interactive and noninteractive condition. However, support for the importance of appreciation for and engagement with a narrative on subsequent prosocial behavior was observed. In summary, while results shed light on processes underlying the relationship between both noninteractive and interactive narratives and prosocial behavior, they also highlight interactivity as a multifaceted concept worth examining in further detail. Keywords: prosocial behavior, interactive narrative, appreciation, games for change, yoked design

A growing amount of research supports the idea that interactive narratives and games can be used not only for entertainment but also for education, health, and to further social change and prosocial behavior (Green & Jenkins, 2014; Steinemann, Mekler, & Opwis, 2015). Games for change are designed to motivate their players to support the social change they themselves are advocating. They have been created on a wide variety of subjects from the humanitarian crisis in Darfur (Darfur Is Dying), to the working poor in the United States (Spent), to the social status of women around the world (Half the Sky). In recent years, studies have provided empirical support for the potential of interactive media to improve attitudes toward stigmatized groups (Ruggiero, 2015), increase willingness to help (Peng, Lee, & Heeter, 2010), and impart knowledge around peace efforts among people living in conflict zones (Kampf & Stolero, 2015). Notably, however, to our knowledge only one study to date has examined the effect of games for change on actual behavior. In that study, Steinemann et al. (2015) compared a game where the player takes the role of a refugee in Darfur, with an interactive text, a noninteractive text, and a video, all telling the same story as the game. After engaging with the story, Journal of Media Psychology (2017), 29(1), 54–66 DOI: 10.1027/1864-1105/a000211

participants were asked whether they would be willing to donate a percentage of a monetary reward they were receiving to a charity helping refugees in Darfur. The study found that participants in the interactive conditions (i.e., the interactive text and the game) donated significantly more than participants in the noninteractive conditions. Understanding the impact that interactive media, such as games for change, can have on behavior, and specifically on prosocial behavior, is a highly interesting topic, both from an academic perspective (Ruggiero, 2015; Sundar, 2009) as well as from a practical perspective, as affecting behavior is arguably a crucial goal of games for change (Klimmt, 2009). In light of this first empirical support that games for change can indeed lead to prosocial behavior, the following sections will outline possible foundations for this effect.

Theoretical Background Interactivity Games for change vary widely in their visual presentation, use of game features, and narrative structure. What they Ó 2017 Hogrefe Publishing


S. T. Steinemann et al., Interactive Narratives Affecting Social Change

55

Figure 1. The conceptual model of the processes and outcomes of interactivity as proposed by Green and Jenkins (2014).

have in common, however, is that each game puts players in a role they would most likely never encounter in their day-to-day life, has them make decisions in this role, and experience their narrative consequences (Green & Jenkins, 2014). This taking of an active role in the narrative is referred to as interactivity (Green & Jenkins, 2014). While an exact definition of interactivity is hampered by the fact that different forms of media will exhibit interactivity in a wide variety of ways (Bucy & Tao, 2007; Sundar, 2009), especially for narrative-heavy games, their ability to allow decision-making is arguably one of interactivity’s most basic and defining features (Elson, Breuer, Ivory, & Quandt, 2014; Green & Jenkins, 2014). Different studies have highlighted the importance of interactivity as crucial to the impact of games for change (Peng et al., 2010; Ruggiero, 2015; Steinemann et al., 2015). Notably, Steinemann et al. (2015) found an interactive text to be just as effective at increasing donations as an animated game. This finding lends credence to interactive texts as a valuable form of game for change. Indeed, several games for change already exist, which either are designed as interactive texts or rely heavily on interactive text as a primary game feature (e.g., Spent, Depression Quest, or NationStates). In this study, we therefore focus on interactivity in text-based narratives, as referring to decisions guiding the narrative, as opposed to, for example, dexterity-based interactivity possible in digital games. Beyond empirically demonstrating the importance of interactivity to affect behavior, it is necessary to further understand the psychological processes that mediate this effect (Bucy & Tao, 2007). In the study by Steinemann et al. (2015), for instance, the effect of interactivity on donating behavior was mediated by appreciation. Yet none of the other examined factors, which included willingness to help and enjoyment, were both impacted by interactivity and positively related to donating. The aim of this study therefore is to more closely examine the relationship between interactivity and prosocial behavior. Ó 2017 Hogrefe Publishing

Hence, we refer to the theoretical model of Green and Jenkins (2014), which discusses a number of variables that may help to explain the processes involved in the effects of interactive narratives on outcomes such as behavior (see Figure 1). In this model, interactivity leads to behavioral change by giving the reader control and allowing them to adapt the narrative structure (i.e., the course of the story) according to their individual personality and interests. This in turn leads to engagement (which includes factors such as identification) and allows the reader to play with different roles of the self, for example, by an increased sense of responsibility toward the characters in the interactive narrative or by exploring different aspects of their personality through possible selves presented in the narrative. Together, these variables are expected to impact outcomes such as enjoyment, appreciation, and attitudinal and behavioral change. The current study aims to empirically examine some of these processes. We focus on variables that may be of particular interest when attempting to explain the impact of interactivity on prosocial behavior as the outcome. Prosocial Behavior While there is still little research specifically about the impact of games for change on actual behavior, the study by Steinemann et al. (2015) gives a first indication for such an effect, and interactivity as its source. While prosocial behavior can manifest itself in countless ways, in the study by Steinemann et al. (2015) prosocial behavior was instrumentalized as the percentage that, after engaging with a narrative, participants donated to a charity helping people like the main character in the narrative. Based on these results, combined with the findings of other studies that link interactive media with increased prosocial attitudes and behaviors (Green & Jenkins, 2014; Ruggiero, 2015), we hypothesize that: Hypothesis 1 (H1): Interactivity will lead to a higher percentage donated. Journal of Media Psychology (2017), 29(1), 54–66


56

S. T. Steinemann et al., Interactive Narratives Affecting Social Change

Identification In the context of media, identification describes the process of taking on the role of a character and sharing their goals and emotions (Cohen, 2001). In contrast to engagement with the narrative world, identification describes the merging with a character (Green & Jenkins, 2014). This merging with a character is facilitated by interactivity, as interactivity allows the player to choose actions for the character, which they personally agree with (Vorderer, Knobloch, & Schramm, 2001). According to social identity theory, identification is crucial in the categorization of in- and outgroups, creating the line between people an individual will consider to be like themselves and treat more favorably and those they will not (Hogg, 2003). Identification has its basis in empathy, itself a well-established predecessor of prosocial behavior (Eisenberg & Miller, 1987). In the context of games for change, increased identification has been associated with higher willingness to help (Peng et al., 2010) and donating behavior (Steinemann et al., 2015). We therefore hypothesize that: Hypothesis 2 (H2): Interactivity will lead to more identification with the character. Hypothesis 3 (H3): Identification will be positively related to a higher percentage donated. Responsibility As argued by Green and Jenkins (2014), while empathy with a character may occur in noninteractive narratives, feeling responsible for their actions is rare. By making decisions in the interactive narrative, however, the reader can see a direct link between their actions and their consequences. Through this sense of agency over the narrative, the likelihood of feeling responsible for the outcome and how it affects the main character increases (Green & Jenkins, 2014). A lack of agency has been associated with an increase in moral disengagement, which in turn is related to a decrease in prosocial behavior (Bandura, Barbaranelli, Caprara, & Pastorelli, 1996). Alternately, priming participants on their responsibility can increase empathy, which is related to prosocial behavior (Čehajić, Brown, & González, 2009). While there are no studies directly linking responsibility with prosocial behavior in interactive narratives, on the basis of these findings we hypothesize that: Hypothesis 4 (H4): Interactivity will lead to more responsibility. Hypothesis 5 (H5): Responsibility will be positively related to a higher percentage donated.

Journal of Media Psychology (2017), 29(1), 54–66

Appreciation Finally, appreciation describes media experiences that are valued not necessarily for being fun but for their capability to be meaningful, moving, and thought-provoking (Oliver & Bartsch, 2010); such as when the player’s character has to make a hard choice in the narrative. While games research has long focused primarily on enjoyment, recent studies have highlighted the ability of games to lead to meaningful experiences (Elson et al., 2014; Oliver et al., 2015; Steinemann et al., 2015). A possible explanation for this effect is that interactivity may allow players to create a story that is more personally meaningful to them than a noninteractive equivalent (Elson et al., 2014). Both feelings of meaningfulness as well as the ability of media to be moving have been repeatedly associated with increased likelihood of compassion and prosocial behavior (Morgan, Movius, & Cody, 2009; Myrick & Oliver, 2015; Small & Simonsohn, 2008). Furthermore, in the study by Steinemann et al. (2015) appreciation was not only higher in the interactive condition, it was also positively related to an increase in donations. In the conceptual model of Green and Jenkins (2014), appreciation is an outcome, similar to behavior. However, as behavior is the focus of this study and because of the aforementioned research linking appreciation with both interactivity and prosocial behavior, we will treat appreciation as an additional process between interactivity and prosocial behavior (see Figure 2). We therefore expect that: Hypothesis 6 (H6): Interactivity will lead to more appreciation. Hypothesis 7 (H7): Appreciation will be positively related to a higher percentage donated. While identification, responsibility, and appreciation offer the clearest indications for their role as mediators between interactivity and prosocial behavior, other variables should also be considered in a comprehensive model of these processes. Therefore, we also controlled for the role of three additional variables. To control for individual differences in empathy, which may particularly impact identification, empathic concern was included (Cohen, 2001). Additionally, enjoyment, which is related to appreciation (Oliver & Bartsch, 2010), and narrative engagement (Busselle & Bilandzic, 2009), which may be related to all three potential mediators, was controlled for (see Figure 2). To sum up, the goal of this study was to examine how an interactive narrative, compared with a noninteractive narrative, impacts prosocial behavior, identification with the character, responsibility toward the character, and

Ó 2017 Hogrefe Publishing


S. T. Steinemann et al., Interactive Narratives Affecting Social Change

Empathic concern

Enjoyment

57

Figure 2. A model of the expected processes between interactivity and prosocial behavior. Lines in bold indicate hypotheses-relevant pathways.

Narrative Engagement

e

Identification

H2

Responsibility

H4

Interactivity

e

H3

e

H5

Prosocial Behavior

H1

e

H6

H7

Appreciation

appreciation of the narrative experience. Furthermore, we examined how these different variables in turn relate to prosocial behavior (see Figure 2). Thereby, the results offer a closer empirical examination of the theoretical model of Green and Jenkins (2014), as well as allowing a more sophisticated look at the relationship between interactivity and prosocial behavior.

Method Ethics Statement This research was registered with the Institutional Review Board of the authors’ university. Written informed consent was obtained from all participants.

Design To test our hypotheses, a between-subject experimental design was utilized. The independent variable was interactivity (interactive, noninteractive). The primary dependent variable was prosocial behavior, measured as the percentage of the reward that participants donated at the end of the study. The further dependent variables – expected to mediate the relationship between interactivity and prosocial behavior – were identification, responsibility, and appreciation. Empathic concern, enjoyment, and narrative engagement were added to the model as control variables. 1 2

An additional variable, text comprehension, served as a quality check and was analyzed across groups prior to testing the model, to ensure that interactivity did not affect participants’ ability to understand the text.

Participants To achieve an acceptable power for the specified model (see Figure 2), a sample of 580 was needed. To ensure we would conclude with a sufficient sample size, we aimed to recruit approximately 730 participants on the crowdsourcing platform CrowdFlower (http://www. crowdflower.com).1 As recruitment over this platform was slow, Mechanical Turk was also included (https://www. mturk.com/mturk/).2 In all, 854 participants finished the study, of whom 796 correctly answered a bogus item (“This is a control item, please select ‘completely disagree’”). To ensure data quality, an additional 162 participants were subsequently excluded, due to technical issues (n = 7), outliers (±3.00 SD) in completion time (n = 81), indicating that they had not carefully answered the study questions (n = 9), participating more than once (n = 25), and answering less than three out of six of the text comprehension questions correctly (n = 40). The final dataset consisted of a total sample of 634 participants (331 in the interactive, 303 in the noninteractive condition). To ensure the samples collected on Mechanical Turk (n = 270) and CrowdFlower (n = 363) did not differ significantly in terms of the impact of interactivity on the

Recruitment on CrowdFlower took place from June 2, 2016, to July 13, 2016. Recruitment on Mechanical Turk took place from July 8, 2016, to July 12, 2016.

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 54–66


58

dependent variables, a two-way multivariate analysis of variance (MANOVA) was conducted to examine the combined effects of platform and condition on identification, responsibility, appreciation, and donation. A significant main effect for condition was found (p < .001), but neither the main effect for platform, nor the interaction effect between platform and condition reached significance (p-values between .53 and .97). Therefore the two samples did not differ in terms of hypothesis-relevant effects. To examine whether text comprehension differed between the interactive (M = 5.27, SD = 0.62) and noninteractive condition (M = 5.25, SD = 0.93), Welch’s two-sample t test was conducted. No significant difference was found (p = .419). As good English skills were essential for understanding the questionnaires and the stimulus material, we restricted recruitment to countries with English as a primary language. The majority of participants reported their nationality as US American (n = 301), Canadian (n = 106), or British (n = 96), with the remaining 131 reporting one of 31 other nationalities. Of the participants, 381 identified as female, 245 as male, three as transgender or non-binary, and four preferred not to say. Participants reported a wide variety of employment types, the largest groups being professional or managerial (n = 268), unemployed (n = 111), student (n = 91), blue collar or service (n = 80), and self-employed (n = 84). Participants received US $0.2 for their participation, which they received after entering a code on CrowdFlower or Mechanical Turk that they were awarded at the end of the study. In addition, they received a reward of up to US $1 for carefully filling out the questionnaires and open questions, with respect to the aforementioned data quality checks. A percentage between 0% and 100% of this reward could be donated and served as our measure of prosocial behavior.

Stimuli An interactive and a noninteractive version of a narrative were created using the authors’ university webserver. Both versions contained the same story, told over 23 paragraphs. The text was based on the article “How I Became Homeless” (Marcus, 2014, December), which tells the story of how a single parent with three children becomes unexpectedly homeless and the struggles they face while trying to find a place to stay. For the interactive condition, eight decisions were added (e.g., opening a letter immediately or waiting until the evening) and the original article’s text was slightly modified (e.g., sentences were added in order to include the 3

S. T. Steinemann et al., Interactive Narratives Affecting Social Change

decisions). These decisions were designed to feel impactful, but at the same time to have a minimal impact on the narrative (e.g., choosing to open a letter a day later would lead to losing 1 day out of 4 for packing, but had no further impact on the story). However, to further ensure that the content of the specific decisions would not confound the effect of interactivity on our dependent variables, a yoked design was used. Therein, every time a participant in the interactive condition finished their version of the story based on their decisions, this version was saved and given to a participant in the noninteractive condition. This meant that the story was presented in as many different versions in the noninteractive condition as in the interactive condition. This “yoking” of the story version presented across conditions insured any differences between the two groups would be due to interactivity and not due to differences in the story or information presented. The yoked design was implemented using Storyboard (Version 0.1), a software developed by the fifth author. The software utilizes a MySQL database and the PHP programming language. User interactions were recorded in our user tracking solution Datamice (Version 0.4) that was implemented with jQuery, PHP, Zend Framework, and MySQL. An example of a noninteractive version of the story and the interactive story, as well as the code for the yoked design, can be viewed on the Open Science Framework website.3

Measures Donating Behavior Donating behavior was measured by asking participants which percentage of their participation reward they wished to donate to a charity. The charity chosen for this study was Habitat for Humanity, a nonprofit organization that aims to build and rehabilitate affordable houses around the world so as to help eliminate homelessness (http://www.habitat. org/). Participants chose the amount to donate from a drop-down list of ten-percent increments from 0% (no donation) to 100% (complete donation). This method was a slightly modified version of the method used by Steinemann et al. (2015), which informed participants of their reward in advance (instead of it being an unexpected bonus). This was done to increase the likelihood that participants would treat this money as their own (Clark, 2002). While US $1 was a fairly small amount of money, several previous studies have utilized this or similarly small amounts to examine donating behavior (e.g., Steinemann et al., 2015; Tsvetkova, Macy, & Szolnoki, 2014).

Our project InteractiveNarratives can be accessed at https://osf.io/jstzv/

Journal of Media Psychology (2017), 29(1), 54–66

Ó 2017 Hogrefe Publishing


S. T. Steinemann et al., Interactive Narratives Affecting Social Change

Responsibility To measure responsibility, the 2-item scale by Jenkins (2014) was used (Cronbach’s α = .95), which asks participants to which extent they feel responsible for the outcome of the story and the character’s decisions. All items for this and all following measures were presented on a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree). Identification With the Character and Empathic Concern The 10-item identification scale by Cohen (2001) was used to measure identification with the main character (Cronbach’s α = .92). The items for this as well as all following measures were modified to be applicable for both interactive and noninteractive narratives. To control for individual differences in empathy, the 7-item Empathic Concern subscale by Davis (1983) was used (Cronbach’s α = .87). Appreciation and Enjoyment Appreciation (Cronbach’s α = .88) and enjoyment (Cronbach’s α = .89) were measured using the scale developed by Oliver and Bartsch (2010). This scale contains three items each for appreciation, that is, how meaningful, moving, and thought-provoking the story was, and enjoyment, that is, to which extent reading through the story was fun, considered a good time, and entertaining. Narrative Engagement To control for narrative engagement, the 12-item scale for narrative engagement developed by Busselle and Bilandzic (2009) was used (Cronbach’s α = .85). Text Comprehension Based on the questionnaire originally developed for viewing comprehension by Hobbs and Frost (2003), a 6-item questionnaire was included to control for text comprehension. While the original questionnaire asked for open answers, considering our large sample size a multiple-choice format was used.

Procedure After clicking on a link on CrowdFlower or Mechanical Turk, participants were informed on an introduction page of the approximate time that the study would take and that they would be receiving a US $1 reward for careful completion of the study, next to the upfront payment of US $0.2. Next, participants were asked to fill out the 4

5

59

questionnaire for empathic concern. Following this, participants were randomly assigned to one of the experimental conditions. Afterward, participants were asked to fill out the identification, responsibility, appreciation, enjoyment, and narrative engagement questionnaires. Next, participants were thanked and told that they now had the opportunity to donate a percentage of their US $1 reward to a charity. The percentage they chose to keep for themselves was later given to them as a bonus on CrowdFlower or Mechanical Turk; the percentage they wished to have donated was donated to the charity. Finally, participants were asked to fill out the text comprehension questionnaire and demographic questions (including a 1-item question on whether they had experienced circumstances similar to the ones described in the narrative), thanked a second time, and given a code to enter on their respective crowdsourcing platform in order to receive their compensation and reward.4

Results The dataset and R script used in this analysis can be found on the Open Science Framework.5

Preliminary Analysis Using boxplots, univariate outliers were detected for empathic concern, narrative engagement, identification, and appreciation. These variables were subsequently winsorized (threshold: 95%) to minimize the influence of the outliers on the statistical estimates. Inspecting normal Q-Q plots, the distributions of donation and responsibility were found to be substantially non-normally distributed. Additionally, inspection of the scatterplots of the standardized residuals against the standardized predicted scores indicated the presence of heteroscedasticity among residuals, likely due to the nonnormal distribution of donation and responsibility (Kline, 2011). Therefore, subsequent analyses were conducted using bootstrapping and Spearman’s rank correlation, as they are robust to violations of normality. Examination of the scatterplots indicated that all visible relations between the outcome variables were linear. Means and standard deviations for all dependent and control variables across the two levels of interactivity are listed in Table 1. Participants in both conditions donated approximately 30% of their reward to the charity, which

In order to donate and pay out the correct amounts to participants, participants received different codes depending on the amount they had chosen to donate. InteractiveNarratives (https://osf.io/jstzv/)

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 54–66


60

S. T. Steinemann et al., Interactive Narratives Affecting Social Change

Table 1. Descriptive statistics: means and standard deviations by condition Noninteractive Narrative Variable Percentage Donated

M (SD)

Interactive Narrative M (SD)

29.47 (37.35)

31.21 (38.10)

Responsibility

2.22 (1.48)

3.09 (1.83)

Identification

5.58 (0.94)

5.57 (0.94)

Appreciation

5.85 (0.95)

5.81 (0.97)

Empathic Concern

5.21 (1.02)

5.24 (1.05)

Enjoyment

4.46 (1.46)

4.69 (1.60)

Narrative Engagement

5.28 (0.91)

5.31 (0.90)

resulted in a total donation of US $214 for Habitat for Humanity. Further, the high values for identification and appreciation indicated that in both conditions, participants identified strongly with the character and found the story to be meaningful. Spearman’s rank correlations are listed in Table 2. Of special note are the high correlations between appreciation, identification, and narrative engagement, contrasted with the fairly low correlations with donation.

Model Estimation To test H1–H7 (Figure 2), a path analysis model was estimated with R (R Core Team, 2016) and the package lavaan (Rosseel, 2012), using standard error-bootstrapping and Satorra–Bentler correction due to non-normality (Kline, 2011). Inspection of the fit indices showed the resulting model to have a good fit, w2 = 3.68, df = 3, p = .299, comparative fit index (CFI) = .99, root mean square of approximation (RMSEA) = .02, 90% CI [.00, .07]. This model can be seen in Figure 3. Next, the importance of the control variables empathic concern, enjoyment, and narrative engagement was examined by trimming the paths between them and the dependent variables. A w2 difference test determined that the trimming of these paths resulted in a significantly poorer fit (w2diff = 927, dfdiff = 15, p < .001). Therefore, the original model was retained. Despite the high covariance between identification, appreciation, and narrative engagement, multicollinearity was within acceptable ranges (VIF between 2.40 and 3.14, tolerance values between .32 and .42; Field, Miles, & Field, 2013).

Confirmatory Analysis Hypotheses were tested using the estimated model (Figure 3). Our first hypothesis predicted that interactivity Journal of Media Psychology (2017), 29(1), 54–66

would lead to a higher percentage donated. This was not supported (β = .02, b = 0.01, SE = 0.03, p = .696). H2 and H3 predicted that interactivity would lead to more identification, which in turn would lead to a higher percentage donated. H2 was not supported (β = .03, b = 0.06, SE = 0.05, p = .169), whereas for H3 a significant relationship in the opposite direction was found, with identification being negatively related to percentage donated (β = .17, b = 0.07, SE = 0.03, p = .013). H4 and H5 predicted that interactivity would lead to more responsibility, which in turn would be related to a higher percentage donated. H4 was supported (β = .23, b = 0.80, SE = 0.12, p < .001), while H5 was not (β = .08, b = 0.02, SE = 0.01, p = .08). H6 and H7 predicted that interactivity would lead to more appreciation, which in turn would be related to a higher percentage donated. H6 was not supported (β = .05, b = 0.10, SE = 0.05, p = .056); however, H7 was supported (β = .17, b = 0.07, SE = 0.02, p = .005). An overview of all hypotheses and corresponding results can be seen in Table 3.

Exploratory Analysis As 148 participants (23.30% of the study sample) indicated that they had themselves experienced circumstances similar to the ones described in the narrative, we added “experienced similar circumstances” (yes/no) as a further control variable into the model, as this may have simultaneously facilitated identification with the character in the story, while also making participants less likely to donate as they might still be in more difficult financial circumstances than someone who had never experienced similar circumstances. The resulting model had a good fit, w2 = 3.82, df = 4, p = .431, CFI = 1.00, RMSEA = .00, 90% CI [.00, .06]. Of particular interest is the finding that the previously negative relationship between identification and donation was no longer significant in this model (β = .12, b = 0.05, SE = 0.03, p = .112), but that instead having experienced similar circumstances was significantly negatively related to donation (β = .13, b = 0.11, SE = 0.03, p = .001). To further improve the model, the nonsignificant paths between experienced similar circumstances and appreciation and responsibility as well as the nonsignificant covariance between experienced similar circumstances and enjoyment were trimmed. A w2 difference test showed this to not significantly reduce the model fit (w2 = 3.33, dfdiff = 3, p = .34). Next, the nonsignificant paths from interactivity to identification, appreciation, and donation as well as the nonsignificant paths from identification to donation, responsibility to donation, empathic concern to donation, and narrative engagement to responsibility were trimmed. A w2 difference test showed this trimming to likewise not Ó 2017 Hogrefe Publishing


S. T. Steinemann et al., Interactive Narratives Affecting Social Change

61

Table 2. Spearman’s rank-order correlations between Empathic Concern, Narrative Engagement, Enjoyment, Appreciation, Identification, Responsibility and Percentage Donated Variables

Empathic Concern

Narrative Engagement

Narrative Engagement

Enjoyment

Appreciation

Identification

Responsibility

.53***

Enjoyment

.14***

.26***

Appreciation

.49***

.69***

.39***

Identification

.55***

.73***

.33***

.77***

Responsibility

.11**

.13***

.25***

.20***

.25***

Donation

.11**

.19***

.09*

.15***

.10**

.08*

Note. *p < .05; **p < .01; ***p < .001.

.55 .27

.14

Empathic concern

Figure 3. Structural equation model of the processes between interactivity and prosocial behavior examined in the confirmatory analysis including standardized estimates of direct effects. Dotted lines indicate nonsignificant pathways.

Narrative Engagement

Enjoyment .02

.21

21 .10 0 .21 .16

.15 15 -.19 .19 9 .23 .2 2

.58 .03

..56 56 e

.19

e

--.17

Identification .24 -.03

.23 23

Interactivity

.50 0 .13

Responsibility

e

.08

Prosocial Behavior

.02

-.05

e

.17

Appreciation

significantly reduce the model fit (w2 = 12.7, dfdiff = 10, p = .239). The resulting model fit was good, w2 = 16.60, df = 14, p = .278, CFI = 0.998, RMSEA = .02, 90% CI [.00, .04]. This exploratory model can be seen in Figure 4.6

Discussion This study aimed to investigate how and why interactive narratives may impact prosocial behavior. Of the variables examined, responsibility alone was impacted by interactivity. Prosocial behavior was positively related to appreciation and narrative engagement and negatively related to enjoyment, and (in the confirmatory analysis) with identification. Responsibility and empathic concern 6

were not significantly related to prosocial behavior. Narrative engagement was strongly related to both identification and appreciation. The clearest result found was that interactivity in the form examined did not impact the percentage donated. These findings are in contrast to those previously found in other studies (Green & Jenkins, 2014; Peng et al., 2010; Ruggiero, 2015; Steinemann et al., 2015). One possible explanation is that the experimental manipulation of interactivity did not work. However, considering that here interactivity was defined merely in terms of the ability to allow decision-making, which the story did, and the finding that participants did experience more responsibility for the story and the character, which have previously been strongly associated with interactivity (Green & Jenkins, 2014), the conditions did appear to differ, at least in these most basic respects.

Further analysis conducted included analysis of variance for all four outcome variables, which found the same effects as the pathway analysis (i.e., responsibility was the only variable that was significantly different across the conditions of interactivity) and a multiple group analysis to test for a moderation effect of “experienced similar circumstances,” which, however, found no significant differences in model fit. More information on these analyses can be found on the Open Science Framework.

Ó 2017 Hogrefe Publishing

Journal of Media Psychology (2017), 29(1), 54–66


62

S. T. Steinemann et al., Interactive Narratives Affecting Social Change

Table 3. Overview of hypotheses, exploratory analyses, and corresponding results Confirmatory Analysis Hypothesis

Finding

Hypothesis confirmed

H1

Interactivity will lead to a higher percentage donated

βH1 = .02

No

H2

Interactivity will lead to more identification with the character

βH2 =

.03

No

H3

Identification will be positively related to a higher percentage donated

βH3 =

.17

No

H4

Interactivity will lead to more responsibility

βH4 = .23

Yes

H5

Responsibility will be positively related to a higher percentage donated

βH5 = .08

No

H6

Interactivity will lead to more appreciation

βH6 =

H7

Appreciation will be positively related to a higher percentage donated

βH7 = .17

.05

No Yes

Exploratory Analysis Research Question RQ 1

Finding βRQ1 =

Does experiencing similar circumstances impact the percentage donated?

Empathic concern

.11

.26

.13

.57

.21 .1 .23 .15

Interactivity

.14

.55 5

.21

.12 .16

.24 .24

Experienced Similar Circumstances

Narrative Engagement

Enjoyment

Yes

Figure 4. Structural equation model of the processes between interactivity and prosocial behavior examined in the exploratory analysis including standardized estimates of direct effects.

.12

.55

.13

Supported

.15

e

Identification

-.14

-.17 e

e

Responsibility

Prosocial Behavior

.51 .14

.11 e

Appreciation

If, therefore, the conditions can be argued to differ in terms of interactivity, but the effects of interactivity were not comparable to those found in other studies on prosocial behavior and attitudes, it begs the question of whether the form of interactivity examined across these studies may have differed in fundamental ways, which would account for these differences. To attempt to answer this question, we take a closer look at the stimuli used in this study compared with studies that have previously found interactivity to affect prosocial behavior and attitudes (Peng et al., 2010; Ruggiero, 2015; Steinemann et al., 2015). In the current study, a noninteractive article about a single parent who becomes homeless was used as a basis, to which interactive elements were added to examine the difference between an interactive and noninteractive story. The actions included options such Journal of Media Psychology (2017), 29(1), 54–66

as deciding whether to stay with one’s mother or one’s best friend, or how to respond to uncomfortable questions asked by coworkers. The interactive narrative ended for all players with a friend offering them and their children a place to stay for as long as they wished. While these decisions were designed to feel meaningful, they differed notably from the decisions in the interactive conditions used in the study by Peng et al. (2010), Ruggiero (2015), and Steinemann et al. (2015), who utilized the games for change Spent or Darfur Is Dying. In Darfur Is Dying, the player takes up the role of a person living in a refugee camp, who must venture out of the camp while having to avoid being captured by the militia patrolling the area. In Spent, the player is a single parent who recently lost their job and must try and survive the month on US $1,000, while facing difficult choices, such as whether or not to send their child Ó 2017 Hogrefe Publishing


S. T. Steinemann et al., Interactive Narratives Affecting Social Change

to an expensive gifted program. While these games tackle separate issues using different design approaches, they do have two crucial factors in common. First, almost every decision in the game had drastic consequences – either bringing the player ever closer to being caught by the militia or running out of money. Often, one wrong decision could mean losing the game. Second, both games are quite difficult; in Steinemann et al. (2015) for example, the vast majority of players of Darfur Is Dying lost the game. Contrasted with the far less severe consequences of choosing to stay with one’s mother or a friend and ultimately ending up in a safe and stable environment, it could be argued that the decisions made in games such as Darfur Is Dying and Spent could be experienced as far more important and meaningful. Described in terms used by Green and Jenkins (2014), user control over the narrative structure was likely more strongly felt when players could see the clear consequences of their actions. This is supported by previous research that has found that inspirational and motivational video clips were only associated with increased prosocial behavior when combined with perceived choice (Ellithorpe, Ewoldsen, & Oliver, 2015). Yet another study found that participants were more satisfied making decisions instead of having a decision made for them only when they clearly could differentiate between two options and that only the differentiated options led to a higher sense of responsibility (Botti & McGill, 2006). In the current study, while responsibility did differ between the interactive and noninteractive conditions, responsibility in neither condition was particularly high. The low sense of responsibility even in the interactive narrative could well be due to the fact that decisions were rarely followed by clear consequences, for example, opening the letter in the morning instead of the evening led to a day less time to pack, but had no further consequence or lasting repercussions. Furthermore, the most important positive relationships with prosocial behavior were engagement with and appreciation for the narrative. We first hypothesized that interactivity would lead to more appreciation (Elson et al., 2014; Oliver et al., 2015; Steinemann et al., 2015) and this in turn would relate to more prosocial behavior (Morgan et al., 2009; Myrick & Oliver, 2015; Small & Simonsohn, 2008). Perhaps, however, the concept of interactivity should be considered in more nuanced terms than this, in that interactivity can lead to more appreciation by the meaningfulness of the decisions this interactivity entails. In other words, the more meaningful interactivity is perceived, the more appreciation is felt and the more this will in turn lead to prosocial behavior. While further research comparing different forms of interactive narrative is necessary, the present findings suggest that interactivity is more complex than simply adding decisions to a story. Taken together, the differences Ó 2017 Hogrefe Publishing

63

between interactive narratives used in the current study and those used by Peng et al. (2010), Ruggiero (2015), and Steinemann et al. (2015) imply that decisions must feel meaningful and offer clear consequences with emotional ramifications for the player. To be more effective than their noninteractive counterparts, the interactive narrative must be capable of impacting variables such as appreciation and narrative engagement. Another possible explanation for the failure to find a relationship between interactivity and prosocial behavior could be that interactivity does in fact not lead to an increase in prosocial behavior. Arguably, previous studies have suffered from methodological drawbacks, with the studies of both Peng et al. (2010) and Steinemann et al. (2015) being underpowered, which may have led to an over-estimation of effects (Button et al., 2013). Furthermore, to our knowledge no previous studies examining the effects of interactivity on prosocial behavior or attitudes have utilized a yoked design (e.g., Peng et al., 2010; Ruggiero, 2015; Steinemann et al., 2015). Yoked designs have been used in the past to allow for conclusive results on the effects of interactivity on a number of topics from neural activation (Cole, Yoo, & Knutson, 2012) to learning performance (Kickmeier-Rust, Marte, Linek, Lalonde, & Albert, 2008) to the amount of voluntary reading children with dyslexia are willing to do (Ward, McKeown, Utay, Medvedeva, & Crowley, 2012). When the interactive and noninteractive condition are not yoked, it becomes difficult to ensure that any differences between the conditions are truly due to interactivity and not due to differences in the information presented in the conditions. Owing to the high power of the present study, its employment of a yoked design, as well as the use of a preregistered confirmatory analysis, the finding that interactivity does not impact prosocial behavior – at least under the conditions used in this study – can be assumed to be robust. To examine whether interactivity affects prosocial behavior under other conditions, future studies should therefore aim both for sufficient power and, importantly, for the use of a yoked design. Preregistry of confirmatory analysis is recommendable for research across fields. While interactivity failed to impact any processes save responsibility in the estimated model, a number of interesting effects between the examined psychological processes and prosocial behavior were observed. For one, the positive relationship between appreciation and prosocial behavior corroborates previous findings (Steinemann et al., 2015), further establishing appreciation as an important experience to consider when designing for prosocial behavior in contexts such as, but not limited to, games for change. The previously unexamined positive relationship between narrative engagement and prosocial behavior suggests an interesting factor to keep in mind in further research. Journal of Media Psychology (2017), 29(1), 54–66


64

The negative effect of identification on prosocial behavior was unexpected. The exploratory analysis provided a possible explanation, as having oneself experienced similar circumstances to those depicted in the narrative was associated both with higher identification with the character and a smaller donation. Including this variable in the model led the negative relationship between identification and prosocial behavior to disappear. A possible interpretation could be that having experienced similar circumstances to those of a homeless family might be associated with an increased chance of still being in difficult circumstances, potentially needing the money more, and therefore being less willing to donate. It is also possible that in the context of the story used in this study, experiencing similar circumstances, and thereby identifying more with the character, affected donations negatively, because participants who had experienced similar circumstances in the past did not believe that donations to charities would necessarily improve the situation of the person affected. In future studies, it may therefore be worth controlling for perceived efficacy of proposed solutions. However, even controlling for the effect of previous experience, the hypothesized positive relationship between identification and prosocial behavior was not observed in the model. Considering that instead appreciation and narrative engagement were related to prosocial behavior, this may suggest that, at least under certain circumstances, a narrative’s meaningfulness and its ability to engage the reader may perhaps be more important for promoting prosocial behavior than character identification is (Bartsch, Kalch, & Oliver, 2014; Small & Simonsohn, 2008). Put differently, a reader could identify with a character or a character’s action, but would not necessarily think of the issue as meaningful or engaging enough to donate. Enjoyment being negatively related to prosocial behavior, while appreciation was positively related, further supports the differentiation between these two forms of media experience (Oliver & Bartsch, 2010). For games for change, the findings that the less fun and entertaining, yet the more meaningful and moving the experience is, the more people will donate at the end, hints at the importance of focusing on creating experiences that are appreciated rather than enjoyed (Bartsch et al., 2014; Myrick & Oliver, 2015; Steinemann et al., 2015). This finding comes, however, with the caveat that this is solely related to whether people will donate. Other experiences, such as willingness to share the interactive narrative with other people or starting to play in the first place, may be impacted by the degree of enjoyment experienced or expected to be experienced (Cohen, 2014). Further research on the impact of appreciation and enjoyment on prosocial behavior other than donating is therefore recommended.

Journal of Media Psychology (2017), 29(1), 54–66

S. T. Steinemann et al., Interactive Narratives Affecting Social Change

Limitations and Outlook While this study offers several promising findings, it also has clear limitations. Most importantly, the main question of this study of how and why interactivity impacts prosocial behavior presupposed that a significant impact of interactivity on prosocial behavior would be found. As this was not the case, mediation effects could not be observed. While these remain interesting research questions, the findings of this study as they were observed may offer valuable insights into why interactivity may work in some cases but not in others. Future studies on the relationship between interactive narratives and prosocial behavior should therefore carefully consider how interactivity is manipulated, in particular whether the decisions are considered meaningful by participants. Furthermore, the high values for appreciation and identification may have led to a ceiling effect, which would make differentiating between experimental conditions more difficult and therefore may have impeded the analysis. However, while not the main focus of the study, the positive relationship of appreciation, narrative engagement, and prosocial behavior suggests interesting avenues for future research on interactive narratives. For example, the possibility of losing and facing negative consequences when wrong decisions are made, or the simple uncertainty of the outcome and the resulting suspense, may be crucial factors worth future study (Hall, 2015; Ruggiero & Becker, 2015).

Conclusion The results of this study support the importance of appreciation, enjoyment, and narrative engagement in the context of media trying to further prosocial behavior. The results, however, also indicate that the relationship between interactivity and prosocial behavior may not be as simple as previously assumed. We argue that examination of further interactivity-related variables, such as the emotional consequences of decisions made, as well as the outcome of the story (i.e., whether one can lose or experience a negative outcome), may be crucial elements when creating interactive narratives with the goal of encouraging prosocial behavior. Lastly, while donating behavior as an instrumentalization of prosocial behavior is both relevant and meaningful, other behavioral consequences of interacting with narratives, for example, how willing people are to share the narrative with friends or to start reading the narrative in the first place, may offer interesting themes for future research.

Ó 2017 Hogrefe Publishing


S. T. Steinemann et al., Interactive Narratives Affecting Social Change

Acknowledgments We would like to sincerely thank Markus Stöcklin, Mathias Jenny, and Michelle Wobmann for their valuable advice and assistance. Furthermore we would like to express our warmest gratitude to the editors, Malte Elson and Andrew K. Przybylski, and to the anonymous reviewers, who throughout the development phases of this preregistered research offered careful, competent, and extremely helpful feedback.

References Bandura, A., Barbaranelli, C., Caprara, G. V., & Pastorelli, C. (1996). Mechanisms of moral disengagement in the exercise of moral agency. Journal of Personality and Social Psychology, 71(2), 364. Bartsch, A., Kalch, A., & Oliver, M. B. (2014). Moved to think: The role of emotional media experiences in stimulating reflective thoughts. Journal of Media Psychology, 26(3), 125. Botti, S., & McGill, A. L. (2006). When choosing is not deciding: The effect of perceived responsibility on satisfaction. Journal of Consumer Research, 33(2), 211–219. Bucy, E. P., & Tao, C.-C. (2007). The mediated moderation model of interactivity. Media Psychology, 9(3), 647–672. Busselle, R., & Bilandzic, H. (2009). Measuring narrative engagement. Media Psychology, 12(4), 321–347. Button, K. S., Ioannidis, J. P., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S., & Munafò, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365–376. Čehajić, S., Brown, R., & González, R. (2009). What do I care? Perceived ingroup responsibility and dehumanization as predictors of empathy felt for the victim group. Group Processes & Intergroup Relations, 12(6), 715–729. Clark, J. (2002). House money effects in public good experiments. Experimental Economics, 5(3), 223–231. Cohen, E. L. (2014). What makes good games go viral? The role of technology use, efficacy, emotion and enjoyment in players’ decision to share a prosocial digital game. Computers in Human Behavior, 33, 321–329. Cohen, J. (2001). Defining identification: A theoretical look at the identification of audiences with media characters. Mass Communication & Society, 4(3), 245–264. Cole, S. W., Yoo, D. J., & Knutson, B. (2012). Interactivity and reward-related neural activation during a serious videogame. PLoS One, 7(3), e33909. doi: 10.1371/journal.pone.0033909 Davis, M. H. (1983). Measuring individual differences in empathy: Evidence for a multidimensional approach. Journal of Personality and Social Psychology, 44(1), 113. Eisenberg, N., & Miller, P. A. (1987). The relation of empathy to prosocial and related behaviors. Psychological Bulletin, 101(1), 91. Ellithorpe, M. E., Ewoldsen, D. R., & Oliver, M. B. (2015). Elevation (sometimes) increases altruism: Choice and number of outcomes in elevating media effects. Psychology of Popular Media Culture, 4(3), 236. Elson, M., Breuer, J., Ivory, J. D., & Quandt, T. (2014). More than stories with buttons: Narrative, mechanics, and context as determinants of player experience in digital games. Journal of Communication, 64(3), 521–542. Field, A., Miles, J., & Field, Z. (2013). Discovering statistics using r. London, UK: Sage.

Ó 2017 Hogrefe Publishing

65

Green, M. C., & Jenkins, K. M. (2014). Interactive narratives: Processes and outcomes in user-directed stories. Journal of Communication, 64(3), 479–500. Hall, A. E. (2015). Entertainment-oriented gratifications of sports media: Contributors to suspense, hedonic enjoyment, and appreciation. Journal of Broadcasting & Electronic Media, 59(2), 259–277. Hobbs, R., & Frost, R. (2003). Measuring the acquisition of medialiteracy skills. Reading Research Quarterly, 38(3), 330–355. Hogg, M. A. (2003). Social identity. New York, NY: Guilford Press. Jenkins, K. M. (2014). Choose your own adventure: Interactive narratives and attitude change, (Unpublished doctoral dissertation). The University of North Carolina at Chapel Hill. Kampf, R., & Stolero, N. (2015). Computerized simulation of the Israeli–Palestinian conflict, knowledge gap, and news media use. InformationCommunication & Society, 18(6), 644–658. Kickmeier-Rust, M. D., Marte, B., Linek, S., Lalonde, T., & Albert, D. (2008). The effects of individualized feedback in digital educational games. In T. Conolly & M. Stansfield (Eds.), Proceedings of the 2nd European Conference on Games Based Learning (pp. 227–236). Barcelona, Spain: Academic Publishing. Klimmt, C. (2009). Serious games and social change: Why they (should) work. In U. Ritterfeld, M. Cody, & P. Vorderer (Eds.), Serious games: Mechanisms and effects (pp. 248–270). New York, NY: Routledge. Kline, R. B. (2011). Principles and practice of structural equation modeling. New York, NY: Guilford Publications. Marcus, M. (2014, December). How I became homeless. Retrieved from http://www.salon.com/2014/12/25/how_i_ became_homeless Morgan, S. E., Movius, L., & Cody, M. J. (2009). The power of narratives: The effect of entertainment television organ donation storylines on the attitudes, knowledge, and behaviors of donors and nondonors. Journal of Communication, 59(1), 135–151. Myrick, J. G., & Oliver, M. B. (2015). Laughing and crying: Mixed emotions, compassion, and the effectiveness of a YouTube PSA about skin cancer. Health Communication, 30(8), 820–829. Oliver, M. B., & Bartsch, A. (2010). Appreciation as audience response: Exploring entertainment gratifications beyond hedonism. Human Communication Research, 36(1), 53–81. Oliver, M. B., Bowman, N. D., Woolley, J. K., Rogers., R, Sherrick, B. I., & Chung, M.-Y. (2015). Video games as meaningful entertainment experiences. Advance online publication. Psychology of Popular Media Culture, 1–16. doi: 10.1037/ppm0000066 Peng, W., Lee, M., & Heeter, C. (2010). The effects of a serious game on role-taking and willingness to help. Journal of Communication, 60(4), 723–742. R Core Team. (2016). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria: Retrieved from https://www.R-project.org/ Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. Retrieved from http://www.jstatsoft.org/v48/i02/ Ruggiero, D. (2015). The effect of a persuasive social impact game on affective learning and attitude. Computers in Human Behavior, 45, 213–221. Ruggiero, D., & Becker, K. (2015). Games you can’t win. The Computer Games Journal, 4(3–4), 169–186. Small, D. A., & Simonsohn, U. (2008). Friends of victims: Personal experience and prosocial behavior. Journal of Consumer Research, 35(3), 532–542. Steinemann, S. T., Mekler, E. D., & Opwis, K. (2015). Increasing donating behavior through a game for change: The role of interactivity and appreciation. Proceedings of the 2015 Annual Symposium on Computer-Human Interaction in Play (pp. 319–329). New York, NY: ACM.

Journal of Media Psychology (2017), 29(1), 54–66


66

S. T. Steinemann et al., Interactive Narratives Affecting Social Change

Sundar, S. S. (2009). Media effects 2.0: Social and psychological effects of communication technologies. In R. L. Nabi & M. B. Olives (Eds.), The Sage handbook of media processes and effects (pp. 545–560). Thousand Oaks, CA: Sage Publications. Tsvetkova, M., Macy, M. W., & Szolnoki, A. (2014). The social contagion of generosity. PloS one, 9(2). Vorderer, P., Knobloch, S., & Schramm, H. (2001). Does entertainment suffer from interactivity? the impact of watching an interactive TV movie on viewers’ experience of entertainment. Media Psychology, 3(4), 343–363. Ward, A., McKeown, M., Utay, C., Medvedeva, O., & Crowley, R. (2012). Interactive stories and motivation to read in the raft dyslexia fluency tutor. In Y. Nakano, M. Neff, A. Paiva, & M. Walker (Eds.), International Conference on Intelligent Virtual Agents (pp. 260–267). Berlin, Germany: Springer. Received January 15, 2016 Revision received December 14, 2016 Accepted December 15, 2016 Published online March 21, 2017 Sharon T. Steinemann Center for Cognitive Psychology and Methodology University of Basel Missionsstrasse 62a 4055 Basel Switzerland sharon.steinemann@unibas.ch

Sharon Steinemann is a research assistant and PhD candidate in the Human-Computer Interaction Lab at the Center for Cognitive Psychology and Methodology at the University of Basel. Her research interests focus on the impact of interactive media on prosocial behavior and attitude change and understanding the psychological processes involved.

Glena Iten is a research assistant and PhD candidate in the HumanComputer Interaction Lab at the Center for Cognitive Psychology and Methodology at the University of Basel. Her research interests focus on media recovery, moral reasoning in games, and applied cognitive psychology.

Journal of Media Psychology (2017), 29(1), 54–66

Klaus Opwis is a Full Professor for Cognitive Psychology and Methodology at the University of Basel. His research interests focus on applied cognitive psychology (e.g., training of working memory, HCI, aesthetics, and visual perception) and on methodological topics (e.g., measurement theory, structural equation models).

Seamus Forde is a HCI Master’s student and research assistant at the HCI research group at the University of Basel. His research interests focus on computer games, gamification, and motivation.

Lars Frasseck is a research associate and software engineer at the Department of Psychology at the University of Basel. His research interests focus on web software as means for data collection and analysis in psychological studies.

Elisa Mekler is a post-doctoral fellow at the HCI Games Group at the University of Waterloo. Her research focuses on the characteristics of enjoyable and meaningful user experiences with games, gamification, and interactive technology in general. She is particularly interested in the value of emotional experiences with technology and their potential to motivate attitude and behavior change.

Ó 2017 Hogrefe Publishing


Meeting Calendar March 26–29, 2017 TeaP 2017 – 59th Conference of Experimental Psychologists Dresden, Germany Contact: TeaP 2017 organizing team, E-mail info@teap2017.de, http://www.teap2017.de/

July 15–17, 2017 WAPOR 70th Annual Conference Lisbon, Portugal Contact: World Association for Public Opinion Research, E-mail renae@wapor.org, http://wapor.org/ contacts/

October 18–21, 2017 18th Annual Meeting of the AoIR 2017 Tartu, Estonia Contact: Association of Internet Researchers, E-mail ac@aoir.org, http://aoir.org/

May 6–11, 2017 CHI2017 Denver, CO, USA Contact: Conference Chairs: Gloria Mark, University of California, Irvine, and Susan Fussell, Cornell University, E-mail generalchairs@ chi2017.acm.org, https://chi2017.acm. org/organizers.html

July 16–20, 2017 IAMCR 2017 Conference Cartagena, Colombia Contact: International Association for Media and Communication Research, http://iamcr.org/news/ iamcr2017

November 16–17, 2017 2nd International Conference on Communication & Media Studies Vancouver, Canada Contact: Communication & Media Studies Knowledge Community, http://oncommunicationmedia.com/ 2017-conference

May 25–29, 2017 67th ICA – International Communication Association Annual Conference San Diego, CA, USA Contact: International Communication Association, Washington, DC, https://www.icahdq.org/conf/ May 30–June 2, 2017 Canadian Communication Association – Annual Meeting Toronto, Canada Contact: Canadian Communication Association, Ottawa, Canada, http:// www.acc-cca.ca/annualmeeting July 11–14, 2017 ECP – 15th European Congress of Psychology Amsterdam, The Netherlands Contact: ENIC Meetings and Events, Firenze, Italia, E-mail info@ecp2015. it, http://www.ecp2015.it/#sthash. MESRCDSL.dpuf

Ó 2017 Hogrefe Publishing

August 3–6, 2017 APA Annual Convention Washington, DC, USA Contact: American Psychological Association, 750 First Street, Washington, DC, 20002-4242, http://www.apa.org/ convention/future.aspx

November 16–19, 2017 NCA 102nd Annual Convention Dallas, TX, USA Contact: National Communication Association, E-mail inbox@natcom.org, http://www.natcom.org/convention/

August 9–12, 2017 AEJMC’s 100th Annual Conference Chicago, IL, USA Contact: Association for Education in Journalism and Mass Communication, http://www.aejmc.org/ September 6–8, 2017 10th Conference of the DGPS Media Psychology Division Landau, Germany Contact: University of Koblenz-Landau, Landau, Germany E-mail info@ mediapsychology2017.com, http:// www.mediapsychology2017.com

Journal of Media Psychology (2017), 29(1), 67 DOI: 10.1027/1864-1105/a000220


Instructions to Authors Journal of Media Psychology (JMP) is committed to publishing original, high-quality papers which cover the broad range of media psychological research. This peer-reviewed journal focuses on how human beings select, use, and experience various media as well as how media (use) can affect their cognitions, emotions, and behaviors. It is also open to research from neighboring disciplines as far as this work ties in with psychological concepts of the uses and effects of the media. In particular, it publishes multidisciplinary papers that reflect a broader theoretical and methodological spectrum and comparative work, e.g., cross-media, cross-gender, or crosscultural. As JMP is intended to foster Open Science Practices, authors are offered to publish their data and materials (i.e., stimuli and surveys) as Electronic Supplementary Material on the publisher’s website at http://econtent@hogrefe. com. In line with the Peer Reviewers’ Openness Initiative, authors may be asked by reviewers to share their data and materials at any stage of the reviewing process. Journal of Media Psychology publishes the following types of article: Original Articles, Theoretical Articles, Research Reports, Pre-Registered Reports.

Manuscript submission: All manuscripts should in the first instance be submitted electronically at http://www.editorialmanager.com/ jmp. Detailed instructions to authors are provided at https:// www.hogrefe.com/j/jmp Copyright Agreement: By submitting an article, the author confirms and guarantees on behalf of him-/herself and any coauthors that the manuscript has not been submitted or published elsewhere, and that he or she holds all copyright in and titles to the submitted contribution, including any figures, photographs, line drawings, plans, maps, sketches, and tables, and that the article and its contents do not infringe in any way on the rights of third parties. The author indemnifies and holds harmless the publisher from any third party claims. The author agrees, upon acceptance of the article for publication, to transfer to the publisher the exclusive right to reproduce and distribute the article and its contents, both physically and in nonphysical, electronic, or other form, in the journal to which it has been

Journal of Media Psychology (2017), 29(1)

submitted and in other independent publications, with no limitations on the number of copies or on the form or the extent of distribution. These rights are transferred for the duration of copyright as defined by international law. Furthermore, the author transfers to the publisher the following exclusive rights to the article and its contents: 1. The rights to produce advance copies, reprints, or offprints of the article, in full or in part, to undertake or allow translations into other languages, to distribute other forms or modified versions of the article, and to produce and distribute summaries or abstracts. 2. The rights to microfilm and microfiche editions or similar, to the use of the article and its contents in videotext, teletext, and similar systems, to recordings or reproduction using other media, digital or analog, including electronic, magnetic, and optical media, and in multimedia form, as well as for public broadcasting in radio, television, or other forms of broadcast. 3. The rights to store the article and its content in machinereadable or electronic form on all media (such as computer disks, compact disks, magnetic tape), to store the article and its contents in online databases belonging to the publisher or third parties for viewing or downloading by third parties, and to present or reproduce the article or its contents on visual display screens, monitors, and similar devices, either directly or via data transmission. 4. The rights to reproduce and distribute the article and its contents by all other means, including photomechanical and similar processes (such as photocopying or facsimile), and as part of so-called document delivery services. 5. The right to transfer any or all rights mentioned in this agreement, as well as rights retained by the relevant copyright clearing centers, including royalty rights to third parties. Online Rights for Journal Articles: Guidelines on authors’ rights to archive electronic versions of their manuscripts online are given in the ‘‘Guidelines on sharing and use of articles in Hogrefe journals’’ on the journal’s web page at https://www. hogrefe.com/j/jmp October 2016

Ó 2017 Hogrefe Publishing


Alternatives to traditional self-reports in psychological assessment “A unique and timely guide to better psychological assessment.” Rainer K. Silbereisen, Research Professor, Friedrich Schiller University Jena, Germany Past-President, International Union of Psychological Science

Tuulia Ortner / Fons J. R. van de Vijver (Editors)

Behavior-Based Assessment in Psychology Going Beyond Self-Report in the Personality, Affective, Motivation, and Social Domains (Series: Psychological Assessment – Science and Practice – Vol. 1) 2015, vi + 234 pp. US $63.00 / € 44.95 ISBN 978-0-88937-437-9 Also available as eBook Traditional self-reports can be an unsufficiant source of information about personality, attitudes, affect, and motivation. What are the alternatives? This first volume in the authoritative series Psychological Assessment – Science and Practice discusses the most influential, state-of-the-art forms of assessment that can take us beyond self-report. Leading scholars from various countries describe the theo-

www.hogrefe.com

retical background and psychometric properties of alternatives to self-report, including behavior-based assessment, observational methods, innovative computerized procedures, indirect assessments, projective techniques, and narrative reports. They also look at the validity and practical application of such forms of assessment in domains as diverse as health, forensic, clinical, and consumer psychology.


Social Psychology

ISSN-Print 1864-9335 ISSN-Online 2151-2590 ISSN-L 1864-9335 6 issues per annum (= 1 volume)

Subscription rates (2017) Libraries / Institutions US $478.00 / € 374.00 Individuals US $223.00 / € 159.00 Postage / Handling US $24.00 / € 18.00

www.hogrefe.com

line n o e e e fr e issu l p m sa

Editor-in-Chief Kai Epstude University of Groningen, The Netherlands

Editorial Office Wim Meerholz University of Groningen, The Netherlands

Associate Editors Julia Becker, Osnabrück, Germany Malte Friese, Saarbrücken, Germany Ilka Gleibs, London, UK Michael Häfner, Berlin, Germany Hans J. IJzerman, Amsterdam, The Netherlands

Ulrich Kühnen, Bremen, Germany Toon Kuppens, Groningen, The Netherlands Ruth Mayo, Jerusalem, Israel Christian Unkelbach, Cologne, Germany Michaela Wänke, Mannheim, Germany

About the Journal Social Psychology publishes innovative and methodologically sound research and serves as an international forum for scientific discussion and debate in the field of social psychology. Topics include all basic social psychological research themes, methodological advances in social psychology, as well as research in applied fields of social psychology. The journal focuses on original empirical contributions to social psychological research, but is open to theoretical articles, critical reviews, and replications of published research. The journal was published until volume 38 (2007) as the Zeitschrift für Sozialpsychologie (ISSN 0044-3514). Drawing on over 30 years of experience and tradition in publishing high-quality, innovative science as the Zeitschrift für Sozialpsychologie, Social Psychology has an internationally renowned team of editors and consulting editors from all areas of basic and applied social psychology, thus ensuring that the highest international standards are maintained.

Manuscript Submissions All manuscripts should be submitted online at www.editorialmanager.com/sopsy, where full instructions to authors are also available. Electronic Full Text The full text of the journal – current and past issues (from 1999 onward) – is available online at econtent.hogrefe.com/loi/zsp (included in subscription price). A free sample issue is also available there. Abstracting Services The journal is abstracted / indexed in Current Contents / Social and Behavioral Sciences (CC / S&BS), Social Sciences Citation Index (SSCI), PsycINFO, PSYNDEX, ERIH, Scopus, and EMCare. Impact Factor (Journal Citation Reports®, Thomson Reuters): 2015 = 1.979

Zmp 2017 29 issue 1  
Zmp 2017 29 issue 1