Expsy 2016 63 issue 1 by Hogrefe

Volume 63 / Number 1 / 2016

Experimental Psychology

Editor-in-Chief Christoph Stahl Editors Tom Beckers Arndt Brรถder Adele Diederich Chris Donkin Gesine Dreisbach Kai Epstude Magda Osman Manuel Perea Klaus Rothermund Samuel Shaki

New, insightful theory and research concerning reactance processes Topics covered include • Reactance theory in association with guilt appeals • Tests to study the relationship between fear and psychological reactance • The influence of threat to group identity and its associated values and norms on reactance • Benefit of reactance research in health psychology campaigns • Construction and empirical validation of an instrument for measuring state reactance (Salzburger State Reactance Scale) • Motivation intensity theory and its implications for how reactance motives should convert into effortful goal pursuit

Sandra Sittenthaler / Eva Jonas / Eva Traut-Mattausch / Jeff Greenberg (Editors)

New Directions in Reactance Research (Series: Zeitschrift für Psychologie – Vol. 223) 2015, iv + 76 pp., large format US $49.00 / € 34.95 ISBN 978-0-88937-479-9 Psychological reactance theory, formulated by Jack Brehm in 1966, is one of the most popular social psychological theories explaining how people respond to threats to their free behaviors and has attracted attention in both basic and applied research in areas such as health, marketing, politics, and education.

www.hogrefe.com

A review article published 40 years later by Miron and Brehm pointed out several research gaps. That article inspired the editors to develop this carefully compiled collection presenting recent research and developments in reactance theory that both offer new knowledge and illuminate issues still in need of resolution.

Experimental Psychology

Volume 63, Number 1, 2016

Editors

C. Stahl (Editor-in-Chief), Ko¨ln, Germany T. Beckers, Leuven, Belgium A. Bro¨der, Mannheim, Germany A. Diederich, Bremen, Germany C. Donkin, Sydney, Australia G. Dreisbach, Regensburg, Germany

K. Epstude, Groningen, The Netherlands M. Osman, London, UK M. Perea, Valencia, Spain K. Rothermund, Jena, Germany S. Shaki, Samaria, Israel

Editorial Board

U. J. Bayen, Du¨sseldorf, Germany H. Blank, Portsmouth, UK J. De Houwer, Ghent, Belgium R. Dell’Acqua, Padova, Italy G. O. Einstein, Greenville, SC, USA E. Erdfelder, Mannheim, Germany M. Goldsmith, Haifa, Israel D. Hermans, Leuven, Belgium R. Hertwig, Berlin, Germany J. L. Hicks, Baton Rouge, LA, USA P. Juslin, Uppsala, Sweden Y. Kareev, Jerusalem, Israel D. Kerzel, Geneva, Switzerland A. Kiesel, Freiburg, Germany K. C. Klauer, Freiburg, Germany R. Kliegl, Potsdam, Germany I. Koch, Aachen, Germany J. I. Krueger, Providence, RI, USA S. Lindsay, Victoria, BC, Canada

E. Loftus, Irvine, CA, USA T. Meiser, Mannheim, Germany K. Mitchell, West Chester, PA, USA N. W. Mulligan, Chapel Hill, NC, USA B. Newell, Sydney, Australia K. Oberauer, Zu¨rich, Switzerland F. Parmentier, Palma, Spain M. Regenwetter, Champaign, IL, USA R. Reisenzein, Greifswald, Germany J. N. Rouder, Columbia, MO, USA D. Shanks, London, UK M. Steffens, Landau, Germany S. Tremblay, Quebec, Canada C. Unkelbach, Ko¨ln, Germany M. Waldmann, Go¨ttingen, Germany E. Walther, Trier, Germany P. A. White, Cardiff, UK D. Zakay, Tel Aviv, Israel

Publisher

Hogrefe Publishing, Merkelstr. 3, D-37085 Go¨ttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail publishing@hogrefe.com, Web http://www.hogrefe.com

Production

Regina Pinks-Freybott, Hogrefe Publishing, Merkelstr. 3, D-37085 Go¨ttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail production@hogrefe.com

Subscriptions

Hogrefe Publishing, Herbert-Quandt-Str. 4, D-37081 Go¨ttingen, Germany, Tel. +49 551 99950-956, Fax +49 551 99950-998

Advertising/Inserts

Marketing, Hogrefe Publishing, Merkelstr. 3, D-37085 Go¨ttingen, Germany, Tel. +49 551 99950-0, Fax +49 551 99950-425, E-mail marketing@hogrefe.com

ISSN

ISSN-L 1618-3169, ISSN-Print 1618-3169, ISSN-Online 2190-5142

Ó 2016 Hogrefe Publishing. The journal as well as the individual contributions to it are protected under international copyright law. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, digital, mechanical, photocopying, microﬁlming or otherwise, without prior written permission from the publisher. All rights, including translation rights, are reserved.

Publication

Published in six issues per annual volume. Experimental Psychology is the continuation of Zeitschrift fu¨r Experimentelle Psychologie (ISSN 0949-3964), the last annual volume of which (Volume 48) was published in 2001.

Subscription Prices

Annual subscription (2016): Individuals: US $259.00/1185.00/£148.00; Institutions: US $498.00/1380.00/ £304.00 (postage & handling: US $24.00/118.00/£15.00). Single issue: US $83.00/163.50/£51.00 (+ postage & handling).

Payment

Payment may be made by check, international money order, or credit card to Hogrefe Publishing, Merkelstr. 3, D-37085 Go¨ttingen, Germany, or, for customers in North America, to Hogrefe Publishing, Inc., Journals Department, 38 Chauncy Street, Suite 1002, Boston, MA 02111, USA.

Electronic Full Text

The full text of Experimental Psychology is available online at http://econtent.hogrefe.com/loi/zea

Abstracting/Indexing Services

Experimental Psychology is abstracted/indexed in Current Contents/Social & Behavioral Sciences (CC/S&BS), Social Science Citation Index (SSCI), Medline, PsyJOURNALS, PsycINFO, PSYNDEX, ERIH, Scopus, and EMCare. Impact Factor (2014): 2.076.

Printed on acid-free paper

Experimental Psychology 2016; Vol. 63(1)

Ó 2016 Hogrefe Publishing

Contents Editorial

Experimental Psychology: A Call for Conﬁrmatory Research Christoph Stahl

Theoretical Articles

Single- and Dual-Process Models of Biased Contingency Detection Miguel A. Vadillo, Fernando Blanco, Ion Yarritu, and Helena Matute

Research Articles

Registered Report

Ó 2016 Hogrefe Publishing

The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings: A Novel Perspective on Evaluative Conditioning Sean Hughes, Jan De Houwer, and Dermot Barnes-Holmes

Exploring the Subjective Feeling of Fluency Michael Forster, Helmut Leder, and Ulrich Ansorge

The Inﬂuence of Presentation Order on Category Transfer Fabien Mathy and Jacob Feldman

Evaluative Priming in the Pronunciation Task: A Preregistered Replication and Extension Karl Christoph Klauer, Manuel Becker, and Adriaan Spruyt

Experimental Psychology 2016; Vol. 63(1)

Editorial Experimental Psychology A Call for Conﬁrmatory Research Christoph Stahl Department of Psychology, University of Cologne, Germany

Looking back at the journal’s performance, it is noteworthy that some things have not changed: In the previous year, Experimental Psychology has continued its tradition of publishing highest-quality experimental psychological research. This research continues to have a strong impact on psychology as a basic science, as underscored by citation metrics (i.e., IF for 2014 2.076; 5-Year IF 2.475; h5-index 24).1 Experimental Psychology also continues to be a particularly fast outlet: The majority of editorial decisions – concerning initial submissions as well as revisions – were again made within 6 weeks. Other things have changed: Most prominently, the positive effects of the data-sharing policies adopted last year are clearly visible: For (almost all) articles published during the second half of 2015, the data necessary to reproduce the reported results are available as electronic supplements. This makes Experimental Psychology one of the few journals for which the widely shared goal of open data has already become reality. Since 2015, authors have the option to also share other electronic supplements – for instance, materials or methods, additional analyses and results – alongside the electronic version of the article on the journal’s website.2 In their efforts toward making their research more open and reproducible, authors have made great use of this new option: About a quarter of all articles published in the past year are accompanied by such supplemental materials. These materials will help other researchers reproduce the reported results and replicate the original findings – two important steps in building a cumulative science. We encourage all authors to make use of the options for sharing materials and methods. I am especially glad to see the first Registered Report in Experimental Psychology appear in the current issue 1

2 3

(Klauer, Becker, & Spruyt, 2016). It investigated a moderating variable that successfully explained why an independent lab failed to replicate an Affective Priming effect. The article nicely illustrates how an adversarial collaboration can make excellent use of this new format to settle an open empirical question.

Call for Registered Report Submissions We encourage authors to submit their confirmatory research proposals as Registered Reports whenever suitable. Submitted proposals will be reviewed, and editorial decisions will be made, before the research has been conducted (see Instructions to Authors for detail, Footnote 2). This comes with at least two benefits: First, authors will be able to improve the quality of their research proposal by building on reviewers’ constructive comments even before their research is conducted. Second, the publication decision will be result-blind, that is, based solely on the theoretical relevance and methodological soundness of the research proposal. In particular – and in contrast to other forms of preregistration – once the submitted proposal has been accepted at Stage 1, publication of the final article is independent of the direction or statistical significance of the results. As a third argument for submitting a Registered Report in 2016, note that funds for conducting registered research may be covered in part by a Preregistration Prize awarded by the Center for Open Science (see https://cos.io/prereg/).3 The Registered Report article format is well suited for original research aimed at testing a new hypothesis, but it is not at all limited to this case. For instance, a

The impact factor (IF) is calculated by dividing the number of citations, in 2014, of articles published in 2012 and 2013 by the number of articles published in that 2-year period. Retrieved from Thomson Reuters, December 13, 2015. The h5-index is the h-index for articles published in the last 5 complete years. It is the largest number h such that h articles have at least h citations each. Retrieved from Google Scholar, December 13, 2015. See Instructions to Authors at http://www.hogrefe.com/periodicals/experimental-psychology/advice-for-authors/. It is our impression that these advantages are especially relevant for younger researchers who often struggle with replicating findings they plan to investigate, and are expected to publish a certain number of articles in a brief period of time. With a Registered Report accepted at Stage 1, researchers can be rest assured that their work will in fact be published, and the time course until publication is predictable. Acceptance at Stage 1 may even come to be seen as partial fulfillment of publication requirements by thesis committees. Finally, (partial) funding on the basis of a peer-reviewed proposal can help enable young researchers to investigate independent research questions.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):1–2 DOI: 10.1027/1618-3169/a000317

to-be-tested hypothesis may well be the result of research exploring a new phenomenon across a series of studies; manuscripts reporting exploratory research can greatly benefit from including a final registered study aimed at confirming the main prediction. As another example, studies attempting to replicate prior work should be submitted as Registered Reports because result-blind review and editorial decisions are especially critical for replication attempts. Submissions of replication proposals are explicitly welcomed at Experimental Psychology. Needless to say, all submissions will be held to the same high standards of theoretical innovation and methodological rigor.

Reporting Standards Openness and transparency are key ingredients of the scientific method. In the past, transparent reporting of design, methods, and results has too often remained an unattained goal; in response to this, a broad coalition of scientific societies, journals, publishers, and research funding organizations have adopted new guidelines for reporting standards (see http://cos.io/top). At Experimental Psychology, we support this initiative, and we have made explicit the required reporting standards in the Instructions to Authors (see Footnote 2). Submissions are expected to adhere to the following five reporting requirements: 1. Citation: Authors must acknowledge all kinds of intellectual contributions made by other researchers by citing their work in accordance with APA’s publication manual – not only contributions of the theoretical and empirical kind, but also including materials, methods, data, and analysis code; 2. Design and Analysis: Authors must transparently describe study design and data analysis; this includes reporting all manipulations and measures, describing how sample size was determined, and explaining any data exclusions; 3. Materials and Methods: Authors are encouraged to share any original materials and methods used in their study; the manuscript must state whether and by which means they do so; 4. Preregistration and Replication: The journal offers the Registered Reports article category, for which preregistration of study methods and analysis plan are required; Replications should be submitted as Registered Reports; 5. Data sharing: Data must be shared publicly (i.e., uploaded as an electronic supplement or deposited in an independent repository). These reporting standards largely reflect current editorial policy (with the one exception that authors are now asked to make explicit whether and how materials and methods

Experimental Psychology 2016; Vol. 63(1), 1–2

Editorial

are shared). Publishing in accordance with the new transparency guidelines should therefore come without noticeable additional effort for authors because, at Experimental Psychology, a high level of transparency already represents established practice.

Continuing a Tradition: Call for Special Issues Until a few years back, Experimental Psychology used to publish excellent thematic Special Issues on topics within the journal’s scope – for instance, on working memory and cognition (Issue 4/2004), bilingual sentence processing (Issue 3/2003), and Internet-based psychological experimenting (Issue 4/2002) – with some of the articles in these issues among the most influential articles in the journal. We have decided to revive this tradition: Beginning in 2016, thematic Special Issues will again be published in Experimental Psychology. Special Issue topics can be proposed from any area of experimental psychology as a basic science. Proposals can be submitted at any time. A Call for Proposals is available on the journal’s website (http://www.hogrefe.com/periodicals/ experimental-psychology/call-for-papers/). I invite researchers to use this great opportunity to present the state-of-the art research conducted in their area of expertise. A final note concerns the editorial team: In 2015, Christian Frings (University of Trier, Germany) and Frederick Verbruggen (University of Exeter, UK) have decided to leave the editorial team due to other important commitments. I hereby express my great gratitude for their long and excellent service as associate editors and their important help in implementing innovation. With their departure, the journal needed new editorial expertise in the domain of action control, and I am very grateful to be able to welcome Gesine Dreisbach (University of Regensburg, Germany) as a new associate editor.

References Klauer, K. C., Becker, M., & Spruyt, A. (2016). Evaluative priming in the pronunciation task: A preregistered replication and extension. Experimental Psychology, 63, 70–78. doi: 10.1027/16183169/a000286 Christoph Stahl Department of Psychology University of Cologne Herbert-Lewin-Str. 2 50931 Cologne Germany Tel. +49 221 470-3428 Fax +49 221 470-7089 E-mail christoph.stahl@uni-koeln.de

Ó 2016 Hogrefe Publishing

How to be more persuasive and successful in negotiations “Presented in a concise and even entertaining style, this book succeeds in demonstrating how to negotiate successfully and fairly at the same time. A clear recommendation.” Heinz Schuler, PhD, Hohenheim University, Stuttgart, Germany

Marco Behrmann

Negotiation and Persuasion

The Science and Art of Winning Cooperative Partners 2016, viii + 128 pp. US $34.80 / € 24.95 ISBN 978-0-88937-467-6 Also available as eBook Scientific research shows that the most successful negotiators analyze the situation thoroughly, self-monitor wisely, are keenly aware of interpersonal processes during the negotiation – and, crucially, enter negotiations with a fair and cooperative attitude. This book is a clear and compact guide on how to succeed by means of such goal-oriented negotiation and cooperative persuasion. Readers learn models to understand and describe what takes place during negotiations, while numerous figures, charts, and checklists clearly summarize effective

www.hogrefe.com

strategies for analyzing context, processes, competencies, and the impact of our own behavior. Real-life case examples vividly illustrate the specific measures individuals and teams can take to systematically improve their powers of persuasion and bargaining strength. The book also describes a modern approach to raising negotiation competencies as part of personnel development, making it suitable for use in training courses as well as for anyone who wants to be a more persuasive and successful negotiator.

Test development and construction: Current practices and advances

ing Com il Apr 2016

“This book is indispensable for all who want an up-to-date resource about constructing valid tests.” Prof. Dr. Johnny R. J. Fontaine, President of the European Association of Psychological Assessment, Faculty of Psychology and Educational Sciences, Ghent University, Belgium

Karl Schweizer / Christine DiStefano (Editors)

Principles and Methods of Test Construction Standards and Recent Advances

(Series: Psychological Assessment – Science and Practice – Vol. 3) 2016, ca. vi + 329 pp. ca. US $63.00 / € 44.95 ISBN 978-0-88937-449-2 Also available as eBook This latest volume in the series Psychological Assessment – Science and Practice describes the current state-of-the-art in test development and construction. The past 10–20 years have seen substantial advances in the methods used to develop and administer tests. In this volume many of the world’s leading authorities collate these advances and provide information about current practices, thus equipping researchers and students to successfully construct new tests using the best modern standards and

www.hogrefe.com

techniques. The first section explains the benefits of considering the underlying theory when designing tests, such as factor analysis and item response theory. The second section looks at item format and test presentation. The third discusses model testing and selection, while the fourth goes into statistical methods that can find group-specific bias. The final section discusses topics of special relevance, such as multitraitmultimethod analyses and development of screening instruments.

Theoretical Article

Single- and Dual-Process Models of Biased Contingency Detection Miguel A. Vadillo,1,2 Fernando Blanco,3 Ion Yarritu,3 and Helena Matute3 1

Primary Care and Public Health Sciences, King’s College London, UK

Department of Experimental Psychology, University College London, UK

Departamento de Fundamentos y Métodos de la Psicología, Universidad de Deusto, Bilbao, Spain

Abstract: Decades of research in causal and contingency learning show that people’s estimations of the degree of contingency between two events are easily biased by the relative probabilities of those two events. If two events co-occur frequently, then people tend to overestimate the strength of the contingency between them. Traditionally, these biases have been explained in terms of relatively simple single-process models of learning and reasoning. However, more recently some authors have found that these biases do not appear in all dependent variables and have proposed dual-process models to explain these dissociations between variables. In the present paper we review the evidence for dissociations supporting dual-process models and we point out important shortcomings of this literature. Some dissociations seem to be difficult to replicate or poorly generalizable and others can be attributed to methodological artifacts. Overall, we conclude that support for dual-process models of biased contingency detection is scarce and inconclusive. Keywords: associative models, cognitive biases, contingency learning, cue-density bias, dual-process models, illusory correlations, outcomedensity bias, propositional models

Contingency learning is the ability to detect that different events in the environment are statistically related. Classical and instrumental conditioning are probably the simplest and more popular examples of contingency learning. However, this ability is also an essential part of more sophisticated cognitive processes like language acquisition (Ellis, 2008), visual search (Chun & Turk-Browne, 2008), causal induction (Holyoak & Cheng, 2011), or categorization (Kruschke, 2008). Given the importance of these processes, it is hardly surprising that people tend to be very good at detecting statistical correlations and causal relations since their first years of life (Beckers, Vandorpe, Debeys, & De Houwer, 2009; Gopnik, Sobel, Schulz, & Glymour, 2001; Saffran, Aslin, & Newport, 1996). Unfortunately, we are so eager to detect statistical patterns that we also tend to perceive them when they are absent (Chapman & Chapman 1969; Matute, 1996; Redelmeier & Tversky, 1996). Understanding how and why we misperceive contingency between unrelated events has become one of the most interesting topics of research in cognitive psychology (Gilovich, 1991; Vyse, 1997). It has been suggested that biased contingency detection might play a role in the development of pseudoscientific thinking, clinical errors, social stereotyping, and pathological behavior, among others (Hamilton & Gifford, 1976; Lilienfeld, Ritschel, Lynn, Cautin, & Latzman, 2014; Matute, Yarritu, & Vadillo, 2011; Orgaz, Estevez, & Matute, 2013; Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

Reuven-Magril, Dar, & Liberman, 2008). From this point of view, basic research on the mechanisms underlying these biases can make a potential contribution to the improvement of debiasing and educational strategies aimed at counteracting them (Barbería, Blanco, Cubillas, & Matute, 2013; Matute et al., 2015). In the present review, we focus on two specific biases that are assumed to distort our perception of contingency, namely, the cue-density bias and the outcome-density bias (Allan & Jenkins, 1983; López, Cobos, Caño, & Shanks, 1998; Wasserman, Kao, Van-Hamme, Katagiri, & Young, 1996). In the following sections we briefly explain these two biases and their contribution to our understanding of why we perceive illusory correlations and why we infer causal relations between events that are actually independent. Although biases in contingency detection have been explored extensively for decades, there is still little consensus about their underlying mechanisms. Traditionally, they have been explained in terms of relatively simple singleprocess models that put the stress on basic learning and memory processes (Fiedler, 2000; López et al., 1998; Shanks, 1995). However, more recent theories have suggested that multiple processes are needed to fully understand biases in contingency detection (Allan, Siegel, & Hannah, 2007; Allan, Siegel, & Tangen, 2005; Perales, Catena, Shanks, & González, 2005; Ratliff & Nosek, 2010). In general, these theories fit very well with the Experimental Psychology 2016; Vol. 63(1):3–19 DOI: 10.1027/1618-3169/a000309

M. A. Vadillo et al., Biased Contingency Detection

Figure 1. Panel 1A represents a standard 2 2 contingency table. Panels 1B–1G represent examples of contingency tables yielding different Δp values.

increasing popularity of dual-process models in cognitive psychology (Sherman, Gawronski, & Trope, 2014). The goal of the present paper is to assess critically the evidence for dual-process models of biased contingency learning. To summarize our review, we first present the basic methodology used to explore biases in contingency detection and the main (single- and dual-process) theories designed to explain them. Then, we present the results of a reanalysis of our own published work that suggests that the findings that support dual-process models were not replicated in our own data set, comprising data from 848 participants. In addition, computer simulations show that the use of insensitive dependent measures might explain some results that are typically interpreted in terms of dualprocess models. Finally, we explore the experiments that have tested the predictions of dual-process models with implicit measures and we argue that the pattern of results is too heterogeneous to draw any firm conclusions. In light of this, we conclude that for the moment it would be premature to abandon traditional, single-process models. Unless future research shows otherwise, these models still provide the best and simplest framework to understand biases in contingency detection and to design successful debiasing strategies. Experimental Psychology 2016; Vol. 63(1):3–19

Biased Contingency Detection and Illusory Correlations Imagine that you were asked to evaluate whether a new medicine produces an allergic reaction as a side effect. To accomplish this task, you are shown the individual medical records of a number of patients where you can find out whether each patient took the medicine and whether he/ she suffered an allergic reaction. How should you assess the relation between taking the medicine and suffering the allergy? As shown in Figure 1A, to make this judgment you would need four pieces of information that can be summarized in a 2 2 contingency table. You would need to know how many patients took the medicine and suffered an allergy (cell a), how many patients took the medicine and did not suffer an allergy (cell b), how many patients did not take the medicine but suffered an allergy nevertheless (cell c), and, finally, how many patients did not take the medicine and did not suffer an allergy (cell d). Based on this information, you could compute some measure of contingency and estimate whether or not that level of contingency is substantially different from zero. Although there are alternative ways to measure Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

M. A. Vadillo et al., Biased Contingency Detection

contingency, the Δp rule is usually considered a valid normative index (Allan, 1980; Cheng & Novick, 1992; Jenkins & Ward, 1965). According to this rule, if you want to assess the degree of contingency between a cue (e.g., taking a medicine) and an outcome (e.g., an allergy), then you need to compute

Δp ¼ pðojcÞ pðoj cÞ;

ð1Þ

where p(o|c) is the probability of the outcome given the cue and p(o| c) is the probability of the outcome in the absence of the cue. As shown in Figure 1A, these two probabilities can be easily computed from the information contained in a contingency table. Positive values of Δp indicate that the probability of the outcome is higher when the cue is present than when it is absent (see, e.g., Figure 1B). In contrast, negative values indicate that the probability of the outcome is reduced when the cue is present (e.g., Figure 1C). Finally, if the probability of the outcome is the same in the presence as in the absence of the cue, the value of Δp is always 0 (e.g., Figure 1D). In these latter cases, there is no contingency between cue and outcome. When laypeople are asked to estimate the contingency between two events, do their judgments agree with this normative rule? As we will see, the answer is both “yes” and “no.” To study how people detect contingency, researchers typically rely on a very simple task that has become a standard procedure in contingency-learning research. During the task, participants are exposed to a series of trials in which a cue and an outcome may be either present or absent, and they are instructed to discover the relationship between both. As in our previous example, the cue can be a fictitious medicine taken by some patients and the outcome can be an allergic reaction. On each trial, participants are first shown information about whether a patient took the drug on a specific day and they are asked to predict whether or not they think that this patient will develop an allergic reaction. After entering a yes/no response, they receive feedback and they proceed to the next trial. Once they have seen the whole sequence of trials in random order, the participants are asked to rate their perceived strength of the relationship between the medicine and the allergic reaction. The usual result is that participants’ judgments tend to covary with the objective drug-allergy contingency as measured by Δp (e.g., López et al., 1998; Shanks & Dickinson, 1987; Wasserman, 1990). Therefore, to some extent participants seem to be able to track the actual cue-outcome contingency. However, departures from the objective contingency are also observed. For instance, participants’ judgments tend to be biased by the marginal probability of the outcome, defined as the proportion of trials in which the outcome is present, that is, Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

p(outcome) = (a + c)/(a + b + c + d). Figure 1E depicts an example where there is no contingency between cue and outcome, but the outcome tends to appear very frequently. In situations like this, participants tend to overestimate contingency (Allan & Jenkins, 1983; Allan et al., 2005; Buehner, Cheng, & Clifford, 2003; López et al., 1998; Musca, Vadillo, Blanco, & Matute, 2010; Wasserman et al., 1996). Similarly, other things being equal, participants’ judgments tend to covary with the marginal probability of the cue, defined as the proportion of trials in which the cue is present; that is, p(cue) = (a + b)/(a + b + c + d). Figure 1F represents an example where the probability of the cue is high, but there is no contingency between cue and outcome. Again, participants tend to overestimate contingency in situations like this (Allan & Jenkins, 1983; Matute et al., 2011; Perales et al., 2005; Vadillo, Musca, Blanco, & Matute, 2011; Wasserman et al., 1996). The biasing effects of the probability of the outcome and the probability of the cue are typically known as outcome- and cue-density biases. As the astute reader might guess, the most problematic situation is that in which both the probability of the outcome and the probability of the cue are large (Figure 1G). Participants seem to find particularly difficult to detect the lack of contingency in these cases (Blanco, Matute, & Vadillo, 2013). It is interesting to note that biases akin to these have also been found in the social psychology literature on illusory correlations in stereotype formation (Hamilton & Gifford, 1976; Kutzner & Fiedler, 2015; Murphy, Schmeer, Vallée-Tourangeau, Mondragón, & Hilton, 2011). In these experiments, participants are shown information about the personality traits of members of two social groups. Across trials, participants see more information about one of the groups than about the other and they also see more people with positive traits than people with negative traits. Most importantly, the proportion of positive and negative traits is identical in both social groups. Therefore, there is no correlation between membership to the majority or the minority group and the quality (positive vs. negative) of personality traits. As can be seen, if one assumes that social groups play the role of cues and that positive and negative traits play the role of outcomes, this situation is identical to the one represented in Figure 1G. Although there is no correlation between groups and traits, when participants are asked to rate the traits of both groups, they systematically tend to judge the majority group more favorably than the minority group. In other words, despite the absence of a real correlation, participants tend to associate the majority group with the most frequent (positive) personality traits and the minority group with the least frequent (negative) personality traits. The interesting point, for our current purposes, is that we can interpret illusory correlations as a combination of cue- and outcome-density biases, which Experimental Psychology 2016; Vol. 63(1):3–19

M. A. Vadillo et al., Biased Contingency Detection

means that this effect may be explained as a contingencylearning phenomenon.

Single-Process Models of Cue- and Outcome-Density Biases Traditionally, demonstrations of cue/outcome-density biases and illusory correlations have been explained in terms of simple associative processes analogous to those assumed to account for classical and instrumental conditioning (e.g., Alloy & Abramson, 1979; López et al., 1998; Matute, 1996; Murphy et al., 2011; Shanks, 1995; Sherman et al., 2009; Van Rooy, Van Overwalle, Vanhoomissen, Labiouse, & French, 2003). The associative learning rule proposed by Rescorla and Wagner (1972) provides the simplest example of this family of models. According to the Rescorla-Wagner model, when a cue is followed by an outcome, an association or link is formed between the representations of both stimuli. After each pairing, the strength of the association is assumed to increase or decrease according to the formula:

ΔV C O ¼ α β ðλ V TOTAL Þ;

ð2Þ

where ΔVC–O is the increase in the strength of the cueoutcome association after that trial, α and β are learning rate parameters depending on the salience of the cue and the salience of the outcome, respectively, λ is a dummy variable coding whether the outcome was present or absent in that trial, and VTOTAL is the sum of the associative strengths of all the potential cues of the outcome present in that trial. In addition to the target cue, a contextual cue is assumed to remain present in all trials. The association of the contextual cue with the outcome is also updated according to Equation 2. To illustrate how this simple model accounts for cue- and outcome-density biases, in Figure 2 we show the predictions of the model when given as input the six contingencies depicted in Figures 1B–1G. The top panel shows the predictions of the model when the contingency is positive (1B), negative (1C), or zero (1D). As can be seen, eventually the strength of the cue-outcome association tends to converge to the true contingency, as defined by Δp. By the end of training, the model learns a positive association when the cue-outcome contingency is positive and a negative association when the cue-outcome contingency is negative. When the cue-outcome contingency is exactly zero, the associative strength of the cue also tends to move toward this value. Therefore, the model does a good job at explaining why people are good at detecting contingencies (see Chapman & Robbins, 1990; Danks, 2003; Wasserman, Elek, Chatlosh, & Baker, 1993). However, Experimental Psychology 2016; Vol. 63(1):3–19

Figure 2. Results of a computer simulation of the six contingencies represented in Figures 1B–1G using the Rescorla-Wagner learning algorithm. The simulation was conducted using the Java simulator developed by Alonso, Mondragón, and Fernández (2012). For this simulation, the learning rate parameters were set to αcue = 0.3, αcontext = 0.1, βoutcome = β outcome = 0.8.

the model also predicts some systematic deviations from the true contingency. In the four conditions where the contingency is zero, depicted in the bottom panel, the model predicts an overestimation of contingency during the initial stages of learning. These overestimations are larger when the outcome (1E) or the cue (1F) is very frequent, and even larger when both of them are very frequent (1G). Therefore, the model also provides a nice explanation for cue- and outcome-density biases (Matute, Vadillo, Blanco, & Musca, 2007; Shanks, 1995; Vadillo & Luque, 2013). Regardless of the merits and limits of associative models (Mitchell, De Houwer, & Lovibond, 2009; Shanks, 2010), for our present purposes, their most important feature is that, according to them, the same mechanism explains (1) why people are sensitive to contingency and (2) why their judgments are also biased under some conditions. A single process accounts for accurate and biased contingency detection. As we will discuss below, this is the key feature of single-process models that distinguishes them from their dual-process counterparts. It is interesting to note that this property is also shared by other early models of biased contingency detection that do not rely on associative learning algorithms. For example, instance-based models assume that each cue-outcome trial is stored in a separate memory trace in long-term memory Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

M. A. Vadillo et al., Biased Contingency Detection

(Fiedler, 1996, 2000; Meiser & Hewstone, 2006; Smith, 1991). Parts of these memory traces may be lost during the encoding process. In a situation like the one represented in Figure 1G, the loss of information has very little impact on the encoding of cell a events, because there are many redundant memory traces representing the same type of event. However, information loss can have a severe impact on the encoding of cells’ c and d events because there are fewer traces representing them. As a result, the information encoded in memory contains more (or better) information about frequent events (cells a) than about infrequent events (cells c and d). Therefore, information loss explains why participants tend to perceive a positive contingency whenever type a events are more frequent than other events in the contingency table. Most importantly, according to these models we do not need to invoke different mechanisms to explain the cases in which participants are sensitive to the actual contingency and the instances in which their judgments are biased. Accurate and biased contingency detection are supposed to arise from the same operating mechanisms. Therefore, from our point of view, instancebased theories also belong to the category of single-process models. For our present purposes, propositional models can be considered yet another case of single-process models. In a thought-provoking series of papers, De Houwer and colleagues (De Houwer, 2009, 2014; Mitchell et al., 2009) have suggested that all instances of human contingency learning might depend exclusively on the formation and truth evaluation of propositions. In contrast to simple associations, propositions do not just represent that events in the environment are related to each other: They also qualify how they are related (Lagnado, Waldmann, Hagmayer, & Sloman, 2007). For instance, “cholesterol is a cause of cardiovascular disease” and “cholesterol is a predictor of cardiovascular disease” are different propositions. However, the difference between them cannot be represented in terms of a simple association. Although propositional models do not necessarily exclude the contribution of associative processes (see Moors, 2014), the representational power of propositions allows these models to explain aspects of learning that fall beyond the scope of simple associative models (De Houwer, Beckers, & Glautier, 2002; Gast & De Houwer, 2012; Zanon, De Houwer, & Gast, 2012). These ideas have not been formalized in a mathematical model, but nothing in their current formulation suggests that separate mechanisms would be needed to account for accurate and biased contingency detection. For our present purposes, the idea that all learning depends on the evaluation of propositions represents yet another example of a single-process model.

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

Dissociations and Dual-Process Models During the last decade, some researchers have abandoned these explanations in favor of more complex dual-process theories (Allan et al., 2005, 2007; Perales et al., 2005; Ratliff & Nosek, 2010). Although differing in the detail, the core idea of these proposals is that they call upon one mechanism to explain how people (correctly) track contingencies and a different mechanism to explain why their judgments are sometimes biased by cue and outcome density. This proposal is based on the results of several experiments showing what appear to be systematic dissociations between different dependent measures. Cue/outcome density biases and illusory correlations are typically assessed with a numerical causal or contingency rating that participants provide at the end of the experiment. As explained, these judgments show sensitivity to both actual contingency and to the biasing effects of cue and outcome density. However, according to these authors (Allan et al., 2005, 2007; Perales et al., 2005; Ratliff & Nosek, 2010), other dependent measures seem to be sensitive only to the actual contingency, showing no trace of cueor outcome-density biases. These alternative measures are assumed to be relatively uninfluenced by higher-order reasoning processes, or at least, less influenced by them than the numerical judgments typically used as dependent variables. From this point of view, it follows naturally that there must be a very basic learning mechanism that explains how people accurately track contingencies and whose output can be directly observed in these dependent variables. In contrast, judgments are affected both by contingency and by cue/outcome density biases. Because measures that address directly the learning mechanism do not seem to be sensitive to biases, these must be operating through a different mechanism that influences judgments but not the original encoding of information. This is the reason why these models incorporate different processes to account for accurate and biased contingency detection. A schematic representation of the role of learning and judgment processes of biased contingency detection is offered in Figure 3. Therefore, in dual-process models two different and successively operating mechanisms are invoked to explain why people are sensitive to contingency but they are also biased by the marginal probabilities of the cue and the outcome. The first mechanism would include basic encoding and retrieval processes that are highly sensitive to the objective contingency. The information gathered by this mechanism would then feed forward to other mechanisms involved in judgment and decision-making processes. Biases would appear only at this latter stage.

Experimental Psychology 2016; Vol. 63(1):3–19

Figure 3. Schematic representation of dual-process models of biased contingency detection.

These findings are certainly challenging for the theories of contingency detection discussed in the previous section, which in the absence of additional assumptions would typically anticipate similar effects in all dependent measures of contingency learning. Note, however, that dissociations are not a perfect basis to draw inferences about the presence of one or multiple systems. Borrowing an example from Chater (2003), following the logic of dissociations one might conclude that human beings must have different digestive systems, because some people are allergic to prawns, while others are allergic to peanuts. Tiny differences in the way a single mechanism tackles similar problems might create the illusion that several mechanisms are involved in an operation that is actually best described in terms of a single-system process. In spite of these concerns about the interpretation of dissociations, in the following sections we do not question this logic, but the reliability of the findings that support dual-process models of contingency learning. The evidence for these models stems from three papers published during the last decade (Allan et al., 2005; Perales et al., 2005; Ratliff & Nosek, 2010). Although their theoretical conclusions are quite consistent, the empirical findings reported in each of them are noticeably different. In the following sections we review each of them in turn and discuss their merits and shortcomings. To overview our criticisms, we argue that some of these findings seem to be poorly replicable or generalizable, while others are based on possibly faulty dependent measures.

Cue- and Outcome-Density Biases in Trial-by-Trial Predictions The first piece of evidence suggesting that these biases are not observed in all dependent variables comes from an interesting experiment conducted by Allan et al. (2005). Two groups of participants were instructed to discover the effect of a series of fictitious chemicals (playing the role of Experimental Psychology 2016; Vol. 63(1):3–19

M. A. Vadillo et al., Biased Contingency Detection

the cue) on the survival of a sample of bacteria in a petri dish (playing the role of the outcome). In each trial, participants saw whether or not the chemical was present in a sample and they were asked to predict whether or not the bacteria would survive. Immediately after entering their responses, they were informed of the outcome of the trial (i.e., whether the bacteria survived) and they proceeded to the next trial. At the end of training, participants were asked to rate to what extent the chemicals had a positive or a negative impact on the survival of bacteria, using a numerical scale from 100 (negative impact) to 100 (positive impact). The overall chemical-survival contingency was different for each group of participants. For one of them, the contingency was moderately positive (Δp = .467), while for the other one the contingency was always null (Δp = .000). Each participant was asked to complete three of these contingencydetection problems, all of them with the same overall contingency, but with different probabilities of the outcome. Therefore, the experiment relies on a 2 3 factorial design with contingency as a between-groups manipulation and probability of the outcome as a within-participant factor. As the reader might expect, the first finding of Allan et al. (2005) was that the numerical ratings that participants provided at the end of each problem were influenced both by contingency and outcome-probability. That is to say, participants were able, in general, to track the objective degree of contingency between each of the chemicals and the survival of bacteria; however, their ratings were also biased by the probability of the outcome. This is a replication of the wellknown outcome-density bias discussed in previous sections. Most interestingly, Allan et al. (2005) found that other dependent measures seemed to be unaffected by outcome density, although they were sensitive to the overall cueoutcome contingency. Specifically, Allan et al. used the discrete yes/no predictions made by participants in every trial to compute an alternative measure of their sensitivity to contingency. If the participant believes that there is a statistical connection between the chemicals and the survival of bacteria, then he or she should predict the survival (i.e., respond “yes” to the question of whether the bacteria would survive in the current trial) more frequently when the chemicals are present than when they are not. Following this reasoning, it would be possible to measure the extent to which a participant believes that there is a relationship between the cue and the outcome using the formula:

Δppred ¼ pð}yes}jcueÞ pð}yes}j cueÞ:

ð3Þ

Note that this index is based on the same logic that underlies the computation of Δp in Equation 1, only that the real occurrence of the outcome is replaced by the outcome predictions made by the participant (see Collins & Shanks, 2002). Therefore, Δppred does not measure the objective contingency between cue and outcome, but it Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

M. A. Vadillo et al., Biased Contingency Detection

aims at measuring the subjective contingency that the participant perceives, as revealed by the trial-by-trial predictions made during training. When using Δppred as their dependent variable, Allan et al. (2005) found that this measure was sensitive to cue-outcome contingency. However, it was absolutely unaffected by manipulations of the probability of the outcome. This result led them to conclude that there must be a stage of processing in which the contingency between cue and outcome or the target conditional probabilities have already been encoded but the outcome-density bias is still absent. Then, Δppred would provide an insight into this basic mechanism that encodes contingency in a format that is not yet influenced by the outcome-density bias. Given that the numerical judgments are influenced by both contingency and outcome-probability, this must mean that judgments are affected not only by the original encoding of information (sensitive to contingency but free from bias) but also by processes that take place after encoding (see Figure 3). In other words, the outcomedensity bias is not due to learning or encoding, but to more sophisticated processes related to judgment and decision making. Although Allan et al. (2005) explored only the outcome-density bias, they suggested that a similar approach might explain the complementary cue-density bias as well. The results of Allan et al. (2005) and their interpretation are certainly appealing. If they proved to be reliable, they would pose insurmountable problems for single-process models that aim to explain cue- and outcome-density biases as learning effects. But, how strong is this evidence? To answer this question, we decided to reanalyze data from our own laboratory using the strategy followed by Allan et al. Specifically, we reanalyzed data from nine experimental conditions exploring the cue-density bias (originally published in Blanco et al., 2013; Matute et al., 2011; Vadillo et al., 2011; Yarritu & Matute, 2015; Yarritu, Matute, & Vadillo, 2014) and three experimental conditions exploring the outcome-density bias (originally published in Musca et al., 2010; Vadillo, Miller, & Matute, 2005). All these 1

Figure 4. Scatterplot of effect sizes (Cohen’s d) of the cue- and outcome-density manipulations on the Δppred index and on judgments in 12 experimental conditions. Error bars denote 95% confidence intervals.

experiments were conducted using the standard experimental paradigm outlined above. In the original reports of those experiments we only analyzed the judgments that participants reported at the end of the training phase. However, we also collected trial-by-trial predictive responses to maintain participants’ attention and to make sure that they were following the experiment. These responses can be used to compute the Δppred index using Equation 3. This allows us to compare the size of the bias observed in the Δppred scores with the size of the bias that we observed in judgments. On the basis of the results of Allan et al., one should expect a dissociation between these two measures. More specifically, cue- and outcome-density biases should have an effect on judgments, but not on Δppred. In the following analyses, we included data from 848 participants tested in 12 conditions included in the articles mentioned in the previous paragraph.1 Figure 4 plots the

Blanco et al. (2013) reported two experiments, each of them including two conditions where the effect of the cue-density manipulation was tested. Cue-density was also manipulated in Matute et al. (2011), Vadillo et al. (2011), and Yarritu and Matute (2015, Experiment 2). The latter contributed to our analyses with two experimental conditions. Yarritu et al. (2014) reported two experiments manipulating cue density, but due to their design requirements, trial-by-trial predictions were only requested in the yoked condition of Experiment 1. In that condition the probability of the cue could actually adopt any value from 0 to 1. Following the original data analysis strategy of Yarritu et al. (2014), we categorized participants in the “low probability of the cue” condition if they belonged to the one third of the sample with the lowest probability of the cue, and we categorized them in the “high probability of the cue” condition if they belonged to the one third of the sample with the highest probability of the cue. As mentioned above, we also included in our analyses three experimental conditions exploring the outcome-density bias. One of them originally reported by Vadillo et al. (2005, Experiment 3, Group 0.50–0.00 vs. Group 1.00–0.50) and two reported by Musca et al. (2010). In most of these experiments participants were asked to provide only one judgment at the end of training. However, in Matute et al. (2011) and Vadillo et al. (2005, 2011) they were asked to provide several judgments. In the case of Matute et al. (2011) and Vadillo et al. (2011), we included in the analyses the judgment that yielded the stronger cue-probability bias in each experiment. If anything, selecting judgments that show strong biases should make it easier to observe any potential dissociation between judgments and trial-by-trial predictions. In the case of Vadillo et al. (2005) the largest effect of outcome-density was observed for prediction judgments, but this cannot be considered a bias (because it is normatively appropriate to expect the outcome to happen when its probability is very large; see De Houwer, Vandorpe, & Beckers, 2007). Because of that, in this case we analyzed their predictive-value judgments.

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

Experimental Psychology 2016; Vol. 63(1):3–19

effect size (Cohen’s d) of density biases on the Δppred index against the effect size of the same manipulation on judgments collected at the end of the experiment. As can be seen, overall, most experiments found clear evidence for density biases in Δppred. A random-effects meta-analysis yielded a statistically significant effect of size d = 0.49, 95% CI [0.08, 0.90], z = 2.37, p = .018. Therefore, overall our results do not replicate the original findings of Allan et al. (2005): Cue and outcome density do have an effect on Δppred scores. However, the confidence intervals plotted in Figure 4 show that the effect of density bias did not reach statistical significance in some experimental conditions. In two cases, the effect was even negative. A closer look at the data shows that in the rare occasions when biases were not observed in Δppred, they also tended to be absent or smaller than usual in judgments. A meta-regression confirmed that the effect size of biases on Δppred was moderated by the effect size of biases on judgments, Q(1) = 5.54, p = .019. These analyses suggest that cue- and outcome-density biases can be observed in Δppred and that, when they are absent, it is not due to a dissociation between judgments and Δppred, but to any other factor that affects both measures. To show that these results are robust, we also analyzed the individual data of the participants tested in all the experimental conditions shown in Figure 4. Across all conditions, the cue-/outcome-density manipulation had an effect on Δppred, t(846) = 6.20, p < .001, d = 0.43. This result also held when data from experiments exploring cue- and outcomedensity biases were analyzed separately: t(659) = 4.97, p < .001, d = 0.37, and t(185) = 4.01, p < .001, d = 0.59, respectively. Not surprisingly, the two dependent measures, judgments and Δppred, were significantly correlated, r = .27, p < .001, and this correlation remained significant when data from cue- and outcome-density biases were analyzed independently: r = .28, p < .001, and r = .24, p < .001, respectively. Overall the data shown in Figure 4 and these additional analyses are inconsistent with the hypothesis that trial-by-trial predictions are unbiased by the probability of the cue/outcome or that radically different results are observed with judgments and trial-by-trial predictions. Thus, there is no need to postulate different mechanisms to account for judgments and for trial-by-trial predictions. This being said, Figure 4 also reveals that some of our studies failed to find a significant effect of the cue/outcome density manipulation on Δppred. To some extent, this feature of our results can be considered a replication of Allan et al. (2005). But as seen in Figure 4 and in the previous analyses, this can hardly be considered strong evidence for a dissociation between judgments and Δppred. The fact that sometimes Δppred fails to yield significant results may be due to its reduced reliability compared to judgments. There are many situations where the use of unreliable or

Experimental Psychology 2016; Vol. 63(1):3–19

M. A. Vadillo et al., Biased Contingency Detection

insensitive measures can produce patterns of results that look like dissociations but do not require a dual-process account (Shanks & St. John, 1994; Vadillo, Konstantinidis, & Shanks, 2016). Consistent with this interpretation, there are good reasons why Δppred might be an imperfect index of contingency learning, as we discuss in subsequent sections.

Signal Detection Theory Analyses of Trial-By-Trial Predictions Perales et al. (2005, Experiment 1) found a dissociation strikingly similar to the one reported by Allan et al. (2005). In the study of Perales et al., two groups of participants were exposed to several contingency-learning problems where they had to learn the relationship between the activation of a fictitious minefield and the explosion of enemy tanks. In each trial, participants were first presented with information about whether the minefield was active and were asked to predict by means of a yes/no response whether or not they thought that the tanks would explode in that trial. After entering their predictions, they were given feedback and they proceeded to the next trial. For one of the groups, the contingency between the minefield and the explosions was always positive, Δp = .50, while the contingency was always zero for all the problems presented to the other group of participants. Within participants, the probability of the cue was manipulated with two levels, high (.75) and low (.25). At the end of each problem, participants were asked to rate the strength of the causal relation between the activation of the minefield and the explosion of tanks. Consistent with previous reports, Perales et al. found that these numerical judgments were sensitive not only to the contingency manipulation, but also to the cuedensity manipulation. That is to say, for a specific level of contingency, judgments tended to vary with the probability of the cue, replicating the well-known cue-density bias. However, as in the case of Allan et al., Perales et al. found that other dependent measures, also computed from participants’ trial-by-trial responses, were sensitive only to contingency and were largely immune to the cue-density bias. Unlike Allan et al. (2005), Perales et al. (2005, Experiment 1) did not convert trial-by-trial predictions to Δppred but, instead, they computed two alternative measures inspired in Signal Detection Theory (SDT) analyses. One of them, the criterion for responding, was a measure of participants’ overall tendency to predict that the outcome will occur. The second one, d0 , was the discriminability index of SDT analyses and aimed at measuring participants’ ability to discriminate when the outcome was more likely to

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

M. A. Vadillo et al., Biased Contingency Detection

appear and when it was less likely to appear. For reasons that will become obvious later, only this second measure is relevant for the present discussion. To compute the d0 index, Perales et al. registered the “hit rate” of each participant (i.e., the proportion of trials in which they correctly predicted that the outcome would occur, among all trials where the outcome was present) and their “false alarm rate” (i.e., the proportion of trials in which they incorrectly predicted the outcome, among all trials where the outcome was absent). Based on these two measures, d0 can be easily computed as

d0 ¼ zðhit rateÞ zðfalse alarm rateÞ;

ð4Þ

where z is the inverse of the normal cumulative density function. Crucially, this equation ignores completely whether participants made their predictions on cue-present or on cue-absent trials. The only important thing is whether they correctly predicted the outcome when it was going to happen and whether they incorrectly predicted the outcome when it was not going to happen. In other words, the d0 index measures to what extent participants are good at discriminating when the outcome will be presented and when it will not. The rationale for using this index as a measure of participants’ sensitivity to contingency is that, in principle, if participants have learned the correct cue-outcome contingency, they should be able to make more accurate predictions, and this should yield a higher d0 . The key finding of Perales et al. (2005, Experiment 1) was that participants’ d0 scores turned out to be sensitive just to the contingency manipulation, but not to the cuedensity manipulation. This parallels Allan et al.’s (2005) finding that Δppred was affected by manipulations of contingency, but not by manipulations of the probability of the outcome. Taken collectively, both experiments converged on the same idea: There are some dependent measures that reflect that participants have learned the cue-outcome contingency, but which nevertheless show no trace of cue- or outcome-density bias. This stands in stark contrast with the patterns of results found in numerical judgments, which are sensitive to both contingency and density biases. Perales et al. discussed several dual-process accounts that could explain these dissociations. Although differing in the detail, all of them dovetailed with the idea that there is a basic encoding mechanism that tracks cue-outcome contingency in a format free from any cue- or outcomedensity bias. The d0 index would be a direct measure of this unbiased learning process. The density biases observed in judgments must then be attributed to other mechanisms that intervene in later stages of processing. This account fits well with the general framework outlined in Figure 3.

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

However, a closer inspection of the results reported by Perales et al. (2005, Experiment 1) suggests that alternative interpretations are possible. A first striking feature of the results is that the manipulation of the probability of the cue did in fact seem to have an effect on d0 , although this effect was only marginally significant (p = .09, ηp2 = .067). The authors argued that this effect was “far too small to account for the significant effect of causedensity [i.e., cue-density] on judgments.” (p. 1109). However, this argumentation is only valid if one assumes that the validity, reliability, and sensitivity of d0 as a dependent measure are comparable to those of judgments. If d0 turned out to be less sensitive, then the smaller effect size of the cue-density effect found of d0 would be very poor evidence for a dissociation. Are there any reasons to suspect that d0 is not a sensitive measure for cue- or outcome-density biases? We think so. The problems of d0 as a measure of learning are particularly obvious in the two null-contingency conditions of Perales et al. (2005, Experiment 1). In those conditions, the probability of the outcome was always .50, no matter whether the cue was present or not. In this situation, there was nothing participants could do to predict the outcome successfully. The outcome and its absence were equally likely and the cue did not offer any information to make the outcome more predictable. Furthermore, this was the case in both the high cue-density and the low cue-density conditions: The probability of the cue was higher in one condition than in the other, but this did not change the fact that the outcome was equally unpredictable in both conditions. Therefore, it is not surprising that participants in both conditions got d0 values indistinguishable from zero. Note that this does not mean that participants had the same perception of contingency in those two conditions. It only means that in those particular conditions, nothing participants learn can help them make better outcome predictions as measured by d0 . But crucially, if all participants produce a d0 of zero regardless of their predictions, then their score cannot be used as a measure of their perception of contingency. A computer simulation provides a simple means to illustrate the problems of d0 as a measure of contingency learning. In the following simulation we computed d0 for a large number of simulated participants exposed to the same contingencies used in Perales et al. (2005, Experiment 1). The labels PC-HD, PC-LD, NC-HD, and NC-LD shown in Figure 5 refer to the four experimental conditions tested by Perales et al. These labels denote whether contingency was positive (PC) or null (NC) and whether the density of the cue was high (HD) or low (LD). For each experimental condition, we computed how many correct and incorrect responses a simulated participant would get. Then, we used

Experimental Psychology 2016; Vol. 63(1):3–19

M. A. Vadillo et al., Biased Contingency Detection

Figure 5. Simulated d0 scores of participants with different response strategies in the four experimental conditions included in Perales et al. (2005, Experiment 1).

this information to compute the d0 of that simulated participant using Equation 4.2 For each condition, we simulated 1,000 participants with nine different response distributions. The rationale for simulating a wide variety of response strategies is that, if d0 is a valid measure of learning, this index should adopt different values when participants behave differently. Imagine that one participant learns that there is positive contingency between the cue and the outcome. This participant should say “yes” very frequently when asked whether the outcome will follow the cue and he/she should say “no” very frequently when asked whether the outcome will appear in a trial in which the cue was absent. Now imagine a second participant who learns that there is no contingency between cue and outcome. This participant will be just as likely to predict the outcome in cue-present and in cue-absent trials. If d0 is a good measure of contingency learning, these two participants should get different d0 scores. In contrast, if participants who act on the basis of different beliefs about the cue-outcome contingency obtain the same d0 score, this would imply that d0 is not a valid measure of contingency detection. To minimize the impact of sampling error on the exact number of hits and false alarms, we run 1,000 simulations for each combination of experimental condition and response distribution. Each of the nine series of data shown in Figure 5 refers to a different response distribution (e.g., 25/75, 50/50, . . .). The first number (25, 50, or 75) refers to the probability of predicting the outcome in the presence of the cue and the second number (also 25, 50, or 75) refers to the probability of predicting the outcome in the absence of the cue.

For instance, a simulated participant with response strategy 75/25 would predict the outcome with probability .75 if the cue was present and with probability .25 if the cue was absent (which is consistent with the belief in a positive contingency). Similarly, a simulated participant with response strategy 75/75 would predict the outcome with probability .75 regardless of whether the cue is present or absent (which is consistent with the belief in a null contingency). Consistent with our previous discussion, Figure 5 shows that all simulated participants got almost identical d0 scores in the null-contingency conditions (right-most half of the figure). In other words, the d0 scores obtained by those participants reveal absolutely nothing about their pattern of performance. In these null-contingency conditions, a participant who acted as if there were a positive cue-outcome contingency (e.g., 75/25) would receive virtually the same score as a participant who acted as if contingency were negative (e.g., 25/75). Similarly, a participant who was very prone to predicting the outcome (e.g., 75/75) and a participant who was very reluctant to predict the outcome (e.g., 25/25) would obtain similar d0 scores. These predictions do not differ across conditions with different cue-densities. This confirms our suspicion that the d0 scores obtained by Perales et al. (2005, Experiment 1) in the NC conditions tell us nothing about the participants performance’, let alone about their beliefs regarding the cue-outcome contingency or their outcome expectancies. In contrast, the left-hand side of Figure 5 shows that d0 can be a sensitive measure of the perception of contingency in the positive contingency (PC) conditions tested by Perales et al. In these conditions, different patterns of

In our simulations, occasionally the hit or the false-alarm rates had values of 0 or 1. The z function for these values yields 1 and 1, respectively. To avoid this problem, we followed the correction suggested by Snodgrass and Corwin (1988). Specifically, in the computation of the hit rate, we added 0.001 to the number of hits and 0.002 to the number of outcome-present trials. The same correction was used in the computation of the false-alarm rate. This correction only makes a minimal difference in the value of d0 , except when either the hit or the falsealarm rates have extreme (0 or 1) values.

Experimental Psychology 2016; Vol. 63(1):3–19

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

M. A. Vadillo et al., Biased Contingency Detection

responding do give rise to different d0 scores, suggesting that d0 can reveal something about participants’ beliefs or about their response strategies. Therefore, it is only in these conditions with positive contingencies that one can expect to measure differences in performance with d0 . It is interesting to note that visual inspection of the data reported by Perales et al. (2005) confirms that the trend towards a cue-density bias was stronger in the PC condition than in the NC condition. When one takes into account that one half of the experiment (the two NC conditions) is affected by a methodological artifact, it becomes less surprising that the cue-density manipulation only had a marginally significant effect on d0 .

Different Processes or Different Strategies? This being said, we note that in the PC conditions the cue-density effect observed in d0 still looks relatively small, compared to the large effect found in participants’ judgments. Going back to the reanalysis of our own studies presented in our previous section, we also found there that, in some occasions, cue/outcome density manipulations seemed to have a stronger effect on judgments than on Δppred. Based on this evidence, it appears that, in general, all the dependent measures computed from trial-by-trial predictions (either Δppred or d0 ) are less sensitive than numerical judgments. Is there any reason why these dependent measures should be less reliable? As we will show below, we suspect that the data reported by Allan et al. (2005) and Perales et al. (2005) provide an interesting insight into this question. Imagine that two participants, A and B, have been exposed to exactly the same sequence of trials and that, as a result, they end up having the same beliefs about the relationship between a cue and an outcome. For instance, imagine that both of them have learned that the probability of the outcome given the cue is .75 and that the probability of the outcome given the absence of the cue is .25. This means that both of them would believe, implicitly or explicitly, that there is a moderate positive contingency between cue and outcome (i.e., that the outcome is more likely to appear in the presence of the cue than in its absence). Now, let us assume that both participants are presented again with a series of trials where the cue is present or absent and they are asked to predict whether or not the outcome will be presented in each trial. Participant A might consider that, because the probability of the outcome given the cue is .75, he should predict the outcome in roughly 75% of the trials where the cue is present. And, similarly,

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

because the probability of the outcome given the absence of the cue is .25, he predicts the outcome in approximately 25% of the trials where the cue is absent. The behavior of this participant would show what researchers call “probability matching” (Nies, 1962; Shanks, Tunney, & McCarthy, 2002; Tversky & Edwards, 1966), that is, his predictions would match the probabilities seen (or perceived) in the environment. Now imagine that Participant B is asked to do the same task. But Participant B has a different goal in mind: He wants to be right as many times as possible. If the outcome appears 75% of the times when the cue is present, then predicting the outcome on 75% of the trials is not a perfect strategy. If he did that, on average, he would be right on 56.25% of the trials (i.e., .75 .75). In contrast, if he always predicts the outcome when the cue is present, he will be correct 75% of the times (i.e., 1.00 .75). If he wants to maximize the number of correct outcome predictions, this is a much more rational strategy. Following the same logic, if the probability of the outcome in the absence of the cue is .25, it makes sense to always predict the absence of the outcome. Doing that will allow him to be correct on 75% of the trials. Our point is that if a participant wants to maximize the number of correct predictions, he will predict the outcome whenever he thinks that its probability is higher than .50 and he will predict the absence of the outcome in any other case. Not surprisingly, research with human and nonhuman animals shows that maximization is a typical response strategy in many situations (e.g., Unturbe & Coromias, 2007). Interestingly, although both participants, A and B, base their responses on the same ‘beliefs’, their behavior is radically different because they pursue different goals. This has important implications for our review of the results reported by Allan et al. (2005) and Perales et al. (2005). If trial-bytrial predictions do not only depend on the perceived contingency but also on response strategies like probability matching or maximization, then any dependent variable computed from them (like Δppred or d0 ) will be necessarily noisy and unreliable. This is particularly problematic when many participants rely on a maximization strategy. For instance, if we computed Δppred for Participant A in our previous example, that would yield an approximate value of .50, which reflects quite well his beliefs about the contingency between cue and outcome. However, if we computed Δppred for Participant B, this would yield a value of 1.00, which is a gross overestimation of his true beliefs. These two participants would also receive different d0 scores. In the case of Perales et al. (2005, Experiment 1), there is clear evidence of probability maximization. The data reported in the Appendix of Perales et al. confirm that participants in the PC group predicted the outcome in around 92–93% of cue-present trials and in 12–13% of cue-absent Experimental Psychology 2016; Vol. 63(1):3–19

trials. In the case of Allan et al. (2005) there is no obvious evidence for maximization in their noncontingent condition, but there appears to be such a trend in the contingent condition. Their Figure 5 suggests that although the actual probability of the outcome given the cue varied from .567 to .900, participants predicted the outcome in 70–90% of cue-present trials. Similarly, although the probability of the outcome in the absence of the cue ranged from .100 to .667, participants predicted the outcome in 10–30% of cue-absent trials. This pattern is perhaps less extreme than the one found in Perales et al. (2005), but it does nevertheless suggest that many of their participants must have used a maximization strategy. In either case, this strategy makes Δppred and d0 less sensitive to any manipulation. The lack of sensitivity might explain why they failed to find any effect of cue and outcome density on these dependent variables. It is interesting to note that participants are more likely to become “maximizers” in relatively long experiments, which provide more opportunities to develop optimal response strategies (Shanks et al., 2002). This might explain the diverging results obtained by Allan et al. (2005) and Perales et al. (2005) and our own experiments. In their experiments, cue and outcome density were manipulated withinparticipants. To accomplish this, all participants had to complete the contingency learning several times. In contrast, in our experiments all the manipulations were conducted between groups, reducing substantially the length of the experiment and, consequently, the opportunities to develop sophisticated response strategies like maximization.

Illusory Correlations in the Implicit Association Test Dissociations between judgments and trial-by-trial predictions are not the only piece of evidence in favor of dualprocess models of biased contingency detection. This approach received convergent support from a recent study by Ratliff and Nosek (2010) that found a similar dissociation between different measures of illusory correlations in stereotype formation, suggesting that two or more processes might also be involved in this effect. As explained in previous sections, most experiments on illusory correlations in stereotype formation rely on a fairly standard procedure (Hamilton & Gifford, 1976). Participants are presented with positive and negative traits of members of two different social groups on a trial-by-trial basis. Crucially, there are more members of one group (majority) than of the other (minority) and, regardless of group, most of the members show positive traits. Although the proportion of positive and negative traits is identical for the majority and the minority groups, people tend to make Experimental Psychology 2016; Vol. 63(1):3–19

M. A. Vadillo et al., Biased Contingency Detection

more positive evaluations of the majority group when asked to judge both groups at the end of training. Illusory correlations in stereotypes and cue/outcome density biases have been explored in quite different literatures, but both effects are clearly related and can be explained by the same or very similar models (Murphy et al., 2011; Sherman et al., 2009; Van Rooy et al., 2003). Illusory correlations are typically assessed by means of numerical ratings (similar to judgments in contingencydetection experiments) or by asking participants to recall which positive or negative traits were observed in the majority or the minority group. However, Ratliff and Nosek (2010) wondered whether the same illusory-correlation effects would be found in an alternative test that is supposed to provide a cleaner measure of the underlying attitudes of participants: The Implicit Association Test (IAT; Greenwald, McGhee, & Schwartz, 1998). Unlike traditional questionnaires, the IAT is a reaction-time test that is traditionally assumed to measure implicit attitudes with little interference from higher-order cognitive processes (De Houwer, Teige-Mocigmba, Spruyt, & Moors, 2009; Gawronski, LeBel, & Peters, 2007; Nosek, Hawkins, & Frazier, 2011). Following this idea, if illusory correlations require the operation of reasoning or inferential processes, then they should not be observed in the IAT. In contrast, if only very elemental associative processes are responsible for illusory correlations, then the IAT should be able to detect them. Ratliff and Nosek (2010) found the standard illusorycorrelation effect in the responses to the explicit questionnaire. However, there was no hint of the effect in the IAT scores in any of their two experiments. Most importantly, the absence of effects cannot be attributed to the lack of validity of the IAT: The IAT was sensitive to the valence of the majority and the minority groups when there was a correlation between membership to one of them and the positive or negative personality traits. Nor can they be attributed to a lack of statistical power, given that the results were replicated in an online experiment with almost 900 participants. Ratliff and Nosek (2010) interpreted these results in terms of a dual-process model (Gawronski & Bodenhausen, 2006) surprisingly similar to that invoked by Allan et al. (2005) and Perales et al. (2005) in the domain of contingency learning. According to them, the IAT would be only sensitive to the original encoding of associations in memory. From this point of view, the fact that performance in the IAT was unaffected by the illusory-correlation manipulation indicates that participants in their experiment correctly learned that there was no correlation between belonging to one group or the other and having positive or negative personality traits. Therefore, the illusory-correlation effect observed in explicit judgments Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

M. A. Vadillo et al., Biased Contingency Detection

Figure 6. Forest plot of a meta-analysis exploring the results of the experiments that have measured illusory correlation effects in the IAT. Error bars denote 95% confidence intervals.

must have been due to additional higher-order cognitive processes that took place on a later stage, and not to the original learning mechanism responsible for the initial encoding of the information (see Figure 3). As in the case of the results reported by Allan et al. and Perales et al., these results pose problems for any single-process model that assumes that illusory correlations are the product of the same mechanisms responsible for the detection of contingency. If a single process were responsible for both sensitivity to contingency and for density biases, why should an IAT be sensitive to one of these manipulations (contingency) but not to the other (density)? Before drawing any conclusion, it is convenient to review all the available evidence regarding this dissociation. Until recently, the study conducted by Ratliff and Nosek (2010) was the only experimental work that had tried to detect illusory correlations with the IAT. However, the latest attempt to replicate this result using a similar methodology has failed to find any dissociation between explicit measures and the IAT. Using a very similar procedure to Ratliff and Nosek, Carraro, Negri, Castelli, and Pastore (2014) did find an illusory correlation on the IAT, showing that the original dissociation was either not reliable or, more likely, not generalizable to similar but not identical conditions. In a similar vein, a recent experiment conducted in our laboratory (Vadillo, De Houwer, De Schryver, Ortega-Castro, & Matute, 2013) found an outcome-density effect using the IAT. The divergences with Ratliff and Nosek are less surprising in this case, because Vadillo et al. used a radically different design and procedure. However, this discrepancy converges to the idea that the failure of the IAT to detect illusory correlations might not be a generalizable result. 3

To better illustrate the results found with the IAT, Figure 6 depicts a forest plot with the divergent results of these studies.3 As can be seen, the only firm conclusion that can be drawn on the basis of this evidence is that the results are strikingly variable. In fact, the meta-analysis of these studies yielded an unusually large heterogeneity, Q(4) = 134.18, p < .001. Even the replication that Carraro et al. (2014) reported in their general discussion yielded results notably distant from those of their main study, although both of them were statistically significant. This variability suggests that whether or not illusory correlations are observed in the IAT probably depends on a number of moderators that we still ignore. As shown in Figure 6, the 95% confidence interval of the random-effects model includes zero. An advocate of dual-process models might claim that this null result of the meta-analysis supports the claim that illusory correlations are not found in implicit measures like the IAT. However, the confidence interval does not just include zero: It extends over a large number of positive effect sizes. On the basis of the collective evidence, any value from 0.13 to a massive 1.16 could be an accurate estimate of the average Cohen’s d. We doubt that this evidence is clear or robust enough to abandon single-process models of illusory correlations, which offer a simple and parsimonious explanation for a large body of data (Allan, 1993; Fiedler, 2000; López et al., 1998; Shanks, 1995). Even more so, if we keep in mind that the only converging evidence from Allan et al. (2005) and Perales et al. (2005) is open to criticism.

General Discussion In the previous sections we have reviewed the studies that have found dissociations in cue- and outcome-density effects across dependent variables (Allan et al., 2005; Perales et al., 2005; Ratliff & Nosek, 2010). A common result of these experiments is that there are some dependent measures that only show sensitivity to contingency (e.g., Δppred, d0 , or IAT scores), while other dependent measures (e.g., contingency judgments) show sensitivity to both contingency and cue/outcome density. On the basis of this evidence, it has been suggested that two separate mechanisms are needed to explain (1) why people are able to learn the correct cue-outcome contingencies and (2) why their judgments are influenced not only by contingency but also by the overall probabilities of the cue and the outcome. However, on closer inspection, it appears that this evidence might not be reliable enough to justify this theoretical

All the effect sizes included in the meta-analyses were computed from the t-values reported in the studies following the equations suggested by Lakens (2013), even when this resulted in d values slightly different from those reported by the authors. The random-effects meta-analysis was conducted using the “metafor” R package (Viechtbauer, 2010).

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

Experimental Psychology 2016; Vol. 63(1):3–19

interpretation. A review of the available evidence from our laboratory and from other research groups shows that some of these results do not seem to be replicable or do not generalize easily to similar experimental settings. In our experiments, cue- and outcome-density manipulations seem to have a significant impact on all dependent measures. Similarly, our simulations show that some of the dissociations previously reported in the literature might be due to simple methodological artifacts resulting from aspects of the design and the computation of the dependent variable. Given the lack of strong evidence in favor of these dualprocess accounts, we think that it is premature to abandon the idea that sensitivity to contingency and density biases are both attributable to the operation of a single mechanism. As mentioned in the Introduction, this idea is an essential feature of many associative models that were originally invoked to account for cue- and outcome-density effects (López et al., 1998; Shanks, 1995; Sherman et al., 2009). It is also a central feature of alternative models of biased contingency detection, such as instance-based models (Fiedler, 1996; Meiser & Hewstone, 2006; Smith, 1991). Beyond the specific details of these models, their common feature is that they all share the assumption that the same mechanism that is responsible for detecting and encoding the relationship between cues and outcomes is also responsible for density biases. In other words, there is no level of representation in the cognitive system where contingency information is represented in a format that is free from cue- or outcome-density biases. Note that, although we favor single-process models of biased contingency detection, we do not ignore the fact that different processes might contribute to each of the dependent variables used in this kind of research. We do not question the idea that there are manipulations that might affect one dependent variable without affecting others. In fact, part of our own research has been directed at showing that judgments of contingency can vary considerably depending on seemingly minor procedural details like the wording of the test question (Matute, Vegas, & De Marez, 2002; Vadillo & Matute, 2007; Vadillo et al., 2005; see also Crocker, 1982; De Houwer et al., 2007; Perales & Shanks, 2008). Moreover, part of our discussion of Perales et al. (2005) relies on the idea that participants sometimes adopt response strategies that might mask their true perception of contingency. What we are suggesting here is that there are no strong reasons to assume that there are two different levels of representation of contingency information, one that closely mirrors objective contingencies and one where that information is biased by factors like the probability of the cue or the probability of the outcome. In the absence of stronger evidence, it appears more parsimonious to assume that a single representation is learned during contingency-learning experiments and that this Experimental Psychology 2016; Vol. 63(1):3–19

M. A. Vadillo et al., Biased Contingency Detection

representation is biased by cue and outcome density at the encoding level. Dual-process models of biased contingency detection share some ideas with other dichotomist models of cognition (Evans & Over, 1996; Kahneman, 2011; Osman, 2004; Sloman, 1996; Stanovich & West, 2000). There is a crucial difference, though, between the dual models reviewed in this article and other dual-process models. Traditionally, dual models have tended to explain biases and cognitive illusions attributing them to very simple cognitive mechanisms that operate effortlessly and in a relatively automatic manner, usually related to simple encoding or retrieval processes. More complex cognitive mechanisms, usually strategic processes related to judgment and decision making, have been invoked to explain why people are sometimes able to overcome the harmful effect of these biases and intuitive reactions (e.g., Kahneman, 2011). Interestingly, the dual-process models of biased contingency detection that we review here make the opposite interpretation: Correct contingency detection is attributed to the operation of basic encoding and retrieval processes, while biased judgments are attributed to more sophisticated judgment and decision-making processes. We might even say that these models present a benign view of biases in contingency detection: Although people might show a bias in their judgments, deep inside their cognitive system there is some level of representation where information is represented accurately (a similar perspective can be found in De Neys, 2012). The question of whether biases in contingency detection are due to basic encoding and retrieval processes or whether they reflect the operation of judgment and decision-making processes is not only important from a theoretical point of view. As mentioned above, it has been suggested that these biases might contribute to the development of superstitious and pseudoscientific thinking (Gilovich, 1991; Lindeman & Svedholm, 2012; Matute et al., 2011, 2015; Redelmeier & Tversky, 1996; Vyse, 1997). Given the societal costs of these and other biases, cognitive psychologists have started to develop a number of interventions and guidelines for protecting people from cognitive biases (Barbería et al., 2013; Lewandowsky, Ecker, Seifer, Schwarz, & Cook, 2012; Lilienfeld, Ammirati, & Landfield, 2009; Schmaltz & Lilienfeld, 2014). Interventions designed to reduce biases can only be successful to the extent that they are based on an accurate view of their underlying mechanisms. If the underlying information was somehow encoded in a bias-free format, as suggested by dual-process models of contingency learning, these beliefs should be relatively easy to modify. Teaching people how to use more rationally the information and intuitions they already have should suffice to overcome these biases. This prediction stands in stark contrast with the well-known fact that superstitions are difficult to modify or eradicate Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

M. A. Vadillo et al., Biased Contingency Detection

(Arkes, 1991; Lilienfeld et al., 2009; Nyhan, Reifler, Richey, & Freed, 2014; Pronin, Gilovich, & Ross, 2004; Smith & Slack, 2015). We think that the persisting effect of biases is more consistent with the view that these beliefs are hardwired in the way people encode information about the relationship between events, as suggested by single-process models. Based on the evidence we have discussed so far, it seems safe to suggest that attempts to debias superstitions and misperceptions of contingency should include teaching people to look for unbiased information. Acknowledgments The authors were supported by Grant PSI2011-26965 from Dirección General de Investigación of the Spanish Government and Grant IT363-10 from Departamento de Educación, Universidades e Investigación of the Basque Government. We are indebted to José Perales and David Shanks for their valuable comments on earlier versions of this article.

References Allan, L. G. (1980). A note on measurement of contingency between two binary variables in judgement tasks. Bulletin of the Psychonomic Society, 15, 147–149. Allan, L. G. (1993). Human contingency judgments: Rule based or associative? Psychological Bulletin, 114, 435–448. Allan, L. G., & Jenkins, H. M. (1983). The effect of representations of binary variables on judgment of influence. Learning and Motivation, 14, 381–405. Allan, L. G., Siegel, S., & Hannah, S. (2007). The sad truth about depressive realism. The Quarterly Journal of Experimental Psychology, 60, 482–495. Allan, L. G., Siegel, S., & Tangen, J. M. (2005). A signal detection analysis of contingency data. Learning & Behavior, 33, 250–263. Alloy, L. B., & Abramson, L. Y. (1979). Judgements of contingency in depressed and nondepressed students: Sadder but wiser? Journal of Experimental Psychology: General, 108, 441–485. Alonso, E., Mondragón, E., & Fernández, A. (2012). A Java simulator of Rescorla and Wagner’s prediction error model and configural cue extensions. Computer Methods and Programs in Biomedicine, 108, 346–355. Arkes, H. (1991). Costs and benefits of judgment errors: Implications for debiasing. Psychological Bulletin, 110, 486–498. Barbería, I., Blanco, F., Cubillas, C. P., & Matute, H. (2013). Implementation and assessment of an intervention to debias adolescents against causal illusions. PLoS One, 8, e71303. Beckers, T., Vandorpe, S., Debeys, I., & De Houwer, J. (2009). Three-year-olds’ retrospective revaluation in the blicket detector task: Backward blocking or recovery from overshadowing? Experimental Psychology, 56, 27–32. Blanco, F., Matute, H., & Vadillo, M. A. (2013). Interactive effects of the probability of the cue and the probability of the outcome on the overestimation of null contingency. Learning & Behavior, 41, 333–340.

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

Buehner, M. J., Cheng, P. W., & Clifford, D. (2003). From covariation to causation: A test of the assumption of causal power. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 1119–1140. Carraro, L., Negri, P., Castelli, L., & Pastore, M. (2014). Implicit and explicit illusory correlation as a function of political ideology. PLoS One, 9, e96312. Chapman, G. B., & Robbins, S. J. (1990). Cue interaction in human contingency judgment. Memory & Cognition, 18, 537–545. Chapman, L. J., & Chapman, J. P. (1969). Illusory correlation as an obstacle to the use of valid psychodiagnostic signs. Journal of Abnormal Psychology, 74, 271–280. Chater, N. (2003). How much can we learn from double dissociations? Cortex, 39, 167–169. Cheng, P. W., & Novick, L. R. (1992). Covariation in natural causal induction. Psychological Review, 99, 365–382. Chun, M. M., & Turke-Browne, N. B. (2008). Associative learning mechanisms in vision. In S. J. Luck & A. Hollingworth (Eds.), Visual memory (pp. 209–245). New York, NY: Oxford University Press. Collins, D. J., & Shanks, D. R. (2002). Momentary and integrative response strategies in causal judgment. Memory & Cognition, 30, 1138–1147. Crocker, J. (1982). Biased questions in judgment of covariation studies. Personality and Social Psychology Bulletin, 8, 214–220. Danks, D. (2003). Equilibria of the Rescorla-Wagner model. Journal of Mathematical Psychology, 47, 109–121. De Houwer, J. (2009). The propositional approach to associative learning as an alternative for association formation models. Learning & Behavior, 37, 1–20. De Houwer, J. (2014). Why a propositional single-process model of associative learning deserves to be defended. In J. W. Sherman, B. Gawronski, & Y. Trope (Eds.), Dual-process theories of the social mind (pp. 530–541). New York, NY: Guilford Press. De Houwer, J., Beckers, T., & Glautier, S. (2002). Outcome and cue properties modulate blocking. The Quarterly Journal of Experimental Psychology, 55A, 965–985. De Houwer, J., Teige-Mocigmba, S., Spruyt, A., & Moors, A. (2009). Implicit measures: A normative analysis and review. Psychological Bulletin, 135, 347–368. De Houwer, J., Vandorpe, S., & Beckers, T. (2007). Statistical contingency has a different impact on preparation judgements than on causal judgements. The Quarterly Journal of Experimental Psychology, 60, 418–432. De Neys, W. (2012). Bias and conflict: A case for logical intuitions. Perspectives on Psychological Science, 7, 28–38. Ellis, N. C. (2008). Usage-based and form-focused language acquisition: The associative learning of constructions, learnedattention and the limited L2 endstate. In P. Robinson (Ed.), Handbook of Cognitive Linguistics and Second Language Acquisition (pp. 372–405). New York, NY: Taylor & Francis. Evans, J. S. B. T., & Over, D. E. (1996). Rationality and reasoning. Hove, UK: Psychology Press. Fiedler, K. (1996). Explaining and simulating judgment biases as an aggregation phenomenon in probabilistic, multiple-cue environments. Psychological Review, 103, 193–214. Fiedler, K. (2000). Illusory correlations: A simple associative algorithm provides a convergent account of seemingly divergent paradigms. Review of General Psychology, 4, 25–58. Gast, A., & De Houwer, J. (2012). Evaluative conditioning without directly experienced pairings of the conditioned and the unconditioned stimuli. The Quarterly Journal of Experimental Psychology, 65, 1657–1674. Gawronski, B., & Bodenhausen, G. V. (2006). Associative and propositional processes in evaluation: An integrative review of

Experimental Psychology 2016; Vol. 63(1):3–19

implicit and explicit attitude change. Psychological Bulletin, 132, 692–731. Gawronski, B., LeBel, E. P., & Peters, K. R. (2007). What do implicit measures tell us? Scrutinizing the validity of three common assumptions. Perspectives on Psychological Science, 2, 181–193. Gilovich, T. (1991). How we know what isn’t so: The fallibility of human reason in everyday life. New York, NY: Free Press. Gopnik, A., Sobel, D., Schulz, L., & Glymour, C. (2001). Causal learning mechanisms in very young children: Two-, three-, and four-year-olds infer causal relations from patterns of variation and covariation. Developmental Psychology, 37, 620–629. Greenwald, A. G., McGhee, D. E., & Schwartz, J. K. L. (1998). Measuring individual differences in implicit cognition: The Implicit Association Test. Journal of Personality and Social Psychology, 74, 1464–1480. Hamilton, D. L., & Gifford, R. K. (1976). Illusory correlation in interpersonal perception: A cognitive basis of stereotypic judgments. Journal of Experimental Social Psychology, 12, 392–407. Holyoak, K. J., & Cheng, P. W. (2011). Causal learning and inference as a rational process: The new synthesis. Annual Review of Psychology, 62, 135–163. Jenkins, H. M., & Ward, W. C. (1965). Judgment of contingency between responses and outcomes. Psychological Monographs, 79, 1–17. Kahneman, D. (2011). Thinking, fast and slow. New York, NY: Farrar, Strauss, Giroux. Kruschke, J. K. (2008). Models of categorization. In R. Sun (Ed.), The Cambridge handbook of computational psychology (pp. 267–301). New York, NY: Cambridge University Press. Kutzner, F. L., & Fiedler, K. (2015). No correlation, no evidence for attention shift in category learning: Different mechanisms behind illusory correlations and the inverse base-rate effect. Journal of Experimental Psychology: General, 144, 58–75. Lagnado, D. A., Waldmann, M. R., Hagmayer, Y., & Sloman, S. A. (2007). Beyond covariation: Cues to causal structure. In A. Gopnik & L. Schultz (Eds.), Causal learning: Psychology, philosophy, and computation (pp. 154–172). Oxford, UK: Oxford University Press. Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863. Lewandowsky, S., Ecker, U. K. H., Seifer, C. M., Schwarz, N., & Cook, J. (2012). Misinformation and its correction: Continued influence and successful debiasing. Psychological Science in the Public Interest, 13, 106–131. Lilienfeld, S. C., Ammirati, R., & Landfield, K. (2009). Giving debiasing away: Can psychological research on correcting cognitive errors promote human welfare? Perspectives on Psychological Science, 4, 390–398. Lilienfeld, S. O., Ritschel, L. A., Lynn, S. J., Cautin, R. L., & Latzman, R. D. (2014). Why ineffective psychotherapies appear to work: A taxonomy of causes of spurious therapeutic effectiveness. Perspectives on Psychological Science, 9, 355–387. Lindeman, M., & Svedholm, A. (2012). What’s in a term? Paranormal, superstitious, magical and supernatural beliefs by any other name would mean the same. Review of General Psychology, 16, 241–255. López, F. J., Cobos, P. L., Caño, A., & Shanks, D. R. (1998). The rational analysis of human causal and probability judgment. In M. Oaksford & N. Chater (Eds.), Rational models of cognition (pp. 314–352). Oxford, UK: Oxford University Pres. Matute, H. (1996). Illusion of control: Detecting response-outcome independence in analytic but not in naturalistic conditions. Psychological Science, 7, 289–293. Matute, H., Blanco, F., Yarritu, I., Díaz-Lago, M., Vadillo, M. A., & Barbería, I. (2015). Illusions of causality: How they bias our

Experimental Psychology 2016; Vol. 63(1):3–19

M. A. Vadillo et al., Biased Contingency Detection

everyday thinking and how they could be reduced. Frontiers in Psychology, 6, 888. Matute, H., Vadillo, M. A., Blanco, F., & Musca, S. C. (2007). Either greedy or well informed: The reward maximization – unbiased evaluation trade-off. In S. Vosniadou, D. Kayser, & A. Protopapas (Eds.), Proceedings of the European Cognitive Science Conference (pp. 341–346). Hove, UK: Erlbaum. Matute, H., Vegas, S., & De Marez, P.-J. (2002). Flexible use of recent information in causal and predictive judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 714–725. Matute, H., Yarritu, I., & Vadillo, M. A. (2011). Illusions of causality at the heart of pseudoscience. British Journal of Psychology, 102, 392–405. Meiser, T., & Hewstone, M. (2006). Illusory and spurious correlations: Distinct phenomena or joint outcomes of exemplar-based category learning? European Journal of Social Psychology, 36, 315–336. Mitchell, C. J., De Houwer, J., & Lovibond, P. F. (2009). The propositional nature of human associative learning. Behavioral and Brain Sciences, 32, 183–246. Moors, A. (2014). Examining the mapping problem in dual process models. In J. W. Sherman, B. Gawronski, & Y. Trope (Eds.), Dual-process theories of the social mind (pp. 20–34). New York, NY: Guilford Press. Murphy, R. A., Schmeer, S., Vallée-Tourangeau, F., Mondragón, E., & Hilton, D. (2011). Making the illusory correlation effect appear and then disappear: The effects of increased learning. The Quarterly Journal of Experimental Psychology, 64, 24–40. Musca, S. C., Vadillo, M. A., Blanco, F., & Matute, H. (2010). The role of cue information in the outcome-density effect: evidence from neural network simulations and a causal learning experiment. Connection Science, 22, 177–192. Nies, R. C. (1962). Effects of probable outcome information on two choice learning. Journal of Experimental Psychology, 64, 430–433. Nosek, B. A., Hawkins, C. B., & Frazier, R. S. (2011). Implicit social cognition: From measures to mechanisms. Trends in Cognitive Sciences, 15, 152–159. Nyhan, B., Reifler, J., Richey, S., & Freed, G. L. (2014). Effective messages in vaccine promotion: A randomized trial. Pediatrics, 133, 1–8. Orgaz, C., Estevez, A., & Matute, H. (2013). Pathological gamblers are more vulnerable to the illusion of control in a standard associative learning task. Frontiers in Psychology, 4, 306. Osman, M. (2004). An evaluation of dual-process theories of reasoning. Psychonomic Bulletin & Review, 11, 988–1010. Perales, J. C., Catena, A., Shanks, D. R., & González, J. A. (2005). Dissociation between judgments and outcome-expectancy measures in covariation learning: A signal detection theory approach. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 1105–1120. Perales, J. C., & Shanks, D. R. (2008). Driven by power? Probe question and presentation format effects on causal judgment. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34, 1482–1494. Pronin, E., Gilovich, T., & Ross, L. (2004). Objectivity in the eye of the beholder: Divergent perceptions of bias in self versus others. Psychological Review, 111, 781–799. Ratliff, K. A., & Nosek, B. A. (2010). Creating distinct implicit and explicit attitudes with an illusory correlation paradigm. Journal of Experimental Social Psychology, 46, 721–728. Redelmeier, D. A., & Tversky, A. (1996). On the belief that arthritis pain is related to the weather. Proceedings of the National Academy of Sciences, 93, 2895–2896.

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

M. A. Vadillo et al., Biased Contingency Detection

Rescorla, R. A., & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. F. Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64– 99). New York, NY: Appleton-Century-Crofts. Reuven-Magril, O., Dar, R., & Liberman, N. (2008). Illusion of control and behavioral control attempts in obsessivecompulsive disorder. Journal of Abnormal Psychology, 117, 334–341. Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Science, 274, 1926–1928. Schmaltz, R., & Lilienfeld, S. O. (2014). Hauntings, homeopathy, and the Hopkinsville goblins: Using pseudoscience to teach scientific thinking. Frontiers in Psychology, 5, 336. Shanks, D. R. (1995). Is human learning rational? The Quarterly Journal of Experimental Psychology, 48A, 257–279. Shanks, D. R. (2010). Learning: From association to cognition. Annual Review of Psychology, 61, 273–301. Shanks, D. R., & Dickinson, A. (1987). Associative accounts of causality judgment. In G. H. Bower (Ed.), The psychology of learning and motivation, Vol. 21: Advances in research and theory (pp. 229–261). San Diego, CA: Academic Press. Shanks, D. R., & St. John, M. F. (1994). Characteristics of dissociable human learning systems. Behavioral and Brain Sciences, 17, 367–447. Shanks, D. R., Tunney, R. J., & McCarthy, J. D. (2002). A reexamination of probability matching and rational choice. Journal of Behavioral Decision Making, 15, 233–250. Sherman, J. W., Gawronski, B., & Trope, Y. (Eds.). (2014). Dualprocess theories of the social mind. New York, NY: Guilford Press. Sherman, J. W., Kruschke, J. K., Sherman, S. J., Percy, E., Petrocelli, J. V., & Conrey, F. R. (2009). Attentional processes in stereotype formation: A common model for category accentuation and illusory correlation. Journal of Personality and Social Psychology, 96, 305–323. Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119, 3–22. Smith, B. W., & Slack, M. B. (2015). The effect of cognitive debiasing training among family medicine residents. Diagnosis, 2, 117–121. Smith, E. R. (1991). Illusory correlation in a simulated exemplarbased memory. Journal of Experimental Social Psychology, 27, 107–123. Snodgrass, J. G., & Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to dementia and amnesia. Journal of Experimental Psychology: General, 117, 34–50. Stanovich, K. E., & West, R. F. (2000). Individual differences in reasoning: Implications for the rationality debate? Behavioral and Brain Sciences, 23, 645–665. Tversky, A., & Edwards, W. (1966). Information versus reward in binary choices. Journal of Experimental Psychology, 71, 680–683. Unturbe, J., & Corominas, J. (2007). Probability matching involves rule-generating ability: A neuropsychological mechanism dealing with probabilities. Neuropsychology, 21, 621–630. Vadillo, M. A., De Houwer, J., De Schryver, M., Ortega-Castro, N., & Matute, H. (2013). Evidence for an illusion of causality when using the Implicit Association Test to measure learning. Learning and Motivation, 44, 303–311. Vadillo, M. A., Konstantinidis, E., & Shanks, D. R. (2016). Underpowered samples, false negatives, and unconscious learning.

Ó 2016 Hogrefe Publishing. Distributed under the Hogrefe OpenMind License http://dx.doi.org/10.1027/a000001

Psychonomic Bulletin & Review, 23, 87–102. doi: 10.3758/ s13423-015-0892-6 Vadillo, M. A., & Luque, D. (2013). Dissociations among judgments do not reflect cognitive priority: An associative explanation of memory for frequency information in contingency learning. Canadian Journal of Experimental Psychology, 67, 60–71. Vadillo, M. A., & Matute, H. (2007). Predictions and causal estimations are not supported by the same associative structure. The Quarterly Journal of Experimental Psychology, 60, 433–447. Vadillo, M. A., Miller, R. R., & Matute, H. M. (2005). Causal and predictive-value judgments, but not predictions, are based on cue–outcome contingency. Learning & Behavior, 33, 172–183. Vadillo, M. A., Musca, S. C., Blanco, F., & Matute, H. (2011). Contrasting cue-density effects in causal and prediction judgments. Psychonomic Bulletin & Review, 18, 110–115. Van Rooy, D., Van Overwalle, F., Vanhoomissen, T., Labiouse, C., & French, R. (2003). A recurrent connectionist model of group biases. Psychological Review, 110, 536–563. Viechtbauer, W. (2010). Conducting meta-analyses in R with the metaphor package. Journal of Statistics Software, 36, 1–48. Vyse, S. (1997). Believing in magic: The psychology of superstition. New York, NY: Oxford University Press. Wasserman, E. A. (1990). Detecting response-outcome relations: Toward an understanding of the causal texture of the environment. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 26, pp. 27–82). San Diego, CA: Academic Press. Wasserman, E. A., Elek, S. M., Chatlosh, D. L., & Baker, A. G. (1993). Rating causal relations: The role of probability in judgments of response-outcome contingency. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 174–188. Wasserman, E. A., Kao, S.-F., Van-Hamme, L. J., Katagiri, M., & Young, M. E. (1996). Causation and association. In D. R. Shanks, K. J. Holyoak, & D. L. Medin (Eds.), The psychology of learning and motivation, Vol. 34: Causal learning (pp. 207–264). San Diego, CA: Academic Press. Yarritu, I., & Matute, H. (2015). Previous knowledge can induce an illusion of causality through actively biasing behavior. Frontiers in Psychology, 6, 389. Yarritu, I., Matute, H., & Vadillo, M. A. (2014). Illusion of control: The role of personal involvement. Experimental Psychology, 61, 38–47. Zanon, R., De Houwer, J., & Gast, A. (2012). Context effects in evaluative conditioning of implicit evaluations. Learning and Motivation, 43, 155–165. Received February 3, 2015 Revised September 16, 2015 Accepted September 17, 2015 Published online March 29, 2016 Miguel A. Vadillo Primary Care and Public Health Sciences King’s College London Addison House, Guy's Campus London SE1 1UL UK Tel. +44 207 848-6620 Fax +44 207 848-6652 E-mail miguel.vadillo@kcl.ac.uk

Experimental Psychology 2016; Vol. 63(1):3–19

Theoretical Article

The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings A Novel Perspective on Evaluative Conditioning Sean Hughes, Jan De Houwer, and Dermot Barnes-Holmes Department of Experimental Clinical and Health Psychology, Ghent University, Belgium

Abstract: Throughout much of the past century psychologists have focused their attention on a seemingly simple question: How do people come to like or dislike stimuli in the environment? Evaluative Conditioning (EC) – a change in liking due to the pairing of stimuli – has been offered as one avenue through which novel preferences may be formed and existing ones altered. In the current article, we offer a new look at EC from the perspective of Contextual Behavioral Science (CBS) and, more specifically, Relational Frame Theory (RFT). We briefly review the EC literature, introduce Contextual Behavioral Science (CBS), Relational Frame Theory (RFT), and then describe a behavioral phenomenon known as arbitrarily applicable relational responding (AARR). Afterwards, we examine the relationship between EC and AARR. This novel perspective offers ways to organize existing as well as predict new EC effects, contributes to debates on “genuine” EC, human versus nonhuman EC, and further facilitates the development and refinement of cognitive theories of EC. Keywords: evaluative conditioning, Relational Frame Theory, AARR

Although humans may be biologically prepared to prefer certain stimuli over others, many of our likes and dislikes are learned through ongoing interactions in and with the environment (De Houwer, 2007). These evaluations are thought to play a causal role in a diverse spectrum of behaviors such as consumer choices (Gibson, 2008; Hollands, Prestwich, & Marteau, 2011), voting intentions (Galdi, Arcuri, & Gawronski, 2008), in-group favoritism, and stigmatization (Walther, Nagengast, & Trasselli, 2005), to mention just a few. In order to understand, predict, and influence these behaviors in a sophisticated manner, researchers have sought to identify the factors responsible for the formation and change of evaluative responses. Evaluative responses toward stimuli in the environment can be established in a wide variety of ways, from mere exposure (Bornstein, 1989), to socialization (Pettigrew & Tropp, 2006), descriptive information (Rydell & McConnell, 2006), and category membership (Pinter & Greenwald, 2004). However, many psychologists have focused on Evaluative Conditioning (EC) as a means of establishing and manipulating likes and dislikes (see Gast, Gawronski, & De Houwer, 2012). Broadly speaking, EC refers to a change in liking that is due to the pairing of stimuli. Typically, a neutral stimulus acquires the valence of a positive or negative stimulus with which it was previously Experimental Psychology 2016; Vol. 63(1):20–44 DOI: 10.1027/1618-3169/a000310

paired. For example, contiguous presentations of an unknown Pokémon character with pleasant images often result in that character being rated positively whereas pairing it with negative images results in it being rated negatively (Olson & Fazio, 2001). Over the past several decades, researchers from two intellectual traditions have studied changes in evaluative responding due to stimulus pairings from their respective scientific perspectives. On the one hand, the vast majority of this work has been conducted by social and cognitive psychologists interested in the mental processes and representations that mediate the impact of stimulus pairings on liking (i.e., the mental level of analysis; see Hofmann, De Houwer, Perugini, Baeyens, & Crombez, 2010, for a recent review). On the other hand, and unknown to much of psychological science, researchers who describe themselves as contextual behavioral scientists and/or behavior analysts have also studied a range of behavioral effects that seem directly relevant to EC (e.g., Smyth, Barnes-Holmes, & Forsyth, 2006; Valdivia-Salas, Dougher, & Luciano, 2013). Unlike their cognitive counterparts, however, these researchers make no appeal to, or assumptions about, mental constructs or their causal agency in behavior and its change. Rather, they aim to determine what factors in the environment influence our behavior, including those Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

behaviors that seem to be involved in liking and disliking specific stimuli (i.e., the functional level of analysis; see De Houwer, 2011a).1 Unsurprisingly, developments in the cognitive EC literature have very rarely filtered into and informed progress in Contextual Behavioral Science (CBS), and vice versa, because both communities draw upon radically different sets of scientific goals, values, and assumptions. Yet we believe that there is much to be gained and little to be lost by fostering greater communication between these two traditions. Cognitive researchers willing to venture into the CBS literature will find a previously undiscovered country populated with novel procedures, conceptual analyses, and findings that may contribute to their understanding of human likes and dislikes. Similarly, contextual behavioral scientists may acquire new ideas, procedures, and information through active and full engagement with their cognitive counterparts. In the current paper, we hope to showcase just what can be achieved when the functional approach to EC (adopted by CBS) interacts with the mental level of analysis and vice versa. In Part I, we begin with a brief introduction to the EC literature (for a more comprehensive review and metaanalysis see Hofmann et al., 2010). This overview will clarify how empirical and theoretical attention in EC research has largely been situated at the mental level of analysis, with changes in liking (behavior) that occur due to the pairing of stimuli (environment) explained in terms of mediating constructs such as mental associations or propositions. In Part II, we turn our attention to CBS and examine how the functional approach adopted in this tradition differs from the functional approach that is typically employed throughout the EC literature. Differentiating between these two functional approaches will require that we first consider the philosophical underpinnings and scientific goals of the former (CBS) and the various ways in which it deviates from the latter. In Part III, we introduce a functional theory of human language and cognition known as Relational Frame Theory (RFT; Hayes, Barnes-Holmes, & Roche, 2001). At its core, RFT is concerned with a behavioral phenomenon known as arbitrarily applicable relational responding (AARR) which, broadly speaking, refers to the ability to act as if stimuli are related to one another (1) regardless of their physical properties and (2) in the absence of direct training or instruction. Evidence indicates that, once learned, the ability to AARR changes how humans interact with, and adapt to, their physical, social, and verbal environments. In Part IV, we draw upon the EC and RFT literatures to put forward a novel perspective on EC. First, we argue that

stimulus pairings may – in principle – function as a mere proximal cause of changes in liking. The expression “mere proximal cause” entails that stimulus pairings only lead to a change in liking because there is a regularity in how stimuli are presented in space and time in the current (i.e., proximal) situation. Second, we argue that a more likely scenario is one in which distal regularities (i.e., regularities in the past environment of the organism) moderate the impact that proximal regularities (stimulus pairings) have on changes in liking (i.e., EC as an instance of moderated learning). Third, there may be many types of (proximal and distal) regularities that can potentially moderate the impact that stimulus pairings have on liking. Yet we believe that a particular set of distal regularities (those which give rise to the ability to AARR) cause stimulus pairings to be transformed from a mere proximal cause of liking to a proximal cue signaling that the CS and US are related in some way. The specific relationship between the CS and US that pairings signal will determine the properties of the changes in liking that are observed. We showcase how this new perspective may accelerate research on and theorizing about EC. For instance, our approach equips researchers with a means to systematically organize different types of EC effects so that their similarities and differences are made evident (i.e., it has heuristic value). It may also lead to the discovery of novel and unexpected ways in which liking can be changed via stimulus pairings (i.e., it has predictive value). This perspective also provides valuable input into debate on what can be considered as “genuine” EC and raises the question of how EC in humans is related to EC in nonhuman animals. Finally, we argue that CBS in general, and RFT in particular, offers a language capable of describing the entire gamut of EC (and related) effects in purely functional, nonmental terms. Adopting this language may facilitate the development and refinement of cognitive theorizing in several important ways.

Part I: EC as Procedure, Effect, and Mental Process The ease and versatility with which likes and dislikes can be formed, manipulated, and eliminated via the pairing of stimuli has led to a wave of EC research that has swept through many areas of psychological science, including health psychology (Hollands et al., 2011), consumer psychology (Gibson, 2008), social psychology (Walther et al., 2005), and clinical psychology (Houben, Schoenmakers,

The relationship between Contextual Behavioural Science (CBS) and behavior analysis is quite complex and a general consensus on the nature of that relationship has yet to emerge. However, for ease of communication in the current article we will simply use the acronym “CBS” to refer to the broad group of researchers who conduct their scientific work at the functional level of analysis.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

& Wiers, 2010). This work has offered valuable insight into the procedures that give rise to a change in liking when stimuli are paired, the effects generated by those procedures as well as the mediating mental processes assumed to govern changes in liking resulting from the pairing of stimuli (De Houwer, 2007). In what follows we consider how recent empirical developments have shaped our understanding of EC in each of these respective areas.

Evaluative Conditioning as a Procedure When defined as a procedure, EC refers to ways of arranging the environment so that stimuli are paired and changes in the liking of those stimuli can be observed (De Houwer, 2007, 2011b). In a prototypical EC study, a neutral stimulus (often referred to as a conditioned stimulus, CS) is repeatedly paired with a positive or negative unconditioned stimulus (US) and it is determined if these pairings lead to a change in liking. Such procedures are typically considered to be a specific subclass of Pavlovian conditioning (PC) procedures, the only difference being that EC procedures focus on changes in evaluative responses, whereas PC procedures can capture any type of behavioral change (De Houwer, 2007). Furthermore, whereas the term PC is typically reserved for procedures that involve biologically significant USs (e.g., food or electric shocks), in most EC studies, neither CS nor USs are biologically significant. Broadly speaking, EC procedures may involve the pairing of neutral stimuli such as fictitious consumer products (Pleyers, Corneille, Luminet, & Yzerbyt, 2007), cartoon characters (Olson & Fazio, 2001), nonsense words (Stahl & Unkelbach, 2009), and unknown individuals (Hütter, Sweldens, Stahl, Unkelbach, & Klauer, 2012) with valenced words (Walther, Langer, Weil, & Komischke, 2011) or images (Corneille, Yzerbyt, Pleyers, & Mussweiler, 2009) with the aim of producing a change in liking. These procedures are not restricted to the use of visual stimuli but may also involve gustatory (Gast & De Houwer, 2012), olfactory (Hermans, Baeyens, Lamote, Spruyt, & Eelen, 2005), tactile (Hammerl & Grabitz, 2000), and auditory stimuli (van Reekum, van den Berg, & Frijda, 1999). Although changes in liking generated by such procedures have often been indexed via self-reported ratings, indirect tasks such as evaluative priming (Fazio, Jackson, Dunton, & Williams, 1995; Wittenbrink, Judd, & Park, 1997), the Implicit Association Test (IAT; Greenwald, McGhee, & Schwartz, 1998), and the Affective Misattribution Procedure (AMP; Payne, Cheng, Govorun, & Stewart, 2005) have allowed automatic evaluative responses to be captured and subjected to empirical scrutiny (see Nosek, Hawkins, & Frazier, 2011). Although these latter procedures

Experimental Psychology 2016; Vol. 63(1):20–44

typically make use of reaction-time or accuracy-based performances, EC can – in principle – also be studied using physiological and neurological measures of evaluation (e.g., Klucken et al., 2009).

Evaluative Conditioning as an Effect EC as an effect refers to the observed change in liking that is due to the pairing of stimuli. Hence, it involves more than the mere pairing of stimuli or the observation of a change in liking. This extra element is the presence of a causal relation between the two elements. More specifically, the claim that an EC effect has occurred implies that the observed change in liking was determined by the pairing of the stimuli. When defined as an effect, similarities and differences between EC and other evaluative learning effects become clear. It has been argued, for example, that EC differs from other types of evaluative learning with regard to the regularity in the environment that produces the change in evaluative responding (De Houwer, 2007; De Houwer, Barnes-Holmes, & Moors, 2013). For instance, changes in liking that are due to the successive presentation of a single stimulus (i.e., the mere exposure effect; Bornstein, 1989) or to response-dependent contingencies (i.e., approach/avoidance learning; Kawakami et al., 2007) are typically not considered to be instances of EC. We now know that changes in liking due to the pairing of stimuli are sensitive to the order, number, and timing of stimulus presentations (e.g., Bar-Anan, De Houwer, & Nosek, 2010; Jones, Fazio, & Olson, 2009; Stahl & Unkelbach, 2009), the manner in which the CS-US relation is established (e.g., sensory preconditioning, higher-order conditioning; Hammerl & Grabitz, 1996; Walther, 2002), the sources of contextual control (e.g., discriminative stimuli; Baeyens, Crombez, De Houwer, & Eelen, 1996; Gawronski, Rydell, Vervliet, & De Houwer, 2010), postacquisition modifications to the CS-US contingency (e.g., extinction, counterconditioning, US revaluation; Hofmann et al., 2010; Kerkhof, Vansteenwegen, Baeyens, & Hermans, 2010; Walther, Gawronski, Blank, & Langer, 2009) as well as the type of organism that is tested (e.g., psychology student, child, or nonhuman; Boakes, Albertella, & Harris, 2007; Field, 2006; Fulcher, Mathews, & Hammerl, 2008). Although the majority of this research has directly related the CS and US on the basis of spatiotemporal contiguity, changes in evaluative responding have also been obtained as a result of observation (Baeyens, Eelen, Crombez, & De Houwer, 2001), written narratives (Gregg, Seibt, & Banaji, 2006), inferences (Gast & De Houwer, 2012), and verbal instructions (Balas & Gawronski, 2012; De Houwer, 2006).

Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

Evaluative Conditioning as a Mental Process To date, the EC literature has almost exclusively been guided by researchers operating at the mental level of analysis. When approached from this perspective psychological events are conceptualized as being similar to a machine, composed of discrete parts that interact and are subject to specific operating conditions (e.g., Bechtel, 2008; Chiesa, 1992). The primary scientific goal at this level of analysis is to identify the mental mechanism(s) that mediate between input (environment) and output (behavior). The researcher’s role is to develop an account of mental representations and processes that mediate changes in behavior. The truth or scientific value of a mental model is therefore based on the correspondence between the mental mechanism it proposes and the set of behavioral observations that it aims to predict. Put another way, research at the mental level of analysis is focused primarily on the prediction of behavioral effects through the use of theoretical models that bridge environmental events and behavioral outcomes. When applied to EC, this approach assumes that mental processes and representations mediate between the pairing of stimuli and the observed change in liking. The researcher’s goal is to postulate a mental theory that can explain (1) how the pairing of stimuli produces a change in evaluative responding, and (2) why the magnitude and direction of such responses are dependent on specific properties within the wider environment. Broadly speaking, the universe of mental accounts of EC can be subdivided into two overarching sets. The first and currently dominant position is that the pairing of stimuli results in the automatic “bottom-up” formation of mental associations in memory. For example, Baeyens, Eelen, Crombez, and Van den Bergh (1992) argued that CS-US pairings result in the formation of an association between the representation of the CS and the representation of the US in memory. These changes at the representational level are thought to emerge in the absence of conscious awareness, attentional resources, or the intention to relate the CS and US (i.e., they demonstrate many of the features of automaticity; see Jones et al., 2009, and Martin & Levey, 1994, for related accounts). Broadly speaking, once the association has been formed, subsequently encountering the CS will result in the automatic activation of the (valenced) US representation which in turn leads to a change in liking of the CS. A second class of mental models have recently emerged that reject the associative position and propose that most if not all forms of classical conditioning – including EC – arise 2

due to the formation of propositions about the CS-US relation (Mitchell, De Houwer, & Lovibond, 2009). Whereas mental associations are simply hypothetical structures in memory through which activation can spread between representations, propositions are qualified truth statements about the environment. As statements about the environment, they can be true or false (i.e., correspond to the actual environment or not) and can specify not only that stimuli in the environment are related but also how those elements are related (e.g., “the CS is opposite to the US”). According to this propositional account, participants utilize their knowledge of how stimuli are related as the basis upon which to evaluate the CS. These propositions can be acquired in a number of ways, from prior knowledge and direct experience, to verbal instructions and deductive reasoning (De Houwer, 2009). The formation of these propositions is assumed to require an awareness of the stimulus relation, as well as the time, cognitive resources, and intention to relate those stimuli.2

Part II: Contextual Behavioral Science In this section, we introduce CBS, and consider how research conducted within this tradition, particularly from the perspective of RFT, may provide an alternative approach to the study of EC. Before proceeding, it is worth considering several points. First, we recognize that for cognitive and social psychologists, the nonmental approach discussed below represents a significant departure from their own pre-analytic assumptions and scientific aims. Indeed, many readers may find the philosophy of science (functional contextualism) and theoretical perspectives (RFT) from which contextual behavioral scientists operate to be entirely new territory. It therefore seems prudent to provide some background information about both of these topics so that the reader can better appreciate the nature of CBS, how it differs from the cognitive approach – and perhaps most importantly – how it may contribute to EC research and cognitive theorizing. At the same time, researchers might argue that they have already adopted a functional approach when it comes to the study of EC. Although there may be some substance to this argument, we will attempt to show that the functional (effect-centric) approach adopted within psychological science is very different to the functional (analytic-abstractive) approach found within CBS. Furthermore, we contend that the latter functional

From a mental (mechanistic) perspective EC effects can – in principle – be mediated by any type or combination of mental processes. Thus while many researchers subscribe to either an associative or propositional position, others argue that EC effects can be driven by both processes operating singularly, or in interaction, in an automatic or nonautomatic fashion (e.g., Gawronski & Bodenhausen, 2011).

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

approach offers EC researchers possible advantages that the former functional approach does not.

Contextual Behavioral Science While various forms of functionally oriented psychologies have emerged over the last century the most empirically and theoretically productive (contemporary) branch is arguably that of CBS. At the core of this tradition resides a philosophy of science known as functional contextualism which specifies (1) the assumptions, goals, and values of the researcher, and by implication, (2) the principles, theories, and methodologies that they draw upon (for more on CBS and how it relates to radical behaviorism see Hayes, Barnes-Holmes, & Wilson, 2012). As a contextualist behavioral scientist sees it, (psychological) science involves a single unified goal: to predict-and-influence behavior with precision (applying a restricted set of principles to any event), scope (explain a comprehensive range of behaviors across a variety of situations), and depth (cohere across analytical levels and domains such as biology, psychology, and anthropology). In what follows we highlight how a CBS approach differs from the mental (mechanistic) approach to psychological science. Absence of Mental Mediators Contextual Behavioral Science adopts an exclusively functional epistemology that makes no appeal to mental mediators. Instead the functional relationships between environment and behavior (which unfold across time and context) are treated as causal or explanatory factors. To illustrate this point more clearly, consider the phenomenon of latent learning in which experiences at an earlier point in time do not influence responding at that point but instead only impact behavior at a later date (e.g., Díaz & De la Casa, 2002; Lubow & Gewirtz, 1995). If one operates at the mental level of analysis then this change in behavior is viewed as the product of some intervening (mental) mechanism that mediates between the experienced event at time 1 and the observed response at time 2. However, from a CBS perspective, it is enough to say that behavior at time 2 is a function of experience at time 1 (regardless of the delay between an event and behavior), provided that this functional relation allows the researcher to predict-andinfluence that behavior with precision, scope, and depth (see also De Houwer, Barnes-Holmes, et al., 2013). In other words, appeals to – or assumptions about – mental constructs or their causal agency in producing behavior are a strategy that contextual behavioral scientists choose not to adopt. Rather, they define behavior as an ongoing act that always occurs within, and in response to, a current and historical context. This “act-in-context” can vary from the most proximal behavioral instance (e.g., evaluative Experimental Psychology 2016; Vol. 63(1):20–44

responses to stimuli in the current environment) to temporally distal and remote behavioral sequences (e.g., the impact of a particular experience 2 years ago on choosing whether to sit beside a member of another racial group at the cinema). Given that the temporal and spatial parameters of a context can vary dramatically it can be difficult to determine which contextual framing of an act is “true” in the sense of “most correct.” CBS researchers therefore adopt a “pragmatic truth criterion” that qualifies the success, meaning, or validity of a scientific analysis in terms of its ability to achieve prediction-and-influence, with precision, scope, and depth over the behavior of interest. Thus, in CBS, the study of EC effects involves a shift away from the search for mental mediators of EC (e.g., associations or propositions via which pairings produce changes in evaluative responding) and toward the identification of environmental moderators of EC (e.g., stimuli or events that determine whether pairings have a particular impact on evaluative responding). The Challenge of Predicting-and-Influencing Behavior The choice of functional researchers not to focus on mental mediators follows, in part, from their aim to predict-andinfluence behavior. Consider, for example, the mental association models of EC outlined above, wherein a change in liking due to the pairing of stimuli is mediated by the formation, activation, or modification of mental associations in memory. As pointed out by Gardner (1987), these mental entities may be treated as the cause of a particular behavior (e.g., evaluative associations as mental causes of ratings on a Likert scale) but they are not open to direct manipulation. For instance, there appears to be no method with which to interact directly with mental constructs such as associations or propositions. Instead (experimental) researchers must (1) act on the world in some way, such as presenting stimuli to an organism, and (2) observe a change in behavior in that organism. Based on the effect of their actions on behavior, researchers can then postulate a mediating mechanism that is responsible for the obtained change in behavior or outcome. This “manipulation argument” is not intended to dismiss or undermine analysis at the mental level, nor does it imply that mental constructs cannot be studied in a scientific manner. Rather, the take home message is that one’s preference for using (or not using) mental constructs in scientific analyses depends on the aim of one’s scientific endeavor. If a researcher’s analytic goal involves predicting behavioral effects under certain environmental conditions, then associations, propositions, emotions, or any other mental or nonmental variable (e.g., neurological activity) can be used to achieve prediction, provided that it reliably precedes that effect. On the other hand, if a researcher’s primary objective is to predict-and-influence behavior, then Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

focusing exclusively on mental explanations is ultimately insufficient and could be deemed as distracting the researcher from the goal at hand. In order to exert influence over behavior the researcher must successfully manipulate events external to that behavior, and only contextual variables can be directly manipulated; mental variables cannot (see De Houwer, 2011a; Hayes & Brownstein, 1986). Consequently, from a CBS perspective, scientific analysis is not complete until the causal variables external to the behavior of interest have been identified – not because of some dogmatic adherence to a physical monism that excludes the nonphysical, mental world, but rather as a means to achieve its scientific goals (for more on CBS see Hayes et al., 2012). Development of Behavioral Principles and Evaluation of Functional Theories The development of behavioral principles and evaluation of functional theories within CBS also differ from those at the mental level of analysis. While accounts at the mental level involve the postulation of mental processes or representations in order to understand past and predict future behavior, functional theories are different: they start out by identifying (1) specific functional relationships between behavior and the environment, and then (2) abstract these relationships into overarching “behavioral principles” that are high in precision, scope, and depth. Examples include the principles of reinforcement, punishment, stimulus generalization, and discrimination. These principles are primarily inductive in nature, built from the bottom up, and “apply across a broad array of topographically distinct behaviors of varying complexity while maintaining coherence and parsimony” (Levin & Hayes, 2009, p. 6). Functionally oriented researchers treat these principles as their primary analytic tools and draw upon them to study and evaluate behavior. For example, the development of temper tantrums in children, emergence and change of problematic behaviors in pets, as well as the tendency to check one’s phone upon hearing a ringtone, can be accounted for by drawing upon the notion of reinforcement (see Catania, 1997). Likewise, stopping one’s car at a red traffic light and accelerating in the presence of a green light or taking a cake out of an oven after an alarm rings can be explained in terms of stimulus control. Analyzing an individual’s interactions with the environment in terms of these behavioral principles is often defined as carrying out a “functional analysis.” Theories at the functional level emerge when researchers seek to move from functional analyses of individual behaviors to more general phenomena such as language, reasoning, and cognition. Such theories constitute organized sets of interrelated behavioral principles. These models or theories are considered to be “true,” successful, meaningful, or valid to the extent that as they “work” (i.e., they allow the Ó 2016 Hogrefe Publishing

researcher to achieve prediction-and-influence over the phenomenon of interest). Thus the functional level of analysis is richly theoretical. But it is a type of theorizing that differs markedly from that seen elsewhere in psychological science (see Barnes-Holmes & Hussey, 2016; Wilson, Whiteman, & Bordieri, 2013). Functional Level of Analysis: Analytic-Abstractive Versus Effect-Centric One could argue that many psychologists adopt a functional approach both inside and outside of CBS. For instance, all experiments involve the manipulation of independent variables whose effect on dependent variables is examined. What should now be clear, however, is that the functional approach adopted in CBS differs markedly from the functional approach seen in many other areas of psychological science. The former begins by conducting functional analyses of specific environment-behavior interactions (e.g., temper tantrums, disobedient pets) and then abstracts these relations into overarching behavioral principles (e.g., reinforcement) which apply to a wide variety of situations and behaviors (i.e., it is analytic-abstractive). The functional perspective adopted in psychological science is typically of a different kind: the description of individual environmentbehavior relations with the aim of predicting the behavioral effect obtained from a certain procedure or set of procedures (i.e., it is effect-centric). These two functional approaches, and the analyses that they engender, differ in a number of important ways. Whereas an analytic-abstractive approach strives to explain a wide variety of seemingly unrelated behaviors using a restricted number of behavioral principles (i.e., it aims for high scope and precision), an effect-centric approach is typically interested in only those aspects of the current environment or procedure that give rise to a specific or narrow range of behavioral effects (i.e., it emphasizes precision over scope). Consequently, while the former approach provides a means to conceptualize seemingly unrelated behaviors as instances of the same underlying phenomenon (i.e., it has heuristic value), the latter does so to a much lesser extent. To illustrate, take the previous examples of temper tantrums in children, a disobedient pet, and persistent checking of one’s phone. Approaching these behaviors from an effect-centric position might lead researchers to postulate three separate effects (e.g., a temper tantrum effect, a disobedience effect, and a checking effect) and suggest that, because those behaviors look different, involve different organisms, stimuli, and events, they must reflect three separate and unrelated phenomena. Yet from an analyticabstractive position these three types of behaviors can be viewed as different instances of the same underlying phenomenon (operant learning). Similarly, an effect-centric perspective might lead researchers interested in cognitive

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

control to view outcomes such as the Stroop effect, Simon effect, and Task-Rule Congruency effect topographically, rather than as different instances of the same underlying behavioral principle (stimulus control; Liefooghe & De Houwer, 2016). Although these different effects are sometimes related at nonfunctional levels (e.g., at the mental level by assuming that the same mental processes mediate different effects; e.g., Miyake et al., 2000), they remain distinct within an effect-centric functional approach. Returning to the topic of evaluative conditioning, an effect-centric approach might view changes in liking due to the pairing of stimuli via instructions, observation, inferences, or spatiotemporal contiguity as entirely different phenomena given that they are instantiated in ways that, at least on the surface, appear to differ from one another. Once again an analytic-abstractive perspective might suggest that these topographically distinct outcomes may stem from a common functional source (we will return to this issue later in the paper). Effect-centric and analytic-abstractive approaches also differ in terms of the emphasis that they place on distal regularities and their impact on proximal regularities. Effectcentric approaches often appear to ignore or play down the learning histories that likely gave rise to specific capacities associated closely with the human species, such as complex language skills and advanced reasoning (for related arguments see Fiedler, 2014). Instead, an effectcentric approach typically focuses on those proximal regularities which need to be manipulated in order to produce the desired outcome (e.g., pairing stimuli on a computer screen). Although certain moderators of this effect – such as the context, stimuli, or organism – can also be manipulated, more distal regularities and learning histories are often taken for granted or are entirely ignored. This may result in a disconnect between the past and present events. For instance, imagine that the word “vomit” is repeatedly paired with a neutral face and a change in liking is observed. Or that the word “loathes” is presented along with pictures of neutral faces and aggressive looking dogs. Researchers may not be interested in the learning history that allows the word “vomit” to function as a US or how the word loathes acquires its ability to transform the relation between neutral and valenced stimuli. The fact that such stimuli give rise to and influence changes in liking is enough for the researcher to achieve their scientific goal of predicting EC effects. Yet, as outlined above, such a stance is unacceptable for researchers interested in achieving both prediction-and-influence over behavior. It is only by identifying the learning processes that underpin the very abilities to engage in complex language and reasoning that we may (ultimately) influence the above changes in liking in a sophisticated manner. Therefore an analytic-abstractive approach emphasizes the importance of an individual’s Experimental Psychology 2016; Vol. 63(1):20–44

learning history (or distal regularities) and the impact they have on proximal regularities. The take home message here is that adopting an effectcentric versus analytic-abstractive functional approach will have important consequences for the researcher, and by implication, the perceived value they place on a functional level of analysis. Those who adopt an effect-centric position will tend to generate ever-growing piles of behavioral effects that are typically described in purely topographical terms or linked narrowly to specific procedures in the current context (see Meiser, 2011). This in turn may lead to the view that many effects are unrelated to or somehow independent from others, and as a result commonalities between different outcomes are unlikely to be identified. The end result of a purely effect-centric functional approach is a deeply unsatisfying one: an accumulating set of behavioral outcomes that lack a conceptual framework capable of systematizing such outcomes or making novel predictions. In contrast, the analytic-abstractive approach adopted within CBS offers a more sophisticated functional perspective: One capable of identifying functional relations between the environment and behavior and then abstracting those relations into overarching behavioral principles and theories that apply to both specific (precision) and general (scope) behaviors. In other words, the functional analytic-abstractive approach is focused on organizing and systematizing seemingly unrelated findings in the immediate service of increasing prediction-andinfluence over the phenomena of interest. And it attempts to do this by drawing attention to the dynamic interplay between distal and proximal regularities in the environment. Strangely, this distinction between effect-centric and analytic-abstractive approaches has rarely, if ever, been made in psychological science or within the EC literature (see Barnes-Holmes & Hussey, 2016). Instead, the prevailing norm is to treat the functional level as synonymous with the former rather than the latter perspective (e.g., Proctor & Urcuioli, 2016). It therefore comes as no surprise that many cognitive and social psychologists consider the functional (effect-centric) level as unsustainable by itself because it leads to a proliferation of behavioral effects in the absence of any theoretical framework capable of organizing those findings or making novel predictions (Meiser, 2011). We agree. It also comes as no surprise that they choose to devise frameworks to systematize findings and to do so at the mental level of analysis given their own scientific goals, values, and assumptions. This is not a problem. Yet to equate the functional level with an effectcentric approach is a mistake. The analytic-abstractive position outlined here and adopted within CBS circumvents many of the perceived issues with an effect-centric approach and offers new insight into much of human Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

psychological life (for recent reviews see Hughes & BarnesHolmes, 2016a, 2016b). In what follows we showcase how an analytic-abstractive theory known as Relational Frame Theory has done just that. Over the last 25 years this account of human language and cognition has sparked a flurry of conceptual and empirical work and has come to dominate a considerable portion of the CBS literature. In Part III, we outline the core features of this analytic-abstractive theory and then in Part IV examine how it can be used to accelerate empirical and theoretical developments in the domain of EC.

Part III: Relational Frame Theory At its core, RFT argues that organisms throughout the animal kingdom can learn in many different ways, from the pairing of stimuli (classical conditioning) to the relationship that exists between particular responses and its consequences (operant learning). However, early on in their lives humans develop an “advanced” type of operant behavior known as arbitrarily applicable relational responding (AARR). RFT argues that this type of behavior is something special, the basic “building block” from which human language and cognition spring forth. Put simply, “relational responding” refers to a type of learned operant behavior that involves responding to one event in terms of another. Although humans and nonhumans can both respond to the relationship between stimuli and events, the latter quickly develop a more complex type of behavior (AARR) that fundamentally alters how they interact with the world around them. In the following section we examine how RFT carves this type of operant behavior into two different varieties (nonarbitrarily and arbitrarily applicable).

Nonarbitrarily Applicable Relational Responding (NAARR) Mammals, birds, fish, and insects can all be trained to respond to the relationship between stimuli in the environment. However, for many different species these relational responses appear to be characterized by two key properties: (1) they are rooted in a prior history of direct experience and (2) they are defined by the physical features of the to-be-related stimuli (Giurfa, Zhang, Jennett, Menzel, & Srinivasan, 2001; Harmon, Strong, & Pasnak, 1982; Reese, 1968; Vaughan, 1988). RFT refers to this type of behavior as an instance of nonarbitrarily applicable relational responding (or NAARR) because the organism is relating stimuli based on their formal or physical properties. Properties such as color, shape, quantity, and size are considered Ó 2016 Hogrefe Publishing

“nonarbitrary” because they are based on the physical characteristics of the stimuli, unlike arbitrary or arbitrarily applicable properties, which are largely determined by social convention (e.g., the meaning of letter strings such as “honesty,” “beauty,” or “freedom”). To illustrate the concept of NAARR more clearly, imagine that a pigeon is exposed to a simple learning task in which a sample stimulus (e.g., a red circle) is presented at the top of a computer screen and two comparison stimuli (e.g., a red and a green circle) are presented at the bottom of the screen. On trials where a red circle serves as a sample stimulus, selecting the red circle from the available comparisons is reinforced. Whenever a green circle is the sample, selecting the green circle from the available comparisons is reinforced. Training continues in this way across a whole host of different colors and shapes. Once the bird is consistently correct across a large number of trials it is then presented with entirely novel stimuli (the matching of which was never directly reinforced in the past). Research suggests that the pigeon will continue to select a shape or color from the bottom of the screen that is physically identical to a shape or color at the top of the screen even when that particular response was never previously reinforced (Frank & Wasserman, 2005). Now consider a series of studies wherein adult rhesus monkeys (Harmon et al., 1982) or marmosets (Yamazaki, Saiki, Inada, Iriki, & Watanabe, 2014) were trained to select the taller of two items which differed only in terms of their respective height. When subsequently presented with a previously “correct” item (i.e., a stimulus that was taller than its comparison stimulus) as well as a novel item that was even taller, monkeys consistently selected the latter item despite the fact that selecting the former was reinforced at an earlier point in time. These studies, in addition to many others, suggest that animals can respond to the nonarbitrary (i.e., physical) relationship that exists between stimuli. In the above examples, pigeons related colored shapes based on their physical similarity to one another, while rhesus monkeys responded to the comparative relationship between items that differed in their respective height.

Arbitrarily Applicable Relational Responding (AARR) RFT argues that while humans and nonhumans can both demonstrate NAARR, the former also display all the hallmarks of a more advanced type of relational behavior known as AARR. This behavior is assumed to arise from a history of generalized operant learning early on in our development and is characterized by three core properties known as mutual entailment, combinatorial entailment, and the transformation of function (see Hughes & Barnes-Holmes, 2016a). Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

Imagine, for example, that a child learns that a picture of a Dog (A) is the same as the word DOG (B), and that the word DOG is related to the sound D-O-G (C). The first property of AARR (mutual entailment) refers to the fact that when the child is taught that a picture of a dog (A) is the same as the word DOG (B) she will subsequently act as if the word (B) is the same as the picture (A) without any explicit training or instruction to do so. The second property of AARR (combinatorial entailment) refers to the additional stimulus relations that tend to emerge between two or more mutually entailed stimuli. Imagine that a child is shown three coins from a foreign currency of which she has no prior experience, and is told that “Coin A is worth less than Coin B” and “Coin B is worth less than Coin C”. It is now likely that she will act in a number of novel and untrained ways that do not depend on the physical properties of those stimuli (e.g., she will act as if Coin A is worth less than C; that Coin C is worth more than A; that Coin B is worth more than A and that Coin C is worth more than B). The third and final property of AARR (transformation of function) refers to the finding that once stimuli have been related to one another, the psychological properties of those stimuli may be changed in accordance with how they were related. For instance, imagine that you are told that four previously unknown brand products (A, B, C, D) are all basically the same, and that A is then given a positive valence through repeatedly pairing it with positively valenced images. Not only will A acquire a positive valence but B, C, and D will as well despite the fact that the latter were never directly paired with positive images and were all physically dissimilar to one another. Critically, this transformation of function depends on the relationship established between or among stimuli: if two neutral images (A and B) are related as opposite to one another, and A is then paired with shock, the fear arousing properties of A will not necessarily transfer to B. Rather the emotional properties of B may come to be transformed in line with the nature of the stimulus relation; that is, B may elicit relaxation instead of a fear response (see Whelan & Barnes-Holmes, 2004). AARR Is Contextually Controlled If humans have the capacity to relate stimuli arbitrarily in increasingly complex ways, one might ask why does this ability not lead to complete and utter chaos. For example, why do people not try to eat the word “apple,” lick the words “ice cream” off a page, or even swat the word “fly” from a book? RFT proposes that the way in which people relate stimuli (and transform functions through those relations) is under the control of stimuli in the environment known as contextual cues. While certain types of contextual cues specify how stimuli are related (e.g., “A same as B” or “A prevents B”), others specify the psychological properties

Experimental Psychology 2016; Vol. 63(1):20–44

that are transformed through those relations (e.g., “A tastes disgusting” or “B feels soft”). RFT researchers usually refer to the former as “relational cues” given that they specify how stimuli and events should be related. These cues can be used to relate stimuli in a near infinite number of ways, from relations based on equivalence or similarity (e.g., “Hond is the same as dog”; Cahill et al., 2007) to those based on opposition (e.g., “Good is the opposite of evil”; Dymond, Roche, Forsyth, Whelan, & Rhoden, 2008), hierarchy (“Cat is a type of mammal”; Gil, Luciano, Ruiz, & Valdivia-Salas, 2012), comparison (“Fruit is better than candy”; Vitale, Barnes-Holmes, Barnes-Holmes, & Campbell, 2008), deictics (“I am not you”; McHugh & Stewart, 2012), temporality (“March comes before May”; O’Hora et al., 2008), and causality (“If X then Y”). Importantly, these relational cues need not always be words; other properties of the environment such as sounds, symbols, shapes, background color (e.g., Gawronski et al., 2010), or verbal rules (e.g., Zanon, De Houwer, & Gast, 2012) may also function in a similar capacity. At the same time, responding can also be controlled by “functional cues” in the environment that specify the type of psychological properties that are transformed in accordance with stimulus relations. For example, the verbal stimulus “ice cream” could in principle evoke many of the psychological properties of actual ice cream (such as its taste, smell, appearance, or its coolness) based on the equivalence relation between the word and the food item. If, however, someone asks you to picture ice cream, the visual properties of ice cream would likely predominate. Likewise, in the sentence “imagine what ice cream tastes like,” the expression “tastes like” may serve as a functional cue that is responsible for the fact that only the gustatory (and not other) functions of ice cream predominate. We now know that functional cues in the environment can influence (1) the ability of a stimulus to signal whether an outcome will occur after a given response (Dougher, Hamilton, Fink, & Harrington, 2007), (2) whether a stimulus will elicit an emotional response (Barnes-Holmes, Barnes-Holmes, Smeets, & Luciano, 2004), (3) whether a stimulus should be approached (Gannon, Roche, Kanter, Forsyth, & Linehan, 2011) or avoided (Roche, Kanter, Brown, Dymond, & Fogarty, 2008) as well as (4) whether a stimulus will elicit a sexual response (Roche, BarnesHolmes, Smeets, Barnes-Holmes, & McGeady, 2000). Once again, many different stimuli in the environment – above and beyond words – may serve as functional cues. AARR Is Learned Early on in our Development RFT argues that AARR is a type of operant behavior that is learned early on in our development through interactions

Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

with the socio-verbal community. To illustrate, imagine that you are attempting to teach your infant son how to name a number of objects around the house. You will likely begin by pointing at an item (e.g., a toy bear), uttering its name in the presence of your son, and then encouraging any orientating response that he makes toward the item (i.e., hear the word bear ? look at the bear). At the same time, you will also present the item to your son and then display or encourage appropriate responses (see the bear ? say the word bear). Both of these interactions will take place in the presence of contextual cues – and in natural language interactions these cues typically take the form of questions such as “What is this?” or “What is the name of that?” In the language of RFT, you are directly reinforcing bidirectional responding to an object and its name in the presence of a contextual cue (e.g., look at the bear when hearing “bear” as well as say “bear” when you see the bear). Importantly, this bidirectional training will not stop here: you and others in the social community (teachers, friends, family) will likely engage in the same exercise with your son across a vast spectrum of different objects, from toys (“where is your bike”), to people (“who is that”), food (“this is an apple”), properties of the environment (“that is called the sun. . .what is that called?”), and do so in a wide variety of different contexts: at the park, home, shopping mall, school, and so on. Although the particular stimuli, people, and contexts change across time, the functional relation between those stimuli is always held constant: the child’s relational responding is reinforced in both directions and in the presence of arbitrary contextual cues. Hence, through multiple exemplar training, your son learns a type of generalized bidirectional responding which no longer depends on the physical features of the stimuli involved and that leads to mutually entailed responses being emitted in the absence of direct reinforcement. Now when you present him with a novel object and a word he has never encountered before (e.g., a laptop and its name) he will respond in a bidirectional manner without any reinforcement for doing so (e.g., he will point to the laptop when asked “where is the laptop” and answer with “laptop” when asked “what is this”). According to RFT, this bidirectional relation between an object and word represents an instance of mutual entailment in which stimuli are related on the basis of their arbitrary “similarity” to one another (i.e., your son has been taught to treat a word and its referent as functionally similar in certain contexts). After more training in which a series of stimuli are related (e.g., a teddy bear being related to the sound “teddy bear” which is afterwards related to the written words “teddy 3

bear”), other patterns of behavior such as combinatorial entailment emerge (e.g., look at the teddy bear when seeing the written words “teddy bear”).3 Why Is AARR so Important? Over the past 40 years AARR has captured the imagination of functional researchers due to its symbolic, flexible, and generative properties. From the beginning they realized that this type of behavior was inherently generative. Providing humans with a small set of direct experiences consistently causes them to act as if those stimuli are related to one another in a staggering number of novel and untrained ways. For instance, after learning that the written word “poison” (A) is the same as a picture containing an unknown symbol (B), and that the latter is the same as the spoken word “G-I-F” (C), people will likely avoid consuming any items that contain images of B or that are labeled “G-I-F.” In other words, they will spontaneously act as if the above stimuli are related in four new ways based on two directly trained relations (i.e., that C is the same as A; A the same as C; B the same as A and C the same as B). When subsequently taught that three new stimuli are to be treated as equivalent to one another (X-Same-Y; Y-Same-Z) people will act as if these stimuli are also related in four untrained ways, and when the first equivalence relation is related to the second equivalence relation 16 additional relations will arise. Indeed, this exponential increase in the number of untrained relations continues to grow as more and more stimuli are related, so that by the time eight stimulus relations are trained people can – in principle – act as if those stimuli are related in several thousand untrained ways. Thus AARR represents a type of behavior that rapidly accelerates learning as more and more stimuli are related. A second reason why AARR has attracted so much attention is that it equips humans with an unparalleled degree of flexibility when interacting with the world around them. Once a person has learned how to respond in an arbitrarily applicable fashion they can relate any stimulus to any other stimulus in a near infinite number of ways. They can relate stimuli with no physical resemblance (like spoken words, written words, and pictures) and these relations can come to control how they subsequently respond. People can also act as if stimuli have acquired, changed, or lost their psychological properties without the need to directly contact contingencies in the environment. To illustrate, suppose that a person learns that a novel item (A) is less than a second item (B) and that B is less than a third item (C). Thereafter, B is repeatedly paired with electrical shocks. Evidence indicates that people will display greater fear

For communicative purposes we have presented a simplified version of the origins, properties, and implications of AARR for psychological science. For a far more nuanced perspective we encourage readers to consider Hayes et al. (2001); Törneke (2010); or Hughes and BarnesHolmes (2016a, 2016b).

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

toward C than B and more fear to B than A, despite the fact that C and A were never paired with shock and that none of the stimuli share any physical similarity (Dougher et al., 2007). Functionally speaking, it seems unlikely that this behavior is a simple case of (nonarbitrary) relational learning given that the three items did not differ along any relevant physical dimension, such as size. At the same time, it does not appear to be an instance of second- or higher-order Pavlovian conditioning because the fear functions of the A and C stimuli differed substantively from that of the B stimulus. Thus it appears that when organisms learn to relate events in arbitrarily applicable ways, the possibilities of manipulating and changing the world are dramatically increased. Summary In short, AARR unshackles humans from the need to learn via direct experiences and rapidly accelerates their ability to flexibly adapt to the world around them. Over the past several decades functional researchers have explored both conceptually and empirically the histories of learning which gives rise to AARR, and how they appear to be critical to the development of complex human behaviors, including the ability to engage in complex language and reasoning (Hayes et al., 2001), to develop a rich and elaborate sense of self as well as advanced perspective taking capacities (McHugh & Stewart, 2012). AARR is also argued to play a key role in implicit and explicit cognition (Hughes, Barnes-Holmes, & Vahey, 2012), developmental disorders (Rehfeldt & Barnes-Holmes, 2009), psychopathology (Hayes, Levin, Plumb-Vilardaga, Villatte, & Pistorello, 2013), intelligence, and many other psychological phenomena (for a recent review Hughes & Barnes-Holmes, 2016b). In what follows we turn our attention to the domain of evaluative learning and consider how the concept of AARR may help reshape the basic understanding of EC.4

Part IV: EC as an Interaction Between Proximal and Distal Regularities In this final section of the paper we introduce a novel perspective on EC that draws upon developments within both the EC and RFT literatures. At the core of this account 4

reside two ideas. The first is that stimulus pairings may constitute a mere proximal cause for changes in liking. In this case, the observed changes in liking are due only to the proximal regularities in how stimuli are presented here and now in space and time. That is, the impact of the stimulus pairings on liking is not moderated by distal experiences. A second possibility is that the impact of proximal regularities (stimulus pairings) on liking may be moderated by other regularities. This moderated learning perspective argues that EC effects depend on more than the mere experience of stimulus pairings and can be moderated by additional factors in the environment. In what follows we briefly discuss these two possibilities. We then introduce a third idea that builds on the second idea, namely that a specific set of distal regularities (i.e., those that give rise to the ability to AARR) are central to many of the effects observed in the EC literature. These regularities transform spatiotemporal contiguity from a mere proximal cause to a proximal cue signaling that the CS and US are related to one another. The specific relationship between the CS and US that pairings signal determines the properties of the changes in liking that are observed. This position corresponds to the analytic-abstractive function approach typically adopted in the CBS literature insofar as different EC effects are viewed as instances of the same behavioral phenomenon (AARR). Once we discuss the above possibilities we then show how an “EC as AARR” account leads to new perspectives on existing EC effects, new perspectives on what constitutes “genuine” EC, clarifies differences between human and nonhuman EC, and provides a novel impetus for the development of cognitive theories of EC.

Evaluative Conditioning as the Product of Proximal Regularities A first possibility, one that is often entertained in the EC literature, is that stimulus pairings function as a mere proximal cause of liking. The expression “mere proximal cause” entails that stimulus pairings exert their influence on liking solely due to a regularity in how stimuli are presented in space and time. Although the organism may have contacted distal regularities during their past interactions with the environment, these regularities are not assumed to influence the way in which the pairings influence liking. Let us be clear here. Stating that pairings function as a mere

We realize that it may be tempting, and entirely possible to, explain the behavioral observations that are typically associated with AARR in terms of mental events. But it is important to realize that the very concept of AARR constitutes a functional analytic-abstractive explanation in terms of overarching patterns of interactions between the behavior of human beings and their physical, social, and verbal environments. When we use terms such as mutual and combinatorial entailment we are referring to the specific ways in which people behave based on a particular history of learning (i.e., the way people act as if stimuli are related in novel and untrained ways). Likewise, terms such as the transformation of function refer to patterns of behavior rather than mental events: once stimuli have been related, people will act as if the psychological properties of those stimuli have changed in some predictable way.

Experimental Psychology 2016; Vol. 63(1):20–44

Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

proximal cause of liking certainly allows for a distal history of learning experiences to alter the responses to each of the individual stimuli that are paired. For instance, a distal history of language learning may transform arbitrary lines on a page (e.g., the word “cancer”) into a negatively valenced stimulus that might change the liking of stimuli it is paired with. But from a mere proximal cause perspective, that same history of learning is not assumed to influence the effect of the pairings (i.e., pairings are not thought to be an interpreted event or one that is given semantic meaning by the organism). Thus, to adopt a “pairings as a mere proximal cause” position, one has to assume that when people have acquired verbal skills, they either do not apply those skills when confronted with stimulus pairings, or apply them selectively to the individual stimuli that are paired but not to the fact that stimuli are paired. In what follows we argue that such a position is certainly possible yet highly unlikely, and that distal regularities may not only transform the meaning of individual stimuli but also the function or “meaning” of pairings as a proximal event.

Evaluative Conditioning as an Interaction Between Proximal and Distal Regularities A second possibility, and one that we alluded to above, is that the impact of proximal stimulus pairings on liking may be moderated by the organism’s previous learning experiences. In line with De Houwer, Barnes-Holmes, et al. (2013) we define moderated learning as the effect of regularities in the environment on how other regularities in the environment influence behavior. When applied to EC, a moderated learning perspective argues that distal regularities may influence the impact that proximal regularities (stimulus pairings) have on liking. They may do so in two ways, by either influencing: (1) the way in which organisms respond to the individual stimuli that are paired and/or (2) the way in which organisms respond to pairings as a proximal event. Note that at this stage, a “moderated learning” perspective of EC does not commit to any specific theory about the development, nature, representation, or deployment of those distal learning experiences that give rise to verbal skills or language. Rather it merely appeals to the idea that humans have encountered distal regularities in the past, and as a result of these interactions with the environment, they possess and can utilize verbal skills in a flexible manner. For instance, our arguments do not depend on whether language necessarily involves propositional representations (mental level) or is an instance of arbitrarily applicable relational responding (AARRing; functional level). When we say that verbal skills influence how people respond to stimuli or pairings as a proximal event, we simply mean that EC effects involving those stimuli and

Ó 2016 Hogrefe Publishing

pairings would either not occur or would not have the same properties for organisms that have not acquired the ability to use language (De Houwer & Hughes, in press). Thus a moderated learning perspective on EC is – at least in the first instance – situated at a very abstract level of analysis (see De Houwer & Moors, 2015). Let us now consider just one possible way in which this moderated learning perspective could be further specified.

Evaluative Conditioning as an Instance of AARR Earlier in this paper we described empirical (AARR) and theoretical (RFT) developments that are currently shaping the functional (analytic-abstractive) approach to human language and cognition. Interestingly, when these developments are combined with the aforementioned moderated learning account of EC an entirely novel perspective emerges – one in which distal regularities that give rise to AARR moderate the impact of stimulus pairings on liking. This perspective on EC is based on the idea that a particular set of regularities moderate the impact of stimulus pairings on liking (i.e., those that give rise to the ability to AARR). According to this perspective when participants enter the experimental context they bring with them a long history of relational learning that influences how they respond to stimuli that are paired in space and time. This arbitrarily applicable relational responding is not “switched off,” discarded, or somehow “set aside” when one encounters such pairings but directly informs and dictates how individuals respond to those regularities. Put simply, once people have learned how to AARR, pairings as a proximal event may be transformed from a mere proximal cause to a proximal (contextual) cue signaling that stimuli are related in some way. The relational “meaning” or function of stimulus pairings will determine the manner in which the CS and US are related, and by implication, the nature of the change in liking observed. This account of EC is analytic-abstractive insofar as different EC effects are viewed as instances of the same underlying behavioral phenomenon (AARR). Critically, however, EC effects differ from other instances of AARR in that the distal learning history that gives rise to AARR is brought to bear on one particular proximal regularity in the environment – namely – the pairing of stimuli. It might well be that other types of evaluative learning, such as those involved in mere exposure or approach/avoidance, also qualify as instances of AARR. However, in these cases, the learning history that gives rise to AARR is brought to bear on other proximal regularities (e.g., the repeated presentation of a stimulus or the relation between behavior and its consequences) which in turn leads to changes in liking.

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

The claim that a particular EC effect is an instance of AARR implies that this EC effect has core functional properties in common with other instances of AARR. More specifically, it implies that (1) the EC effect depends on the same distal learning history that allows for other instances of AARR, (2) manipulations of relational and functional cues should have an impact on that effect, and (3) that changes in liking should coincide with other changes in behavior that are instances of AARR. The first implication is difficult (although not impossible) to test empirically because it requires a manipulation of learning experiences during childhood (e.g., Luciano, Becerra, & Valverde, 2007). The second implication can more readily be tested by manipulating the nature or presence of relational and functional cues. The third implication can also be tested. If the change in liking of the CS due to CS-US pairings is an instance of AARR, then the change in liking should be just one element of acting as if the CS and US are related in a certain way. Hence, given appropriate contextual cues, the change in liking should coincide with other changes in behavior that are indicative of AARR, such as the ability to pick the US from a range of stimuli when presented with the CS as a sample.5 In what follows, we evaluate whether there are logical and empirical arguments for the claim that many EC (and related) effects are instances of AARR and highlight the various contributions that this perspective offers. We first examine EC effects that are based on the pairing of stimuli in the presence or absence of other stimuli. We then examine so-called EC effects that have been established or manipulated via instruction, inference, and observation. Doing so will demonstrate how topographically distinct behavioral effects may actually be instances of the same underlying functional phenomenon (AARR). Contribution 1: AARR Allows for a Sophisticated Functional Description of EC Effects Throughout much of the past three decades EC researchers have conceptualized “pairings” as referring to the direct experience of a CS and US in spatiotemporal contiguity. A subset of these studies have also examined the impact of relational “qualifiers” such as words or sentences on pairings given that these qualifiers provide additional information about the nature of the CS-US relation. In this section, we first consider the possibility that EC effects observed in the latter type of studies qualify as instances of AARR. We then examine whether EC effects in studies without explicit relational cues also qualify as instances of AARR. 5

Evaluative Conditioning Effects With Explicit Relational Cues As we outlined above, two different types of contextual cues can moderate AARR by either (1) indicating the way in which events are related (relational cues) or (2) determining the functions that transfer between related stimuli (functional cues). If EC effects are an instance of AARR then these cues should also moderate those EC effects. This idea provides a functional perspective on recent studies about the impact of relational information on EC. To illustrate, take the work of Förderer and Unkelbach (2012) who found that presenting CS and positively valenced US images in the presence of a relational cue “loves” resulted in a standard EC effect. However, when those same stimuli were related in the presence of a different cue (“loathes”) the effect was completely reversed (see also Förderer & Unkelbach, 2011; Peters & Gawronski, 2011; Zanon, De Houwer, Gast, & Smith, 2014). Interpreting these “relational EC effects” as instances of AARR provides a novel perspective on these phenomena. According to RFT, words such as “goes with,” “loves,” “hates,” and “is opposite to” may function as contextual cues which moderate the impact of CS-US pairings on liking, much like the relational cues we described in Part III moderate other instances of AARR. Contextual cues can have this moderating effect because of a protracted history of operant learning that begins in infancy and continues through the lifetime of the individual (see Hughes & Barnes-Holmes, 2016a). Evaluative Conditioning effects in the Absence of Explicit Relational Cues As we also noted above, words are not the only stimuli that can function as relational cues. Physical objects (e.g., red traffic light), signs (e.g., mathematical or musical notation), gestures (e.g., a raised eyebrow during a job interview), or any other stimulus can signal that people should respond to other stimuli as being related in a particular way. According to this perspective, even spatiotemporal contiguity might constitute a proximal (relational) cue, signaling that stimuli which are paired in space and time are related to one another (e.g., Leader, Barnes, & Smeets, 1996; Zanon et al., 2014). This certainly seems plausible given that contiguity is a diffuse and ubiquitous property of the environment that is present in the “background” of much of everyday life: it is present between the flick of a switch and subsequent presence of a light or a loud bang and a broken window. It is also present when an unknown stimulus (novel brand product) appears with a positively

The relation between EC and AARR should now be clear: There might be instances of EC that are not instances of AARR (i.e., those instances in which stimulus pairings function as a mere proximal cause), there might be instances of EC that are instances of AARR (i.e., those instances in which stimulus pairings function as a proximal cue), and there are many instances of AARR that do not qualify as instances of EC (i.e., those instances that do not involve changes in liking or that involve other proximal cues than stimulus pairings).

Experimental Psychology 2016; Vol. 63(1):20–44

Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

or negatively valenced image, or a neutral face is paired with an evaluative statement. Stated more precisely, humans encounter a vast array of stimuli that co-occur across a multitude of contexts throughout their entire lives. While the stimuli involved in these various learning experiences constantly differ, the functional relation between stimuli is held constant (i.e., stimuli that co-occur tend to be similar in some regard). Given a sufficient number of exemplars, irrelevant factors such as what the stimuli look, smell, taste, or feel like may be “washed out” and the functional relation itself “abstracted” so that it becomes a proximal (contextual) cue for relating stimuli in the future. When participants encounter novel stimuli under similar environmental conditions (as they do in an EC experiment), spatiotemporal contiguity may be used as it has in the past – as a relational cue indicating that stimuli are similar or equivalent to one another. Once stimuli are related in this way their evaluative properties may change in line with the nature of the relation: in this case, a transfer of evaluative functions from the US to the CS. This functional analysis can be tested empirically. For example, if spatiotemporal contiguity is a proximal (relational) cue as we suggest, then manipulating its relational function should moderate the impact that stimulus pairings have on liking. Within the context of an experiment, it may be possible to establish contiguity as a cue meaning that stimuli should be related not as equivalent but as “opposite” to one another. This could be achieved by intermixing the CS-US pairs with a large number of multiple filler trials on which two stimuli with an opposite meaning are presented (e.g., pictures of night and day, hot and cold, black and white stimuli). Following such training, the CS might acquire a valence opposite to the valence of the US with which it was paired given that previously contiguous stimuli were opposite to one another (see Zanon et al., 2014, Experiment 3, for an alternative strategy to assess the impact of spatiotemporal contiguity as a relational cue). In short, we argue that contextual cues present in the proximal environment specify the manner in which stimuli are related and that these cues can either be obvious (e.g., words or symbols) or subtle (e.g., spatiotemporal contiguity). Even the pairing of stimuli in space and time can function as a proximal cue signaling that those events are related in a particular manner. For humans with a history of AARRing, the “default” option may be to treat objects and events as similar to one another whenever they are simply presented together in space and time (e.g., Leader & Barnes-Holmes, 2001; Smyth et al., 2006). Yet when the relational properties of pairings as a cue are changed, or when people are exposed to other relational cues specifying alternative relationships between stimuli more complex relations and evaluative responses tend to emerge. The core message here is that humans may not only give Ó 2016 Hogrefe Publishing

meaning to, or semantically interpret the individual stimuli that are paired in EC studies, but also give meaning to or interpret the pairings themselves. EC effects are not solely determined by stimulus pairings but the way in which people respond relationally to those pairings. Instructed, Observed, and Inference-Based EC effects as Instances of AARR Evaluative Conditioning effects are typically defined as changes in liking due to the pairing of stimuli. A number of researchers have sought to pair a CS with a US, not via repeated co-occurrence, but rather via instructions, inferences, and observation. Once again, we believe that these learning effects also represent instances of AARR. For the time being, we will not address the question of whether these effects qualify as “genuine” instances of EC. This question will be discussed in detail below. For now, it suffices to say that some have argued that EC can be brought about via instructions, inferences, and observation. We now discuss how these phenomena relate to more traditional instances of EC from the perspective of RFT. Evaluative Conditioning via Instructions According to RFT, instructions represent complex stimulus relations that serve to modify the psychological properties of stimuli in those relations (e.g., O’Hora, Barnes-Holmes, Roche, & Smeets, 2004; Törneke, Luciano, & Valdivia Salas, 2008). Put simply, instructions are comprised of arbitrary stimuli (words) that typically “stand for” other stimuli in the environment. For instance, the statement “John helps his landlady with her trash” does not involve the simple physical pairing of John with trashbags or his landlady. Rather it involves two types of stimulus relating. On the one hand, the words “John” and “landlady” have previously been related as equivalent to John and his landlady, respectively. On the other hand, these words are related via contextual cues contained in the statement itself (in this instance, words such as “helps” and “with” which lead to stimuli acquiring new or changing their existing psychological properties) (in this case, it is John who becomes the helper and the landlady who becomes the person who is helped). As an empirical example, consider a recent study by Gast and De Houwer (2013) who observed a change in liking toward a CS (toothpaste) when participants were provided with the following instruction: “If you see an image of toothpaste then a positive photo will appear.” For participants with a history of AARRing the word “toothpaste” is typically treated as equivalent to actual toothpaste while the word “positive photo” is treated as equivalent to previously encountered positive stimuli. At the same time, temporal and causal cues such as “If” and “Then” specify the order of events and their contiguous relationship to one another (i.e., that positive images only follow those of Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

toothpaste and not other stimuli). By specifying this spatiotemporal relation the aforementioned cues lead to a transfer of evaluative properties from one stimulus (US) to another (CS; see also De Houwer, 2006; Gast & De Houwer, 2013; Raes, De Houwer, De Schryver, Brass, & Kalisch, 2014; Zanon et al., 2014). In other words, instruction-following represents a complex type of AARRing that can lead to rapid changes in existing and novel preferences for objects and events. The relational cues that comprise instructions are able to specify that stimuli were related in the past (e.g., “CS caused cancer”), are currently related in the present (“CS is causing cancer”), or will be related at some point in the future (“CS will cause cancer”) without the need for people to directly experience those events or even encounter the stimuli that they refer to. It should be noted that RFT also views changes in liking due to other types of verbal information, like statements (“Mike cheated during a poker game”; Mann & Ferguson, 2015; Peters & Gawronski, 2011), stories (“Luppites are savage, ruthless and brutal”; Gregg et al., 2006), relational qualifers (“CS loathes cute kittens”; Förderer & Unkelbach, 2012), and persuasive messages (“Jonathan is a Yale-educated chemist who works for his state’s Department of Water Resources”; Smith, De Houwer, & Nosek, 2013) as instances of AARRing as well. In each case, arbitrary stimuli (words) are related to one another via relational cues, and specific psychological functions are occasioned by functional cues, such that the psychological properties of words in those relations (and the objects that they refer to) are modifed as a result (for a detailed RFT account of instruction- or rule-following see O’Hora & Barnes-Holmes, 2004). As we noted previously, in defining EC as an instance of AARR we do not argue that all instances of EC are necessarily the same or somehow identical. Rather the aforementioned EC effects appear to differ in terms of the proximal regularity that gives rise to AARR. We unpack this issue in greater detail below. Evaluative Conditioning via Inferences Recently, several researchers have argued that EC effects may emerge on the basis of inferences about (rather than direct experience with) CS-US pairings. For instance, Gast and De Houwer (2012) exposed participants to an EC procedure in which a positive picture (USpos) was directly paired with a gray square containing the number 1 (CS1) while a negative picture (USneg) was paired with a gray square containing the number 2 (CS2). Following training, participants were informed that a neutral image (CS3) was always hidden behind the first gray square while another neutral image (CS4) was hidden behind the second square. Results indicated that automatic and self-reported liking was more positive for CS3 than for CS4 despite the

Experimental Psychology 2016; Vol. 63(1):20–44

fact that neither CS was ever related with any US during training or instructions. These findings bear a striking similarity to the aforementioned research on AARR. In particular, they mirror the fact that when humans encounter a small set of directly trained relations between stimuli, they often act as if those stimuli are related in novel and untrained ways. In the abovementioned study, for example, the emergence of inferred EC effects toward CS3 and CS4 likely reflected, from an RFT perspective, the formation of two equivalence relations comprised of valenced stimuli, numbered squares, and neutral images (i.e., USpos?CS1?CS3 and USneg?CS2? CS4). Specifically, direct experience with one relation (e.g., USpos?CS1) followed by instructions about another (e.g., CS1?CS3) may have allowed participants to act as if those stimuli were related in untrained ways (e.g., USpos?CS3 and USneg?CS4). Once stimuli were related as equivalent to one another, given appropriate contextual cues, a transfer of (evaluative) functions from one stimulus (USpos) to another (CS3) may have taken place, explaining the observed pattern of automatic and self-reported responding. Thus, from an RFT perspective, when researchers use terms such as “inferences,” “reasoning,” and “deduction” in humans they refer to the generative properties of AARRing – namely – the ability to relate stimuli and events to one another in untrained and predictable ways (for a more detailed treatment see Barnes-Holmes, Keane, Barnes-Holmes, & Smeets, 2000; Dack, Reed, & McHugh, 2010; Dougher et al., 2007; Dymond et al., 2008; Hayes et al., 2001; Valdivia-Salas et al., 2013). Evaluative Conditioning via Observation Finally, changes in liking due to the pairing of stimuli can not only take place via spatiotemporal contiguity, verbal instructions, or inference, but can also stem from observing other people interact with the world. Observational learning refers to the capacity to learn how stimuli are related to one another by observing others when they are in contact with the environment. This type of learning allows organisms to change their existing (or acquire novel) behaviors without the need to ever directly encounter the aversive or appetitive consequences of their behavior for themselves (see Bandura, 1965; Zentall, 2012). For example, exposing rhesus monkeys to a real or videotaped counterpart reacting with fear toward a snake causes the observer to respond fearfully in the presence of, and attempt to avoid contact with, that same snake (Mineka & Cook, 1988). Similarly, children who observe their mother reacting with distress when she places her hand in cold water subsequently display a lower pain threshold when they have to place their own hand in water (Goodman & McGrath, 2003). When applied to EC, observational learning refers to changes in liking that are due to the organism observing another Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

organism interacting with the environment. For instance, watching another individual consume a drink (CS) and seeing them facially express a like or dislike for the drink (US) can cause the observer to produce a corresponding evaluative response toward that beverage (e.g., Baeyens et al., 2001; Baeyens, Kaes, Eelen, & Silverans, 1996). Although nonhumans can modify their behavior based on imitation, modeling, or social learning (Zentall, 2012), AARR may influence how humans learn via observation in two important ways. On the one hand, observing events in the environment may alter not only how people respond to those events but also impact how they respond to other related stimuli. For instance, suppose that a human watches a model bite into a large, sour lemon. Thereafter, the model emits a number of negative evaluative responses (e.g., displays signs of physical discomfort), and as a result, the observer avoids tasting lemons. Now suppose that the observer learns (either through training or instruction) that a novel brand product (Ettalas) tastes similar to lemons while a second product (Gageleer) is opposite to Ettalas. People may automatically display and self-report a preference for Gageleer relative to Ettalas and even opt to physically approach and consume the former relative to the latter – despite having never encountered either stimulus in the past. In other words, observing others interact with regularities in the environment may not only alter the psychological properties of those stimuli but also modify the properties of other related stimuli as well. The manner in which these properties are established or altered will once again depend on the relational and functional cues that control how stimuli are related. On the other hand, AARR may also influence how people respond to the observed event itself. Take the above example of someone biting a sour lemon. Imagine that negative facial expressions are related with sour tastes and sour tastes with positive outcomes (e.g., by telling people that “items which taste sour may appear to be unpleasant but they are actually extremely healthy for you” or “If you eat the lemon now I will give you a large sum of money later on”). Now suppose that an individual is exposed to the same scenario as before in which a model bites into a lemon and emits a negative reaction. Rather than attempting to avoid any contact with lemons they may respond positively toward that stimulus in a variety of ways. Conclusion In short, there are good logical and empirical arguments to assume that different effects that have been reported in the EC literature may qualify as instances of AARR. Growing evidence indicates that once humans have learned to AARR they respond relationally to stimuli that are paired together and that this relation is (1) under the control of contextual cues in the environment, and that (2) these cues determine the magnitude, direction, and nature of the change in

Ó 2016 Hogrefe Publishing

liking. Contextual cues can vary in their complexity, from simple words and symbols to relatively complex statements and instructions. They can either be encountered through the individual’s interactions with the environment or from observing the behavior of others in similar situations. Once people have the ability to AARR any stimulus or event can come to function as a contextual cue – even spatiotemporal contiguity between paired stimuli. Thus defining different EC effects as instances of AARR helps to systematize research (i.e., it has heuristic value). It highlights possible functional similarities between topographically distinct outcomes as well as possible functional differences between EC effects and other evaluative learning effects. It highlights that effects due to contiguity, instructions, observations, and inferences may be different instances of the same underlying phenomenon (AARR). Before we continue, several points are worth noting here. First, our analysis does not imply that all effects that have been referred to as EC effects are identical simply because they are all instances of AARR. As we mentioned above, any given instance of AARR may be distinguished based on the type of proximal regularity that generates a relevant class (or classes) of AARRing (e.g., the pairing of stimuli, verbal instructions about the pairing of stimuli, verbal instructions about the properties of stimuli, etc.). Although all of these instances of AARR are assumed to share some basic properties (e.g., depend on the distal learning history that allows for AARR, be sensitive to the nature and presence of relational and functional cues, coincide with other changes of behavior that are instances of AARR), different types of AARR might also differ in nontrivial ways depending on the proximal regularity that gave rise to AARR. For instance, the changes in liking due to contiguous stimulus pairings may differ in important ways from instructions about pairings, observing others interact with stimulus pairings or the pairing of stimuli via inferences. The nature of these differences is an empirical matter. Regardless of the outcome of this future research, it seems likely that different classes of AARRing may result through interaction with different types of proximal regularities in the environment. Second, approaching the EC literature from the perspective of RFT not only has heuristic value (insofar as it identifies commonalities and differences across topographically distinct effects) but also has conceptual value. More specifically, it allows for complex phenomenon such as learning via instructions, inferences, and deductions to be defined in purely functional terms. We will return to this point below. Contribution 2: The Concept of AARR Can Contribute to the Debate on “Genuine” EC Contribution 1 highlights an important point: the current definition of EC allows for considerable degress of freedom

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

when determining what actually qualifies as a stimulus pairing. As we have seen, several authors have interpreted “pairings” strictly as the directly experienced contiguous pairings of “actual” stimuli (Jones et al., 2009; Walther et al., 2011), whereas others have interpreted “pairings” in a more liberal manner, allowing them to be embedded in instructions (De Houwer, 2006; Field, 2006), to emerge via inference (Gast & De Houwer, 2012) or to take place based on observations of other organisms interacting in and with the environment (Baeyens et al., 2001). Thus the extent to which an outcome currently qualifies as a “genuine” or “real” instance of EC is often a matter of debate (see De Houwer & Hughes, in press, and Gast et al., 2012 for more on this issue). We see two ways in which this debate could be resolved. First, EC researchers could restrict the study of EC to those situations in which stimulus pairings function as a mere proximal cause of likes and dislikes. This would require EC effects to meet two strict conditions: (1) that stimulus pairings are the regularity responsible for changes in liking and (2) that those pairings are a mere proximal cause of liking rather than a cue for relational responding. The first condition can easily be tested by eliminating or controlling for other potential proximal regularities (see De Houwer, 2007; De Houwer, 2011b). However, it seems relatively more difficult to determine whether stimulus pairings are functioning as a mere proximal cause of liking or a cue for relational responding. One possible way to test this would be to keep the spatiotemporal properties of pairings constant while varying the relational properties of pairings as a contextual cue (e.g., see Zanon et al., 2012). Note that one cannot settle this issue simply by inspecting the nature of the procedure. Even procedures that merely involve the pairing of stimuli and that are devoid of words or sentences might produce changes in liking only because stimulus pairings function as a proximal cue for the relation between those stimuli. A second (and more pragmatic) option would be to restrict the study of EC to situations in which stimulus pairings function as the proximal regularity responsible for changes in liking, regardless of whether those pairings function as a proximal cause of liking or cue for relational responding. This position allows for other (distal and proximal) regularities to influence the function of stimulus pairings while ensuring that pairings are the proximal regularity responsible for the observed change in liking. Put another way, in order for an effect to qualify as an instance of 6

EC, responding to the CS and US has to be under the control of stimulus pairings even when the function or “meaning” of those pairings has been altered by other regularities or events in the environment (e.g., Zanon et al., 2014). Such an approach would disqualify instructed and many inferred effects as instances of EC, not because they are dependent on distal learning experiences, but because they involve more than an impact of stimulus pairings as a proximal event (for a more detailed treatment of this topic see De Houwer & Hughes, in press).6 The take home message here is that the current definition of EC places few constraints on how researchers interpret the term pairings and this can lead to disagreement about whether an effect actually qualifies as an instance of EC or not. We propose that EC can be delineated from other evaluative learning phenomena in two ways, either restrictively as changes in liking that result from stimulus pairings functioning as a mere proximal cause of liking, or more liberally, as changes in liking that results from stimulus pairings, regardless of whether those pairings function as a mere proximal cause or as a proximal cue. We favor the latter position because it remains true to the longstanding idea that EC is unique in its emphasis on stimulus pairings as a cause of liking (De Houwer, 2007) while acknowledging the potential complexity in the determinants of EC. Although the argument that we developed in the previous paragraphs can be developed independent of the literature on RFT and AARR, we believe that this literature supports the argument. First, it suggests that any attempt to restrict EC to situations in which stimulus pairings function as a proximal cause of liking is going to encounter immediate problems. Based on research spanning more than 40 years, it appears that people do not respond based simply on the spatiotemporal properties of stimuli that they encounter (i.e., stimuli as they are present in the here and now) but rather based on how they relate those stimuli in the presence of contextual cues. The concept of AARR also allows for the possibility that there are two different types of EC effects: those that are instances of AARR and those that are not. For example, it may be that EC effects in humans who have acquired the ability to AARR are in fact instances of AARR, but EC effects observed in preverbal infants, severely verbally-impaired humans, or nonhumans reflect changes in behavior due to the stimulus pairings functioning as a mere proximal cause of liking. Thus approaching EC as an instance of AARR directs attention

On a side-note we are not suggesting that the effects of instructions about stimulus pairings are irrelevant for EC researchers. On the contrary, if EC research centers on the effects of stimulus pairings on liking, then instructions about stimulus pairings represent a unique subclass of instructions given that they refer to a proximal event at the core of EC research (i.e., stimulus pairings). Studying the similarities and differences between effects that result from contiguous versus instructed stimulus pairings could unlock important information about the (mental) processes underlying both of these effects. It could also offer unique insight into the added value of experiencing events versus being instructed about those events (see Raes et al., 2014).

Experimental Psychology 2016; Vol. 63(1):20–44

Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

away from questions about “real” or “genuine” effects and toward questions about the function of stimulus pairings and their role in changing liking. This in turn allows for a new cognitive perspective wherein EC is mediated by fundamentally different underlying mental processes depending upon an individual’s level of verbal ability (for more see De Houwer & Hughes, in press). Second, the concept of AARR also offers insight into the learning experiences necessary to transform pairings from a mere proximal cause of liking to a proximal cue for relational responding. Existing knowledge concerning AARR may further refine our understanding of EC by specifying those distal regularities which are a prerequiste for EC effects involving words, sounds, and other symbolic stimuli. Although researchers are of course free to appeal to other distal regularities that give rise to language or relational learning, and explore the impact of these regularities on stimulus pairings, the literature on AARR clearly articulates what these distal regularities look like and how they may be brought to bear on proximal events (see Hughes & Barnes-Holmes, 2016a, 2016b). Contribution 3: The Concept of AARR Can Contribute to our Understanding of Human Versus Nonhuman EC If changes in liking can result from stimulus pairings functioning as a proximal cue, and if those learning experiences responsible for AARR are critical for establishing stimulus pairings as a proximal cue, then the concept of AARR may also inform the debate on human versus nonhuman EC. This is because over forty years of research indicates that the ability to AARR is unique to, or at least highly elaborated in, humans while largely absent elsewhere in the animal kingdom (e.g., Dugdale & Lowe, 2000; Lionello-DeNolf, 2009). This failure to observe evidence of AARRing in nonhumans was somewhat unexpected given the sheer number of learning principles that are common to humans and nonhumans (e.g., reinforcement, punishment, generalization, discrimination, extinction, recovery, and habituation). It was originally assumed that these principles would also stretch all the way up to complex phenomena such as language and higher-order cognition; yet, a number of important findings emerged hinting at learning processes or principles that may be unique to, or at least largely elaborated in, some species relative to others (Hayes, 1989; Hughes & Barnes-Holmes, 2014). Today, and despite extensive efforts to find or train it, 7

evidence for AARRing in nonhuman animals is still extremely limited. Although there are some indications that the most basic aspects of AARR might be found in nonhuman animals under strict laboratory conditions (e.g., Urcuioli, 2008; Zentall, Wasserman, & Urcuioli, 2014), the flexibility and complexity of AARRing as observed in humans far exceeds that seen in their nonhuman counterparts.7 These findings have important implications for our understanding of human and nonhuman EC. They suggest that there should be a clear divide between the types of EC effects observed in humans and those seen elsewhere in the animal kingdom. First, it seems reasonable to assume that the evaluative responses of animals can be changed whenever pairings between a CS and US function as a mere proximal cause of liking. For instance, pairing a tone (CS) and a tasty treat (US) in the presence of a dog may increase the probability that he will salivate and demonstrate other signs of “evaluative responding” (e.g., tail wagging, barking, or jumping) whenever the tone is presented in the future. Second, changes in liking that result from pairing stimuli that have previously acquired their “meaning” via a history of AARRing may be beyond the reach of nonhuman organisms. Indeed, it seems almost obvious that pairing a tone (CS) with positive adjectives (US) will not produce a change in liking for nonhumans in the same way that it does for humans. This is because the history of learning necessary to establish the “meaning” or function of many verbal stimuli is absent in one case and present in the other. Third, and more importantly for the present purposes, changes in liking that result from stimulus pairings functioning as a proximal cue signaling an arbitrarily applicable relational response will likely also be absent in nonhumans and yet present in humans. As we have argued before, this is because pairings have different functions for organisms with than for organisms without the learning history which gives rise to AARR (i.e., pairings as a mere proximal cause vs. contextual cue). In short, pairing stimuli that have acquired their function or “meaning” via AARR should lead to EC effects that are evident in some species but not others. Likewise, when pairings function as a contextual cue for AARRing (or at the mental level: when pairings are interpreted or given semantic meaning), EC effects should be evident in certain species but not others, or it might have fundamentally different properties in verbal organisms (for which parings can function as a relational cue for AARRing)

It is worth noting that RFT has always remained agnostic to the possibility that AARR is a uniquely human capacity and it has never been argued that derived stimulus relating is forever beyond the grasp of other species. Rather, RFT has simply viewed this claim as an empirical rather than purely theoretical one (see Dymond, Roche, & Barnes-Holmes, 2003). As the evidence currently stands, it seems likely that there is some “glass ceiling” in terms of relational complexity, contextual control, and generalizability that humans are capable of that is not evident elsewhere in the animal kingdom. That said, RFT allows for the possibility that providing nonhuman animals with the learning history that enables humans to AARR may also cause other organisms to behave in similar ways. Whether this potential can be realized is an empirical question.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

than in nonhumans (for which pairings cannot function as a relational cue for AARRing). It may be that testing whether a behavior shows the properties of AARR provides one means in which to distinguish what types of EC effects different species can and cannot produce (for more on this topic see De Houwer & Hughes, in press). Contribution 4: The Concept of AARR Can Help Cognitive Researchers Refine Existing and Develop New Cognitive Theories of EC The findings arising from research on AARR may also be used to constrain hypotheses and theories about the mental constructs that are assumed to mediate EC effects. For instance, some changes in liking that are instances of AARR appear to conflict with associative accounts, wherein the direct pairing of stimuli leads to the formation of unqualified links between mental representations in memory (Baeyens et al., 1992; Martin & Levey, 1994). In principle, from the perspective of typical associative accounts, presenting one stimulus with positive and another with negative images should result in the same mental associations and thus changes in liking regardless of the presence of relational cues. But research indicates that relational qualifiers do appear to moderate EC effects, both during (Förderer & Unkelbach, 2012) and after pairings take place (Peters & Gawronski, 2011; Zanon et al., 2014). Perhaps more importantly, they often lead to entirely different evaluative responses than those implied by the valence of the US that a CS was previously paired with. These accounts also encounter difficulties in explaining why EC effects emerge in the absence of any direct pairings between stimuli or why people will act as if stimuli are related in novel and untrained ways once a small set of relations have been directly trained. Data from research on AARR may not only conflict with associative mental models but could be seen as providing support for propositional accounts that involve qualified links between mental representations in memory. The idea that EC effects result from the relating of stimuli under the control of contextual cues fits well with the idea that EC is mediated by the formation of propositions concerning those relations. Whereas associations simply convey the strength with which representations are linked in memory, propositions specify their strength, structure, and content, and as such appear to articulate more readily with the RFT view of human cognition as being inherently relational. Likewise, while associations gradually develop with many experienced pairings (Dickinson, 2012; Gawronski & Bodenhausen, 2011) 8

propositions can be formed on the basis of direct training or inferred via deductive reasoning and language (De Houwer, 2009). When combined, these two properties of propositions may explain the different patterns of evaluative responding that emerges when stimulus relations are established between and among stimuli. For instance, imagine that people act as if two stimuli (a CS and US) are the same after observing them being paired. It may be that this behavior is mediated by a mental propositional representation such as “CS is the same as US.” If the relational “meaning” or function of pairings is altered so that contiguity now signals that stimuli are opposite to one another, and people act as if the CS and US are opposite, then this behavior may be mediated by another proposition – namely – that the “CS is opposite to US.” Indeed, any relation established between stimuli via contiguity, whether comparative (CS-More than-US), hierarchical (CS-Part ofUS), temporal (CS-Comes before-US), causal (CS-CausesUS), or deictic (I-am-US), could potentially lead to outcomes that are mediated by comparable propositions. At the same time, it is possible that the entirely untrained and novel ways in which people act toward stimuli after a small set of stimulus pairings may be mediated by participants making what we label a “propositional leap.” In other words, learning that CS1 is the same as CS2 and that CS2 is the same as CS3, and then subsequently acting as if CS1 is the same as CS3 and CS3 is the same as CS1, may be mediated by the formation of a small set of propositions (e.g., “CS1 is the same as CS2” and “CS2 is the same as CS3”) which in turn generate an additional set of inferred propositions (“CS1 is the same as CS3”). It may be that these inferred propositions – rather than those that arise from direct experience – play a key role in changes in liking toward stimuli that were never paired with valenced objects or events in the past.8 The core message here is twofold. First, the literature on RFT, and AARR more generally, represents a largely untapped reservoir of behavioral effects and procedures that cognitive researchers can use to improve existing or develop new mental theories. Just as defining a temper tantrum as an instance of reinforcement increases our ability to predict- and-influence such behavior (e.g., by specifying the antecedents and consequences that control such behavior), so too does defining EC as an instance of AARR. Indeed, when defined in this way, existing knowledge about the origins of AARR, its functional properties, and boundary conditions can be used to further increase our understanding of EC effects. For instance, work in the RFT literature

It may be tempting to treat behavioral (AARR) and mental (propositions) phenomena as equivalent to one another in light of their apparent overlap (i.e., their flexibility, generativity, and symbolic nature). However, as we pointed out above, it is important to realize that AARR refers to a pattern of contextually controlled behavior while propositions represent one possible mental mediator of the behavioral observations that functional researchers attempt to explain using the concept of AARR. In other words, AARR is not another name for propositional processes nor are propositions instances of AARR. For cognitive researchers, the former simply refers to a broad set of behavioral observations that need to be explained while the latter is one (of many) possible mental explanations for those behaviors.

Experimental Psychology 2016; Vol. 63(1):20–44

Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

has already shown that pairings can function as a contextual cue for relational responding, and as a result, change how people act toward those stimuli (e.g., Leader & Barnes-Holmes, 2001). These studies typically exposed adults or children to a number of stimuli which are paired contiguously in space and time (e.g., A1 is paired with B1 and B1 then is paired with C1). Results consistently reveal that contiguity can function as a proximal cue signaling that stimuli are equivalent to one another (e.g., following such pairings participants tend to select A1 in the presence of C1; see Clayton & Hayes, 2004; Smyth et al., 2006). Extensive work has also examined the formation and manipulation of contextual cues and how those cues can lead to contrastive, comparative, opposition, and other types of relations between stimuli (for a review see Hughes & Barnes-Holmes, 2016a). This work could provide valuable insight into ongoing debates about the nature or “meaning” of stimulus pairings (proximal cause vs. cue) as well as the impact of relational qualifiers on stimulus pairings (e.g., Bar-Anan & Dahan, 2013; Förderer & Unkelbach, 2012; Moran & Bar-Anan, 2013, Zanon et al., 2014). Similarly, a wealth of data on AARR may also inform research on the generalization of EC effects (Valdivia-Salas et al., 2013), its sensitivity to forward/backward conditioning procedures (Barnes-Holmes et al., 2000), and resistance to extinction (Roche et al., 2008). Take, for example, the work of Valdivia-Salas and colleagues who found that when stimuli are related as equivalent to one another the EC effect established for one stimulus can “symbolically” generalize to other related stimuli even when those stimuli share no perceptual similarities with one another (note that stimulus generalization is typically restricted to perceptually similar stimuli). This may offer a new functional perspective on cognitive phenomena such as the “spreading attitude effect” (Walther, 2002) or “lateral attitude change” (Glaser et al., 2015), which often involve the generalization of evaluative responses along non-perceptual dimensions in ways that are difficult to explain using existing concepts such as second-order or sensory preconditioning. Research that has been conducted in CBS may also lead to new predictions and ideas about EC. To illustrate, imagine that a positively valenced face (US) is repeatedly paired with a neutral face (CS) in the presence of the word “opposite.” If we are correct and contiguity functions as a proximal cue signaling that stimuli are equivalent to one another then contiguity provides a source of information about the CS-US relation that is in direct contradiction to the relation implied by “opposite” (i.e., there is a competition between two different proximal cues). This may lead to a reduced rather than fully reversed EC effect relative to situations in which contiguity and the word “opposite” have the same relational meaning (i.e., a consistency or Ó 2016 Hogrefe Publishing

coherence between proximal cues). This competitioncoherence hypothesis may provide a (functional) perspective on studies which have already pitted contiguity against other relational cues, either at the same time (Moran, Bar-Anan, & Nosek, 2015) or after a delay (Peters & Gawronski, 2011), as well as offer new predictions about the impact of relational qualifiers on liking. For instance, and as we alluded to above, it may be that stronger EC effects will emerge in those situations where contiguity and other cues signal the same relation between the CS and US compared to when they signal divergent or contradictory relations. Procedures developed within the functional tradition – such as the Matching to Sample task – could prove useful when asking and answering these questions. They would allow EC researchers to manipulate the meaning of proximal cues, establish relations between stimuli in ways that are under tight experimental control, and test whether EC effects co-occur with other instances of AARR. The second core message is that given that the functional level is focused solely on the interplay between the environment and behavior, it does not place any a priori restrictions on what constructs mediate interactions between individuals and the environment. Actually, it does the opposite: it unlocks greater “theoretical freedom” such that researchers can deploy associations, propositions, or any other mental concept in order to explain how changes in liking that could be seen as involving AARR come about. This may accelerate cognitive theorizing by freeing it from historical or preexisting biases such as the notion that EC is exclusively associative in nature (for related arguments see Hughes, Barnes-Holmes, & De Houwer, 2011). However, and as noted in the Introduction, findings from the functional literature do constrain mental theorizing in an a posteriori fashion. That is, they provide outcomes that mental models should be able to explain. Contribution 5: The Concept of AARR Provides EC Researchers With a Means to Functionally Define Complex Psychological Phenomena Finally, the concept of AARR provides cognitive researchers with a way of talking about an entire spectrum of EC and related effects in purely functional terms. Given that there are always multiple possible mental theories of behavior (and because behavior is never determined by just one set of mental processes), it is important to maintain a clear conceptual distinction between the to-be-explained behavioral effect and the proposed mental theory (see De Houwer & Moors, 2015). Over the past decade this conceptual separation has been widely adopted within the EC literature. However, the concept of AARR allows one to also define related phenomena such as EC via instructions, inferences, and observation in functional terms. Because knowledge about these effects can be useful for EC Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

research regardless of whether they are considered to be “real” EC effects, it is also good for EC researchers to be able to conceptualize these effects in functional terms. Although effects due to instructions, inferences, and observation have historically been defined at the mental level of analysis, there is no a priori reason why they cannot be defined in purely functional terms. Whereas the functional literature may have been ill-equipped to grapple with these concepts in the past, times have changed. With the development of RFT, and its emphasis on the concept of AARR, many complex psychological phenomena (including instructed and inferential learning) can and have been conceptualized, studied, and explored at the functional level of analysis (see Hayes et al., 2001; Hughes & Barnes-Holmes, 2016b). Adopting a sophisticated functional (analytic-abstractive) language for EC-related effects may allow for even greater theoretical innovation at the mental level, insofar as it ensures that no preexisting restrictions are placed on the potential mental processes and representations that can mediate the effect of stimulus pairings on liking. For instance, it may be that instructed, inferred, and observed EC effects are mediated by some complex interplay between mental associations in memory. Although such an explanation of the behavioral observations that AARR attempts to explain (at the functional level) appears difficult to reconcile with the literature, it is not excluded from consideration on an a priori basis. In short, CBS may help to equip cognitive researchers with a sophisticated nonmental language that maintains a conceptual distinction between the event that needs to be explained (behavioral effect) and the event that is used to explain (mental mediator).

Conclusion The current paper introduced an intellectual tradition known as CBS, a functional account of human language and cognition known as RFT, as well as a behavioral phenomenon known as AARR. We then forwarded a novel perspective on EC wherein stimulus pairings function as either a mere proximal cause of liking or a proximal cue signaling that the CS and US are related. This latter possibility leads to the idea that the impact of proximal regularities (stimulus pairings) on liking may be moderated by distal regularities in the organism’s past. While fully recognizing that other accounts are possible, we draw upon developments in the RFT and EC literatures to propose that a specific set of distal regularities (those that give rise to AARRing) may advance our understanding of EC. According to this perspective, humans not only give meaning to Experimental Psychology 2016; Vol. 63(1):20–44

the individual stimuli that are paired in EC studies but also to pairings as a proximal event. That is, they respond relationally to the fact that stimuli are paired together and this behavior is (1) under the control of contextual cues in the environment, and (2) these cues determine the magnitude, direction, and nature of the change in liking. Not only instructions, observations, and inferences but also contiguous stimulus pairings lead to changes in liking because of a host of learning experiences that have endowed the individual with the ability to AARR. In this way, EC effects that have been traditionally attributed to the presence of a single (proximal) regularity may in fact be moderated by other (distal) regularities. Indeed, if one adopts an RFT perspective, they may even represent different instances of the same behavioral phenomenon (AARR). We presented logical and empirical arguments for the usefulness of this conceptualization and discussed how it contributes to our understanding of EC in a variety of ways, from heuristically organizing existing and predicting novel effects, contributing to debates on “genuine” and human versus nonhuman EC effects, as well as facilitating the development and refinement of cognitive theories of EC. We also showed how the functional analytic concept of AARR provides a means to discuss and define increasingly complex phenomena such as so-called EC via instructions and inferences in purely functional terms, and in doing so, enables researchers to maintain a firm separation between mental and functional levels of analysis. We can see three ways in which cognitively-oriented EC researchers might respond to our arguments. First, they might take note of the idea that some instances of EC go beyond the effect of stimulus pairings as a mere proximal cause and that RFT might help us understand those instances but dismiss these instances as lying beyond the scope of EC research. Such a position would imply that EC research should focus on changes in liking in which stimulus pairings are a mere proximal cause. Second, they could include instances of EC in which stimulus pairings are used to give meaning to stimulus relations but without taking on board the CBS literature. Third, they could turn to the CBS literature as a source of inspiration for their cognitive analysis of EC, including instances of EC in which stimulus pairings are used as a proximal cue for giving meaning to stimulus relations. Based on the arguments presented above, we believe that the third option would optimize progress in our understanding of EC. In any case, our hope is that this paper serves to stimulate new debate on the role that AARR in particular, and distal regularities in general, may play in EC and related phenomena, and that it leads to renewed dialog between cognitive and functional researchers interested in the study of likes and dislikes. Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

Acknowledgments The preparation of this paper was supported by a Government of Ireland Research Fellowship to Sean Hughes, Ghent University Methusalem Grant BOF09/01M00209 to Jan de Houwer, the Interuniversity Attraction Poles Program initiated by the Belgian Science Policy Office (IUAPVII/33), and an Odysseus I grant of the Research Foundation – Flanders (FWO) to Dermot Barnes-Holmes. The authors would like to thank Pieter Van Dessel, Anne Gast and Klaus Fiedler for their comments on an earlier draft of the paper.

References Baeyens, F., Crombez, G., De Houwer, J., & Eelen, P. (1996). No evidence for modulation of evaluative flavor-flavor associations in humans. Learning and Motivation, 27, 200–241. Baeyens, F., Eelen, P., Crombez, G., & De Houwer, J. (2001). On the role of beliefs in observational flavor conditioning. Current Psychology, 20, 183–203. Baeyens, F., Eelen, P., Crombez, G., & Van den Bergh, O. (1992). Human evaluative conditioning; acquisition trials, presentation schedule, evaluative style and contingency awareness. Behavior Research and Therapy, 30, 133–142. Baeyens, F., Kaes, B., Eelen, P., & Silverans, P. (1996). Observational evaluative conditioning of an embedded stimulus element. European Journal of Social Psychology, 26, 15–28. Balas, R., & Gawronski, B. (2012). On the intentional control of conditioned evaluative responses. Learning and Motivation, 43, 89–98. Bandura, A. (1965). Influence of models’ reinforcement contingencies on the acquisition of imitative responses. Journal of Personality and Social Psychology, 1, 589–595. Bar-Anan, Y., & Dahan, N. (2013). The effect of comparative context on evaluative conditioning. Cognition & Emotion, 27, 367–375. Bar-Anan, Y., De Houwer, J., & Nosek, B. A. (2010). Evaluative conditioning and conscious knowledge of contingencies: A correlational investigation with large samples. The Quarterly Journal of Experimental Psychology, 63, 2313–2335. Barnes-Holmes, D., & Hussey, I. (2016). The functional-cognitive meta-theoretical framework: Reflections, possible clarifications and how to move forward. International Journal of Psychology, 51, 50–57. doi: 10.1002/ijop.12166 Barnes-Holmes, D., Keane, J., Barnes-Holmes, Y., & Smeets, P. M. (2000). A derived transformation of emotive functions as a means of establishing differential preferences for soft drinks. The Psychological Record, 50, 493–511. Barnes-Holmes, Y., Barnes-Holmes, D., Smeets, P. M., & Luciano, C. (2004). The derived transfer of mood functions through equivalence relations. The Psychological Record, 54, 95–114. Bechtel, W. (2008). Mechanisms in cognitive psychology: What are the operations? Philosophy of Science, 75, 995–1007. Boakes, R. A., Albertella, L., & Harris, J. A. (2007). Expression of flavor preference depends on type of test and on recent drinking history. Journal of Experimental Psychology: Animal Behavior Processes, 33, 327–338. Bornstein, R. F. (1989). Exposure and affect: Overview and metaanalysis of research, 1968–1987. Psychological Bulletin, 106, 263–289.

Ó 2016 Hogrefe Publishing

Cahill, J., Barnes-Holmes, Y., Barnes-Holmes, D., RodríguezValverde, M., Luciano, C., & Smeets, P. M. (2007). The derived transfer and reversal of mood functions through equivalence relations II. The Psychological Record, 57, 373–389. Catania, A. C. (1997). Learning. Upper Saddle River, NJ: PrenticeHall. Chiesa, M. (1992). Radical behaviorism and scientific frameworks: From mechanistic to relational accounts. American Psychologist, 47, 1287–1299. Clayton, M. C., & Hayes, L. J. (2004). A comparison of match-tosample and respondent-type training of equivalence classes. The Psychological Record, 54, 579–602. Corneille, O., Yzerbyt, V., Pleyers, G., & Mussweiler, T. (2009). Beyond awareness and resources: Evaluative conditioning may be sensitive to processing goals. Journal of Experimental Social Psychology, 45, 279–282. Dack, C., Reed, P., & McHugh, L. (2010). Multiple determinants of transfer of evaluative function after conditioning with freeoperant schedules of reinforcement. Learning & Behavior, 38, 348–366. De Houwer, J. (2006). Using the Implicit Association Test does not rule out an impact of conscious propositional knowledge on evaluative conditioning. Learning and Motivation, 37, 176–187. De Houwer, J. (2007). A conceptual and theoretical analysis of evaluative conditioning. The Spanish Journal of Psychology, 10, 230–241. De Houwer, J. (2009). The propositional approach to associative learning as an alternative for association formation models. Learning & Behavior, 37, 1–20. De Houwer, J. (2011a). Why the cognitive approach in psychology would profit from a functional approach and vice versa. Perspectives on Psychological Science, 6, 202–209. De Houwer, J. (2011b). Evaluative conditioning: Methodological considerations. In K. C. Klauer, C. Stahl, & A. Voss (Eds.), Cognitive methods in social psychology (pp. 124–147). New York, NY: Guilford. De Houwer, J., Barnes-Holmes, D., & Moors, A. (2013). What is learning? On the nature and merits of a functional definition of learning. Psychonomic Bulletin & Review, 20, 631–642. doi: 10.3758/s13423-013-0386-3 De Houwer, J., Gawronski, B., & Barnes-Holmes, D. (2013). A functional-cognitive framework for attitude research. European Review of Social Psychology, 24, 252–287. De Houwer, J., & Hughes, S. (in press). Evaluative conditioning as a symbolic phenomenon: On the relation between evaluative conditioning, evaluative conditioning via instructions, and persuasion. Social Cognition. De Houwer, J., & Moors, A. (2015). Levels of analysis in social psychology. In B. Gawronski & G. Bodenhausen (Eds.), Theory and explanation in social psychology (pp. 24–40). NY: Guilford. Díaz, E., & De la Casa, L. G. (2002). Latent inhibition in human affective learning. Emotion, 2, 242–250. Dickinson, A. (2012). Associative learning and animal cognition. Philosophical Transactions of the Royal Society of Biological Sciences, 367, 2733–2742. Dougher, M. J., Hamilton, D., Fink, B., & Harrington, J. (2007). Transformation of the discriminative and eliciting functions of generalized relational stimuli. Journal of the Experimental Analysis of Behavior, 88, 179–197. Dugdale, N., & Lowe, C. F. (2000). Testing for symmetry in the conditional discriminations of language-trained chimpanzees. Journal of the Experimental Analysis of Behavior, 73, 5–22. Dymond, S., Roche, B., & Barnes-Holmes, D. (2003). The continuity strategy, human behavior and behavior analysis. The Psychological Record, 53, 333–347.

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

Dymond, S., Roche, B., Forsyth, J. P., Whelan, R., & Rhoden, J. (2008). Derived avoidance learning: Transformation of avoidance response functions in accordance with same and opposite relational frames. The Psychological Record, 58, 269–286. Fazio, R. H., Jackson, J. R., Dunton, B. C., & Williams, C. J. (1995). Variability in automatic activation as an unobtrusive measure of racial attitudes: A bona fide pipeline? Journal of Personality and Social Psychology, 69, 1013–1027. Fiedler, K. (2014). From intrapsychic to ecological theories in social psychology: Outlines of a functional theory approach. European Journal of Social Psychology, 44, 657–670. Förderer, S., & Unkelbach, C. (2011). Beyond evaluative conditioning! Evidence for transfer of non-evaluative attributes. Social Psychological and Personality Science, 2, 479–486. Förderer, S., & Unkelbach, C. (2012). Hating the cute kitten or loving the aggressive pit-bull: EC effects depend on CS-US relations. Cognition & Emotion, 26, 534–540. Field, A. P. (2006). I don’t like it because it eats sprouts: Conditioning preferences in children. Behavior Research and Therapy, 44, 439–455. Frank, A. J., & Wasserman, E. A. (2005). Associative symmetry in the pigeon after successive matching-to-sample training. Journal of the Experimental Analysis of Behavior, 84, 147–165. Fulcher, E. P., Mathews, A., & Hammerl, M. (2008). Rapid acquisition of emotional information and attentional bias in anxious children. Journal of Behavior Therapy and Experimental Psychiatry, 39, 321–339. Galdi, S., Arcuri, L., & Gawronski, B. (2008). Automatic mental associations predict future choices of undecided decisionmakers. Science, 321, 1100–1102. Gannon, S., Roche, B., Kanter, J., Forsyth, J. P., & Linehan, C. (2011). A derived relations analysis of approach-avoidance conflict: Implications for the behavioral analysis of human anxiety. The Psychological Record, 61, 227–252. Gardner, H. (1987). The mind’s new science: A history of the cognitive revolution. New York, NY: Basic Books. Gast, A., & De Houwer, J. (2012). Evaluative conditioning without directly experienced pairings of the conditioned and the unconditioned stimuli. The Quarterly Journal of Experimental Psychology, 65, 1657–1674. Gast, A., & De Houwer, J. (2013). The influence of extinction and counterconditioning instructions on evaluative conditioning effects. Learning and Motivation. Advance online publication. doi: 10.1016/j.lmot.2013.03.003 Gast, A., Gawronski, B., & De Houwer, J. (2012). Evaluative conditioning: Recent developments and future directions. Learning and Motivation, 43, 79–88. Gawronski, B., & Bodenhausen, G. V. (2011). The associativepropositional evaluation model: Theory, evidence, and open questions. Advances in Experimental Social Psychology, 44, 59–127. Gawronski, B., Rydell, R. J., Vervliet, B., & De Houwer, J. (2010). Generalization versus contextualization in automatic evaluation. Journal of Experimental Psychology: General, 139, 683–701. Gibson, B. (2008). Can evaluative conditioning change attitudes toward mature brands? New evidence from the Implicit Association Test. Journal of Consumer Research, 35, 178–188. Gil, E., Luciano, C., Ruiz, F. J., & Valdivia-Salas, V. (2012). A preliminary demonstration of transformation of functions through hierarchical relations. International Journal of Psychology and Psychological Therapy, 12, 1–19. Giurfa, M., Zhang, S. W., Jennett, A., Menzel, R., & Srinivasan, M. V. (2001). The concepts of sameness and difference in an insect. Nature, 410, 930–933.

Experimental Psychology 2016; Vol. 63(1):20–44

Glaser, T., Dickel, N., Liersch, B., Rees, J., Sussenbach, P., & Bohner, G. (2015). Lateral attitude change. Personality and Social Psychology Review, 19(3), 257–276. Goodman, J. E., & McGrath, P. J. (2003). Mothers’ modeling influences children’s pain during a cold-pressor task. Pain, 104, 559–565. Greenwald, A. G., McGhee, D. E., & Schwartz, J. K. L. (1998). Measuring individual differences in implicit cognition: The Implicit Association Test. Journal of Personality and Social Psychology, 74, 1464–1480. Gregg, A. I., Seibt, B., & Banaji, M. R. (2006). Easier done than undone: Asymmetry in the malleability of implicit preferences. Journal of Personality and Social Psychology, 90, 1–20. Hammerl, M., & Grabitz, H. J. (1996). Human evaluative conditioning without experiencing a valued event. Learning and Motivation, 27, 278–293. Hammerl, M., & Grabitz, H. J. (2000). Affective-evaluative learning in humans: A form of associative learning or only an artifact? Learning and Motivation, 31, 345–363. Harmon, K., Strong, R., & Pasnak, R. (1982). Relational responses in tests of transposition with rhesus monkeys. Learning and Motivation, 13, 495–504. Hayes, S. C. (1989). Nonhumans have not yet shown stimulus equivalence. Journal of the Experimental Analysis of Behavior, 51, 385–392. Hayes, S. C., Barnes-Holmes, D., & Roche, B. (Eds.). (2001). Relational Frame Theory: A Post-Skinnerian account of human language and cognition. New York, NY: Plenum Press. Hayes, S. C., Barnes-Holmes, D., & Wilson, K. (2012). Contextual behavioral science: Creating a science more adequate to the challenge of the human condition. Journal of Contextual Behavioral Science, 1, 1–16. Hayes, S. C., & Brownstein, A. J. (1986). Mentalism, behaviorbehavior relations, and a behavior-analytic view of the purposes of science. The Behavior Analyst, 9, 175–190. Hayes, S. C., Levin, M. E., Plumb-Vilardaga, J., Villatte, J. L., & Pistorello, J. (2013). Acceptance and commitment therapy and contextual behavioral science: Examining the progress of a distinctive model of behavioral and cognitive therapy. Behavior Therapy, 44, 180–198. Hermans, D., Baeyens, F., Lamote, S., Spruyt, A., & Eelen, P. (2005). Affective priming as an indirect measure of food preferences acquired through odor conditioning. Experimental Psychology, 52, 180–186. Hofmann, W., De Houwer, J., Perugini, M., Baeyens, F., & Crombez, G. (2010). Evaluative conditioning in humans: A meta-analysis. Psychological Bulletin, 136, 390–421. Hollands, G. J., Prestwich, A., & Marteau, T. M. (2011). Using aversive images to enhance healthy food choices and implicit attitudes: An experimental test of evaluative conditioning. Health Psychology, 30, 195–203. Houben, K., Schoenmakers, T. M., & Wiers, R. W. (2010). I didn’t feel like drinking but I don’t know why: The effects of evaluative conditioning on alcohol-related attitudes, craving and behavior. Addictive Behaviors, 35, 1161–1163. Hughes, S. J., & Barnes-Holmes, D. (2016a). Relational Frame Theory: The basic account. In R. D. Zettle, S. C. Hayes, D. Barnes-Holmes, & A. Biglan (Eds.), The Wiley handbook of contextual behavioral science (pp. 178–226). West Sussex, UK: Wiley-Blackwell. Hughes, S. J., & Barnes-Holmes, D. (2016b). Relational Frame Theory: Implications for the study of human language and cognition. In R. D. Zettle, S. C. Hayes, D. Barnes-Holmes, & A. Biglan (Eds.), The Wiley handbook of contextual behavioral science (pp. 129–178). West Sussex, UK: Wiley-Blackwell.

Ó 2016 Hogrefe Publishing

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

Hughes, S., & Barnes-Holmes, D. (2014). Associative concept learning, stimulus equivalence, and Relational Frame Theory: Working out the similarities and differences between human and nonhuman behavior. Journal of the Experimental Analysis of Behavior, 101, 156–160. Hughes, S., Barnes-Holmes, D., & De Houwer, J. (2011). The dominance of associative theorizing in implicit attitude research: Propositional and behavioral alternatives. The Psychological Record, 61, 465–498. Hughes, S., Barnes-Holmes, D., & Vahey, N. (2012). Holding on to our functional roots when exploring new intellectual islands: A voyage through implicit cognition research. Journal of Contextual Behavioral Science, 1, 17–38. Hütter, M., Sweldens, S., Stahl, C., Unkelbach, C., & Klauer, K. C. (2012). Dissociating contingency awareness and conditioned attitudes: Evidence of contingency-unaware evaluative conditioning. Journal of Experimental Psychology: General, 141, 539–557. Jones, C. R., Fazio, R. H., & Olson, M. A. (2009). Implicit misattribution as a mechanism underlying evaluative conditioning. Journal of Personality and Social Psychology, 96, 933–948. Kawakami, K., Phills, C. E., Steele, J. R., & Dovidio, J. F. (2007). (Close) distance makes the heart grow fonder: Improving implicit racial attitudes and interracial interactions through approach behaviors. Journal of Personality and Social Psychology, 92, 957–971. Kerkhof, I., Vansteenwegen, D., Baeyens, F., & Hermans, D. (2010). Counterconditioning: An effective technique for changing conditioned preferences. Experimental Psychology, 58, 31–38. Klucken, T., Kagerer, S., Schweckendiek, J., Tabbert, K., Vaitl, D., & Stark, R. (2009). Neural, electrodermal and behavioral response patterns in contingency aware and unaware subjects during a picture-picture conditioning paradigm. Neuroscience, 158, 721–731. Leader, G., & Barnes-Holmes, D. (2001). Matching-to-sample and respondent-type training as methods for producing equivalence relations: Isolating the critical variable. The Psychological Record, 51, 429–444. Leader, G., Barnes, D., & Smeets, P. (1996). Establishing equivalence relations using a respondent-type training procedure. The Psychological Record, 46, 685–706. Levin, M. E., & Hayes, S. C. (2009). ACT, RFT, and contextual behavioral science. In J. T. Blackledge, J. Ciarrochi, & F. P. Deane (Eds.), Acceptance and Commitment Therapy: Contemporary research and practice (pp. 1–40). Sydney, Australia: Australian Academic Press. Liefooghe, B., & De Houwer, J. (2016). A functional approach for research on cognitive control: Analyzing cognitive control tasks and their effects in terms of operant conditioning. International Journal of Psychology, 51, 28–32. Lionello-DeNolf, K. M. (2009). The search for symmetry: 25 years in review. Learning & Behavior, 37, 188–203. Lubow, R. E., & Gewirtz, J. C. (1995). Latent inhibition in humans: Data, theory, and implications for schizophrenia. Psychological Bulletin, 117, 87–103. Luciano, C., Becerra, I. G., & Valverde, M. R. (2007). The role of multiple-exemplar training and naming in establishing derived equivalence in an infant. Journal of the Experimental Analysis of Behavior, 87, 349–365. Mann, T. C., & Ferguson, M. J. (2015). Can we undo our first impressions? The role of reinterpretation in reversing implicit evaluations. Journal of Personality and Social Psychology, 108, 823–849. Martin, I., & Levey, A. B. (1994). The evaluative response: Primitive but necessary. Behavior Research and Therapy, 32, 301–305.

Ó 2016 Hogrefe Publishing

McHugh, L., & Stewart, I. (2012). The self and perspective taking: Contributions and applications from modern behavioral science. Oakland, CA: New Harbinger. Meiser, T. (2011). Much pain, little gain? Paradigm-specific models and methods in experimental psychology. Perspectives on Psychological Science, 6, 183–191. Mineka, S., & Cook, M. (1988). Social learning and the acquisition of snake fear in monkeys. In T. R. Zentall (Ed.), Social learning: psychological and biological perspectives (pp. 51–72). New York, NY: Routledge. Mitchell, C. J., De Houwer, J., & Lovibond, P. F. (2009). The propositional nature of human associative learning. Behavioral and Brain Sciences, 32, 183–198. Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. (2000). The unity and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent variable analysis. Cognitive Psychology, 41, 49–100. Moran, T., & Bar-Anan, Y. (2013). The effect of object-valence relations on automatic evaluation. Cognition and Emotion, 4, 743–752. Moran, T., Bar-Anan, Y., & Nosek, B. (2015). Processing goals moderate the effect of co-occurrence on automatic evaluation. Journal of Experimental Social Psychology, 60, 157–162. Nosek, B. A., Hawkins, C. B., & Frazier, R. S. (2011). Implicit social cognition: From measures to mechanisms. Trends in Cognitive Sciences, 15, 152–159. O’Hora, D., & Barnes-Holmes, D. (2004). Instructional control: Developing a relational frame analysis. International Journal of Psychology and Psychological Therapy, 4, 263–284. O’Hora, D., Barnes-Holmes, D., Roche, B., & Smeets, P. M. (2004). Derived relational networks and control by novel instructions: A possible model of generative verbal responding. The Psychological Record, 54, 437–460. O’Hora, D., Pelaez, M., Barnes-Holmes, D., Rae, G., Robinson, K., & Chaudhary, T. (2008). Temporal relations and intelligence: Correlating relational performance with performance on the WAIS-III. The Psychological Record, 58, 569–584. Olson, M. A., & Fazio, R. H. (2001). Implicit attitude formation through classical conditioning. Psychological Science, 12, 413–417. Payne, B. K., Cheng, S. M., Govorun, O., & Stewart, B. D. (2005). An inkblot for attitudes: Affect misattribution as implicit measurement. Journal of Personality and Social Psychology, 89, 277–293. Peters, K. R., & Gawronski, B. (2011). Are we puppets on a string? Comparing the impact of contingency and validity on implicit and explicit evaluations. Personality and Social Psychology Bulletin, 37, 557–569. Pettigrew, T. F., & Tropp, L. R. (2006). A meta-analytic test of intergroup contact theory. Journal of Personality and Social Psychology, 90, 751–783. Pinter, B., & Greenwald, A. G. (2004). Exploring implicit partisanship: Enigmatic (but genuine) group identification and attraction. Group Processes & Intergroup Relations, 7, 283–396. Pleyers, G., Corneille, O., Luminet, O., & Yzerbyt, V. (2007). Aware and (dis)liking: Item-based analyses reveal that valence acquisition via evaluative conditioning emerges only when there is contingency awareness. Journal of Experimental Psychology: Learning, Memory, & Cognition, 33, 130–144. Proctor, R. W., & Urcuioli, P. J. (2016). Functional relations and cognitive psychology: Lessons from human performance and animal research. International Journal of Psychology, 51, 58–63. Raes, A. K., De Houwer, J., De Schryver, M., Brass, M., & Kalisch, R. (2014). Do CS-US pairings actually matter? A within-subject

Experimental Psychology 2016; Vol. 63(1):20–44

S. Hughes et al., The Moderating Impact of Distal Regularities on the Effect of Stimulus Pairings

comparison of instructed fear conditioning with and without actual CS-US pairings. PLoS One, 9, e84888. doi: 10.1371/ journal.pone.0084888 Reese, H. W. (1968). The perception of stimulus relations: Discrimination learning and transposition. New York, NY: Academic Press. Rehfeldt, R. A., & Barnes-Holmes, Y. (2009). Derived relational responding: Applications for learners with autism and other developmental disabilities. Oakland, CA: New Harbinger. Roche, B., Barnes-Holmes, D., Smeets, P. M., Barnes-Holmes, Y., & McGeady, S. (2000). Contextual control over the derived transformation of discriminative and sexual arousal functions. The Psychological Record, 50, 267–292. Roche, B. T., Kanter, J. W., Brown, K. R., Dymond, S., & Fogarty, C. C. (2008). A comparison of “direct” versus “derived” extinction of avoidance responding. The Psychological Record, 58, 443–464. Rydell, R. J., & McConnell, A. R. (2006). Understanding implicit and explicit attitude change: A systems of reasoning analysis. Journal of Personality and Social Psychology, 91, 995–1008. Smith, C. T., De Houwer, J., & Nosek, B. A. (2013). Consider the source: Persuasion of implicit evaluations is moderated by source credibility. Personality and Social Psychology Bulletin, 39, 193–205. Smyth, S., Barnes-Holmes, D., & Forsyth, J. P. (2006). A derived transfer of simple discrimination and self-reported arousal functions in spider fearful and non-spider fearful participants. Journal of the Experimental Analysis of Behavior, 85, 223–246. Stahl, C., & Unkelbach, C. (2009). Evaluative learning with single versus multiple unconditioned stimuli: The role of contingency awareness. Journal of Experimental Psychology: Animal Behavior Processes, 35, 286–291. Törneke, N. (2010). Learning RFT: An introduction to relational frame theory and its clinical applications. Oakland, CA: New Harbinger. Törneke, N., Luciano, C., & Valdivia Salas, S. (2008). Rule-governed behavior and psychological problems. International Journal of Psychology and Psychological Therapy, 8, 141–156. Urcuioli, P. J. (2008). Associative symmetry, anti-symmetry, and a theory of pigeons’ equivalence class formation. Journal of the Experimental Analysis of Behavior, 90, 257–282. van Reekum, C. M., van den Berg, H., & Frijda, N. H. (1999). Crossmodal preference acquisition: evaluative conditioning of pictures by affective olfactory and auditory cues. Cognition and Emotion, 13, 831–836. Valdivia-Salas, S., Dougher, M., & Luciano, C. (2013). Derived relations and generalized alteration of preferences. Learning and Behavior, 41, 205–217. Vaughan, W. Jr. (1988). Formation of equivalence sets in pigeons. Journal of Experimental Psychology: Animal Behavior Processes, 14, 36–42. Vitale, A., Barnes-Holmes, Y., Barnes-Holmes, D., & Campbell, C. (2008). Facilitating responding in accordance with the relational frame of comparison: Systematic empirical analyses. The Psychological Record, 58, 365–390. Walther, E. (2002). Guilty by mere association: Evaluative conditioning and the spreading attitude effect. Journal of Personality and Social Psychology, 82, 919–934.

Experimental Psychology 2016; Vol. 63(1):20–44

Walther, E., Gawronski, B., Blank, H., & Langer, T. (2009). Changing likes and dislikes through the backdoor: The US-revaluation effect. Cognition and Emotion, 23, 889–917. Walther, E., Langer, T., Weil, R., & Komischke, M. (2011). Preferences surf on the currents of words: Implicit verb causality influences evaluative conditioning. European Journal of Social Psychology, 41, 17–22. Walther, E., Nagengast, B., & Trasselli, C. (2005). Evaluative conditioning in social psychology: Facts and speculations. Cognition and Emotion, 19, 175–196. Whelan, R., & Barnes-Holmes, D. (2004). The transformation of consequential functions in accordance with the relational frames of same and opposite. Journal of the Experimental Analysis of Behavior, 82, 177–195. Wilson, K. G., Whiteman, K., & Bordieri, M. (2013). The pragmatic truth criterion and values in contextual behavioral science. In S. Dymond & B. Roche (Eds.), Advances in relational frame theory & contextual behavioral science research & application. Oakland, CA: New Harbinger Press. Wittenbrink, B., Judd, C. M., & Park, B. (1997). Evidence for racial prejudice at the implicit level and its relationship to questionnaire measures. Journal of Personality and Social Psychology, 72, 262–274. Yamazaki, Y., Saiki, M., Inada, M., Iriki, A., & Watanabe, S. (2014). Transposition and its generalization in common marmosets. Journal of Experimental Psychology: Animal Learning and Cognition, 40, 317. Zanon, R., De Houwer, J., & Gast, A. (2012). Context effects in evaluative conditioning of implicit evaluations. Learning and Motivation, 43, 155–165. Zanon, R., De Houwer, J., Gast, A., & Smith, C. T. (2014). When does relational information influence evaluative conditioning? The Quarterly Journal of Experimental Psychology, 67, 2105–2122. Zentall, T. R. (2012). Perspectives on observational learning in animals. Journal of Comparative Psychology, 126, 114–128. Zentall, T. R., Wasserman, E. A., & Urcuioli, P. J. (2014). Associative concept learning in animals. Journal of the Experimental Analysis of Behavior, 101, 130–151. Received April 5, 2014 Revised August 26, 2015 Accepted September 29, 2015 Published online March 29, 2016 Sean Hughes Department of Experimental Clinical and Health Psychology Ghent University Henri Dunantlaan 2 9000 Ghent Belgium Tel. +32 473 132-983 Fax +32 926 46-489 E-mail sean.hughes@ugent.be

Ó 2016 Hogrefe Publishing

Research Article

Exploring the Subjective Feeling of Fluency Michael Forster, Helmut Leder, and Ulrich Ansorge Department of Basic Psychological Research and Research Methods, Faculty of Psychology, University of Vienna, Austria

Abstract: According to the processing fluency theory, higher ease of processing a stimulus leads to higher feelings of fluency and more positive evaluations. However, it is unclear whether feelings of fluency are positive or an unspecific activation and whether feelings of fluency are directly attributed to the stimulus even without much positive feelings. In two experiments, we tested how variations in the ease of processing influenced feelings of fluency and affect, in terms of evaluations (Exp. 1) and physiological responses (Exp. 2). Higher feelings of fluency were associated with more positive stimulus ratings and did not affect stimulus arousal ratings, but perceivers’ feelings showed higher felt arousal ratings and left felt valence ratings unaffected. Physiological indices only showed small effects of a subtle positive reaction. These findings show that feelings of fluency can be sources of positive object evaluations, but do not affect one’s own positive feelings. Keywords: feeling of fluency, ease of processing, arousal, valence, evaluation

According to the feelings-as-information theory, feelings can impact our everyday judgments (Schwarz, 2011). One source for these feelings is processing fluency, which is the ease of processing a stimulus.1 Previous research showed that individuals can consciously report feelings of fluency coming along with a higher ease of processing (Forster, Leder, & Ansorge, 2013; Reber, Wurtz, & Zimmermann, 2004; Regenberg, Häfner, & Semin, 2012; but see Topolinski & Strack, 2009). However, it is yet unclear how these feelings of fluency impact affective evaluations. Therefore, in two experiments we tested how variations in the ease of processing influenced feelings of fluency and affect, in terms of evaluations (Exp. 1) and physiological responses (Exp. 2). In a first behavioral experiment we tried to identify whether the feeling of fluency is associated with the affective evaluation of the perceived object and/or the perceivers’ evaluations of their own feelings. This influence was tested on the two core dimensions of affective processing, valence and arousal (Russell, 2003). In a second experiment we applied a combined measure of facial electromyography (fEMG) and skin conductance (SC) to test whether feelings of fluency are hedonically marked (as indicated by fEMG activation) and/or reflect (valenceindependent) arousal (as indicated by skin conductance variation). The theoretical rationale of our study was the following: The theory of processing fluency states that higher ease of 1

processing leads to a higher subjective feeling of fluency within the perceivers, which, in turn, can influence their evaluations of, for example, liking, familiarity, or truth (for an overview see Reber, Schwarz, & Winkielman, 2004). This connection between fluency and evaluations can be explained as follows: Ease of processing signals a positive state of affairs, error-free processing, and absence of threat (Winkielman, Schwarz, Fazendeiro, & Reber, 2003). Thus, ease of processing signals safety and therefore is said to be positively valenced (see also Zajonc, 2001). Previous research has shown that the ease of processing can be subjectively felt and reported (Forster et al., 2013; Reber, Wurtz, et al., 2004). When evaluating how much an object is liked, ratings for easier to process stimuli are enhanced because the positive feeling elicited by the ease is falsely interpreted as a positive reaction toward the stimulus. Empirical tests of the intrinsic positivity of ease of processing are, however, sparse (see Gerger, Leder, Tinio, & Schacht, 2011; Topolinski, Likowski, Weyers, & Strack, 2009; or Winkielman & Cacioppo, 2001, for notable exceptions). Yet, some findings in processing fluency remain difficult to explain by an intrinsic positivity of the feeling of fluency. Higher ease of processing increases evaluations also on dimensions without a clear valence, such as stimulus brightness or darkness (Mandler, Nakamura, & Van Zandt, 1987), or loudness (Jacoby, Allan, Collins, & Larwill, 1988). These findings were accommodated with a perceptual fluency/ attributional account (Bornstein & D’Agostino, 1994).

From the literature there is no indication that the terms “processing fluency” and “ease of processing” denote two distinct processes. Here, we use “ease of processing” to describe the process of fluency and “processing fluency” when referring to the theory.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):45–58 DOI: 10.1027/1618-3169/a000311

This account states that higher ease of processing is subjectively felt as valence-independent unspecific activation, which is then attributed to the stimulus. Recent evidence furthermore suggests that fluency experiences do not unanimously lead to more positive evaluations. Manipulating the ease of processing of negative, neutral, and positive IAPS pictures, Albrecht and Carbon (2014) could show that higher processing ease amplified the affective evaluation; with higher processing ease, positive IAPS pictures were rated more positively and negative IAPS pictures more negatively. Neutral IAPS pictures were not affected by the manipulations. To sum up, previous research on processing fluency suggested that the feeling of fluency, elicited by a higher ease of processing, is either of positive valence or of an unspecific activation. In both cases this feeling should be attributed to the stimulus as its origin so that an evaluation of the stimulus should reflect the influences of ease of processing. To shed light on this explanation, in a first experiment we measured the effect of a fluency manipulation on evaluations of stimuli on the two core dimensions of affective processing: In one group valence was evaluated – representing a positive or negative reaction – and in a second group arousal was evaluated – representing the valenceindependent unspecific activation. Depending on the theory, there are different predictions for these two dimensions: First, if ease of processing is positive, stimuli should be evaluated more positively with higher ease of processing. Stimulus arousal ratings, on the other hand, might or might not increase with higher ease of processing, as high arousal is not necessarily of positive valence. According to the circumplex model of emotions, a high arousal can be characteristic of positive affect, such as when feeling excited, but it can also be characteristic of negative affect, such as when feeling alarmed (Russell, 1980, 2003). Second, if ease of processing elicits a valence-independent unspecific activation, stimulus arousal ratings should increase with higher ease of processing. Following the notion of Bornstein and D’Agostino (1994), who claimed that stimulus ratings on all dimensions are influenced by higher ease, also stimulus positivity ratings should increase with higher ease of processing. Moreover, the first experiment addresses a second important question regarding the feeling of fluency. According to the theory of processing fluency, the ease of processing a stimulus leads to a feeling of fluency, which then affects the evaluation of the stimulus. In previous studies, mainly this evaluation of the stimulus was measured. It is thus unclear whether and how the feelings of the perceivers themselves are affected. Since long it is clear that attributions of emotions can occur quickly (Arnold, 1960). Therefore, the second aim of the study

Experimental Psychology 2016; Vol. 63(1):45–58

M. Forster et al., Fluency and Affective Processing

was to identify whether the feeling of fluency, elicited by the stimulus, is only reflected in the evaluation of the stimulus or whether evaluations of the participants’ own feelings are affected as well. On the one hand, it is possible that interpreting their feeling of fluency the perceivers feel aroused or positive themselves. On the other hand, if felt fluency is directly attributed to the stimulus which elicited the feeling of fluency, a higher feeling of fluency might not be felt as generally positive or arousing, but might only be found in the evaluations of the stimulus. We tested these possibilities by including two additional conditions, in which two separate groups of participants in each trial were asked to rate their feeling of valence or arousal. If the feeling of fluency directly affects the evaluation of the stimulus, the perceiver might not feel aroused or more positive. On the other hand, if the feeling of fluency changes the feelings of the perceivers, which persist even if they are attributed to the stimulus as their origin, the evaluations of the perceivers about their own feelings might be affected, too. Capturing the affective nature of the feelings of fluency with self-reports offers insight into the consciously accessible and reportable experiences participants have after a manipulation of processing ease. However, those parts of the fluency experience that influence our evaluations without being consciously accessible or which are only experienced at the “fringes of consciousness” (Reber, Fazendeiro, & Winkielman, 2002; Topolinski & Strack, 2009) can hardly be captured by self-reports. Thus, in a second experiment we measured physiological indices of valence (fEMG) and arousal (SC) while a new set of participants performed a similar experimental task as in Experiment 1. The method of facial electromyography (facialEMG) allows unobtrusively measuring even subtle changes of affective processing (see, e.g., Larsen, Berntson, Poehlmann, Ito, & Cacioppo, 2008; Larsen, Norris, & Cacioppo, 2003). Affectively positive responses are indicated by activations in the M. zygomaticus major region, a muscle region involved in smiling, and by relaxations in the M. corrugator supercilii region, a muscle region involved in frowning. Affectively negative responses are indicated by activation in the M. corrugator supercilii region. Previous research has already successfully demonstrated that manipulations of ease of processing influence fEMG activation (Gerger et al., 2011; Topolinski et al., 2009; Winkielman & Cacioppo, 2001). Thus, if fluency is hedonically marked (Winkielman et al., 2003), high ease of processing should lead to activation in the M. zygomaticus major region and relaxation in the M. corrugator supercilii region.

Ó 2016 Hogrefe Publishing

M. Forster et al., Fluency and Affective Processing

Measuring the skin conductance (SC) allows capturing the autonomic arousal (Boucsein, 2012). The skin conductance response (SCR) to a stimulus occurs around 1–3 s after stimulus onset (Morris, Cleary, & Still, 2008). The effect of ease of processing on indices of autonomic arousal has, to our knowledge, not yet been studied. According to the perceptual fluency/attributional account (Bornstein & D’Agostino, 1994), suggesting that higher ease of processing results in a valence-independent unspecific activation, we should find higher SCRs with higher ease of processing. Research on feelings of familiarity, an adjacent concept to feelings of fluency (Whittlesea & Williams, 2000, 2001a, 2001b), showed mixed findings on the association between skin conductance and familiarity. Typically, higher SCRs are associated with stimulus novelty (see Dawson, Schell, & Filion, 2000, for an overview). However, for example in faces, higher SCRs were also associated with higher familiarity (Tranel, Fowles, & Damasio, 1985, but see Ellis, Quayle, & Young, 1999). According to Morris et al. (2008), different task demands might be responsible for these inconsistent findings. They argue that when the task demands attention to novelty, novel items are more salient and show higher SCRs. However, when the task demands attention to familiarity, familiar items are more salient and show higher SCRs (see also Öhman, 1979). Studying the feelings of familiarity Morris et al. (2008) could furthermore show that familiar (and thus easier to process) stimuli compared to unfamiliar stimuli were associated with longer SCR latencies. This indicates that reactions to familiar items are slower. Taken together, previous research allows no clear predictions regarding the effect of ease of processing on the SCR. Nonetheless, following the notion that feelings of fluency could be a valence-independent unspecific activation and the fact that higher ease of processing makes a stimulus more salient, we expected higher SCRs at higher ease of processing.

Experiment 1 Method We studied the effects of an ease of processing manipulation on ratings of valence and arousal of both the stimulus and the perceiver. We manipulated the ease of processing by varying the presentation duration of the images between 100 and 400 ms. In previous research, manipulations of image duration led to reliable effects on a (usually not measured and only assumed) subjective experience of ease of processing – that is the feeling of fluency (Forster et al., 2013) – and on (usually measured) liking evaluations (Forster et al., 2013; Reber, Winkielman, & Schwarz, 1998; Winkielman & Cacioppo, 2001). Previous research suggests that awareness of the manipulation is detrimental to effects of fluency on liking (so-called discounting, see Bornstein & D’Agostino, 1994; but see Newell & Shanks, 2007). Although a manipulation of feelings by stimulus duration may be apparent for the participants in hindsight, it seems that the participants do not always recognize the duration manipulation as a source of the ease of processing and therefore the effects of the manipulation seem to be resistant against discounting (see, e.g., Forster et al., 2013; or Reber et al., 1998).

The Present Research

Participants A total of 192 German-speaking volunteers (130 female, Mage = 23.17 years, SDage = 5.25) participated. The sample consisted of students recruited over the departments’ participant system and of volunteers recruited by the experimenters around the University premises. In postquestionnaires we ensured that all participants were naïve regarding the purpose of the experiment. Prior to the experiment, all participants signed a consent form informing the participants that they could withdraw at any time during the experiment without any further consequences. Furthermore, we ensured that participants had sufficient visual acuity (close reading test by Nieden, OculusÓ, Wetzlar, Germany) and color vision (plates 1, 11, and 13 of Ishihara, 1917).

In this article, we present two experiments: First, a behavioral experiment in which we tested whether variations in ease of processing are reflected in valence or arousal ratings of both the stimulus and/or the perceiver; second, a physiological experiment in which we measured how variations in ease of processing influence physiological indices of facialEMG (reflecting valence) and of skin conductance (reflecting arousal). In both experiments we additionally tested whether the variations in ease of processing indeed influence self-reported feelings of fluency. Taken together, both experiments should provide insight into the nature and effect of feelings of fluency.

Stimuli As stimuli, we selected 100 black-and-white images (size: 3.2 2.3 inch, or 6.79° 4.75° of visual angle at a viewing distance of 27.56 inches) from the picture set of Rossion and Pourtois (2004). This set comprises 260 simple line drawings of objects (e.g., a tree, an anchor, a dog, etc.). As too high/low valences might override the effect of ease of processing, prior to the experiment, all stimuli were prerated on valence (N = 7 participants). Nine stimuli with very high and six stimuli with very low valence were excluded from the initial set. For the remaining stimuli the mean

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):45–58

M. Forster et al., Fluency and Affective Processing

Figure 1. Sequence of a trial in Experiment 1. FF = felt fluency.

was between 3.5 and 5.8 on a 7-point scale ranging from 1 to 7. From the remaining stimuli we selected two sets of 50 images each. In these sets complexity, familiarity (both ratings provided by Rossion & Pourtois, 2004), and valence were roughly equal. Because all drawings were initially easy to perceive, difficulty was increased by adding 60% Gaussian noise to the images using Adobe Photoshop CS4. This manipulation avoids floor effects in felt fluency ratings. Furthermore, noise increases perceptual uncertainty, a favorable precondition for measuring felt fluency (see also Forster et al., 2013). Design and Procedure The feeling dimensions (valence, arousal) and the rating dimensions (stimulus evaluations, participants’ feeling) were crossed, resulting in four different conditions that varied between participants (n = 48 per condition). Thus the four conditions differed from each other in the evaluations participants made – that is, evaluations about how arousing (Condition 1) or positive (Condition 2) the stimulus was, or how aroused (Condition 3) or positive (Condition 4) the participant felt. Upon arrival, each participant was pseudorandomly assigned to one condition. In each condition, we also tested whether the ease of processing manipulation affected felt fluency (Forster et al., 2013). For this purpose, in a separate block we asked our participants to evaluate their felt fluency. The longer the stimulus has been presented, the higher the feeling of fluency should be. Experimental Psychology 2016; Vol. 63(1):45–58

Thus, the participants rated each image twice; in one block on the feeling dimension – valence or arousal – and in the other block on felt fluency. Block order was balanced across participants. Ease of processing was manipulated through different presentation durations. For that purpose, we divided the pool of stimuli into four sets of 25 images each. Next, we assigned one of four presentation durations (100, 200, 300, or 400 ms) to each of the four sets. Across participants, we varied the assignment of stimulus sets to presentation durations according to a Latin square design. Thus, for each condition there were eight versions: 4 (Set Presentation duration) 2 Block orders. This assured that across all participants each stimulus was presented at all four different presentation durations and that effects of block order were testable. The experiment started with an instruction in which participants were told that they should rate line drawings in two separate experiments. This instruction was intended to conceal the link between felt fluency evaluations and affective evaluations. After the general instruction, four practice trials were administered to familiarize the participants with task and stimuli. Each experimental trial started with a fixation cross for 2 s, followed by the target image for 100, 200, 300, or 400 ms. Next, to limit visual recognition, a random noise mask (60% Gaussian noise) was presented for 500 ms. Finally, the response was prompted (see Figure 1). Depending on the condition, in one block participants rated either stimulus valence or Ó 2016 Hogrefe Publishing

M. Forster et al., Fluency and Affective Processing

Table 1. Means and standard deviations (in parentheses) separated by presentation duration in all four conditions Presentation duration Feeling dimension

Rating target

Valence

Stimulus Felt Stimulus Felt

Arousal

100 ms M M M M

(SD) (SD) (SD) (SD)

arousal, or felt valence or arousal2 on a 9-point selfassessment manikin (SAM) scale (Bradley & Lang, 1994; modified by Irtel, 2007). Following Forster et al. (2013), participants rated their felt fluency of perception (“Wie leicht ist es Ihnen gefallen das Bild wahrzunehmen?” [“How easy was the perception of the image?”]) on a Likert-type scale from 1 (= very hard) to 7 (= very easy) in a separate block. Because the stimuli in all trials were only mildly emotionally toned – being simple schematic line drawings of objects – we asked the participants to evaluate felt fluency and valence/arousal in comparison with the felt fluency and the valence/arousal elicited in other trials. For the first trials, we instructed participants to compare felt fluency and valence/arousal to the practice trials. In other words, the reference for the subsequent ratings was established during the practice trials. Within the blocks, presentation order of the stimuli and different durations were randomized. After completing the experiment, the participants were debriefed and thanked. The experiment was run using E-PrimeÒ 2.0 (Schneider, Eschman, & Zuccolotto, 2002) and was presented on a 1900 display at a resolution of 1,280 1,024 pixels and a refresh rate of 60 Hz. To help distinguishing the participants’ valence/arousal evaluations of the stimulus from the participants’ valence/arousal evaluations of their feelings, throughout the Results section we used the terms stimulus valence/arousal and felt valence/arousal, respectively.

Results and Discussion To investigate whether ease of processing influenced the feeling dimensions and felt fluency, we performed two separate 4 2 2 mixed-design analyses of variance (ANOVAs): one for the feeling dimensions valence and arousal, and one for felt fluency. Ease of processing (presentation duration: 100, 200, 300, or 400 ms) was a within-participants variable. The feeling dimension (valence, arousal) and the rating target (stimulus, participants’ feeling) were between-participants variables. 2

5.16 5.09 3.71 3.74

(0.79) (0.63) (1.35) (1.39)

200 ms 5.39 5.20 3.75 3.85

(0.80) (0.58) (1.23) (1.44)

300 ms 5.49 5.20 3.71 3.90

(0.77) (0.53) (1.19) (1.56)

400 ms 5.57 5.27 3.63 4.06

(0.89) (0.48) (1.11) (1.56)

The respective ratings of valence, arousal, and felt fluency served as dependent variables. In all analyses, the level of statistical significance was p < .05. When the sphericity assumption was violated, a Greenhouse-Geisser correction was applied and the corrected degrees of freedom are reported. For multiple pairwise comparisons, the level of alpha was Bonferroni-corrected. Raw data (ESM 1-4), aggregated data (ESM 5), and analysis routines (ESM 10) from Experiment 1 can be found online. For the feeling dimensions (valence and arousal), the ANOVA showed a significant main effect of ease of processing, F(2.5, 464.36) = 8.74, p < .001, ηp2 = .04, and of feeling dimension, F(1, 188) = 103.68, p < .001, ηp2 = .36. As indicated by the means (see Table 1), ratings of both valence and arousal in the 100 ms condition were significantly lower than in the 200, 300, and 400 ms conditions, respectively ( ps < .016), and valence ratings were higher than arousal ratings. The main effect of rating target was not significant, F(1, 188) = 0.04, p = .949, ηp2 < .001. The significant main effects, however, were qualified by a significant three-way interaction between ease of processing, feeling dimension, and rating target, F(2.47, 464.36) = 5.07, p = .002, ηp2 = .03. Thus, we analyzed the pattern of the mean differences with pairwise comparisons (Bonferroni-corrected). For stimuli, the valence ratings (100 ms vs. 200, 300, and 400 ms, respectively, ps < .027), but not the arousal ratings ( ps > .849), significantly increased with increasing presentation duration. For ratings of the perceivers’ feeling, the arousal ratings (400 ms vs. 100 and 200 ms, respectively, ps < .052), but not the valence ratings ( ps > .557), significantly increased with increasing presentation duration (see Figure 2). This dissociation indicates that when rating a stimulus, a higher feeling of fluency is attributed to a higher valence, as predicted by the theory of processing fluency. However, when rating the personal feeling, a higher feeling of fluency is instead attributed to a higher arousal. This latter finding indicates that positive stimulus evaluations are not due to a general positive feeling on the side of the perceiver which is then attributed to the stimulus, but rather

The respective questions in the experiment were: for stimulus valence (“Wie positiv/negativ ist dieses Bild?” [“How positive/negative is the image?”]), for stimulus arousal (“Wie stark aktiviert dieses Bild?” [“How arousing is the image?”]), for felt valence (“Welche Stimmung löst die Wahrnehmung des Bilds aus?” [“In which mood does the perception of the image set you?”]), and for felt arousal (“Wie stark aktiviert Sie die Wahrnehmung dieses Bilds?” [“How strongly do you get aroused by the perception of this image?”]).

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):45–58

M. Forster et al., Fluency and Affective Processing

Figure 2. Mean scores for ratings of stimulus (dotted line) and felt (dashed line) valence (circles) and arousal (triangles). Significant effects are indicated by p values (from Bonferroni-corrected pairwise comparisons). Error bars represent standard errors of the mean.

Figure 3. Mean ratings for felt fluency in all four conditions of Experiment 1 and in Experiment 2 (all pairwise comparisons, ps < .007, except 300 ms vs. 400 ms). Error bars represent standard errors of the mean. Cond. = condition; FF = felt fluency.

that higher ease of processing directly influences the liking of the object. For felt fluency, the mixed-design ANOVA only showed a main effect of ease of processing, F(2.6, 482.44) = 116.80, p < .001, ηp2 = .38 (see Figure 3). None of the other effects approached significance ( ps > .321). Pairwise comparisons showed that the longer the stimuli were presented, the higher the felt fluency (all pairwise comparisons, p < .001, except 300 vs. 400 ms, p = .055). This finding is in line with previous findings showing that differences in ease of processing can be explicitly reported (Forster et al., 2013; Reber, Wurtz, et al., 2004). Crucially, it also shows that differences in felt fluency were present in all the conditions and that the four conditions did not differ in the ratings of felt fluency. Thus, the above-stated differences in the influence of ease of processing on the feeling dimensions are unlikely to be caused by differences in the feeling of Experimental Psychology 2016; Vol. 63(1):45–58

fluency. It is rather likely that for stimulus valence and felt arousal fluency is more used as a source and for stimulus arousal and felt valence it is less so. Ratings of felt fluency and ratings of arousal and valence are based on the same stimuli. Showing the same images in the felt fluency block and the feeling dimension block allows stronger conclusions regarding the relationship of felt fluency and evaluated feeling dimensions. This procedure, however, might lead to confounding effects of block order and especially of additional fluency due to stimulus repetition. Therefore, in an additional analysis we tested for possible effects of block order. For the sake of parsimony only significant effects including block order will be reported. For the feeling dimensions, a 4 2 2 2 mixed-design ANOVA (all factors from above plus block order) yielded a significant interaction between rating target and block Ó 2016 Hogrefe Publishing

M. Forster et al., Fluency and Affective Processing

order, F(1, 184) = 6.32, p = .013, ηp2 = .03. For stimuli, but not for feelings, ratings were higher in the second than in the first block ( p = .019). This indicates a mere exposure effect (Zajonc, 2001) of repeated presentation on stimulus ratings. As the effect is the same for arousal and valence – indicated by the absence of any interactions between block order and the feeling dimensions – this effect does not challenge our conclusions. All other effects including block order were not significant ( ps > .078). For felt fluency, a 4 2 2 2 mixed-design ANOVA (all factors from above plus block order) yielded a significant main effect of block order, F(1, 184) = 10.47, p = .001, ηp2 = .05. Felt fluency ratings were significantly higher when tested first. This finding is not compatible with a mere exposure explanation, which would predict higher felt fluency ratings when stimuli are repeated. Maybe participants recognized the repetition and, as a consequence, fluency effects were discounted (Bornstein & D’Agostino, 1994). Be that as it may, fluency effects as a consequence of the duration manipulation remained intact. This was indicated by the absence of an interaction between duration effects and block order. In summary, block order does not challenge the conclusions from our main analyses. Though the behavioral results indicate that participants consciously experienced fluency, in Experiment 2 we assessed the affective components of fluency experiences by physiological measures. By measuring fEMG and skin conductance responses we might capture some affective effects of ease of processing manipulations that are not consciously accessible or reportable.

Experiment 2 Method Participants A total of 26 university students (18 female, Mage = 22.6 years, SDage = 5.7) were recruited via the departments’ participant system and participated in return for partial course credit. As in Experiment 1, all participants had sufficient visual acuity, color vision and were naïve to the purpose of the experiment. Prior to the experiment, all participants signed a consent form, informing the participants that they could withdraw at any time during the experiment without any further consequences. Stimuli The same stimuli as in Experiment 1 were used. 3

Design and Procedure Due to the nature of the physiological measurement our experimental design had to be slightly adapted. Testing all four conditions, both feeling dimensions and both rating targets, in a physiological setting was not feasible due to the intricacies of the physiological recording and due to availability of only few participants. We thus resorted to testing felt fluency, the block that was common to all conditions. Thus, we measured fEMG and skin conductance responses while participants rated the felt fluency of the processing of the stimuli. Both the manipulation of ease of processing through presentation duration (100, 200, 300, 400 ms) and the stimuli were the same as in Experiment 1. Upon arrival participants filled out the consent form and were informed about the procedure. Then physiological measures were prepared. For measuring fEMG, two Ag/AgCl electrodes each (4 mm diameter, 7 mm housing) were placed over the left M. corrugator supercilii and left M. zygomaticus major region resulting in a bipolar measurement (following the guidelines by Fridlund & Cacioppo, 1986). As ground, a single Ag/AgCl electrode was placed on the right mastoid (behind the ear). To reduce electrode impedance prior to electrode placement the skin was cleaned with alcohol (70% isopropyl alcohol) and abrasive gel was applied (Nu Prep, Weaver, USA). For measuring skin conductance, VelcroÒ strip electrodes were attached to the palmar middle phalanges of the index and middle finger of the nondominant hand (see Boucsein, 2012). To assure optimal measuring conditions participants were told to wash their hands in tepid water prior to electrode attachment. For better conductivity EC 33 conductive gel (Grass Technologies, West Warwick, RI) was applied between the skin and the electrodes. Then all electrodes were connected to the amplifier (TMS International Refa8, 32 channel amplifier, TMSi, Twente, The Netherlands) and impedance was checked (to be below 10 kΩ). As physiological measures afford both a certain period in which the signal can be acquired and an intertrial interval during which the activation can return to a baseline level, the trial structure had to be adapted as well. Furthermore, presenting the fixation cross for 2 s in Experiment 1 might have been too long.3 This might have resulted in participants exactly not looking at the middle of the screen and thus in detrimental effects on the perception of the target image. We thus changed the duration of the fixation cross to 200 ms. Similar to Experiment 1, the fixation cross was followed by the target stimulus for either 100, 200, 300, or 400 ms. The target was followed by the same random noise mask as in Experiment 1. The mask was, however, presented for 5 s. Although activation of both fEMG and SC was measured continuously, the activation

We thank the reviewer for pointing that out.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):45–58

M. Forster et al., Fluency and Affective Processing

Figure 4. Sequence of a trial in Experiment 2. FF = felt fluency; ITI = Intertrial interval.

following the target (i.e., during the presentation of the mask) was of main interest (see below for further details). After the mask, participants rated their felt fluency (“Wie leicht ist es Ihnen gefallen das Bild wahrzunehmen?” [“How easy was the perception of the stimulus?”]) on a Likert-type scale from 1 (= very easy) to 7 (= very hard).4 To allow the activation to return to baseline, an intertrial interval ranging between 6 and 6.5 s was included (see Figure 4). Presentation order of the stimuli and different durations were randomized. After completing the experiment, the participants were thoroughly debriefed about the purpose of the experiment and thanked. The experiment was run using E-PrimeÒ 2.0 (Schneider et al., 2002) and was presented on a 3100 display at a resolution of 2,400 1,200 pixels and a refresh rate of 60 Hz. Physiological Data Preparation and Analysis Both fEMG and SC were recorded with 2,048 Hz and filtered online with a 500 Hz low-pass filter. The data were then stored on hard drives and were later preprocessed offline. All data preprocessing was performed in Matlab 7.14 (MathWorks Inc., USA) using the EEGLAB toolbox (Version 13.0.0, Delorme & Makeig, 2004) for fEMG and the Ledalab toolbox (Version 3.4.7, Benedek & Kaernbach, 2010a, 2010b) for SC. For further analysis the fEMG and the SC data were split up in two separate files to facilitate processing.

The fEMG signal was filtered with a 20 Hz high-pass filter to reduce blink artifacts (van Boxtel, 2001) and a 45–55 Hz notch filter to reduce power line artifacts. Both filters were applied with an updated filtering function in EEGLAB (pop_eegfiltnew(), see Widmann & Schröger, 2012). Then the signal was rectified and filtered with a moving average filter (size: 125 ms, see also Topolinski et al., 2009). Next, the signal was cut into trials and baseline corrected. For the baseline we chose 1 s prior to the fixation cross (1,200 ms to 200 ms prior to the stimulus). Thus the signal now reflected the change in activation from the baseline. This signal was then inspected for artifacts by comparing the activation in the data with video recordings. All instances where activation was due to movement unrelated to the actual task (such as chewing, scratching, or touching the electrodes) were removed from the data. After artifact coding, data from one participant had to be removed from further analysis, because less than five trials remained in three of the four conditions (maximum per condition is 25). In a final step the data were z-transformed per muscle region and participant (Bush, Hess, & Wolford, 1993; Gerger, Leder, & Kremer, 2014; Winkielman & Cacioppo, 2001). To test how the fEMG signal develops over time, the data were separately averaged for each of the 5 s following the stimulus per each condition, muscle region, and participant.

Mind that the scale anchoring is reversed to Experiment 1. Informal post-questions in Experiment 1 indicated that the anchoring from 1 (= very easy) to 7 (= very hard) is more intuitive. For data analysis the felt fluency rating scale was inverted to simplify the interpretation. Higher ratings consequently represent higher felt fluency. As can be seen in the Results section, the change in scale anchoring did not affect the results.

Experimental Psychology 2016; Vol. 63(1):45–58

Ó 2016 Hogrefe Publishing

M. Forster et al., Fluency and Affective Processing

The SC signal was first filtered with a 45–55 Hz notch filter to reduce power line artifacts. As the SC response was slower than the fEMG response (Dawson et al., 2000), the amount of data was reduced by first applying a 10 Hz low-pass filter and then downsampling the signal to 32 Hz. Next, the signal of each participant was visually inspected for artifacts. Data from one participant had to be removed due to a high number of artifacts and signal losses. Data from the remaining participants was then further processed using the Ledalab toolbox (Version 3.4.7, Benedek & Kaernbach, 2010a, 2010b). Here, the signal was first smoothed with a Gaussian filter (16 Hz). To separate the tonic SC activation, which slowly fluctuates over time, from the stimulus-dependent phasic activation, continuous decomposition analysis (CDA; Benedek & Kaernbach, 2010a) was performed. As the SC signal is comparably slow, the sum of all activations exceeding an amplitude of at least 0.01 μS starting from the end of the first second to the fifth second (thus, 4 s in total) after stimulus offset was computed separately for each trial (following the guidelines of the Society for Psychophysiological Research Ad Hoc Committee on Electrodermal Measures, 2012). The data were then z-transformed within participant. Cursory inspection of the data revealed a few trials where the z-transformed conductance was much higher than in all other trials. To correct for such outliers, we removed all trials in which the z-transformed conductance exceeded the mean conductance ± 3 standard deviations (calculated separately for each participant). After removing these outliers (2.4% of all cases) the remaining trials were averaged per condition and participant in IBM SPSS Statistics 20.

Results and Discussion Behavioral Data To analyze how variations in ease of processing influence the reported feeling of fluency we performed a one-way repeated-measures ANOVA with ease of processing (presentation duration of 100, 200, 300, or 400 ms) as the factor and the ratings of felt fluency (inverted, see above) as dependent variable. Replicating Experiment 1, we found a significant main effect of ease of processing, F(1.87, 46.63) = 23.48, p < .001, ηp2 = .48 (see also Figure 3). Post hoc pairwise comparisons showed significant differences among all presentation durations ( ps < .007), except 300 versus 400 ms ( p = .99).5 These results show that slightly adapting the trial procedure and measuring physiological data did not harm the effect of ease of processing on feelings of fluency. At least from 100 to 300 ms felt fluency 5

increased with longer presentation durations. The data can be found online (ESM 6). Facial Electromyography (fEMG) Data To analyze whether higher ease of processing leads to more positive affective reactions we performed two repeatedmeasures ANOVAs, one for the M. corrugator supercilii region and one for the M. zygomaticus major region. As we were interested in how the effects change over time the 5 s after the stimulus presentation and prior to the response were binned in five 1-second segments. This resulted in two 4 (ease of processing, 100, 200, 300, or 400 ms) 5 (time segment 1–5) ANOVAs with the mean z-transformed activation over the muscle region as dependent variable. The original unit of fEMG activation is millivolt (mV). Due to z-transformation the resulting values are not interpretable as activation in mV; the direction of effects, however, remains the same: the higher the value the higher the activation, and vice versa. For the M. corrugator supercilii we found a significant main effect of ease of processing, F(3, 72) = 3.25, p = .027, ηp2 = .12, a significant main effect of time segment, F(2.02, 48.46) = 4.35, p = .018, ηp2 = .15, but no interaction, F(12, 288) = 1.61, p = .09, ηp2 = .06. Post hoc pairwise comparisons indicated an inhibition of the corrugator activation at higher compared to lower ease of processing. The comparison between 100 and 400 ms showed a trend ( p = .074, all other comparisons, ps > .148). For the time segments we found a lower activation in the first segment (M = 0.34, SD = 0.4) than in the fourth segment (M = 0.10, SD = 0.4, p = .005). For the M. zygomaticus major we found a significant main effect of time segment, F(2.16, 51.89) = 38.42, p < .001, ηp2 = .62, but no effects of ease of processing, F(3, 72) = 1.20, p = .317, ηp2 = .05, and no interaction, F(3.70, 88.85) = 0.69, p = .591, ηp2 = .03. Post hoc pairwise comparisons showed that zygomaticus activations in seconds 1 and 2 were significantly higher than in seconds 3, 4, and 5 ( ps < .002). The pattern of fEMG results indicated that in our experiment higher ease of processing was only reflected in an inhibition of the M. corrugator supercilii region that hints at a slightly positive affect (see also Topolinski et al., 2009). Nonetheless, the activation of the M. zygomaticus major region, which should also have indicated this positive affect, was not influenced. Regarding the time segments, the data pattern showed that corrugator activation increased while zygomaticus activation decreased. Over the duration of a trial, participants exhibited increased

In the preprocessing of the fEMG and SC signals, data from two participants (one in each signal) had to be excluded from further analysis. We thus ran the analysis of the felt fluency ratings also without those two participants. The results did not change. We report the analysis of the full sample of participants.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):45–58

Figure 5. Mean z-transformed activation of the first second over the M. zygomatics major region (solid line) and the M. corrugator supercilii region (dashed line) in Experiment 2. Error bars represent standard errors of the mean.

negative affective reactions. This could be due to presenting mildly affective stimuli for very short durations in combination with rather long intervals between the stimulus and the response and between trials. However, the recording of the fEMG and SC signal necessitated these intervals. If the participants reacted with negative affect to the increased intervals, this effect could have built up across the segments and might have overshadowed the subtle effects of feelings of fluency. To tackle this issue we also analyzed only the first time segment – the first second of the recordings. For the corrugator a one-way ANOVA (ease of processing, 100, 200, 300, or 400 ms) with the mean z-transformed activation over the muscle region in the first second as dependent variable showed a main effect of ease of processing, F(3, 72) = 3.16, p = .030, ηp2 = .12. As before, activation tended to decrease over time (see Figure 5). For the zygomaticus the ANOVA again showed no main effect of ease of processing, F(3, 72) = 0.56, p = .644, ηp2 = .02. All data can be found online (ESM 7). To sum up, the only indication for an increased positive affect due to higher ease of processing (Gerger et al., 2011; Topolinski et al., 2009; Winkielman & Cacioppo, 2001) was a slight inhibition of the M. corrugator region with longer presentation duration of the stimuli, and consequently higher ease of processing. The activation of the M. zygomaticus region remained largely uninfluenced by our manipulation. Skin Conductance (SC) Data To analyze whether higher ease of processing leads to a higher unspecific activation we performed a repeatedmeasures ANOVA with ease of processing (presentation duration of 100, 200, 300, or 400 ms) as a factor and the summed z-transformed SC activation poststimulus Experimental Psychology 2016; Vol. 63(1):45–58

M. Forster et al., Fluency and Affective Processing

Figure 6. Mean z-transformed skin conductance during seconds 1–4 in Experiment 2. Error bars represent standard errors of the mean.

(seconds 1–5) as the dependent variable. The original unit of SC activation is microsiemens (μS). (Siemens is the SI unit of electrical conductance.) Due to z-transformation the resulting values are not interpretable as conductance in μS; the direction of effects, however, remains the same: the higher the conductance the higher the activation, and vice versa. The ANOVA showed a main effect of ease of processing, F(3, 72) = 8.55, p < .001, ηp2 = .26. Contrary to our prediction, the conductance at higher ease of processing was significantly lower compared to lower ease of processing. The conductance at 100 ms (M = 0.01, SD = 0.14) and at 300 ms (M = 0.09, SD = 0.11) was significantly higher than at 400 ms (M = 0.21, SD = 0.12, ps < .038, see Figure 6). Thus, in our case higher ease of processing did not come along with higher, but lower, unspecific activation. Morris et al. (2008) found that feelings of familiarity were reflected in longer SCR latencies. We thus also analyzed how the latency of the skin conductance was affected by ease of processing in an ANOVA, with ease of processing (presentation duration of 100, 200, 300, or 400 ms) as a factor and the latency of the first SCR (exceeding the threshold of 0.01 μS within the window of seconds 1–5 poststimulus) as the dependent variable. The ANOVA showed no effect of ease of processing on the latencies, F(1.98, 47.44) = 0.66, p = .522, ηp2 = .03. For the sake of completeness and parallel to the fEMG analysis, we also analyzed the SCRs in the first second. Due to the slow response of skin conductance (Society for Psychophysiological Research Ad Hoc Committee on Electrodermal Measures, 2012) we chose Second 2 poststimulus as the equivalent of Second 1 in the fEMG analysis. The results were similar to the analyses above. We still found Ó 2016 Hogrefe Publishing

M. Forster et al., Fluency and Affective Processing

a significant, albeit weaker, main effect of ease of processing on the summed z-transformed SC activation, F(3, 72) = 2.98, p = .037, ηp2 = .11. This is in line with the fact that it takes some time for the SCR to unfold and with the suggestion to analyze a response window of 4 s poststimulus starting at the first second (Society for Psychophysiological Research Ad Hoc Committee on Electrodermal Measures, 2012). Also the latencies of the first SCR exceeding the threshold did not significantly vary with ease of processing, F(2.38, 57.16) = 1.24, p = .301, ηp2 = .05. All data can be found online (ESM 8 and 9).

General Discussion In two experiments we tested effects of variations in ease of processing on feelings of fluency and on affective responses. Theories on processing fluency suggest that the feeling of fluency is either affectively positive (Winkielman et al., 2003) or an unspecific activation (Bornstein & D’Agostino, 1994). Our results on a behavioral level show that higher ease of processing indeed leads to higher feelings of fluency and a more positive evaluation of the stimulus. On the one hand, this was indicated by higher stimulus valence ratings, but not higher stimulus arousal ratings. On the other hand, the effects of ease of processing on the perceivers’ feelings showed a different pattern. Although higher feelings of fluency were reported, participants only gave higher felt arousal ratings, but not more positive valence ratings. On a physiological level, tested in Experiment 2, we found only little evidence for physiological correlates of the effects. In line with Experiment 1, reported feelings of fluency increased with longer presentation durations. The notion that the feeling of fluency is affectively positive (Winkielman et al., 2003) could be shown in a slight inhibition of the frowning muscle region. A decrease in SC responses with higher ease of processing, however, was even in contrast to the notion that feelings of fluency led to an unspecific activation (Bornstein & D’Agostino, 1994). The behavioral findings supported both theories. Experiment 1 extends previous research by showing that the validity of the different theories depends on the target of the ratings. When evaluating a stimulus, felt fluency elicited by that stimulus leads to a more positive evaluation of that stimulus. This finding is in line with the hedonic marking of fluency. However, when a more object-detached personal feeling of fluency has to be evaluated, felt fluency, elicited by a stimulus, is not evaluated as something positive in valence, but results in an unspecific arousal. It could thus be argued that the feeling of fluency per se is an unspecific activation which, in combination with a stimulus, is of Ó 2016 Hogrefe Publishing

positive valence. This might be due to the fact that the unspecific activation is misinterpreted as a positive affective response toward the stimulus (Dutton & Aron, 1974). Together with the absence of an effect on stimulus arousal, this finding supports the hypothesis that – at least when rating a stimulus – processing fluency is mainly characterized by positive feeling dimensions (see Exp. 2 in Reber et al., 1998). The findings of Mandler et al. (1987), showing that even evaluations of brightness and darkness are influenced by higher ease of processing, are, however, inconsistent with this hypothesis. At this point we can only speculate about the underlying causes. One possible explanation might be the context of the experiment. Previous research has shown that the context also influences the interpretation of fluency (Unkelbach, 2007). On the one hand, in the experiments of Mandler et al. (1987) stimuli were presented very briefly (1 or 2 ms). Therefore, participants might have been in a state of high uncertainty or even bad mood due to not seeing the stimuli. This could have produced the nonspecific effects in the evaluations of the stimuli. However, Albrecht and Carbon (2014, Exp. 2) could show that even when the stimuli were clearly visible, nonspecific effects occurred. Also, in Experiment 2 of Reber et al. (1998) as well as in our experiment stimuli were always visible. Here, participants maybe had no reason to relate the feeling of fluency to a “non-positive” stimulus evaluation. This explanation, however, remains to be further tested. Our manipulation of processing fluency (based on varying the presentation durations) is different from mere exposure manipulations (for a review see Bornstein, 1989) in which the higher ease of processing originates from stimulus repetition. In Experiment 1 ratings of stimuli on the feeling dimension were higher when tested in the second block, indicating a mere exposure effect. However, this effect did not interact with the effects due to other variations in ease of processing. Thus, the reported effects of higher ease of processing on evaluations cannot simply be explained by mere exposure or higher familiarity. In Experiment 2, we could replicate the effects of ease of processing on felt fluency ratings from Experiment 1. Higher ease of processing was reflected in reports of higher felt fluency. In regard to the conflicting accounts of the nature of felt fluency, the physiological responses in terms of facial electromyography (fEMG) and skin conductance (SC) were by trend in favor of the hedonic marking of fluency. Responses of fEMG, which capture positive or negative affect, showed a slight positive reaction as indicated by inhibitions of the M. corrugator supercilii region (Topolinski et al., 2009). An activation of the M. zygomaticus major indicative of positive affect (Winkielman & Cacioppo, 2001) could not be found. Possibly, the effects were too subtle to be reflected in the activation of this muscle region. Experimental Psychology 2016; Vol. 63(1):45–58

The notion that fluency is experienced as an unspecific activation in our Experiment 2 could not be confirmed. Measures of skin conductance indicated that higher ease of processing came along with lower skin conductance, and thus, if anything, with lower activation. This effect is in line with previous research showing that skin conductance responses are especially sensitive to novel, and thus disfluent, material (see Dawson et al., 2000, for an overview). This post hoc interpretation makes it clear that in comparison to the verbal reports, the physiological measurements could be too unspecific to allow any firm conclusions. Together the findings from both experiments showed that stimulus evaluations became more positive with higher ease of processing, as indicated by higher ratings and, less strongly, by the inhibition of the frowning muscle. This is in line with the notion that processing fluency is hedonically marked (Winkielman et al., 2003). In contrast to the notion that processing fluency is an unspecific activation we could not find effects on stimulus arousal ratings or physiological indices of arousal. Regarding the effects on the personal feelings of the participants, effects of higher ease of processing found in Experiment 1 were not confirmed by higher SCRs in Experiment 2. These subtle effects of ease of processing on positive physiological responses did thus not translate into self-reports of ease of processing, too. The latter finding might be due to the fact that the physiological responses were reflecting processes that were entirely unrelated to the required judgments (of ease of processing). This finding would also show a drawback of the measure of physiological responses without accompanying self-reports: If the two dissociate, it is difficult to relate them to one another because of a lack of emotion specificity of the physiological responses. However, this leaves open the question where the effects of ease of processing on reported feelings of arousal in Experiment 1 come from. As the term “cognitive feelings” already suggests (Greifeneder, Bless, & Pham, 2011), the arousal judgments might have also reflected “cold cognitions” without much physiological underpinnings – that is, a cognitive bias toward arousal judgments. In light of dual-process models of behavior in general (for an overview see, e.g., Smith & DeCoster, 2000) and the 2-systems model of Strack and Deutsch (2004) not only the impulsive, and more affective system, but also the reflective, and more cognitive system might have its share in the formation of the arousal ratings. Thus, feelings of arousal might also be the results of a deliberate attribution of the higher ease-of-processing to higher ratings of arousal. It is, thus, possible that experiences of ease of processing are not themselves felt as affective responses, but rather as cognitive cues for evaluations. Testing this possibility, however, requires studies specifically designed for pitting Experimental Psychology 2016; Vol. 63(1):45–58

M. Forster et al., Fluency and Affective Processing

outcomes due to the reflective system and the impulsive system against each other. We can only speculate why variations in ease of processing in Experiment 2 were only weakly reflected in physiological responses. Of course, measuring physiological indices parallel to self-reports of arousal and valence would have allowed more valid conclusions on the relationship between felt fluency and affect. Also, in contrast to previous studies applying fEMG in research on processing fluency (Gerger et al., 2011; Topolinski et al., 2009; Winkielman & Cacioppo, 2001), we did not explicitly ask for liking but for felt fluency. One could argue that when participants evaluate liking or affective responses to an object then stronger physiological responses might occur. This follows from the fact that the participants’ task-specific foci of attention on nonevaluative aspects of the stimuli could have distracted the participants from their affective responses. In other words, implicit measures in general and physiological responses in particular are subject to context effects, such as the task at hand (Wittenbrink & Schwarz, 2007). One limitation of our findings concerns the stimuli. To elicit stronger affective responses, stimulus material having stronger emotional content might be warranted. We refrained from using this material in the present study for two main reasons. First, strong emotional content might override relatively subtle effects of ease of processing. And second, using similar material to previous studies (Forster et al., 2013; Reber et al., 1998; Winkielman & Cacioppo, 2001) helps in better understanding the effects from the previous studies by avoiding the confound of different stimulus material. To sum up, our findings show that the feeling of fluency contributes to a positive evaluation of the stimulus, as reflected in more positive evaluations and a subtle physiological response. This response, however, did not lead to more positive feelings in the perceiver. Thus, in line with the feelings-as-information account (Schwarz, 2011), feelings of fluency can be a powerful source of positive evaluations of objects, but not of the own feeling. Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at http://dx.doi.org/10.1027/ 1618-3169/a000311 ESM 1. Data (sav file). Raw data of felt arousal in Experiment 1. ESM 2. Data (sav file). Raw data of felt valence in Experiment 1. ESM 3. Data (sav file). Raw data of stimulus arousal in Experiment 1. ESM 4. Data (sav file). Ó 2016 Hogrefe Publishing

M. Forster et al., Fluency and Affective Processing

Raw data of stimulus valence in Experiment 1. ESM 5. Data (sav file). Aggregated data in Experiment 1. ESM 6. Data (sav file). Aggregated rating data in Experiment 2. ESM 7. Data (sav file). Aggregated fEMG data in Experiment 2. ESM 8. Data (sav file). Aggregated SC data (four seconds) in Experiment 2. ESM 9. Data (sav file). Aggregated SC data (first second) in Experiment 2. ESM 10. Analysis routines (SPS file). SPSS syntax routines for all analyses.

References Albrecht, S., & Carbon, C.-C. (2014). The fluency amplification model: Fluent stimuli show more intense but not evidently more positive evaluations. Acta Psychologica, 148, 195–203. doi: 10.1016/j.actpsy.2014.02.002 Arnold, M. B. (1960). Emotion and personality. New York, NY: Columbia University Press. Benedek, M., & Kaernbach, C. (2010a). A continuous measure of phasic electrodermal activity. Journal of Neuroscience Methods, 190, 80–91. doi: 10.1016/j.jneumeth.2010.04.028 Benedek, M., & Kaernbach, C. (2010b). Decomposition of skin conductance data by means of nonnegative deconvolution. Psychophysiology, 47, 647–658. doi: 10.1111/j.14698986.2009.00972.x Bornstein, R. F. (1989). Exposure and affect: Overview and metaanalysis of research, 1968–1987. Psychological Bulletin, 106, 265–289. doi: 10.1037/0033-2909.106.2.265 Bornstein, R. F., & D’Agostino, P. R. (1994). The attribution and discounting of perceptual fluency: Preliminary tests of a perceptual fluency attributional model of the mere exposure effect. Social Cognition, 12, 103–128. doi: 10.1521/ soco.1994.12.2.103 Boucsein, W. (2012). Electrodermal activity. New York, NY: Springer. Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: The selfassessment manikin and the semantic differential. Journal of Behavior Therapy and Experimental Psychiatry, 25, 49–59. doi: 10.1016/0005-7916(94)90063-9 Bush, L. K., Hess, U., & Wolford, G. (1993). Transformations for within-subject designs: A Monte Carlo investigation. Psychological Bulletin, 113, 566–579. doi: 10.1037/00332909.113.3.566 Dawson, M.-E., Schell, A.-M., & Filion, D.-L. (2000). The electrodermal system. In J.-T. Cacioppo, L.-G. Tassinary, & D. K. Dawson (Eds.), Handbook of Psychophysiology (2 ed., pp. 200– 223). Cambridge, UK: Cambridge University Press. Delorme, A., & Makeig, S. (2004). EEGLAB: An open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of Neuroscience Methods, 134, 9–21. doi: 10.1016/j.jneumeth.2003.10.009 Dutton, D. G., & Aron, A. P. (1974). Some evidence for heightened sexual attraction under conditions of high anxiety. Journal of Personality and Social Psychology, 30, 510–517. doi: 10.1037/ h0037031 Ellis, H. D., Quayle, A. H., & Young, A. W. (1999). The emotional impact of faces (but not names): Face specific changes in skin

Ó 2016 Hogrefe Publishing

conductance responses to familiar and unfamiliar people. Current Psychology, 18, 88–97. doi: 10.1007/s12144-9991018-y Forster, M., Leder, H., & Ansorge, U. (2013). It felt fluent, and I liked it: Subjective feeling of fluency rather than objective fluency determines liking. Emotion, 13, 280–289. doi: 10.1037/ a0030115 Fridlund, A. J., & Cacioppo, J. T. (1986). Guidelines for human electromyographic research. Psychophysiology, 23, 567–589. doi: 10.1111/j.1469-8986.1986.tb00676.x Gerger, G., Leder, H., & Kremer, A. (2014). Context effects on emotional and aesthetic evaluations of artworks and IAPS pictures. Acta Psychologica, 151, 174–183. doi: 10.1016/j. actpsy.2014.06.008 Gerger, G., Leder, H., Tinio, P. P. L., & Schacht, A. (2011). Faces versus patterns: Exploring aesthetic reactions using facial EMG. Psychology of Aesthetics, Creativity, and the Arts, 5, 241–250. doi: 10.1037/a0024154 Greifeneder, R., Bless, H., & Pham, M. T. (2011). When do people rely on affective and cognitive feelings in judgment? A review. Personality and Social Psychology Review, 15, 107–141. doi: 10.1177/1088868310367640 Irtel, H. (2007). PXLab: The Psychological Experiments Laboratory [computer software] (Version 2.1.11). Mannheim, Germany: University of Mannheim. Retrieved from http:// www.pxlab.de Ishihara, S. (1917). Tests for color-blindness. Tokyo, Japan: Hongo Harukicho. Jacoby, L. L., Allan, L. G., Collins, J. C., & Larwill, L. K. (1988). Memory influences subjective experience: Noise judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 240–247. doi: 10.1037/0278-7393.14.2.240 Larsen, J. T., Berntson, G. G., Poehlmann, K. M., Ito, T. A., & Cacioppo, J. T. (2008). The psychophysiology of emotion. In R. Lewis, J. M. Haviland-Jones, & L. F. Barrett (Eds.), The handbook of emotions (3rd ed., pp. 180–195). New York, NY: Guilford. Larsen, J. T., Norris, C. J., & Cacioppo, J. T. (2003). Effects of positive and negative affect on electromyographic activity over zygomaticus major and corrugator supercilii. Psychophysiology, 40, 776–785. Mandler, G., Nakamura, Y., & Van Zandt, B. J. (1987). Nonspecific effects of exposure on stimuli that cannot be recognized. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 646–648. doi: 10.1037/0278-7393.13.4.646 Morris, A. L., Cleary, A. M., & Still, M. L. (2008). The role of autonomic arousal in feelings of familiarity. Consciousness and Cognition, 17, 1378–1385. doi: 10.1016/j. concog.2008.04.005 Newell, B. R., & Shanks, D. R. (2007). Recognising what you like: Examining the relation between the mere-exposure effect and recognition. European Journal of Cognitive Psychology, 19, 103–118. doi: 10.1080/09541440500487454 Öhman, A. (1979). The orienting response, attention, and learning: An information-processing perspective. In H. D. Kimmel, E. H. V. Olst, & J. F. Orlebeke (Eds.), The orienting reflex in humans (pp. 443–471). Hillsdale, NJ: Erlbaum. Reber, R., Fazendeiro, T. A., & Winkielman, P. (2002). Processing fluency as the source of experiences at the fringe of consciousness. Psyche: An Interdisciplinary Journal of Research on Consciousness, 8, 175–188. Reber, R., Schwarz, N., & Winkielman, P. (2004). Processing fluency and aesthetic pleasure: Is beauty in the perceiver’s processing experience? Personality and Social Psychology Review, 8, 364–382. doi: 10.1207/s15327957pspr0804_3 Reber, R., Winkielman, P., & Schwarz, N. (1998). Effects of perceptual fluency on affective judgments. Psychological Science, 9, 45–48. doi: 10.1111/1467-9280.00008

Experimental Psychology 2016; Vol. 63(1):45–58

Reber, R., Wurtz, P., & Zimmermann, T. D. (2004). Exploring “fringe” consciousness: The subjective experience of perceptual fluency and its objective bases. Consciousness and Cognition, 13, 47–60. doi: 10.1016/s1053-8100(03)00049-7 Regenberg, N. F. E., Häfner, M., & Semin, G. R. (2012). The Groove Move: Action affordances produce fluency and positive affect. Experimental Psychology, 59, 30–37. doi: 10.1027/1618-3169/ a000122 Rossion, B., & Pourtois, G. (2004). Revisiting Snodgrass and Vanderwart’s object set: The role of surface detail in basic-level object recognition. Perception, 33, 217–236. doi: 10.1068/ p5117 Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39, 1161–1178. doi: 10.1037/h0077714 Russell, J. A. (2003). Core affect and the psychological construction of emotion. Psychological Review, 110, 145–172. doi: 10.1037/0033-295x.110.1.145 Schneider, W., Eschman, A., & Zuccolotto, A. (2002). E-Prime user’s guide. Pittsburgh, PA: Psychology Software Tools. Schwarz, N. (2011). Feelings-as-information theory. In P. Van Lange, A. Kruglanski, & E. T. Higgins (Eds.), Handbook of Theories of Social Psychology (Vol. 1, pp. 289–308). Thousand Oaks, CA: Sage. Smith, E. R., & DeCoster, J. (2000). Dual-process models in social and cognitive psychology: Conceptual integration and links to underlying memory systems. Personality and Social Psychology Review, 4, 108–131. Society for Psychophysiological Research Ad Hoc Committee on Electrodermal Measures. (2012). Publication recommendations for electrodermal measurements. Psychophysiology, 49, 1017–1034. doi: 10.1111/j.1469-8986.2012.01384.x Strack, F., & Deutsch, R. (2004). Reflective and impulsive determinants of social behavior. Personality and Social Psychology Review, 8, 220–247. doi: 10.1207/s15327957pspr0803_1 Topolinski, S., Likowski, K., Weyers, P., & Strack, F. (2009). The face of fluency: Semantic coherence automatically elicits a specific pattern of facial muscle reactions. Cognition & Emotion, 23, 260–271. doi: 10.1080/02699930801994112 Topolinski, S., & Strack, F. (2009). Scanning the “Fringe” of consciousness: What is felt and what is not felt in intuitions about semantic coherence. Consciousness and Cognition, 18, 608–618. doi: 10.1016/j.concog.2008.06.002 Tranel, D., Fowles, D. C., & Damasio, A. R. (1985). Electrodermal discrimination of familiar and unfamiliar faces: A methodology. Psychophysiology, 22, 403–408. doi: 10.1111/j.14698986.1985.tb01623.x Unkelbach, C. (2007). Reversing the truth effect: Learning the interpretation of processing fluency in judgments of truth. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33, 219–230. doi: 10.1037/0278-7393.33.1.219

Experimental Psychology 2016; Vol. 63(1):45–58

M. Forster et al., Fluency and Affective Processing

van Boxtel, A. (2001). Optimal signal bandwidth for the recording of surface EMG activity of facial, jaw, oral, and neck muscles. Psychophysiology, 38, 22–34. doi: 10.1111/1469-8986.3810022 Whittlesea, B. W. A., & Williams, L. D. (2000). The source of feelings of familiarity: The discrepancy-attribution hypothesis. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 547–565. doi: 10.1037/0278-7393.26.3.547 Whittlesea, B. W. A., & Williams, L. D. (2001a). The discrepancyattribution hypothesis: I. The heuristic basis of feelings of familiarity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 3–13. doi: 10.1037/0278-7393.27.1.3 Whittlesea, B. W. A., & Williams, L. D. (2001b). The discrepancyattribution hypothesis: II. Expectation, uncertainty, surprise, and feelings of familiarity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 14–33. doi: 10.1037/ 0278-7393.27.1.14 Widmann, A., & Schröger, E. (2012). Filter effects and filter artifacts in the analysis of electrophysiological data. Frontiers in Psychology, 3, 233. doi: 10.3389/fpsyg.2012.00233 Winkielman, P., & Cacioppo, J. T. (2001). Mind at ease puts a smile on the face: Psychophysiological evidence that processing facilitation elicits positive affect. Journal of Personality and Social Psychology, 81, 989–1000. doi: 10.1037/00223514.81.6.989 Winkielman, P., Schwarz, N., Fazendeiro, T., & Reber, R. (2003). The hedonic marking of processing fluency: Implications for evaluative judgment. In J. Musch & K. C. Klauer (Eds.), The psychology of evaluation: Affective processes in cognition and emotion (pp. 189–217). Mahwah, NJ: Erlbaum. Wittenbrink, B., & Schwarz, N. (2007). Implicit measures of attitudes. New York, NY: Guilford Press. Zajonc, R. B. (2001). Mere exposure: A gateway to the subliminal. Current Directions in Psychological Science, 10, 224–228. doi: 10.1111/1467-8721.00154 Received April 11, 2014 Revised September 1, 2015 Accepted September 1, 2015 Published online March 29, 2016 Michael Forster Department of Basic Psychological Research and Research Methods Faculty of Psychology University of Vienna Liebiggasse 5 1010 Vienna Austria Tel. +43 1 4277-47161 Fax +43 1 4277-9471 E-mail michael.forster@univie.ac.at

Ó 2016 Hogrefe Publishing

Alternatives to traditional self-reports in psychological assessment “A unique and timely guide to better psychological assessment.” Rainer K. Silbereisen, Research Professor, Friedrich Schiller University Jena, Germany Past-President, International Union of Psychological Science

Tuulia Ortner / Fons J. R. van de Vijver (Editors)

Behavior-Based Assessment in Psychology Going Beyond Self-Report in the Personality, Affective, Motivation, and Social Domains (Series: Psychological Assessment – Science and Practice – Vol. 1) 2015, vi + 234 pp. US $63.00 / € 44.95 ISBN 978-0-88937-437-9 Also available as eBook Traditional self-reports can be an unsufficiant source of information about personality, attitudes, affect, and motivation. What are the alternatives? This first volume in the authoritative series Psychological Assessment – Science and Practice discusses the most influential, state-of-the-art forms of assessment that can take us beyond self-report. Leading scholars from various countries describe the theo-

www.hogrefe.com

retical background and psychometric properties of alternatives to self-report, including behavior-based assessment, observational methods, innovative computerized procedures, indirect assessments, projective techniques, and narrative reports. They also look at the validity and practical application of such forms of assessment in domains as diverse as health, forensic, clinical, and consumer psychology.

How to assess the social atmosphere in forensic hospitals and identify ways of improving it “All clinicians and researchers who want to help make forensic treatment environments safe and effective should buy this book.” Mary McMurran, PhD, Professor of Personality Disorder Research, Institute of Mental Health, University of Nottingham, UK

Norbert Schalast / Matthew Tonkin (Editors)

The Essen Climate Evaluation Schema – EssenCES A Manual and More

2016, x + 108 pp. US $49.00 / € 34.95 ISBN 978-0-88937-481-2 Also available as eBook The Essen Climate Evaluation Schema (EssenCES) described here is a short, well-validated questionnaire that measures three essential facets of an institution’s social atmosphere. An overview of the EssenCES is followed by detailed advice on how to administer

www.hogrefe.com

and score it and how to interpret findings, as well as reference norms from various countries and types of institutions. The EssenCES “manual and more” is thus a highly useful tool for researchers, clinicians, and service managers working in forensic settings.

Research Article

The Inﬂuence of Presentation Order on Category Transfer Fabien Mathy1 and Jacob Feldman2 1

Université Nice Sophia Antipolis, France

Rutgers University, New Brunswick, NJ, USA

Abstract: This study of supervised categorization shows how different kinds of category representations are influenced by the order in which training examples are presented. We used the well-studied 5-4 category structure of Medin and Schaffer (1978), which allows transfer of category learning to new stimuli to be discriminated as a function of rule-based or similarity-based category knowledge. In the rule-based training condition (thought to facilitate the learning of abstract logical rules and hypothesized to produce rule-based classification), items were grouped by subcategories and randomized within each subcategory. In the similarity-based training condition (thought to facilitate associative learning and hypothesized to produce exemplar classification), transitions between items within the same category were determined by their featural similarity and subcategories were ignored. We found that transfer patterns depended on whether the presentation order was similarity-based, or rule-based, with the participants particularly capitalizing on the rule-based order. Keywords: categorization, presentation order, similarity rule

A few studies have shown that aspects of stimulus presentation can influence category formation (Carvalho & Goldstone, 2011, 2014; Clapper & Bower, 1994; Elio & Anderson, 1981, 1984; Gagné, 1950; Goldstone, 1996; Kang & Pashler, 2012; Kornell & Bjork, 2008; Mathy & Feldman, 2009; Medin & Bettger, 1994). One way to test this is to evaluate the transfer of learning to novel instances, in order to see whether different orders can, in fact, alter the way training examples are generalized (Elio & Anderson, 1981, 1984). In this paper we pursue this strategy, using the popular 5-4 category structure of Medin and Schaffer (1978) for which generalization patterns have been scrutinized in the past (Johansen & Palmeri, 2002) to test whether rules or similarity can alter the representation of categories. Rule-based versus similarity-based processing is often induced exclusively by the characteristics of the categorization tasks (Ashby & Ell, 2001), but sometimes both types of processes can be observed within a single task (e.g., Allen & Brooks, 1991; Shanks & Darby, 1998), for instance depending on the presence of a concurrent load during training (Wills, Graham, Koh, McLaren, & Rolland, 2011). The present study adopts a different strategy and seeks to manipulate these two types of processing as factors within a single task. Although the rule-based and similarity-based distinction is subject to theoretical difficulties, there is some advantage of having these two explanatory constructs under different experimental conditions (see the special issue of Cognition in Vol. 65, e.g., Hahn & Chater, 1998; Sloman & Rips, 1998; E. E. Smith, Patalano, & Jonides, 1998). Ó 2016 Hogrefe Publishing

Mathy and Feldman (2009) for instance tested three types of presentation orders to study category discovery (not category generalization, as in the present study). The rule-based order, in which stimuli obeying different rules are separated to facilitate a rule-abstraction process, led to the most effective learning overall. Two other types of presentation orders were found less beneficial to category learning, although as discussed below they may have subtle benefits for other kinds of learning. The similarity-based order (Elio & Anderson, 1981), in which the stimuli are ordered by proximity in the stimulus space, facilitates exemplar memorization. (The temporal contiguity of the most similar stimuli could reinforce the local associations of exemplars, but it could also induce overly specific rules). The dissimilarity-based order, in which the stimuli are ordered so as to maximize distance between successive items, tends to disrupt almost any type of learning mechanism (Gagné, 1950, found in this latter case no difference with a random presentation). The present study employs the first two types of orders, because both can easily be matched to a specific learning mechanism (respectively, rule-based and exemplar-based) that we sought to track during the generalization of category learning to new stimuli. We hypothesized that participants in a rule-based training condition would show generalization patterns consistent with rule-based retrieval and that participants in a similarity-based training condition would show generalization patterns consistent with exemplar retrieval. We provide strong Experimental Psychology 2016; Vol. 63(1):59–69 DOI: 10.1027/1618-3169/a000312

F. Mathy & J. Feldman, Category Generalization

evidence that participants capitalized differently on the two types of stimulus orders from the analysis of the individual generalization patterns. To anticipate further, the generalization patterns seem more typical of a rule-based classification following a rule-based training, and the connection is weaker for the similarity-based condition.

Method The present experiment used the 5-4 categories (see Figure 1 and Table 1). The 5-4 category set was first studied by Medin and Schaffer (1978), as well as reanalyzed in many subsequent studies by J. D. Smith and Minda (2000), and in other subsequent studies (Cohen & Nosofsky, 2003; Johansen & Kruschke, 2005; Johansen & Palmeri, 2002; Lafond, Lacouture, & Mineau, 2007; Lamberts, 2000; Minda & Smith, 2002; Rehder & Hoffman, 2005; Zaki, Nosofsky, Stanton, & Cohen, 2003). This structure makes it possible to study the way in which seven unclassified items are categorized during a transfer phase, after the learning of 5 + 4 = 9 items. The 5-4 ill-defined category structure has been evaluated many times without diminishing its current importance. For instance, Raijmakers, Schmittmann, and Visser (2014) have recently suggested that prolonged learning of this category can be modeled using latent Markov models, with the aim of detecting the underlying strategies used by participants and the transitions between those strategies. Our objective here was different in that our manipulation was designed to induce a particular strategy (rule-based or exemplar-based) at the beginning of the learning process, and to find out if different generalization patterns could be observed depending on whether training was similarity- or rule-based right after a short training process (i.e., an amount of training sufficient to acquire the correct categories). We based our analysis on the list of the rule-based and exemplar-based patterns of generalization which was provided by Johansen and Palmeri (2002) for the 5-4 categories (but see Raijmakers et al., 2014 for a more recent and more complete exploration of the possible fluctuation of the strategies used by participants during transfer, including guessing). Two types of presentation orders were used (rule-based or similarity-based) for the training phase, and this manipulation was a between-participants factor. In both conditions, a single fully-blocked (i.e., separating the two 1

Figure 1. Four-dimensional Hasse diagram, training sample, and the 54 categories. Note. The figure illustrates two hypercubes, one per row. The first hypercube exemplifies how the dimensions were implemented in the experiment (the objects are green on the top and red on the bottom; the colors appear in the online version of this article). The second hypercube shows the structure of the 5-4 categories. The examples of Category A are indicated by black circles, whereas the examples of Category B are represented by empty vertices. The white circles represent the transfer items. The 5-4 notation refers to the presence of five examples in Category A versus four examples in Category B (Medin & Schaffer, 1978; Smith & Minda, 2000). For this concept, following Smith and Minda (2000, p. 4), the objects were numbered by indexing each of the examples using letters (A, for the examples belonging to Category A; B, for the second category; and T, for the examples that were presented during the transfer phase, which the participants could freely classify as As or B’s. The numbers following the letters A, B, and T are arbitrary numbers that distinguish the individual items. Note that T1,. . ., T7 in Johansen and Palmeri (2002), and in the present study, are, respectively, indexed T10,. . ., T16 in Smith and Minda (2000). Among the examples of Category A, there are two clusters. The red set indicates the objects of Category A belonging to the biggest cluster (T4, T6, and T3 are virtually included in the cluster under the hypothesis that the participant in the rule-based order is induced to form a “All red except the small hatched circle” rule). The blue set indicates the objects of Category A from the second cluster. Category-B clusters (described in the text) are not represented in order to lighten the presentation. The clusters are circled by discontinuous curves to acknowledge the fuzziness of their boundaries. Indeed, the representation of the clusters in the participant’s mind is unclear because seven examples (the transfer items) are not associated with clear categories during the training phase.

categories into two distinct subblocks; see Clapper & Bower, 2002) ordering of the stimuli was generated for each participant and then used for all training blocks.1 Blocking was used to reinforce the effect of ordering the stimuli within category. After the category was learned

Although many studies have shown that induction benefits from interleaving the categories, (e.g., Kornell & Bjork, 2008), blocked presentations are sometimes more beneficial (Goldstone, 1996), for instance with low-similarity category structures such as the one used in the present study (Carvalho & Goldstone, 2014). In this case, blocking tends to promote the identification of the shared features (a process that seems neutral regarding the rule-based vs. similarity-based processes). See also Clapper and Bower (1994, 2002) for a positive effect of blocking. Another strategy for presenting the stimuli (such as presenting first the most different stimuli close together in temporal sequences to highlight the discriminative features of the categories) would argue for the use of interleaving instead.

Experimental Psychology 2016; Vol. 63(1):59–69

Ó 2016 Hogrefe Publishing

F. Mathy & J. Feldman, Category Generalization

Table 1. Category structure of the 5-4 categories and observed generalizations patterns by order type A A1 A2 A3 A4 A5

B 1110 1010 1011 1101 0111

B1 B2 B3 B4

Transfer 1100 0110 0001 0000

*T1 *T2 T3 *T4 *T5 *T6 T7

Observed generalization patterns + frequency (bottom) 1001 1000 1111 0010 0101 0011 0100

A A

A B

B A

B B

A A A

B A A

B B A

B B B

A B A

B A A

B A B

A B B

B A A

A B A

B A A

B A B

B B B

0 1

7 14

0 1

1 1

1 0

Frequency Order: Order:

Sim. Rule

4 2

2 0

1 0

2 1

1 0

1 1

1 0

Note. *Highlights the critical transfer stimuli. A and B stand for “Category A” and “Category B,” respectively. The feature vectors such as 1110 in the second column represent, respectively, the Size-Shape-Color-Filling dimension values. The values 1111 represent, respectively, the features Large-Red-CirclePlain. The values 0000 represent, respectively, the features Small-Green-Square-Hatched. Among the 32 possible generalization patterns, we observed only 13 different ones in our experiment. The patterns are reported in columns in the right part of the table.

(up to a chosen criterion explained below), the participants were administered a transfer phase in which old and new stimuli were shown randomly.

Participants The participants were 44 freshman or sophomore students from the University of Franche-Comté, who received course credits in exchange for their participation.

Choice of Categories Studied Each participant was administered a single 5-4 category set. The 5-4 is shown in the bottom hypercube of Figure 1. In the hypercube, made of 24 = 16 stimuli, the five examples of Category A are indicated by black circles and the four examples of Category B are indicated by empty vertices. The hypercube, also known as a Hasse diagram, is extremely useful for quickly looking at the category structure, which does not easily appear in the corresponding truth table such as the one provided in Table 1.

Stimuli Stimulus objects varied across four Boolean dimensions (Color, Shape, Size, and Filling texture). Each dimension was instantiated by the same physical dimension for all participants. This choice was made to homogenize attention to the dimensions across participants to limit the distribution of the generalization patterns within presentation orders. A color dimension differentiated the objects at the top of the hypercube from those at the bottom (green vs. red,

respectively); a shape dimension differentiated the objects at the front from those at the back (square vs. circle); a size dimension distinguished the objects in the left cube from those in the right cube (small vs. large); and finally, the left and right objects within the cubes were hatched versus plain. Overall, the combination of these four separable dimensions (Garner, 1974) formed 16 single unified objects (e.g., a small hatched red square, a large plain green circle, etc.).

Clusters The 5-4 notation refers to the presence of one 5-member category and another 4-member category (Medin & Schaffer, 1978; J. D. Smith & Minda, 2000). For this concept, the objects of each category were numbered by virtually indexing each of the training stimuli using A (first category), B (second category), and T (transfer items). This notation refers to previous research (e.g., J. D. Smith & Minda, 2000, p. 4). To identify the items in a rule-based fashion, we rely on the simplest one-dimensional Color rule plus exception which groups the objects of Category A into two mutually exclusive subcategories or clusters.2 Category A was organized into two clusters: the largest cluster represented the rule for the Red feature and the smallest cluster represented the exception for the Green feature, that is, Cluster 1 = (A5, A1, A2, A3), and Cluster 2 = A4. The two combined clusters thus represented the rule “All red, except the small hatched circle” and the “large plain green circle” exception. These two clusters are circled by discontinuous red and blue curves in Figure 1. Similarly, the objects of the second Category were B3, B4, and B1 (Cluster 1) and B2 (Cluster 2),

There are two dimensions in these stimuli that admit a rule-plus-exception strategy (i.e., Size and Color), but Color was chosen to facilitate the task.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):59–69

F. Mathy & J. Feldman, Category Generalization

representing the rule “All green, except the large plain circle” and the “small hatched red circle” exception.3

Ordering of Stimuli For both conditions (Rule-based and Similarity-based), the stimuli were fully separated/blocked according to category (Category A and Category B). The presentation of Category A always preceded the presentation of Category B in two successive subblocks. In the rule-based order, the first objects of the first category (i.e., Category A) were randomly drawn from Cluster 1 until all had been presented. This was followed by the object in Cluster 2. Thus, in the rule-based order, all the members of the biggest cluster were presented first in a random order, but separated from the exceptional member, in order to encourage participants to abstract the simplest rule. The presentation within Cluster 1 was random in accordance with a rule-abstraction process that is supposed to impede stimulus singularity. The ordering procedure was similar for the second category (i.e., Category B): the objects of Cluster 1 (randomly ordered) preceded the object of Cluster 2. In the similarity-based order, the first object of Category A was chosen at random, and subsequent objects of the same category were chosen randomly from those with maximal similarity to the previous object, and so forth until the set of examples was exhausted. Ties were resolved randomly. The same algorithm was applied to the second category (i.e., Category B). From an exemplar point of view (Elio & Anderson, 1981, 1984), this type of order strictly followed the similarity structure between the stimuli with the objective to reinforce exemplar memorization. Dissimilarity between two stimuli i and j was computed using the city-block distance

dij ¼

a¼4 X

jxia xja j

ð1Þ

a¼1

where xia is the value of stimulus i along dimension a. Similarity was simply computed using sij = 4 dij. The most important aspect of this procedure was that the ordering did not necessarily respect the cluster boundaries targeted in the rule-based order, as similarity steps could cross in and out of clusters (for instance, A5, A4, A1, etc.). Another important aspect of the similarity-based 3

order is that some orders could induce hypotheses more specific than those targeted in the rule-based order: for instance, “large red squares” after the presentation of A3 and A2; “large red squares OR large red hatched,” or “red large, except plain circle,” after the presentation of A3, A2, and A1; “large red squares OR large circles” after the presentation of A3, A2, A1, and A4, etc. One sample of a rule-based presentation for Category A followed by Category B could be: A1, A5, A3, A2, and A4, followed by B3, B4, B1, and B2. One sample of a similaritybased presentation for Category A followed by Category B could be: A2, A1, A5 A4, and A3, followed by B1, B2, B3, and B4.

Justification for the Types of Presentation Orders Chosen These two types of presentation orders match two extreme ways of learning: a complex inductive process based on abstraction and an elementary process with underlying associative mechanisms (Sloman, 1996). 1. The rule-based condition uses a set of clusters which are presented to the participants in an order depending on their magnitude (since in many domains, exceptions are learned last), with no distinction within clusters (since the abstraction process supposedly cancels out any effects of the nondiagnostic features on learning). Because the objects are supposed to involve common abstract properties within clusters, they are drawn randomly. The goal is to make participants form a logical rule. Overall, the rule-based order is thought to help the learner separate different clusters in order to abstract a simple logic describing the stimuli. The identification of the clusters is also facilitated by the randomization of the steps within clusters. The randomization of the steps avoid the learner to be misled by a too specific hypothesis. 2. The similarity-based condition follows a more simple hypothetical associative process that uses the temporal contiguity of the stimuli to reinforce the memory traces of two stimuli locally and by extension, by a chaining process, the entire similarity structure. Because exemplar models assume that participants classify objects according to their relative summed similarity to exemplars of the two categories, it is hypothesized that

Note that the choice of clusters is hypothetical. We refer here to the idea that these clusters can result from the participants’ conceptualization of the induced simplest one-dimensional rule plus exception during the training phase, and can be applied to the transfer phase. However, other participants’ conceptualizations are possible. For instance, a more complicated alternative representation for Cluster 1 could be the set A5, T6, A1, A2, T3, and A3 (i.e., all red, except small hatched). This rule is simpler if we refer to its length but it is more elaborated in that it supposes that the participant makes the abstraction that B2 is not the only exception in play. The only effective constraint on the ordering of stimuli was the separation between the two clusters within-categories, in order to help the abstraction of the one-dimensional rule plus exception allowing one to reach the learning criterion faster according to our hypothesis.

Experimental Psychology 2016; Vol. 63(1):59–69

Ó 2016 Hogrefe Publishing

F. Mathy & J. Feldman, Category Generalization

exemplar memorization can capitalize on a reinforced similarity structure. To compare the two conditions, imagine the learner has already encountered a small and green object (e.g., B3), which gave the learner some information about the different dimension values. In this context, the presentation of a large plain red square A3 followed by a large hatched red circle A1 directly rules out the overly specific “large red square” rule, in favor of the more general “large and red” rule. On the contrary, a large plain red square A3 followed by a large hatched red square A2 (an order typically favored by the similarity-based order) tends to temporarily mislead the learner about the possible “large red square” rule, until the large hatched red circle A1 is presented. A similarity-based order operates otherwise to favor an exemplar-based learning mechanism: when a large plain red square A3 is followed by a large hatched red square A2, the two exemplars are not only memorized as single entities but the association of the two can be thought to subsequently reinforce the entire similarity structure (see Sloman, 1996, who develop a explanation of why associative learning uses similarity and temporal relations rather than symbolic structures).

Procedure Participants were individually tested. The participants sat approximately 60 cm from the computer screen and were given a tutorial before the task began. Each participant was then asked to learn a single 5-4 category set and was administered a series of training blocks until the learning criterion (detailed later) was met. Presentation order (rule-based or similarity-based) was a between-subject manipulation during the training phase. The participants were randomly assigned to one of these two conditions. Because there were several possible variations within each presentation order type (depending for instance on which stimulus was first presented within blocks), one presentation order was randomly chosen for a given participant before the experiment started and the chosen presentation order then applied across the training blocks until the learning criterion was met. Independently of the two types of orders, the training phase was arranged to be as simple as possible. Each participant was administered a series of identical blocks in which, within each block, the five examples of the first category of the to-be-learned 5-4 category structure preceded the four

examples of the second category (i.e., learning was massed; see Birnbaum, Kornell, Bjork, & Bjork, 2013; Carvalho & Goldstone, 2014; Kang & Pashler, 2012; Kornell & Bjork, 2008; Kornell, Castel, Eich, & Bjork, 2010; Wahlheim, Dunlosky, & Jacoby, 2011). Although not specifically tested on the 5-4 category structure, Mathy and Feldman (2011) previously showed that massing (relative to interleaving) the stimuli by category (such that the stimuli of each category appeared consecutively) facilitated both a rule-based and a similarity-based binary-valued artificial classification learning task such as the one studied here. The repetition of identical orders across blocks within participants was not the procedure used by Mathy and Feldman (2009) either. However, by removing the variations between blocks during training, our goal was to have the participants form a consistent representation of the category structure. During a training block, the stimuli were displayed one at a time on the top half of the computer screen for 1 s. The A and B categories corresponded, respectively, to the up and down keys, and to two category pictures on the right-hand side of the screen. In the category frame, a school bag was located at the top, and a trash can4 at the bottom (to match the response keys). When the stimulus was presented, the correct category label was displayed below the stimulus (i.e., “schoolbag” or “trash can”) for 1 s while the corresponding category picture was displayed (for instance, the school bag was shown for 1 s, while the trash can was hidden for a positive category). This instruction was followed by a confirmation phase during which the participant had to press the response key corresponding to the category picture that had just been shown to them. After the key was pressed, feedback indicating a correct or incorrect classification was given at the bottom of the screen for 2 s. This feedback was useful despite presenting the correct answer prior to the participant’s response because participants in categorization tasks might systematically inverse the response keys at one point or another of the experiment. Our pretests showed that in this condition, the participants could not fail to correctly give any of the instructed responses. Therefore, the participants were expected to get 100% correct feedback during the training blocks. This confirmation phase was used to make sure that the participants were following the learning phase actively and that they did not miss any of the instructed categories. Because a 100% correct feedback did not guarantee learning during the training blocks (e.g., the participants could repeat the instructed responses without paying attention to the stimuli), each training block was followed by a categorization block to test the participants along the

The school bag and the trash have been used in many other similar experiments on artificial categorization in our laboratory, and have proved useful for children, especially when giving the instructions. We chose to use the same experimental setup across populations to facilitate the comparison of the results.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):59–69

training process. In the categorization blocks, the nine stimuli were randomly permuted to test the participants. Category learning was also supervised (i.e., with feedback) in the categorization blocks. To summarize, the main difference between the training blocks and the categorization blocks is that order was manipulated in the training blocks (which was observational in nature and which did not allow testing the participants) whereas order was random during the categorization blocks to test the participants. Although the categorization blocks could generate noise in the manipulated order, the participants indicated in our pretests that the learning process was clearly facilitated by the recurring training blocks. To further encourage learning during the categorization blocks, the participants scored one point on a progress bar for each correct response. The number of empty boxes in the progress bar was 4 (5 + 4). One empty box was filled whenever a correct response was given, but the progress bar was reset if an incorrect response was given. This criterion was identical to the one used by Shepard, Hovland, and Jenkins (1961) in their first experiment and by Mathy and Feldman (2009). Consequently, the participants had to correctly classify stimuli in four consecutive categorization blocks of nine stimuli to complete the training phase of the experiment. This setting required them to correctly classify all the stimuli, including those considered as exceptions. This intentionally limited the number of strategies that could provide partial solutions such as being able to classify stimuli on the basis of a limited number of features with less than 100% accuracy. Although this criterion seems difficult to reach (in Johansen & Palmeri, 2002, only 75% of their participants reach a criterion of 80% accuracy in the final blocks), our procedure combining the observation of the category labels and the fully-blocked procedure was thought to outperform the usual trial-anderror learning procedure. Our participants were not trained to categorize the stimuli during 32 training blocks like in some previous studies (Johansen & Palmeri, 2002; Raijmakers et al., 2014), because we were interested in observing the participantsâ&#x20AC;&#x2122; representations after they had acquired the correct category. Because the training blocks and the categorization blocks were interspersed, the progress bar was hidden during the training blocks. The number of points that were accumulated on the progress bar was restored whenever a categorization phase began. Once the participants reached the learning criterion (at this point, the progress bar was equal to 4 9), we conducted a transfer phase during which both the training and transfer stimuli were presented (each once in a block). The transfer phase was composed of 5 blocks of 16 stimuli. The order of all stimuli was randomized within blocks for the transfer phase. Experimental Psychology 2016; Vol. 63(1):59â&#x20AC;&#x201C;69

F. Mathy & J. Feldman, Category Generalization

Results The data files ESM 1 (Learning Phase) and ESM 2 (Transfer Phase) are enclosed in the electronic supplementary materials.

Learning Phase One participant (in the similarity-based order) who was not able to meet the learning criterion in the time allowed by the experiment did not complete the transfer phase and was removed from the analysis. To make sure that the similarity-based order differed from the rule-based order, interitem similarity was computed for each type of order and for all contiguous pairs of examples within the manipulated blocks before being averaged by participants. The mean inter-item similarity per block was significantly lower in the rule-based condition (2.05, sd = .19) than in the similarity-based condition (2.2, sd = .13), t(41) = 2.9, p = .006. Learning was faster in the rule-based order than in the similarity-based order (6.4, sd = 2.6, vs. 9.0, sd = 5.7, number of training blocks to criterion on average), and this difference was significant (t(41) = 2.39, p = .051, but ZWilcoxon = 2.11, p = .035; the nonparametric test was preferred because both distributions significantly deviated from a normal distribution using the Shapiro-Wilk test, due to a positive skewness in both groups). However, no correlation between Interitem similarity per block and number of training blocks was found significant in either of the order conditions.

Transfer Phase The following analysis of the transfer phase follows that of Johansen and Palmeri (2002) (from p. 495), which we take here for granted. Johansen and Palmeri developed a precise analysis of the patterns of categorization during the transfer phase, with some of the patterns reflecting rule-based category representations. These patterns only vary for a subset of stimuli that labeled the critical stimuli. As in Johansen and Palmeri (2002), our analysis does not focus on the categorization probabilities for the nine examples encountered by the participants in the training phases, as these stimuli were globally categorized in the transfer phase as they were learned in the training phase. Figure 2 shows the average categorization probabilities corresponding to the critical transfer stimuli (those that are diagnostic of a rule-based or an exemplar-based generalization pattern: T1, T2, T4, T5, T6). A generalization pattern for a given participant is defined by how these five critical transfer stimuli were categorized across the transfer blocks. A participant who would classify the five critical transfer stimuli in the A category (on average across the five transfer blocks) would exhibit an AAAAA pattern. Another participant who would

Ă&#x201C; 2016 Hogrefe Publishing

F. Mathy & J. Feldman, Category Generalization

Figure 2. Average categorization probabilities of the critical transfer items (T1, T2, T4, T5, T6) in the 5-4 category structure during the transfer phase (amounting to five blocks). Note. p(A) is the observed proportion that each of the stimuli labeled under the abscissa was categorized as A during the transfer phase; p(A) was first computed for each of the participants before being averaged across participants. The proportions are broken down by presentation order conditions (rule- vs. similarity-based). The graph does not include the categorization probabilities for the nine examples that the participants encountered in the training phases. Error bars show ±1 SE.

classify only T1 in the A category on average would exhibit an ABBBB pattern. As indicated by Johansen and Palmeri on p. 490, participants who would apply a rule along the Size and Color dimensions would exhibit the AABBB and BBABA patterns, respectively. Using more complex computation (which we cannot detail in the present short paper), Johansen and Palmeri identified ABBBA as a pattern typical of an exemplar-based categorization. The categorization probability p(A) was the observed proportion of a stimulus categorized as A during the transfer phase across the five blocks (this calculation was made for each participant, before being averaged across participants in Figure 2). For instance, a stimulus categorized five times out of five as belonging to Category A simply corresponds to p(A) = 1. Participants were classified as using a particular transfer pattern based on whichever response was given more often out of the five transfer responses. A within-subjects Stimulus Type Between-Subjects Presentation Order ANOVA on the proportion of A responses restricted to the five critical stimuli showed a significant interaction, F(4, 164) = 3.2, p = .014, ηp2 = .07. Figure 2 shows that following the rule-based presentation order, the average pattern (across participants) is BBABA 5

(a typical rule-based generalization pattern), as opposed to ABABA for the similarity-based presentation order (a nontypical pattern, although similar to ABBBA, a typical exemplarbased pattern). We therefore found a prominent rule-based generalization pattern (BBABA) following the rule-based presentation order, which corresponds to a one-dimensional rule (plus exceptions) based on the Color dimension, that is “all red objects except B2 versus all green objects except A4.” Because averaging the individual patterns can be misleading, Table 1 reports the 13 different patterns that we observed in our experiment when we focused on the distribution of the generalization patterns at the individual participant level (N = 43). The patterns are reported in columns in the right part of the table. The respective distributions of the number of patterns observed for the Similarity and Rule conditions are reported below the patterns. Because it makes sense that an “AAAAA” pattern is less different from an “AABAA” pattern than from a “BBBBB” pattern, we considered that the 13 patterns represented an ordinal variable, since each of the binary patterns can be transformed into a decimal. The two-sample Kolmogorov-Smirnov test was computed using the Matlab kstest2 function, which is based on the exact theoretical distribution function of D. We found a p = .028 when the directional parameter was set to “larger” (to test the alternative hypothesis that the cumulative distribution function for Sim is larger), showing that the frequency distributions for Sim and Rule are different. A simple Gamma test for the Frequency Order Type crosstab was also significant ( p = .017). The distribution of the generalization patterns shows in particular that the BBABA pattern was less common in the similarity-based presentation order than in the rulebased order (7 vs. 14 participants, respectively). Note that the AABBB pattern (another prominent rule-based pattern, based on a one-dimensional rule using Size instead of Color as the main dimension) is represented twice by participants who were given a similarity-based order and once by a participant in the rule-based presentation order condition. Note that in the present study, 93% of the participants who applied a one-dimensional rule still classified the exceptional items of Category A and B in their correct category during transfer, regardless of order condition. We conclude that the participants tended to use Color to separate the categories. A total of 24 participants eventually categorized the transfer objects in a way that suggests that they applied a rule-based strategy.5 Overall, our result clearly indicates a distortion in the generalization patterns according to presentation order, and this distortion is mainly visible in the frequency associated with the BBABA pattern and the result of the Kolmogorov-Smirnov test.

Twenty-four (7 + 14 + 2 + 1 = 24) is quite a high value in relation to the fairly erratic distribution of patterns that has previously been observed (see Johansen & Palmeri, 2002, p. 491).

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):59–69

A simpler test to see whether there was statistically more prominent rule-based generalizations in the rule-based training fell short of significance. For instance, a total of 15 participants in the rule-based condition categorized the transfer objects in a way that reflected one of the two prominent BBABA or AABBB rule-based patterns, against seven participants for all other patterns (expressed henceforth as 15 vs. 7). Conversely, the distribution was 9 versus 12 in the similarity-based order. This allowed us to compute a w2 for this 2 2 crosstab, which fell short of significance ( p = .09, without Yatesâ&#x20AC;&#x2122; correction; p = .07 using the Fisher exact test). A 16 versus 6 distribution (instead of the 15 vs. 7 observed) in the rule-based condition would have reached significance. Note that the 16 versus 6 simulation is quite informative since two of the seven participants only showed a single different response to the BBABA pattern in the rulebased condition (i.e., BABAA and BBBAA). Another potential issue of the preceding analyses is that a participant who responded for instance ABABA on all five transfer trials (i.e., five A responses for the first critical transfer item, five B responses for the second critical transfer item, etc.) was treated the same as a participant who responded A on four trials and B on one trial for the first critical transfer item, and so on. The latter participant would still be classified as an ABABA responder since a particular transfer pattern was based on whichever response was given more often out of the five transfer responses. Latent class analysis would be appropriate for determining whether any strategies were dominant in either conditions, but this method unfortunately requires either large data sets consisting of long time series of transfer responses from each subject, or else many more subjects (Raijmakers et al., 2014). In order to see whether performance in the similarity-order condition was more random than in the rule-based order, we considered the proportion of A responses to transfer trials for each of the transfer stimuli, which the previous analysis had ignored. One possibility is that the participants were simply worse at applying one of the rules to the transfer items in the similarity-order condition. We therefore classified the participants as 100% consistent when they gave at least four similar responses to each transfer item; they were rated as 80% consistent when only one item was classified inconsistently (i.e., two or three times in a given given category), 60% consistent when two items were classified inconsistently, and so on. The mean proportion of this consistency measure across participants was slightly lower in the rule-based condition (.85, sd = .20) than in the similarity-based condition (.94, sd = .11), but this difference did not reach significance (t(41) = 1.9, p = .06). A second test was made to substitute the preceding consistency criterion into a more severe one by rating a transfer response as consistent only when the participant gave either five A responses or five B responses Experimental Psychology 2016; Vol. 63(1):59â&#x20AC;&#x201C;69

F. Mathy & J. Feldman, Category Generalization

for a critical item. In that case, the mean proportion reduced to .65 (sd = .31) in the rule-based condition and .70 (sd = .25) in the similarity-based condition, and this smaller difference did not reach significance either (t(41) = .57, p = .57). This result contradicts the idea that the participants in the similarity-based condition were behaving more randomly when categorizing the new items.

General Discussion The current study corroborates our earlier finding that a rule-based presentation order aids learning (Mathy & Feldman, 2009), this time with the 5-4 categories. More importantly, we hypothesized that participants in a rulebased training condition would show generalization patterns consistent with rule-based retrieval, and that participants in a similarity-based training condition would show generalization patterns consistent with exemplar retrieval. Our results show that participants particularly capitalized on the rule-based order. When our participants in the rule-based condition were asked to transfer their learning to unlearned instances, they mostly showed generalization patterns consistent with rule-based retrieval; and, likewise, participants in the similarity-based condition showed fewer generalization patterns consistent with rule-based retrieval, although they did not generally exhibit generalization patterns consistent with exemplar-based retrieval. The distribution of the generalization pattern seems more random in the similarity-based condition (except for the larger group which adopted one rule-based generalization). Two thirds of these participants may have represented the categories in a heterogeneous way, almost as if the presentation had been random. However, our consistency measure of the transfer responses in the similarity-based condition did not provide support for the hypothesis that the participants behaved more randomly when categorizing the new items. One remaining interpretation is that the similarity-based condition only generated more noise during the training process. These analyses lead to our conclusion that the participants particularly capitalize on the rule-based order during learning because they can in fact abstract rules that are both consistent and simple, while participants in the similarity-based order struggle abstracting the simplest rules and memorize the stimuli without relying on a typical exemplar-based categorization process either. These findings are important because they suggest that the manipulation of order within categories can define a context which may affect not only the speed of learning but also the nature of learning, in much the same way as copresented items can have powerful effects on samecategory comparisons (Andrews, Livingston, & Kurtz, 2011; Ă&#x201C; 2016 Hogrefe Publishing

F. Mathy & J. Feldman, Category Generalization

Hammer, Diesendruck, Weinshall, & Hochstein, 2009). The different representations that were observed during the transfer phase suggest a potential connection with the idea that a mere manipulation of order inside a category can determine the meaning that is inferred from a set of objects, like a disease from a checklist of symptoms for instance (Kwan, Wojcik, Miron-Shatz, Votruba, & Olivola, 2012).

Limitations One limitation of the present study is that the total amount of category learning was not equated in the two experimental conditions. Because we used a learning criterion, the order conditions were confounded with the number of blocks of training; on average, participants in the rule-based condition reached the learning condition in fewer blocks. In effect, using a fixed learning criterion meant that rule-based participants received on average fewer training trials – an inevitable result of the superior learning entailed by the rule-based order. Note though that alternative procedures would have created even more serious design problems. For example, equating the number of blocks in each condition would have led either to extremely poor learning in some conditions (if the number were low), and thus to transfer performance reflecting more noise than learning); or (if the number were high) to ceiling effects and mostly exemplar-based memorization. Effectively, early phases of learning tend to be more rule-based, and later phases more exemplar-based (Johansen & Palmeri, 2002), potentially introducing a further bias if learning is prolonged. Of course, our results might in part reflect the specific learning criterion we chose, and only future experiments can reveal whether the pattern of results might subtly alter with alternative criteria. However, the essential point is that because two presentation orders lead to such different rates of learning, it seems necessary that when comparing them we equate the amount of actual learning rather than the mere exposure to training stimuli. A similar limitation had been noted by Casale, Roeder, and Ashby (2012), which also questioned whether the specific transfer observed for their rule-based condition might have been due to the faster learning of their rule-based task. The authors also rejected the difficulty account arguing for instance that transfer should occur in any task in which there is learning, and concluded that rule-based category learning is functionally distinct from other categorization tasks during transfer. To confirm our interpretation of the results, we reanalyzed our results by matching the participants of both 6

groups by the number of blocks that each of them used to reach the learning criterion. The participants who could not be matched given their performance during training were left out of the new analysis (e.g., because a single participant reached the learning criterion using 12 blocks, in the rule-based order, this participant was omitted from the new analysis). As a result, presentation order and amount of learning were equated a posteriori. We then reanalyzed the data, without the stimulus-exposure confound. The revised results were virtually identical to the original, with for instance a r = .99 Pearson’s correlation between the mean probabilities shown in Figure 2 and the corresponding means in the new analysis. We conclude that the differing patterns of transfer between the two presentation orders cannot be attributed to the different amounts of stimulus exposure in the two conditions. A second limitation is due to a choice we made in the design of the experiment that may hamper its generalizability. For instance, only one kind of clustering (based on Color) was used to determine the clusters of the rule-based condition, and a similar experiment could have been conducted using the Size and the Shape dimensions as well. Still, because two-thirds of our participants followed the one-dimensional Color rule plus exception that was induced in the Rule condition, we felt that tripling our sample to use for instance all of the Shape/Color/Size canonical dimensions to confirm this trend would have been too costly in the short-term.

Conclusion Our main results tend to agree with previous theories implying that different learning processes mediate category representations (Ashby & Ell, 2001; Sloman, 1996) as we found that different types of presentation orders change the generalization patterns of a single category structure. Our previous work (Mathy & Feldman, 2009) already showed that rule-based orders and similarity-based orders inflect the speed of category learning. The present result further documents that different category representations can be induced by such orders and that the distortion of these representations is extended to several stimuli among the categories induced. It is not clear whether the similarity-based training leads to exemplar-based generalizations because there is no clear evidence for similarity-based responding in our result, but the patterns of generalization that the learners show change in comparison to the rule-based condition.6 The rule-based condition however more clearly

One possibility is that showing our participants the correct category label during training tended to discourage associative learning (Ashby, Maddox, & Bohil, 2002). However, a subsequent study has shown that, under more controlled conditions, the presence of feedback impacts rule-based and similarity-based processes to a similar degree (Edmunds, Milton, & Wills, 2015).

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):59–69

produced rule-based classification during transfer and a clear peak for one of the typical generalization patterns. A reasonable question is whether our participants are truly using rule-based representations within a putative dual process of reasoning. Our results are also compatible with other approaches which tend to see the two systems of reasoning as the two endpoints of a continuum (Briscoe & Feldman, 2011). For instance, selective attention in an exemplar-based processes can produce patterns of responses similar to single-dimension rule use (indeed, this is how Johansen & Palmeri, 2002, model rules; see also Kruschke, 1992), and as such, responding on the basis of a single dimension does not in itself indicate rule use (likewise, participants can also apply multidimensional rules which can account for exemplar effects; see Nosofsky, Palmeri, & McKinley, 1994). However, we believe that our procedure involved no pressure to increase the likelihood of single-dimension responding as predicted for instance by the Extended General Context Model (Lamberts, 1998) or Combination Theory (Wills, Inkster, & Milton, 2015). The rule-based effect is particularly interesting in light of the fact that the 5-4 categories are known to predominantly elicit memorization of individual objects (Blair & Homa, 2003). Thus it cannot be easily argued that the category structure itself generally favors a rule-based learning mechanism, although the learning process was not prolonged and might have mostly induced a rule-based process. Overall, these results might be crucial for testing rule- and exemplar-based models of human categorization behavior (Rouder & Ratcliff, 2004) and sequential hypothesis-testing models (Markant & Gureckis, 2014), as well as better identifying rule learners versus exemplar learners (McDaniel, Cahill, Robbins, & Wiener, 2014), especially for testing models that are known to implement an incremental architecture (Love, Medin, & Gureckis, 2004; Sakamoto, Jones, & Love, 2008; Stewart, Brown, & Chater, 2002) to account for such sequence effects. Acknowledgments This research was supported by the Agence Nationale pour la Recherche (ANR) Grant No. ANR-09-JCJC-0131-01 awarded to Fabien Mathy. We are grateful to Azizedine Elmahdi and Nicolas Heller for their assistance in data analysis during their engineering internship in the laboratory. Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at http://dx.doi.org/10.1027/ 1618-3169/a000312

Experimental Psychology 2016; Vol. 63(1):59–69

F. Mathy & J. Feldman, Category Generalization

ESM 1. Raw data (Excel file). Raw data of the presentation order experiment. ESM 2. Transfer data (sav file). Data of the repeated experiment.

References Allen, S. W., & Brooks, L. R. (1991). Specializing the operation of an explicit rule. Journal of Experimental Psychology: General, 120, 3–19. Andrews, J. K., Livingston, K. R., & Kurtz, K. J. (2011). Category learning in the context of co-presented items. Cognitive Processing, 12, 161–175. Ashby, F. G., & Ell, S. W. (2001). The neurobiology of human category learning. Trends in Cognitive Sciences, 5, 204–210. Ashby, F. G., Maddox, W. T., & Bohil, C. J. (2002). Observational versus feedback training in rule-based and information-integration category learning. Memory & Cognition, 30, 666–677. Birnbaum, M. S., Kornell, N., Bjork, E. L., & Bjork, R. A. (2013). Why interleaving enhances inductive learning: The roles of discrimination and retrieval. Memory & Cognition, 41, 392–402. Blair, M., & Homa, D. (2003). As easy to memorize as they are to classify: The 5-4 categories and the category advantage. Memory & Cognition, 31, 1293–1301. Briscoe, E., & Feldman, J. (2011). Conceptual complexity and the bias/variance tradeoff. Cognition, 118, 2–16. Carvalho, P. F., & Goldstone, R. L. (2011). Stimulus similarity relations modulate benefits of blocking and interleaving during category learning. Abstracts of the 52nd annual meeting of the psychonomic society, Seattle, WA. Carvalho, P. F., & Goldstone, R. L. (2014). Putting category learning in order: Category structure and temporal arrangement affect the benefit of interleaved over blocked study. Memory & Cognition, 42, 481–495. Casale, M. B., Roeder, J. L., & Ashby, F. G. (2012). Analogical transfer in perceptual categorization. Memory & Cognition, 40, 434–449. Clapper, J. P., & Bower, G. H. (1994). Category invention in unsupervised learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 443–460. Clapper, J. P., & Bower, G. H. (2002). Adaptive categorization in unsupervised learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 908–923. Cohen, A. L., & Nosofsky, R. M. (2003). An extension of the exemplar-based random-walk model to separable-dimension stimuli. Journal of Mathematical Psychology, 47, 150–165. Edmunds, C. E. R., Milton, F., & Wills, A. J. (2015). Feedback can be superior to observational training for both rule-based and information-integration category structures. The Quarterly Journal of Experimental Psychology, 68, 1203–1222. Elio, R., & Anderson, J. R. (1981). Effects of category generalizations and instance similarity on schema abstraction. Journal of Experimental Psychology: Human Learning and Memory, 7, 397–417. Elio, R., & Anderson, J. R. (1984). The effects of information order and learning mode on schema abstraction. Memory & Cognition, 12, 20–30. Gagné, R. M. (1950). The effect of sequence of presentation of similar items on the learning of paired associates. Journal of Experimental Psychology, 40, 61–73. Garner, W. (1974). The processing of information and structure. Potomac, MD: Erlbaum.

Ó 2016 Hogrefe Publishing

F. Mathy & J. Feldman, Category Generalization

Goldstone, R. L. (1996). Isolated and interrelated concepts. Memory & Cognition, 24, 608–628. Hahn, U., & Chater, N. (1998). Similarity and rules: distinct? exhaustive? empirically distinguishable? Cognition, 65, 197–230. Hammer, R., Diesendruck, G., Weinshall, D., & Hochstein, S. (2009). The development of category learning strategies: What makes the difference? Cognition, 112, 105–119. Johansen, M. K., & Kruschke, J. K. (2005). Category representation for classification and feature inference. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 1433–1458. Johansen, M. K., & Palmeri, T. J. (2002). Are there representational shifts during category learning? Cognitive Psychology, 45, 482–553. Kang, S. H. K., & Pashler, H. (2012). Learning painting styles: Spacing is advantageous when it promotes discriminative contrast. Applied Cognitive Psychology, 26, 97–103. Kornell, N., & Bjork, R. A. (2008). Learning concepts and categories: Is spacing the “enemy of induction”? Psychological Science, 19, 585–592. Kornell, N., Castel, A. D., Eich, T. S., & Bjork, R. A. (2010). Spacing as the friend of both memory and induction in young and older adults. Psychology and Aging, 25, 498–503. Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22–44. Kwan, V. S. Y., Wojcik, S. P., Miron-Shatz, T., Votruba, A. M., & Olivola, C. Y. (2012). Effects of symptom presentation order on perceived disease risk. Psychological Science, 23, 381–385. Lafond, D., Lacouture, Y., & Mineau, G. (2007). Complexity minimization in rule-based category learning: Revising the catalog of Boolean concepts and evidence for non-minimal rules. Journal of Mathematical Psychology, 51, 57–74. Lamberts, K. (1998). The time course of categorization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 695–711. Lamberts, K. (2000). Information-accumulation theory of speeded categorization. Psychological Review, 107, 227–260. Love, B. C., Medin, D. L., & Gureckis, T. M. (2004). SUSTAIN: A network model of category learning. Psychological Review, 111, 309–332. Markant, D., & Gureckis, T. (2014). Is it better to select or to receive? Learning via active and passive hypothesis testing. Journal of Experimental Psychology: General, 143, 94–122. Mathy, F., & Feldman, J. (2009). A rule-based presentation order facilitates category learning. Psychonomic Bulletin & Review, 16, 1050–1057. Mathy, F., & Feldman, J. (2011). Presentation order effects on category learning and category generalization. In Psychonomic Society. (Ed.), Abstracts of the Psychonomic Society (Vol. 19, pp. 58) Psychonomic Society. Seattle, WA: 52nd Annual Meeting of the Psychonomic Society. McDaniel, M. A., Cahill, M. J., Robbins, M., & Wiener, C. (2014). Individual differences in learning and transfer: Stable tendencies for learning exemplars versus abstracting rules. Journal of Experimental Psychology: General, 143, 668–693. Medin, D. L., & Bettger, J. G. (1994). Presentation order and recognition of categorically related examples. Psychonomic Bulletin & Review, 1, 250–254. Medin, D. L., & Schaffer, M. (1978). A context theory of classification learning. Psychological Review, 85, 207–238. Minda, J. P., & Smith, J. D. (2002). Comparing prototype-based and exemplar-based accounts of category learning and attentional allocation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 275–292.

Ó 2016 Hogrefe Publishing

Nosofsky, R. M., Palmeri, T. J., & McKinley, S. C. (1994). Ruleplus-exception model of classification learning. Psychological Review, 101, 53–79. Raijmakers, M. E., Schmittmann, V. D., & Visser, I. (2014). Costs and benefits of automatization in category learning of illdefined rules. Cognitive Psychology, 69, 1–24. Rehder, B., & Hoffman, A. B. (2005). Thirty-something categorization results explained: Selective attention, eyetracking, and models of category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 811–829. Rouder, J. N., & Ratcliff, R. (2004). Comparing categorization models. Journal of Experimental Psychology: General, 133, 63–82. Sakamoto, Y., Jones, M., & Love, B. C. (2008). Putting the psychology back into psychological models: Mechanistic versus rational approaches. Memory & Cognition, 36, 1057–1065. Shanks, D. R., & Darby, R. J. (1998). Feature- and rule-based generalization in human associative learning. Journal of Experimental Psychology: Animal Behavior Processes, 24, 405–415. Shepard, R. N., Hovland, C. L., & Jenkins, H. M. (1961). Learning and memorization of classifications. Psychological Monographs: General and Applied, 75, 1–42. doi: 10.1037/h0093825 Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119, 3–22. Sloman, S. A., & Rips, L. J. (1998). Similarity as an explanatory construct. Cognition, 65, 87–101. Smith, E. E., Patalano, A. L., & Jonides, J. (1998). Alternative strategies of categorization. Cognition, 65, 167–196. Smith, J. D., & Minda, J. P. (2000). Thirty categorization results in search of a model. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 3–27. Stewart, N., Brown, G. D. A., & Chater, N. (2002). Sequence effects in categorization of simple perceptual stimuli. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 3–11. Wahlheim, C. N., Dunlosky, J., & Jacoby, L. L. (2011). Spacing enhances the learning of natural concepts: An investigation of mechanisms, metacognition, and aging. Memory & Cognition, 39, 750–763. Wills, A. J., Graham, S., Koh, Z., McLaren, I. P. L., & Rolland, M. D. (2011). Effects of concurrent load on feature- and rule-based generalization in human contingency learning. Journal of Experimental Psychology: Animal Behavior Processes, 37, 308–316. Wills, A. J., Inkster, A. B., & Milton, F. (2015). Combination or differentiation? Two theories of processing order in classification. Cognitive Psychology, 80, 1–33. Zaki, S. R., Nosofsky, R. M., Stanton, R. D., & Cohen, A. L. (2003). Prototype and exemplar accounts of category learning and attentional allocation: A reassessment. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 1160–1173. Received May 13, 2015 Revised October 6, 2015 Accepted October 14, 2015 Published online March 29, 2016 Fabien Mathy Université Nice Sophia Antipolis Laboratoire BCL: Bases, Corps, Langage 24, avenue des diables bleus 06357 Nice Cedex 4 France Tel. +33 489881445 E-mail fabien.mathy@unice.fr

Experimental Psychology 2016; Vol. 63(1):59–69

Registered Report

Evaluative Priming in the Pronunciation Task A Preregistered Replication and Extension Karl Christoph Klauer,1 Manuel Becker,1 and Adriaan Spruyt2 1

Institut für Psychologie, Albert-Ludwigs-Universität Freiburg, Germany

Department of Experimental-Clinical and Health Psychology, Ghent University, Belgium

Abstract: We replicated and extended a study by Spruyt and Hermans (2008) in which picture primes engendered an evaluative-priming effect on the pronunciation of target words. As preliminary steps, we assessed data reproducibility of the original study, conducted Pilot Study I to identify highly semantically related prime-target pairs, reanalyzed the original data excluding such pairs, conducted Pilot Study II to demonstrate that we can replicate traditional associative priming effects in the pronunciation task, and conducted Pilot Study III to generate relatively unrelated sets of prime pictures and target words. The main study comprised three between-participants conditions: (1) a close replication of the original study, (2) the same condition excluding highly related prime-target pairs, and (3) a condition based on the relatively unrelated sets of prime pictures and target words developed in Pilot Study III. There was little evidence for an evaluative priming effect independent of semantic relatedness. Keywords: evaluative priming, affective priming, pronunciation task, replicability

The evaluative-priming paradigm is a sequential priming paradigm introduced by Fazio, Sanbonmatsu, Powell, and Kardes (1986). It allows one to gauge the impact of the valence of briefly shown prime stimuli on responses to subsequently presented target stimuli. An evaluative-priming effect is observed when responses to target stimuli (e.g., sunshine) occur faster and more accurately following evaluatively-congruent prime stimuli (e.g., love) than following evaluatively incongruent prime stimuli (e.g., hate). The paradigm continues to attract a considerable amount of research interest, perhaps due to its role in studying the spontaneous activation of attitudes (e.g., Fazio et al., 1986), as one of the first implicit measures of attitudes (De Houwer, Teige-Mocigemba, Spruyt, & Moors, 2009), and as a model of behavioral priming effects (Fiedler, Bluemke, & Unkelbach, 2011). One line of research has looked at evaluative-priming effects in the pronunciation task. In that task, the targets are words and the participants’ task is to read out the target word loud as fast as possible. The pronunciation task has gained some prominence because the occurrence of evaluative-priming effects in it would suggest relatively strongly (1) that prime valence affects the encoding of valenced targets (Wentura & Frings, 2008) and (2) that primes engender effects independently of a goal to evaluate (De Houwer, Hermans, & Spruyt, 2001). Both conclusions (1) and (2) would imply evaluative priming to be surprisingly Experimental Psychology 2016; Vol. 63(1):70–78 DOI: 10.1027/1618-3169/a000286

general in scope, and they would strongly constrain theoretical accounts of evaluative priming. Early attempts to find such effects yielded mixed results (see Klauer & Musch, 2003, for a summary). In particular, it seems difficult to replicate seminal findings by Bargh, Chaiken, Raymond, and Hymes (1996), who observed evaluative-priming effects in the traditional paradigm in which primes and targets are words (Klauer & Musch, 2001; Spruyt, Hermans, Pandelaere, De Houwer, & Eelen, 2004). But in recent years, evaluative-priming effects were repeatedly reported in the pronunciation task when picture primes and/or modified priming procedures were employed. In particular, four studies have used picture primes and reported evaluative-priming effects in the pronunciation task (Duckworth, Bargh, Garcia, & Chaiken, 2002, Exp. 2; Giner-Sorolla, Garcia, & Bargh, 1999, Exp. 2; Spruyt & Hermans, 2008; Spruyt, Hermans, De Houwer, & Eelen, 2002, Exp. 3). A fifth such study (Everaert, Spruyt, & De Houwer, 2011) found a significant effect in one of two conditions. Other studies have used modified paradigms employing a variety of measures presumably affecting the semantic or evaluative processing of primes and/or targets (e.g., De Houwer, Hermans, & Spruyt, 2001; De Houwer & Randell, 2004; Everaert et al., 2011; Pecchinenda, Ganteaume, & Banse, 2006; Spruyt, De Houwer, & Hermans, 2009). A related line of research has used the Ó 2016 Hogrefe Publishing

K. Christoph Klauer et al., Evaluative Priming

picture-naming task, in which targets are pictures and the task is to name the object depicted in the picture (Spruyt, Hermans, De Houwer, Vandromme, & Eelen, 2007; Spruyt et al., 2002; Wentura & Frings, 2008). Collecting these different paradigms and tasks under the common categorization “pronunciation task/naming task,” a recent metaanalysis (Herring et al., 2013) concluded that there was a significant evaluative-priming effect across these studies. The present paper is focused on effects reported for picture primes and target words in the non-modified evaluativepriming paradigm with the pronunciation task.1 The just-mentioned meta-analysis also adduced evidence for publication bias in the evaluative-priming literature indicating that the published results may present a distorted picture. Given that the effect size estimated for the pronunciation/naming studies (d = 0.29) was substantially smaller than that for studies with the frequently used evaluative-decision task in which targets have to be classified as positive or negative (d = 0.45), it may be speculated that the pronunciation studies are especially likely to have produced null results and to have disappeared in file drawers. In fact, the first author has conducted multiple experiments using the pronunciation task producing null results and did not attempt or succeed to get them published. In the light of the relatively small number of published studies that have used the pronunciation task and considering the low power of methods for detecting publication bias, it is, however, difficult to assess whether publication bias is more or less pronounced for pronunciation studies than for studies using other tasks. Given the evidence for publication bias, preregistered replications are desirable to enhance one’s confidence in the reliability of the relevant literature despite publication bias. Another limitation of the literature is that priming effects in the pronunciation task using picture primes were reported by only two workgroups, making it desirable to assess the generalizability of effects beyond these laboratories (more than two workgroups were involved in the above-discussed set of studies sometimes categorized as pronunciation/naming-task studies). Furthermore, we engage in an adversarial collaboration, a widely recommended constructive method to resolve scientific conflict (e.g., Kahneman, 2003). For the present replication, we chose the study by Spruyt and Hermans (2008) on the basis of its methodological strengths. In particular, unlike almost all other studies, it avoided the use of target repetition, at least for a first block of trials. Klauer and Musch (2001) argued that target repetition jeopardizes the intended interpretation of priming effects as demonstrating facilitated target encoding.

In what follows, we first describe the study by Spruyt and Hermans (2008) and then assess its reproducibility, a necessary condition for replicability. Data reproducibility “means that Researcher B (e.g., the reviewer of a paper) obtains exactly the same results (e.g., statistics and parameter estimates) that were originally reported by Researcher A (e.g., the author of that paper) from A’s data when following the same methodology” (Asendorpf et al., 2013, p. 109). In the course of assessing reproducibility, a potential alternative hypothesis in terms of a semantic-relationship confound for Spruyt and Hermans’ (2008) study was identified. This possibility was followed up in a reanalysis of Spruyt and Hermans’ (2008) data based on ratings obtained in Pilot Study I. The reanalysis strongly motivates to include additional conditions over and above a replication condition, leading to an extended design. Pilot Study II aseesses the suitability of our procedures and instruments for securing priming effects in the pronunciation task, another necessary condition for replicability. In Pilot Study III, new materials were generated for an additional condition of the main study.

Spruyt and Hermans (2008) Study Procedure Spruyt and Hermans (2008) used 20 positive and 20 negative words as targets along with 30 positive and 30 negative pictures as primes. Participants completed four blocks of 40 trials each. In each block, the 40 target words were randomly paired with 40 randomly selected pictures, the only restriction being that each trial type (positivepositive, positive-negative, negative-positive, negativenegative) occurred equally often. The experimental blocks were preceded by 10 practice trials using neutral prime pictures and neutral target words. Participants were to pronounce the target word as quickly as possible while ignoring the pictures. The procedures were further described as follows: “Each trial started with a 500 ms presentation of a fixation cross in the centre of the screen. Five hundred milliseconds after the offset of the fixation cross, the primes were presented for 200 ms. Finally, after an inter stimulus interval of 50 ms, the target stimuli were presented until the participant gave a

The first two authors believe that the above-mentioned different paradigms are vulnerable to different alternative explanations and confounds and that it is therefore misleading to lump them together. We intend to spell out and examine these alternative explanations in the course of a larger project of which the present manuscript constitutes the first step targeting specifically and only evaluative priming in the pronunciation task using picture primes.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):70–78

K. Christoph Klauer et al., Evaluative Priming

response or 2,000 ms elapsed. By pressing one of three keys of the computer keyboard, the experimenter coded whether the microphone was triggered accurately and whether the participant’s response was correct. After the experimenter entered the code, the next trial was initiated after a time interval that varied randomly between 500 ms and 1,500 ms.” (Spruyt & Hermans, 2008, p. 239). For the analyses, trials in which the voice key was not appropriately activated or an incorrect response was given were excluded. Responses past the 2,000 ms deadline were also excluded. Response latencies that deviated more than 2.5 SDs from a participant’s conditional mean latency were also discarded, where “conditional” refers to the cells of the design spanned by the factors block and evaluative congruency. An analysis of variance with these factors revealed a significant main effect of evaluative congruency with estimated effect size d = 0.44. In addition, there was a significant evaluative priming effect for the first block of trials with estimated effect size d = 0.35.

Reproducibility Based on the files with the raw data, we reproduced all reported statistics precisely with the exception of the number of excluded outliers which amounted to 2.04% of all trials rather than 1.70% as stated in the article. In total, 5.30% of trials were excluded for different reasons.2 The raw data suggested a potentially more important discrepancy between the article and the data, however. The set of targets listed in the appendix of Spruyt and Hermans (2008) differs from the set of targets written into the raw data files. Discussion with Adriaan Spruyt revealed that the set reported in the article is not the one that was actually used in the experiment although there is considerable overlap between the two. This discrepancy led to the publication of an erratum, “Correction to Spruyt and Hermans (2008)” (2014), in which the correct target words are reported. The discrepancy does not affect any of the statistical analyses or conclusions drawn from the original paper.

A Potential Confound, Pilot Study I, and Reanalysis Inspection of the target words and of the prime pictures reveals a potential confound: The target words are 2

relatively frequently strongly descriptive of the prime pictures. For example, the target “tumor” can be seen as a description of one of the prime pictures depicting a huge tumor; the target “haat” [hate] is exemplified by several prime pictures depicting threatening, angry men; the target “romantiek” [romance] is allegorically depicted by pictures showing happy couples, a bride, a rose, and so forth. Such close semantic relationships raise the possibility that part of the evaluative-priming effects were in fact due to uncontrolled strong associative-semantic links between primes and targets (Neely, 1991).

Pilot Study I To assess this possibility, we had six raters rate all 2,400 possible prime-target pairs on a 4-point rating scale with respect to how well they fit each other. The instructions explained that a picture and a word fit each other if, for example, the word describes the picture well, the picture is an allegorical depiction of the word, the picture gives an example of the word, or if the picture makes one think spontaneously of the word. The four points of the rating scale were labeled from 1 to 4, in order, as “do not fit at all,” “fit rather not,” “fit somewhat,” “fit very well.” One rater only used the cautious middle categories with few exceptions, and we did not use her data further (see Electronic Supplementary Material, ESM 1). Not surprisingly, evaluatively-congruent stimuli (M = 2.75, SD = 0.72) fit each other better than evaluatively incongruent stimuli (M = 1.21, SD = 0.24); in particular, none of the raters gave the highest rating “4” for any evaluatively incongruent pair.

Reanalysis For a reanalysis of Spruyt and Hermans’ (2008) data, we excluded all pairs for which a majority of the five raters had selected the highest rating, thereby excluding 233 of the 2,400 pairs. This reduced the mean fit rating for evaluatively-congruent stimuli slightly to M = 2.52, SD = 0.60. The total percentage of excluded trials was thereby elevated to 14.19% (from 5.30%). An analysis of variance with within-participants factors block and evaluative congruency showed that the main effect of block reported by Spruyt and Hermans (2008) was as strong as in the original analysis, F(2.39, 107.67) = 8.99, p < .01, previously F(2.50, 112.44) = 8.42, whereas the main effect of congruency was erased: F(1, 45) = 1.49, p = .23, from F(1, 45) = 8.83, p < .005 (F < 1 for the interaction of both factors). The mean priming

In the original article, degrees of freedom are Greenhouse-Geisser corrected for the F tests involving the factor “block,” but this is not the case in the computation of the reported MSE values as some readers might perhaps have expected. Finally, in Footnote 3, the degrees of freedom reported for the main effect of block are wrongly specified, the correct values are 1.75 and 78.52.

Experimental Psychology 2016; Vol. 63(1):70–78

Ó 2016 Hogrefe Publishing

K. Christoph Klauer et al., Evaluative Priming

effect was 1.8 ms (SD = 9.7), down from M = 4.0 ms (SD = 9.2) in the original study. Neither was the evaluative-priming effect significant in the first block of trials, t(45) = 1.17, p = .25, M = 3.6 ms, SD = 21, previously t(45) = 2.40, p < .05, M = 6.9 ms, SD = 20, nor in an analysis of variance with first block of trials eliminated (all Fs involving congruency smaller than 1). This suggests that the evaluative-priming effects reported by Spruyt and Hermans (2008) may partly or completely have been based on a few strongly semantically related prime-target pairs. Of course, this evidence is only suggestive, because (1) eliminating trials reduces the statistical test power for detecting effects (note, however, that the block effect remained, if anything, as strong as in the original analysis), and because (2) the fit ratings were based on target words translated into German and on German rather than Belgian participants so that there may be cultural differences in the degree of fit as perceived by Spruyt and Hermans’s (2008) participants and the present raters. Nevertheless, the results of the pilot study and reanalysis strongly motivate to extend the replication by additional conditions in which at least the most strongly semantically related prime-target pairs are excluded a priori.

Pilot Study II The purpose of Pilot Study II was to demonstrate that the procedures and instruments (such as our voice key) are capable of documenting priming effects if they exist. Like data replicability, this is a necessary condition for replicability, because sloppy and noisy data collection would decrease effect sizes and can thereby prevent a true effect from emerging reliably. In Pilot Study II, we implemented a sequential priming paradigm using the pronunciation task and exactly the same procedures for the timing of primes and targets and the recording of responses as in Spruyt and Hermans (2008). Primes and targets were, however, both words that were either strongly associated (e.g., mother-father) or not related. The related prime-target pairs were the same 50 pairs already used by Klauer and Musch (2001). The 50 unrelated prime-target pairs were formed by randomly repairing primes and targets. The 100 prime-target pairs were presented in a random order that was determined anew for each participant with the restriction that the same target word did not appear in two consecutive trials. Associative priming in the pronunciation task is a welldocumented effect (Neely, 1991) and therefore an appropriate benchmark for testing our procedures. The 20 participants were mostly students from the University of Freiburg with different majors (mean age 23.5 Ó 2016 Hogrefe Publishing

years, SD = 3.4, 13 female). They received course credit or a monetary gratification of €2.00 for participating. Primes and targets were presented in the center of a 58 cm LCD monitor (LG FLATRON W2363D) with a resolution of 1,920 by 1,080 pixels and a seating distance of approximately 60 cm. Words were presented in Times New Roman font and subtended 54 pixels vertically. Data (see ESM 2) were preprocessed exactly as in Spruyt and Hermans (2008; see above), leading to the exclusion of 4.10% of the trials. An associative priming effect of 10 ms (SD = 12) and estimated effect size of d = .88 emerged that was significantly different from zero: t(19) = 3.93, p < .01.

Main Study The main study comprised three conditions implemented as between-participants factor. A replication condition was a close replication of the study by Spruyt and Hermans (2008). In a second condition, all stimuli from the original study were retained, but the 233 strongly related primetarget combinations identified in Pilot Study I were never presented. A third condition used different sets of prime pictures and target words for which highly related primetarget pairs were a priori unlikely to occur. A sample size of N = 58 is required for a power of .95 to detect an effect of evaluative congruency of the size observed in the original study (d = 0.44) by means of a one-sided t-test, and we planned to collect N = 60 participants per condition. An anonymous reviewer suggested prior to data collection to use the above-mentioned effect size of d = 0.29 estimated for pronunciation/naming studies and to obtain enough participants such that the two relatively unconfounded replication attempts (second and third condition) taken together achieve a test power of .95 in a one-sided t-test. This requires N = 131 participants, and following the reviewer’s suggestion, we sampled N = 66 participants for each of the three conditions.

Pilot Study III For the third condition, we selected pictures of ugly and beautiful landscapes as negative and positive primes, respectively, all of them of a size of 512 by 384 pixels, like the pictures used by Spruyt and Hermans (2008). Target words denoted positive and negative traits excluding those that might be readily applied to landscapes (such as beautiful, attractive, etc.). We obtained valence ratings on a 7-point scale for 60 such pictures and 60 such words from Experimental Psychology 2016; Vol. 63(1):70–78

K. Christoph Klauer et al., Evaluative Priming

Table 1. Descriptive statistics for the materials in the condition with unrelated stimuli (SDs in parentheses) Valence rating Mean Pictures Words

Negative Positive Negative Positive

1.52 6.51 2.42 5.69

(0.23) (0.16) (0.54) (0.48)

Word frequency

Polarization 2.49 2.51 1.59 1.69

(0.23) (0.16) (0.54) (0.48)

Written

Spoken

Word length

2.02 (0.82) 1.94 (0.68)

14.80 (1.89) 14.25 (2.07)

7.80 (0.59) 8.75 (0.53)

Notes. The norms for spoken language stem from Brysbaert et al. (2011); for written language from the German reference corpus (Kupietz, Belica, Keibel, & Witt, 2010). One negative word was not part of the database for spoken words. For both pictures and words, valence ratings depart significantly from the midpoint for both positive and negative stimuli (all four t > 13.0, p < .001) and positive and negative stimuli do not differ in this degree of polarization (both t < 1).

N = 10 raters (see ESM 3). We selected two sets of 40 pictures and 40 words for use in the experiment on the basis of these valence ratings and so that positive and negative words were matched in word frequency and in word length. Table 1 shows descriptive statistics for the final picture and word sets. The target words are listed in the appendix, the prime pictures can be obtained from the authors upon request.

Method The procedures closely followed those of Spruyt and Hermans (2008) unless where otherwise mentioned. A replication condition provided a close replication of the original study, a condition with reduced prime-target relatedness sampled from the same primes and targets as the original study, but excluded the 233 highly related primetarget pairs identified in Pilot Study I from being sampled. A condition with unrelated stimuli was based on the landscape primes and trait words just described. The stimuli for the 10 practice trials were sampled from the sets of primes and targets not in use for the participant’s condition (excluding highly-related prime-target pairs). Departing from the proposal submitted prior to data collection, we used, however, the neutral stimuli (with neutral words translated into German) employed in the original experiment for the practice trials of the replication condition. This change was cleared with the action editor prior to data collection (Christian Frings, personal communication, October 23rd, 2014). Participants Participants were mostly University-of-Freiburg students with different majors. They received either partial course credit or €2 for participating. A total of 204 participants were sampled (see ESM 4).3 Five of these were excluded 3

as extreme outliers according to Tukey’s criterion (more than three times the interquartile range above the upper or below the lower quantile) in the sample’s distribution of the number of correctly pronounced words (M = 157, SD = 4.67) with less than 141 correct responses. One was excluded as extreme outlier in the sample’s distribution of correct-response latencies (M = 496 ms, SD = 58) with an average response latency of 728 ms. After these exclusions, there were N = 66 persons per condition for the analyses. Of these 198 participants, demographic data were lost for one participant due to computer problems. Of the remaining 197 participants, 75 were male, 121 female, and one person chose not to respond to the gender question. Participants’ mean age was 22.62 years (SD = 3.93).

Results Preregistered Analyses The main analyses (see ESM 5) closely followed Spruyt and Hermans (2008) in terms of outlier criteria and statistical tests. Like in Spruyt and Hermans (2008), a few responses coded as correct occurred past the response deadline of 2 s, with response latency set to 2 s by the program, and in the present data, a few such responses occurred with unrealistically short latencies (such as smaller than 50 ms). Comparing the codings of the Spruyt and Hermans (2008) data and the present data, coders were generally more liberal for the Spruyt and Hermans (2008) data in accepting too late responses and coders of the present data in accepting responses with short latencies. This suggests that there is some leeway for subjective coder criteria in the coding of such responses. This in turn suggests that it is probably wise to have coders blind to the experimental hypotheses in future studies. Responses past the response deadline were excluded as in Spruyt and Hermans (2008), and the three authors also

A number of additional participants were excluded a priori because technical problems occurred during the experimental session (two participants), because they did not complete the experiment (four participants), because they had already participated once in the same experiment (two participants), or because German was not their first language (two participants). One of the last participants to take part in the experiment was excluded because we had already sampled the planned N of 66 per condition for the participant's condition.

Experimental Psychology 2016; Vol. 63(1):70–78

Ó 2016 Hogrefe Publishing

K. Christoph Klauer et al., Evaluative Priming

Table 2. Means and SDs of correct-response latencies and priming effects in ms Condition 1 Block

Statistic

M SD M SD M SD M SD

2 3 4

Condition 2

Condition 3

Inc.

Con.

Inc.

Con.

Inc.

Con.

514.6 75.3 492.8 64.9 495.5 61.4 499.5 65.3

513.7 80.1 498.3 66.1 492.6 62.2 495.7 64.8

0.9 23.2 -5.5 22.2 2.9 19.7 3.8 19.0

504.3 58.2 483.6 46.5 485.2 50.1 486.3 50.2

503.6 54.2 483.2 48.0 485.0 49.6 486.0 53.3

0.8 24.0 0.5 16.6 0.2 19.1 0.3 19.5

521.9 71.1 486.8 50.6 486.4 57.1 483.4 48.5

516.8 64.2 487.4 51.8 484.6 53.14 482.7 47.9

5.1 30.6 -0.6 21.8 1.8 19.8 0.7 16.5

Notes. Con. = evaluatively congruent; Inc = evaluatively incongruent; PE = priming effect. Condition 1 is the replication condition; in Condition 2, highly related prime-target pairs are never presented; in Condition 3, prime-target pairs are less related a priori.

agreed to take out responses that occurred 150 ms or sooner after target onset for the present analyses. Note, however, that the results pattern of the preregistered analysis is robust against not taking out such responses or using different cut-off values such as 50 ms or 100 ms. Response latencies that deviated more than 2.5 SDs from a participant’s conditional mean latency were excluded as in Spruyt and Hermans (2008). This led to an exclusion of 1.75% of the trials due to voicekey failures, wrong, or missing responses and to an additional exclusion of 1.75% of the remaining correctresponse trials as outliers. For the analyses of variance reported below, degrees of freedom are GreenhouseGeisser corrected. Table 2 presents means and standard deviations of the correct-response latencies as a function of the betweenparticipants factor condition, and within-participants factors block and evaluative congruency. An analysis of variance with these factors revealed a significant main effect of block, F(1.87, 364.81) = 80.09, p < .01, and a significant interaction of block and condition, F(3.74, 364.81) = 4.28, p < .01. There was no main effect of evaluative congruency, F(1, 195) = 1.24, p = .27, nor an interaction of block and evaluative congruency, F(2.87, 560.03) = 1.60, p = .19; all other F 1.01. The average priming effect was 0.9 ms (SD = 11.4, d = 0.08). A second preregistered analysis of variance with condition and evaluative congruency was conducted just for the first block of trials in which there is no stimulus repetition. Again, the main effect of evaluative congruency was not significant, F(1, 195) = 1.47, p = .23; all other F < 1. 4

Exploratory Analyses Both the first two authors and the third author, independently from each other, explored the effects of various analytic choices on the results. These analytic choices include the exclusion/inclusion of the block factor, the exclusion/ inclusion of different subsets of trials or participants on various grounds, the use of a logarithmic transformation of the raw data, and combinations of these settings. The first two authors also looked at mixed models that include random intercepts and slopes for participants and/or items. Both the first two authors and the third author found that some (combinations of) analytic choices resulted in significant results for the evaluative-priming effect, whereas others did not. For the sake of brevity, we report only a small subset of these analyses. In particular, a joint analysis (total N = 285) of the present data, the original data (N = 46) collected by Spruyt and Hermans (2008), and the data of a recent study (N = 41; see ESM 6), run at Ghent university, in which the second condition of the present experiment was replicated, was also performed. This second replication study revealed no reliable evidence for an evaluative priming effect, either across blocks of trials, F < 1, or in the first block of trials, F(1, 40) = 1.10, p = .30.4 Across all studies, however, a small but reliable evaluative priming effect emerged.5 As can be seen in Table 3, this evaluative priming effect was not significantly moderated by the semantical relatedness of prime-target pairs, the language/lab (location) in which the experiment was run, nor the precise study. This data pattern also replicated irrespective of whether (1) the block variable was included

Nevertheless, the overall evaluative priming effect did correlate significantly (r = .36, p < .05) with the extent to which participants deemed the experiment to have important implications (i.e., as captured by a post-experimental questionnaire that was included for exploratory purposes). This observation suggests that the evaluative priming effect in the pronunciation task may be dependent upon person-specific beliefs and expectations, which might be an interesting avenue for further research, not only for researchers working on the evaluative priming effect but also for replication attempts in general. The joint analysis also revealed that the number of coded voicekey failures was significantly smaller in the present study as compared to the original study by Spruyt and Hermans (2008) and the recent replication attempt run at Ghent University, F(1, 283) = 116.95, p < .001, Δ = 2.84%. This observation suggests that the coding accuracy, coding criteria, and/or voicekey sensitivity settings were different in both (sets of) studies.

Ó 2016 Hogrefe Publishing

Experimental Psychology 2016; Vol. 63(1):70–78

K. Christoph Klauer et al., Evaluative Priming

Table 3. Overview of the effects observed in a joint analysis of the present study, the original study reported by Spruyt and Hermans (2008), and a recent replication attempt by the third author (total N = 285) Evaluative congruence Analysis Block factor included Block factor excluded First block only

Interaction effects

η2p

Study

Location

Relatedness

5.18* 7.48** 4.80*

1.44 ms 1.68 ms 3.16 ms

.018 .026 .017

F < 1.60 F<1 F<1

F < 1.65 F<1 F<1

F<1 F<1 F<1

Notes. PE = priming effect. *p < .05. **p < .01.

in the design or (2) the analyses were restricted the data of the first block only. Still, in each of these analyses, the overall evaluative priming effect was numerically very small, as were the respective effect sizes (see Table 3). Moreover, when the trials with the most strongly related prime-target pairs identified in Pilot Study I were removed, just like we did for the reanalysis of Spruyt and Hermans (2008), these effects dropped to non-significance, all F < 2.21, all p > .14.

Discussion There was little evidence for an evaluative-priming effect in pronunciation latencies in the preregistered analyses. This was in particular true for the data from the first block, in which there was no stimulus repetition and in which an evaluative-priming effect of the size reported by Spruyt and Hermans (2008) would therefore have been especially conclusive in its theoretical implications had it occurred. The evidence for small evaluative-priming effects in the exploratory analyses of the joint data appears to hinge on the inclusion of highly semantically related prime-target pairs in the analyses, that is, on the semantic-relatedness confound already discussed. It is possible that the present small 0.9 ms priming effect (d = 0.08) would turn out significant if an even larger sample of participants were obtained. Note, however, that such a small priming effect would be difficult to interpret theoretically, given that residual confounds of evaluative congruency with associative and semantic relatedness cannot be ruled out even when the most highly semantically related prime-target pairs are taken out of the analyses. For example, ratings of semantic relatedness were still substantially higher for evaluatively-congruent prime-target pairs than for incongruent pairs even after the most highly related pairs were taken out (see Pilot Study I). Future work might make an effort to contrast evaluatively congruent and incongruent stimulus pairs equated on ratings of semantic relatedness or might select pairs for which evaluative congruency and semantic relatedness vary orthogonally. Priming results may differ as a function of the language in which the study is run. For example, one criticism of the null findings reported by Klauer and Musch (2001) and in particular of their failure to replicate the original Bargh Experimental Psychology 2016; Vol. 63(1):70–78

et al. (1996) priming effects in the pronunciation task was that the German language may be orthographically less deep than the English language. Orthographic depth increases with the complexity of the print-to-speech correspondences, and languages differ systematically in orthographic depth. The impact of lexical and semantic variables on pronunciation latencies may be reduced in shallow orthographies, because a simple direct route from print to speech is available (e.g., Frost, Katz, & Bentin, 1987; but see Pagliuca, Arduino, Barca, & Burani, 2008). To address this criticism, Klauer and Musch (2001) also ran a study with English-speaking participants and English stimuli, which did not change results. Similarly, Spruyt et al. (2004) also failed to replicate the result using native English speakers. One reviewer of the present replication proposal analogously raised the issue that the German language may be less deep orthographically than the Dutch language in which Spruyt and Hermans’ (2008) study was couched. According to the reviewer, this might make it more difficult to find an effect. A number of objective quantifications of orthographic depth have been proposed (Schmalz, Marinus, Coltheart, & Castles, 2015). According to a quantification based on the dual route cascaded model of reading (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001), “English is a ‘deep’ orthography, in that it has many rules, and a particularly high percentage of irregular words, while Dutch and German are ‘shallow,’ in that they have few rules and a small proportion of irregular words.” (Schmalz et al., 2015, p. 1620). In this quantification, German and Dutch do not differ substantially in orthographic depth. According to an alternative quantification proposed by Borgwaldt, Hellwig, and Groot (2005), the German language is even substantially deeper orthographically than the Dutch language. This is consistent with the fact that our participants were slower by an average amount of about 50 ms than Spruyt and Herman’s (2008) in naming the targets (see also Footnote 5). Note finally that the failed replication presented in the section on exploratory analyses was run in Dutch. In closing, let us emphasize once more that the present failures to replicate do not question the evaluative-priming literature on the pronunciation task and the picture-naming task as a whole. Although they raise a concern for the many studies that did not control for semantic relatedness of Ó 2016 Hogrefe Publishing

K. Christoph Klauer et al., Evaluative Priming

primes and targets, the present results directly speak only to the paradigm and study by Spruyt and Hermans (2008). Each of the other paradigms reviewed in the introduction needs to be scrutinized separately and a few attempts to do so are underway (e.g., Becker, Klauer, & Spruyt, 2016).

Acknowledgments Preparation of this article was supported by Methusalem Grant BOF09/01M00209 of Ghent University. Electronic Supplementary Materials The electronic supplementary material is available with the online version of the article at http://dx.doi.org/10.1027/ 1618-3169/a000286 ESM 1. Data (text file). Raw data of pilot study 1. ESM 2. Data (text file). Raw data of pilot study 2. ESM 3. Data (text file). Raw data of pilot study 3. ESM 4. Data (text file). Raw data of the main study. ESM 5. Script (R file). Analysis script of the main study. ESM 6. Data (text file). Raw data of the replication study in Ghent.

References Asendorpf, J. B., Conner, M., De Fruyt, F., De Houwer, J., Denissen, J. J. A., Fiedler, K., . . . Wicherts, J. M. (2013). Recommendations for increasing replicability in psychology. European Journal of Personality, 27, 108–119. doi: 10.1002/per.1919 Bargh, J. A., Chaiken, S., Raymond, P., & Hymes, C. (1996). The automatic evaluation effect: Unconditional automatic attitude activation with a pronunciation task. Journal of Experimental Social Psychology, 32, 104–128. doi: 10.1006/ jesp.1996.0005 Becker, M., Klauer, K. C., & Spruyt, A. (2016). Is attention enough? A re-examination of the impact of feature-specific attention allocation on semantic priming effects in the pronunciation task. Attention, Perception, & Psychophysics, 78, 396–402. Borgwaldt, S. R., Hellwig, F. M., & Groot, A. M. B. D. (2005). Onset entropy matters: Letter-to-phoneme mappings in seven languages. Reading and Writing, 18, 211–229. doi: 10.1007/ s11145-005-3001-9 Brysbaert, M., Buchmeier, M., Conrad, M., Jacobs, A. M., Bölte, J., & Böhl, A. (2011). The word frequency effect. Experimental Psychology, 58, 412–424. doi: 10.1027/1618-3169/a000123 Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108, 204–256. doi: 10.1037/0033-295X.108.1.204

Ó 2016 Hogrefe Publishing

“Correction to Spruyt, Hermans (2008)”. (2014). Correction to Spruyt and Hermans (2008). Canadian Journal of Experimental Psychology, 68, 132–132. doi: 10.1037/cep0000022 De Houwer, J., Hermans, D., & Spruyt, A. (2001). Affective priming of pronunciation responses: Effects of target degradation. Journal of Experimental Social Psychology, 37, 85–91. doi: 10.1006/jesp.2000.1437 De Houwer, J., & Randell, T. (2004). Robust affective priming effects in a conditional pronunciation task: Evidence for the semantic representation of evaluative information. Cognition & Emotion, 18, 251–264. doi: 10.1080/02699930341000022 De Houwer, J., Teige-Mocigemba, S., Spruyt, A., & Moors, A. (2009). Implicit measures: A normative analysis and review. Psychological Bulletin, 135, 347–368. doi: 10.1037/a0014211 Duckworth, K. L., Bargh, J. A., Garcia, M., & Chaiken, S. (2002). The automatic evaluation of novel stimuli. Psychological Science, 13, 513–519. doi: 10.1111/1467-9280.00490 Everaert, T., Spruyt, A., & De Houwer, J. (2011). On the (un) conditionality of automatic attitude activation: The valence proportion effect. Canadian Journal of Experimental Psychology/Revue canadienne de psychologie expérimentale, 65, 125–132. doi: 10.1037/a0022316 Fazio, R. H., Sanbonmatsu, D. M., Powell, M. C., & Kardes, F. R. (1986). On the automatic activation of attitudes. Journal of Personality and Social Psychology, 50, 229–238. doi: 10.1037/0022-3514.50.2.229 Fiedler, K., Bluemke, M., & Unkelbach, C. (2011). On the adaptive flexibility of evaluative priming. Memory & Cognition, 39, 557–572. doi: 10.3758/s13421-010-0056-x Frost, R., Katz, L., & Bentin, S. (1987). Strategies for visual word recognition and orthographical depth: A multilingual comparison. Journal of Experimental Psychology: Human Perception and Performance, 13, 104–115. doi: 10.1037/0096-1523.13.1.104 Giner-Sorolla, R., Garcia, M. T., & Bargh, J. A. (1999). The automatic evaluation of pictures. Social Cognition, 17, 76–96. doi: 10.1521/soco.1999.17.1.76 Herring, D. R., White, K. R., Jabeen, L. N., Hinojos, M., Terrazas, G., Reyes, S. M., . . . Crites, S. L. J. (2013). On the automatic activation of attitudes: A quarter century of evaluative priming research. Psychological Bulletin, 139, 1062–1089. doi: 10.1037/a0031309 Kahneman, D. (2003). Experiences of collaborative research. American Psychologist, 58, 723–730. doi: 10.1037/0003-066X.58.9.723 Klauer, K. C., & Musch, J. (2001). Does sunshine prime loyal? Affective priming in the naming task. The Quarterly Journal of Experimental Psychology, 54, 727–751. doi: 10.1080/ 713755986 Klauer, K. C., & Musch, J. (2003). Affective priming: Findings and theories. In J. Musch & K. C. Klauer (Eds.), The psychology of evaluation: Affective processes in cognition and emotion (pp. 7–49). Mahwah, NJ: Erlbaum. Kupietz, M., Belica, C., Keibel, H., & Witt, A. (2010). The German Reference Corpus DeReKo: A primordial sample for linguistic research. In N. Calzolari, et al. (Ed.), Proceedings of the seventh international conference on language resources and evaluation (pp. 1848–1854). Valletta, Malta: European Language Resources Association (ELRA). Neely, J. H. (1991). Semantic priming effects in visual word recognition: A selective review of current findings and theories. In D. Besner & G. W. Humphreys (Eds.), Basic processes in reading: Visual word recognition (pp. 264–336). Hillsdale, NJ: Erlbaum. Pagliuca, G., Arduino, L. S., Barca, L., & Burani, C. (2008). Fully transparent orthography, yet lexical reading aloud: The lexicality effect in Italian. Language and Cognitive Processes, 23, 422–433. doi: 10.1080/01690960701626036

Experimental Psychology 2016; Vol. 63(1):70–78

Pecchinenda, A., Ganteaume, C., & Banse, R. (2006). Investigating the mechanisms underlying affective priming effects using a conditional pronunciation task. Experimental Psychology, 53, 268–274. doi: 10.1027/1618-3169.53.4.268 Schmalz, X., Marinus, E., Coltheart, M., & Castles, A. (2015). Getting to the bottom of orthographic depth. Psychonomic Bulletin & Review, 22, 1614–1629. doi: 10.3758/s13423-0150835-2 Spruyt, A., De Houwer, J., & Hermans, D. (2009). Modulation of automatic semantic priming by feature-specific attention allocation. Journal of Memory & Language, 61, 37–54. doi: 10.10167j.jml.2009.03.004 Spruyt, A., & Hermans, D. (2008). Affective priming of naming responses does not depend on stimulus repetition. Canadian Journal of Experimental Psychology 62, 237–241. doi: 10.1037/1196-1961.62.4.237 Spruyt, A., Hermans, D., De Houwer, J., Vandromme, H., & Eelen, P. (2007). On the nature of the affective priming effect: Effects of stimulus onset asynchrony and congruency proportion in naming and evaluative categorization. Memory and Cognition, 35, 95–106. doi: 10.3758/BF03195946 Spruyt, A., Hermans, D., De Houwer, J., & Eelen, P. (2002). On the nature of the affective priming effect: Affective priming of naming responses. Social Cognition, 20, 227–256. doi: 10.1521/soco.20.3.227.21106 Spruyt, A., Hermans, D., Pandelaere, M., De Houwer, J., & Eelen, P. (2004). On the replicability of the affective priming effect in the pronunciation task. Experimental Psychology (Formerly Zeitschrift für Experimentelle Psychologie), 51, 109–115. doi: 10.1027/1618-3169.51.2.109 Wentura, D., & Frings, C. (2008). Response-bound primes diminish affective priming in the naming task. Cognition & Emotion, 22, 374–384. doi: 10.1080/02699930701446064

K. Christoph Klauer et al., Evaluative Priming

Appendix Target Words in the Condition With Unrelated Stimuli Negative Words Müde [tired], langsam [slow], eitel [vain], hämisch [gleeful], bissig [snappy], naiv [naive], berechnend [scheming], gelangweilt [bored], dumm [stupid], launisch [fickle], eifersüchtig [jealous], eingebildet [conceited], unfähig [unable], bessesen [obsessed], bestechlich [bribable], süchtig [addicted], neidisch [envious], arrogant [arrogant], gierig [greedy], betrügerisch [fraudulous] Positive Words Altruistisch [altruistic], diskret [discrete], raffiniert [refined], sensibel [sensitive], behutsam [gentle], ausdauernd [enduring], lässig [cool], pünktlich [punctual], barmherzig [compassionate], diplomatisch [diplomatic], zuvorkommend [courteous], ausgelassen [exuberant], geschickt [skillful], gütig [benign], belesen [well-read], fleißig [assiduous], aktiv [active], selbstsicher [self-assured], humorvoll [humorous], witzig [funny]

Received August 14, 2014 Revised September 8, 2015 Accepted October 7, 2015 Published online March 29, 2016 Karl Christoph Klauer Institut für Psychologie Albert-Ludwigs-Universität Freiburg 79085 Freiburg Germany Tel. +49 761 2032469 E-mail christoph.klauer@psychologie.uni-freiburg.de

Experimental Psychology 2016; Vol. 63(1):70–78

Ó 2016 Hogrefe Publishing

Instructions to Authors Experimental Psychology publishes innovative, original, highquality experimental research. The scope of the journal is defined by experimental methodology and thus papers based on experiments from all areas of psychology are welcome. To name just a few fields and domains of research, Experimental Psychology considers manuscripts reporting experimental work on human learning, memory, perception, action, language, thinking, problem-solving, judgment and decision making, social cognition, and neuropsychological aspects of these topics. Apart from the use of experimental methodology, a primary criterion for publication is that research papers make a substantial contribution to theoretical research questions. For experimental papers that have a mainly applied focus, Experimental Psychology is not the appropriate outlet. A major goal of Experimental Psychology is to provide a particularly fast outlet for such research. Authors usually receive an editorial decision within 6 weeks of manuscript submission. Experimental Psychology publishes the following types of article: Research Articles, Short Research Articles, Theoretical Articles, and Registered Reports. Replication studies should be submitted as a Registered Report. Manuscript Submission: All manuscripts should in the first instance be submitted electronically at http://www.editorial manager.com/exppsy. Detailed instructions to authors are provided at http://www.hogrefe.com/periodicals/experimentalpsychology/advice-for-authors/ Copyright Agreement. By submitting an article, the author confirms and guarantees on behalf of him-/herself and any coauthors that he or she holds all copyright in and titles to the submitted contribution, including any figures, photographs, line drawings, plans, maps, sketches, tables, raw data, and other electronic supplementary material, and that the article and its contents does not infringe in any way on the rights of third parties. ESM and raw data files will be published online as received from the author(s) without any conversion, testing, or reformatting. They will not be checked for typographical errors or functionality. The author indemnifies and holds harmless the publisher from any third party claims. The author agrees, upon acceptance of the article for publication, to transfer to the publisher the exclusive right to reproduce and distribute the article and its contents, both physically and in nonphysical, electronic, and other form, in the journal to which it has been submitted and in other independent publications, with no limits on the number of copies or on the form or the extent of the distribution. These rights

Ó 2016 Hogrefe Publishing

are transferred for the duration of copyright as defined by international law. Furthermore, the author transfers to the publisher the following exclusive rights to the article and its contents: 1. The rights to produce advance copies, reprints, or offprints of the article, in full or in part, to undertake or allow translations into other languages, to distribute other forms or modified versions of the article, and to produce and distribute summaries or abstracts. 2. The rights to microfilm and microfiche editions or similar, to the use of the article and its contents in videotext, teletext, and similar systems, to recordings or reproduction using other media, digital or analogue, including electronic, magnetic, and optical media, and in multimedia form, as well as for public broadcasting in radio, television, or other forms of broadcast. 3. The rights to store the article and its content in machinereadable or electronic form on all media (such as computer disks, compact disks, magnetic tape), to store the article and its contents in online databases belonging to the publisher or third parties for viewing or downloading by third parties, and to present or reproduce the article or its contents on visual display screens, monitors, and similar devices, either directly or via data transmission. 4. The rights to reproduce and distribute the article and its contents by all other means, including photomechanical and similar processes (such as photocopying or facsimile), and as part of so-called document delivery services. 5. The right to transfer any or all rights mentioned in this agreement, as well as rights retained by the relevant copyright clearing centers, including royalty rights to third parties. Online Rights for Journal Articles Hogrefe will send the corresponding author of each accepted paper free of charge an e-offprint (PDF) of the published version of the paper when it is first released online. This e-offprint is provided exclusively for the author’s personal use, including for sharing with coauthors. Other uses of the e-offprint/ published version of record, including but not limited to the following, are not permitted except with the express written permission of the publisher: posting the e-offprint/published version of record to a personal or institutional website or to an institutional or disciplinary repository; changing or modifying the digital file; reproducing, distributing, or licensing the article in whole or in part for commercial use. February 2016

Experimental Psychology 2016; Vol. 63(1)

European Psychologist

nline o e e r e f le issu p m a s

Official Organ of the European Federation of Psychologists’ Associations (EFPA) Editor-in-Chief Peter Frensch Humboldt-University of Berlin, Germany

Associate Editors Rainer Banse, Bonn, Germany Ulrike Ehlert, Zurich, Switzerland Katariina Salmela-Aro, Helsinki, Finland

Managing Editor Kristen Lavallee

ISSN-Print 1016-9040 ISSN-Online 1878-531X ISSN-L 1016-9040 4 issues per annum (= 1 volume)

Subscription rates (2016) Libraries / Institutions US $254.00 / € 196.00 Individuals US $125.00 / € 89.00 Postage / Handling US $16.00 / € 12.00

www.hogrefe.com

About the Journal The European Psychologist is a multidisciplinary journal that serves as the voice of psychology in Europe, seeking to integrate across all specializations in psychology and to provide a general platform for communication and cooperation among psychologists throughout Europe and worldwide. The journal accepts two kinds of contributions: Original Articles and Reviews: Integrative articles and reviews constitute the core material published in the journal. These state-of-the-art papers cover research trends and developments within psychology, with possible reference to European perceptions or fields of specialization. Empirical articles will be considered only in rare circumstances when they present findings from major multinational, multidisciplinary or longitudinal studies, or present results with markedly wide relevance. EFPA News and Views are a central source of information on important legal, regulatory, ethical, and administrative matters of interest to members of the European Federation of Psychologists’ Associations (EFPA) and other psychologists working

throughout Europe. Such items include: News, reports from congresses or EFPA task forces and member country organizations, policy statements, keynote and award addresses, archival or historical documents with relevance for all European psychologists, and calendars of forthcoming meetings. Manuscript Submissions All manuscripts should be submitted online at www.editorialmanager.com/ep, where full instructions to authors are also available. Electronic Full Text The full text of the journal – current and past issues (from 1996 onward) – is available online at econtent.hogrefe.com/loi/epp (included in subscription price). A free sample issue is also available there. Abstracting Services The journal is abstracted / indexed in Current Contents / Social and Behavioral Sciences (CC / S&BS), Social Sciences Citation Index (SSCI), ISI Alerting Services, Social SciSearch, PsycINFO, PASCAL, PSYNDEX, ERIH, and Scopus. Impact Factor (Journal Citation Reports®, Thomson Reuters): 2015 = 3.372

Hogrefe OpenMind Open Access Publishing? It’s Your Choice! Your Road to Open Access Authors of papers accepted for publication in any Hogrefe journal can now choose to have their paper published as an open access article as part of the Hogrefe OpenMind program. This means that anyone, anywhere in the world will – without charge – be able to read, search, link, send, and use the article for noncommercial purposes, in accordance with the internationally recognized Creative Commons licensing standards.

The Choice Is Yours 1. Open Access Publication: The final “version of record” of the article is published online with full open access. It is freely available online to anyone in electronic form. (It will also be published in the print version of the journal.) 2. Traditional Publishing Model: Your article is published in the traditional manner, available worldwide to journal subscribers online and in print and to anyone by “pay per view.” Whichever you choose, your article will be peer-reviewed, professionally produced, and published both in print and in electronic versions of the journal. Every article will be given a DOI and registered with CrossRef.

www.hogrefe.com

How Does Hogrefe’s Open Access Program Work? After submission to the journal, your article will undergo exactly the same steps, no matter which publishing option you choose: peer-review, copy-editing, typesetting, data preparation, online reference linking, printing, hosting, and archiving. In the traditional publishing model, the publication process (including all the services that ensure the scientific and formal quality of your paper) is financed via subscriptions to the journal. Open access publication, by contrast, is financed by means of a one-time article fee (€ 2,500 or US $3,000) payable by you the author, or by your research institute or funding body. Once the article has been accepted for publication, it’s your choice – open access publication or the traditional model. We have an open mind!

An innovative and highly effective brief therapy for suicidal patients “ASSIP is perhaps the most significant innovation we have seen in the assessment and treatment of suicidal risk...” David A. Jobes, PhD, Professor of Psychology, The Catholic University of America, Washington, DC, USA Past President, American Association of Suicidology

Konrad Michel / Anja Gysin-Maillart

ASSIP – Attempted Suicide Short Intervention Program A Manual for Clinicians

2015, x + 114 pp. US $59.00 / € 41.95 ISBN 978-0-88937-476-8 Also available as eBook Attempted suicide is the main risk factor for suicide. The Attempted Suicide Short Intervention Program (ASSIP) described in this manual is an innovative brief therapy that has proven in published clinical trials to be highly effective in reducing the risk of further attempts. ASSIP is the result of the authors’ extensive practical experience in the treatment of suicidal individuals. The emphasis is on the therapeutic alliance with the suicidal patient, based on an initial patientoriented narrative interview.

www.hogrefe.com

The four therapy sessions are followed by continuing contact with patients by means of regular letters. This clearly structured manual starts with an overview of suicide and suicide prevention, followed by a practical, step-by-step description of this highly structured treatment. It includes numerous checklists, handouts, and standardized letters for use by health professionals in various clinical settings.