Page 1


“Music is the art of thinking with sounds.” - Jules Combarieu

Though the experience of emotion in music may feel little different from the experience of emotion in everyday life, the two differ in one fundamental way. While the latter are directed towards real objects, circumstances, and occurrences in our environments, emotion in music arises from mere sequences of abstract sounds. While these resulting emotional musical experiences do not have actual physical correlates in the environment, their presence is undeniable. Thus, the task at hand is to understand how our minds are able to derive and experience significance from inherently meaningless stimuli. Because of their quick and reactive nature, emotions appear to have little to do with cognition at first glance. In reality, it is cognitive variables that are responsible for our ability to perceive emotion in music. This paper aims to describe the various ways in which cognitive factors create perception of emotion in music. First, it explains how emotion can be elicited from the actual structure of a piece by creating, confirming, and violating expectations, and how contextual effects can modulate these expectations during the listening experience. Second, it describes ways in which individual

components of a piece can develop or innately possess some affective meaning associated with it, and comments on the role of familiarity in this process. Third, it presents findings on individual differences and uses these findings as support for the assertion that implicit learning is ultimately responsible for the perception of emotion in music. Finally, the paper concludes with the suggestion that affective coherence and embodied affect may be responsible for the expressive power that music has as a whole. One way in which emotion is perceived from music is through structure, specifically through the confirmation or disconfirmation of expectations. In his article “Why Emotions Vary in Intensity,” Clore suggests that the main cognitive variable that influences how intensely an emotion is experienced is cognitive restructuring (1994). He explains how a decrease in cognitive restructuring could be responsible for one’s adaptation to the natural beauty of a vista. “It may be difficult to maintain the feeling of awe or amazement to the extent that one gets used to the scene and comes to have a clear mental model of it,” he explains, “so that the mere sight of the scene no longer restructures anything” (Clore, 1994, p.391). Huron also suggests that highly predictable stimuli can lead to reduced attention and lowered arousal (2006). There is empirical evidence that the intensity with which one experiences auditory stimuli subsides with repetition of that stimulus. Putnam and Roth (1990) recorded event-related potentials and measured eye blink response to intense auditory stimuli and found a weakening of startle responses with increased stimulus repetition. Habituation to such stimuli represents an ability to anticipate future events, and can be likened to a decrease in cognitive restructuring. This serves as a convincing explanation as to why it is possible for songs to lose their emotional power.

In contrast, unanticipated musical events are responsible for intensely expressed emotions in music. Expectation deviation has the ability to modulate emotional intensity in everyday situations (Clore, 1994). For example, one would likely feel more admiration for an elderly woman who dives into a river to save a small boy, compared with a lifeguard who does the same (Clore, 1994). Similarly, stimuli are experienced more intensely when they are not expected in music (Huron, 2006). Sloboda (1991) examined self-reports of emotion and their accompanied physical reactions in response to emotional music. He found that the musical structures that were rated as highly emotional were accompanied by “shivers” and were reliably provoked by sudden changes in harmony. Low level autonomic reactions have the ability to initiate emotional reactions, and subsequent “cognitive appraisal processes act like a sculptor, shaping general affective reactions into specific emotions” (Core & Ortony, in press, p.2). In music, these low level reactions occur in response to violations of expectations. According to Huron, these automatic “reaction responses” are subjected to slower, more controlled “appraisal responses” that can facilitate or inhibit the initial reaction response (Huron, 2006). Arousal does not always cause the initiation of an emotion. However, subjective experience of arousal has previously been shown to be a reliable predictor of emotion (Ritossa & Rickard, 2004). Thus, it is likely that arousal is at least sometimes involved in the initiation of an emotion. Just as the OCC theory suggests that some emotions such as “satisfaction” and “relief” can arise from confirmation of the prospect of an event (Ortony, Clore, & Collins, 1988), so can emotions arise from the confirmation of expectations in music. When an individual is accurately able to predict a musical stimulus, then the stimulus

itself is experienced as pleasant due to misattribution effects (Huron, 2006). This is similar to the misattribution effect described by Clore, in which misattributed affective feelings have the power to influence judgments (1994). Thus, tones that should otherwise sound neutral become more pleasant when they are expected and in comparison to neutral tones that are not expected. Expectations exist in music to prepare individuals to act in appropriate ways (Huron, 2006). Expectations “allow us to adopt a state of arousal that is better suited to what is likely to happen next…and so increase the speed and accuracy of future perceptions” (Huron, 2006, p. 109). The idea of affective coherence suggests that the agreement of multiple sources of evaluation facilitates cognitive processing and has the potential to make an experience more emotionally powerful (Clore & Schnall, in press). Witvliet and Vrana (2007) found that repeated exposure to positive music increased one’s liking of the music and caused participants to smile the most during positive arousing music. Perhaps a theory that combines Huron’s and Clore’s ideas of expectations, facilitated processing, and affective coherence can account for these results. When various sources of information are in agreement with each other, “affective certainity” arises (Clore & Schnall, in press). Evidence for cross-modal perception of emotion in music provides support for this idea. Thompson, Russo, & Quinto (2008) presented participants with happy or sad facial expressions while playing happy or sad chords and subsequently judged their emotional ratings of the chords presented. When the emotion of the facial expression matched the emotion of the chords, participants rated the chords as being much more emotional than when there was no agreement of audio-visual

information. Thus, the agreement of various sources of affective information resulted in the music being perceived as more emotional. The dynamic context of a musical piece can change an individual’s expectations for the piece in real time. For example, Bigand & Pineau (1997) found that a listener’s expectation of the eighth chord in a sequence depended on the six chords that were played before it. The establishment of key is important in establishing expectations. Tones and chords from the same key are more likely to co-occur in a Western musical piece, and key changes are likely to occur between certain related keys (Tillmann, Bharucha, & Bigand, 2000). Thus, the same chord may elicit different feelings, based on whether it is in its own key (and expected) or in another key (and unexpected). Themes and motives function by modulating the listener’s expectations in realtime. In cross-cultural research with Joy Ollen, Huron found that songs tend to have patterns of repeating musical passages (Huron, 2006). Repetition of passages allows the listener to experience familiarity and anticipate future events. Passages that are slightly modified and built upon each other prevent the listener from becoming bored (Huron, 2006). According to Clore, “rhythmic and melodic transformations are only satisfying when a representation of the untransformed theme is cognitively available to be restated and embellished” (1994, p.392). Therefore, the arrangement of repeated segments, modified segments, and new segments into patterns serves as another source of emotion in music. Conscious knowledge of the social context of a piece may affect emotions, as well. Even the timbre of a specific instrument can make it more or less likely that certain events will follow (Huron, 2006). Motives for listening to music, attitudes towards a band

or genre, and the environment in which the music is experienced are other examples by which context effects can manipulate a listener’s experiences. Judgment about an emotion can be modulated in real-time through experiencing effects that serve as feedback on one’s own cognitive processes (Clore, 1992). The actual experiences of understanding, feeling uncertain, or developing expectations are examples of information about an individual’s own state that can influence subsequent judgment (Clore, 1992). A study by Wilson, Lisle, Kraft, and Wetzel (1989) found that how funny a comic was considered was changed by the act of having expectations of funniness. Participants showed more facial features typical of laughter and amusement when they expected a cartoon to be funny. In addition, these same expectations caused unfunny cartoons to appear as more funny (Wilson et al, 1989). Perhaps this feedback of what Clore calls “cognitive feelings” is one of many ways in which expectations influence the judgment of emotion in music. Inherent liking or disliking of individual components of a piece may changes overall emotional experience. One reason that a stimulus can be liked is because of exposure to the stimuli. It is a well established fact that familiar stimuli are responded to more positively than unfamiliar stimuli. In his research, Zajonc’s research has found that the mere exposure to a stimulus has the ability to change one’s affective reactions towards that stimulus (Zajonc, 2000). Research on music suggests the same. Szpunar, Schellenberg, and Pliner (2004) played sequences of “unfamiliar,” “mechanical,” and “relatively nonmusical” tones for participants and judged their preferences for tones. With incidental listening, liking ratings increased with exposure. The researchers explained this as misattribution of perceptual fluency to liking of the stimulus (Szpunar et

al, 2004). In other words, the stimulus itself was seen as enjoyable because of pleasant feelings from the increased ease of processing the stimulus. There is a general consensus that consonant tones and dissonant tones sound pleasant and unpleasant, respectively. Consonant tones are easier to process than dissonant tones, are more aesthetically pleasing, and are made up of tones that have related harmonics (Trainor & Heinmiller, 1998). In contrast, dissonant tones have a rough, displeasing sound that arises from two or more simultaneous tones with nonidentical harmonics. A study by Trainor et al (1998) found that six-month old infants and adults share a preference for consonance over dissonance. Consonance is found to predominate the music of many different cultures (Trainor et al, 1998), which may suggest that familiarity effects are responsible for consonance liking. Dissonance, however, is unpleasant due to interference in the cochlea (Huron, 2006). However, further research is needed to determine where preferences for consonance originate. With regards to harmony perception in Western music, certain chords are found to be evocative of distinct emotional qualia. Huron presented various specific progressions of chords to musician and non-musician listeners and asked them to come up with adjectives to describe the chord progressions (Huron, 2006). Huron categorized the results into words relating to tendency, words related to valence, and words with no other category. He concluded that the reason that certain chord progressions sound especially emotional are because they occur less frequently in music (Huron, 2006). In addition, most major chords were associated with a tendency to move forward, while most minor chords were associated with a more restful state. He concluded that valence is the factor that differentiates major from minor chords. Thus, Huron’s work found that frequency of

occurrence, tendency, and valence are three factors which affect the emotionality of music. It is interesting to note that Huron’s factors are not too conceptually distant from concepts such as liking, likelihood, and familiarity that Clore believes modulate emotion intensity (Clore, 1994). Emotional responses to music are thought to be weakly affected by individual differences related to culture and musical expertise and this fact provides support for the role of implicit learning in emotion perception in music. The definition of an emotion as an affective “state� implies that it is elicited in responses to situations. Thus, if these situations are similar across cultures, we should expect to see the same emotions arising out of the situations. A study by Oishi, Diener, Scollon, and Biswas-Diener (2004) found that affective experiences show a significant degree of cross-situational consistency within various cultures. The same study also found that cultural differences exist as a result of the effect of specific contexts in various cultures (Oishi et al, 2004). Thus, while the same situations elicited the same emotions across cultures, the extent to which they were experienced in different cultures varied. Krumhansl (1999) examined the notion that melodic expectations are universal by presenting music from various cultures, and comparing the responses of a musical expert from that culture with a Western musician who is unfamiliar with the music. The results showed cross-cultural similarities in using expectation in music, but showed differences in how each culture shapes them. It is hypothesized that the similarities in expectation use developed out of implicit learning. Thus, in both studies, the same emotion-eliciting conditions were found across cultures, but cultural differences produced slight differences in the relative intensities of those experiences. These cultural differences were explicit and therefore did not affect the

development of expectations. Musical expertise is found to make a very small difference in how well one is able to perceive emotion in music, as well. A study by Bigand, Vieillard, Madurell, and Dacquet (2005) demonstrated only a slight increase in ability to process emotional stimuli for music experts compared to non-experts. In general, however, there was a strong consistency in the emotions that experts and non-experts experienced. Importantly, the emotional responses to music did not depend on the amount of musical training. Both studies on culture and expertise provide support for the role of implicit learning in emotion perception in music. Perception of emotion is unable to develop without some sort of past exposure to the experience because “the intensity of the experience would presumably be tied to some implicit comparison of that view with something else, such as what one expected, the scene just beforehand, other [instances], and so on� (Clore, 1994, p. 391). Individuals are thought to develop their ability to perceive emotion through implicit learning and mere exposure to music. Implicit learning is the process by which expectations are formed, statistical regularities are learned, and exposure effects are developed. Peretz, Gagnon, and Bouchard (1997) studied the case of a woman, I.R., who is able to accurately interpret the emotional content of songs despite a complete loss of ability to recognize music. I. R. provides support for the idea that implicit learning is responsible for the development of the mechanisms that allow us to enjoy music. A study by Cross, Halcomb, and Matter (1967) demonstrates that even a single exposure to music can affect subsequent affective reactions towards similar music. Rats were exposed to a Mozart composition for four weeks, and subsequently preferred to listen to a separate Mozart composition over never-before-heard music. The first interesting point regarding this

experiment is that exposure effects were able to persist even without an explicit awareness of them. The second is that the exposure effects generalized to a separate Mozart piece, most likely by virtue of its similarity to the first piece. Tillmann et al (2000) developed a model for the process by which tonal music is acquired. Western tonal music is a highly structured system which individuals are exposed to on a daily basis, often without explicit awareness of it (Tillmann et al, 2000). Individuals come to implicitly learn which events co-occur and with what frequency they occur. More frequently occurring tones are established as being more stable. The more stable tones are activated and the tones linked with it are activated as well, and in turn become more stable themselves (Tillmann et al, 2000). In this way, tonal relations emerge continually as a result of spreading activation and are characteristic of implicit learning, rather than stored explicit knowledge. Indeed, implicit learning is often only able to take place during incidental listening. For example, Szpunar et al (2004) showed that focused-listening prevents the mere exposure effect from taking place, and that subsequent ratings of liking are found to be independent from exposure. Lastly, the process of statistical learning by which qualia are associated with chords it is thought to be mediated by implicit learning, as well (Huron, 2006). While it is clear that a number of different cognitive factors are involved in the perception of expression in music, perhaps the notion of embodied affect suggested by Clore and Schnall (in press) explains why music has the ability to be emotionally powerful across situations, contexts, and individuals. The notion of embodied affect would suggest that the various components of music such as tempo, pitch, harmony, rhythm, dynamics, and timbre serve as multiple sources of evaluative information that

mesh together to create one emotionally powerful experience. In fact, these various properties are not experienced as separate entities in the listener. “The brain’s disposition is to assemble an integrated ‘package’…[and music] is experienced as a unified entity rather than a smorgasbord of disparate properties” (Huron, 2006, p.124). Similarly, it is not possible to divorce the emotional component of a sound from the actual sound itself. This is because the emotion emerges as a result from the combinations of individual components. On its own, each feature is a poor expresser of emotion, “but the larger the number of cues used, the more reliable the communication” (Juslin, 2004, p. 220). The beauty of music lies in its continued ability to communicate such emotion. In the end, even a thorough understanding of the cognitive mechanisms that that allow us to perceive musical emotion is not enough to keep us from continuing to surrendering to its power.

Works Cited

Bigand, E. & Pineau, M. (1997). Global context effects on musical expectancy. Perceptual Psychophysics, 59(7), 1098-1107. Bigand, E., Vieillard, S., Madurell, F., Marozeau, J., & Dacquet, A. (2005). Multidimensional scaling of emotional responses to music: The effect of musical expertise and of the duration of the excerpts. Cognition and Emotion, 19(8), 11131139. Clore, G. L. (1992). Cognitive phenomenology: Feelings and the construction of judgment. In L. L. Martin & A. Tesser (Eds.), The Construction of Social Judgments (pp.133-163). New Jersey: Lawrence Earlbaum Associates Inc. Clore, G. L. (1994). Why emotions vary in intensity. In P. Ekman & R. J. Davidson (Eds.), The Nature of Emotion: Fundamental Questions (pp. 386-393). New York: Oxford University Press. Clore, G. L., & Ortony, A. (2000). Cognition in emotion: Always, sometimes, or never? In R. D. Lane & L. Nadel (Eds.), Cognitive neuroscience of emotion (p. 24-61). New York: Oxford University Press. Clore, G. L. & Ortony, A. (in press). Appraisal theories: How cognition shapes affect into emotion. In M. Lewis (Ed.), Handbook of Emotions (3rd edn). Guilford Press. Clore, G. L., & Schnall, S. (in press). Affective coherence: Affect as embodied evidence in attitude, advertising, and art. In G. R. Semin, & E. R. Smith (Eds.), Embodied Grounding: Social, Cognitive, Affective, And Neuroscientific Approaches. New York: Cambridge University Press. Cross, H. A., Halcomb C. G., & Matter, W. W. (1967). Imprinting or exposure learning in rats given early auditory stimulation. Psychonomic Science, 7, 233–234. Huron, D. B. (2006). Sweet Anticipation: Music and the Psychology of Expectation. New York: MIT Press. Juslin, P. N. & Laukka, P. (2004). Expression, perception, and induction of musical emotions: A review and a questionnaire study of everyday listening. Journal of New Music Research, 33(3), 217-238. Krumhansl, C. L., Louhivuori, J., Toiviainen, P., Järvinen, T., & Eerola, T. (1999). Melodic expectation in Finnish spiritual folk hymns: Convergence of statistical, behavioral, and computational approaches. Music Perception, 17, 151-195.

Oishi, S., Diener, E., Scollon, C. N., & Biswas-Diener, R. (2004). Cross-situational consistency of affective experiences across cultures. Journal of Personality and Social Psychology, 86(3), 460-472. Ortony, A., Clore, G. L., & Collins, A. (1988). The Cognitive Structure of Emotions. New York: Cambridge University Press. Peretz, I., Gagnon, L., & Bouchard, B. (1998). Music and emotion: Perceptual determinants, immediacy, and isolation after brain damage. Cognition, 68, 111141. Putnam, L. E. & Roth, W. T. (1990). Effects of stimulus repetition, duration, and rise time on startle blink and automatically elicited P300. Psychophysiology, 27(3), 275-297. Ritossa, D. A. & Rickard, N. S. (2004). The relative utility of ‘pleasantness’ and ‘liking’ dimensions in predicting the emotions expressed by music. Psychology of Music, 32(1), 5-22. Schwarz, N. & Clore, G. L. (1983). Mood, misattribution, and judgments of well-being: Informative and directive functions of affective states. Journal of Personality and Social Psychology, 45(3), 513-523. Sloboda, J. A. (1991). Music structure and emotional response: Some empirical findings. Psychology of Music, 19, 110-120. Szpunar, K. K., Schellenberg, G., & Pliner, P. (2004). Liking and memory for musical stimuli as a function of exposure. Journal of Experimental Psychology, 30(2), 370-381. Thompson, W. F., Russo, F. A., & Quinto, L. (2008). Audio-visual integration of emotional cues in song. Cognition and Emotion, 00(00), 1-14. Tillmann, B., Bharucha, J. J., & Bigand, E. (2000). Implicit learning of regularities in Western tonal music by self-organization. Psychological Review, 107(4), 885913. Trainor, L. J. & Heinmiller, B. M. (1998). The development of evaluative responses to music: Infants prefer to listen to consonance over dissonance. Infant Behavior and Development, 21(I), 77-88. Wilson, T. D., Lisle, D. J., Kraft, D., & Wetzel, C. G. (1989). Preferences as expectationdriven inferences: Effects of affective expectations on affective experience. Journal of Personality and Social Psychology, 56(4), 519-530.

Witvliet, C. V. O., Vrana, S. R. (2007). Play it again Sam: Repeated exposure to emotionally evocative music polarizes liking and smiling responses, and influences other affective reports, facial EMG, and heart rate. Cognition and Emotion, 21(1), 3-25. Zajonc, R. B. (2000). Feeling and thinking: Closing the debate over the independence of affect. In J. P. Forgas (Ed.), Feeling and thinking: The role of affect in social cognition (pp. 31-58). New York: Cambridge University Press.

The Role of Cognition in the Perception of Emotion in Music  

Although emotional musical experiences do not have actual physical correlates in the environment, their presence is undeniable. This paper a...