Toward Affective Brain-Computer
Transcription
Toward Affective Brain-Computer
Toward Affective Brain-Computer Interfaces Exploring the Neurophysiology of Affect during Human Media Interaction C HRISTIAN M ÜHL PhD dissertation committee: Chairman and Secretary: Prof. dr. ir. A. J. Mouthaan, Universiteit Twente, NL Promotors: Prof. dr. ir. A. Nijholt, Universiteit Twente, NL Prof. dr. D. K. J. Heylen, Universiteit Twente, NL Members: Prof. dr. ing. W. B. Verwey, Universiteit Twente, NL Prof. dr. F. van der Velde, Universiteit Twente, NL Prof. dr. ir. P. W. M. Desain, Radboud Universiteit Nijmegen, NL Prof. dr. T. Pun, University of Geneva, CH Paranymphs: Dr. A. Niculescu H. Gürkök This book is part of the CTIT Dissertation Series No. 12-2210 Center for Telematics and Information Technology (CTIT) P.O. Box 217 – 7500AE Enschede – the Netherlands ISSN: 1381-3617 The authors gratefully acknowledge the support of the BrainGain Smart Mix Programme of the Netherlands Ministry of Economic Affairs and the Netherlands Ministry of Education, Culture, and Science. The research reported in this thesis has been carried out at the Human Media Interaction research group of the University of Twente. This book is part of the SIKS Dissertation Series No. 2012-23 The research reported in this thesis has been carried out under the auspices of SIKS, the Dutch Research School for Information and Knowledge Systems. c 2012 Christian Mühl, Enschede, The Netherlands c Cover image by Lisa Schlosser, Münster, Germany ISBN: 978-90365-3362-1 ISSN: 1381-3617, No. 12-2210 Toward Affective Brain-Computer Interfaces Exploring the Neurophysiology of Affect during Human Media Interaction DISSERTATION to obtain the degree of doctor at the University of Twente, on the authority of the rector magnificus, prof. dr. H. Brinksma, on account of the decision of the graduation committee to be publicly defended on Friday, June 1st , 2012 at 16.45 by Christian Mühl born on January 26, 1979 in Cottbus, Germany Promotors: Prof. dr. ir. A. Nijholt Prof. dr. D.K.J. Heylen c 2012 Christian Mühl, Enschede, The Netherlands ISBN: 978-90365-3362-1 Acknowledgements Completing this book concludes a journey from east to west. More exactly, it concludes a part of my live that more or less started in the early August of 2007 at a small coffee bar viewing over the Bosphorus Bridge connecting Europe and Asia. In a spontaneous flight from the work on my master’s thesis, I participated in a month-long summer workshop on the creative and scientific approaches of using neuroscientific data. For most of the time we worked in a dark and cool basement, deciphering the output of a functioning, but rather old EEG system that we were allowed to use on courtesy of the Boğaziçi University. But at times we would walk to the coffee at the bank of the Bosphorus, which would be one of these great ’academic’ experiences: sharing tee, laughter and sun, talking about the prospects of our work, and dreaming of possible applications. In the final hours on our final day, we finally finalized my first brain-computer interface: a ball controlled on a 2D-plane with the neurophysiological signals acquired from the old EEG system. We shouted, joked, and there might have been one or two hot tears of relieve shed on the cool laboratory floor. After so much time spend with the system, our feelings might have been comparable to those of proud parents witnessing the first steps of their child. The working system and our small final experiment led to a part in the workshop report and a workshop poster, which were nice but did not really have an exceptional impact on my life. What had this impact though, was the time overlooking the Bosphorus bridge and the experiences I shared during that time: I decided to continue working on braincomputer interfaces. Therefore, my first thanks are to the initiators of the eNTERFACE workshop series, who made this extraordinary and motivating encounter with a group of great people and dedicated researchers possible. Next, everything happened at once (at least that is how I remembered it): I came back to Germany, heard about the Ph.D. position in Enschede, applied, and almost immediately was notified that I could start as soon as possible. Thus, I finished my master thesis and started to prepare for the new job. I was welcomed warmly by my future colleagues of the HMI group and though I never really became a Dutchman, they treated me nicely and dealt with my Dutch-accented German. Especially, the charming secretaries Charlotte Bijron and Alice Vissers-Schotmeijer were always patient with me and helped independent of how awfully I formulated my concerns and wishes in Dutch. I believe that they are responsible for why I today am proudly able to speak with so many different people (cashiers, nurses, craftsmen, bus drivers, and train conductors) in Dutch, no matter how wrong (or how quickly they start to reply in English or even German). Similarly, I am thankful for the vi | ACKNOWLEDGEMENTS technical support I got from Hendri Hondorp, who took up with my wishes and requests (RAM, beamers, notebooks, data and code repositories .. you name it!), and saved thereby some lectures, experiments, and especially prevented hours of futile work. I owe special thanks to the dietary, orthographical and grammatical support of Lynn Packwood. Not only did she feed me and all of HMI delicious cakes and cookies for dutifully filling in the TAS records each and every month, but she also corrected the whole of this thesis in a pedagogically humorous and collegially patient manner. Thus took my life in the Netherlands its start. I got to know my colleagues and made great experiences - mostly that there finally is not that huge difference between the German and the Dutch culture, except the more liberal stance the Dutch mostly adopt to others. Conveniently, that also is true for professional life, which gave me a lot of freedom to explore the scientific topics roughly associated with my Ph.D. assignment. That I should only a year later almost despair on that freedom is another story which, as this book shows, had a good ending. That I made it through the desperate times as well as I made it through the very good times of my thesis is to a great part the work of supervisor Dirk Heylen and of my promotor Anton Nijholt. Both kept cool when I was close to nervous break-down and handed me the end of a new thread when I was lost. Their insight and experience helped me to see some of the bigger questions, while fretting over details of my statistical analysis. And their efforts allowed me to meet and cooperate with great people from all around the world. Their gracious support allowed me to make the experiences that I find most rewarding in the realm of science: exchanging ideas and thoughts with others and collaborate on common projects. In that regard, I am also thankful to the initiators of the BrainGain project and all those involved in getting this huge project started. It was not only a great opportunity to meet people that work in very related topics or with similar methods, it was also the reason why I was able to share most of my four years of research experience with a group of like-minded individuals in the cozy ivory tower of BCI, instead of isolated in a den in those gray mountains surrounding the tower. I would like to thank Boris Reuderink, Danny Plass-Oude Bos, Hayrettin Gürkök, Bram van de Laar, Femke Nijboer, Egon van den Broek, and Mannes Poel for the many great ideas they shared with me, for all the complaints or hearty critics they endured, and for the endurance they showed in our common projects. I want to emphasize my thanks for Danny’s, Hayrettin’s, and Boris’ support in the Bacteria Hunt project and the ensuing studies. Not only was the cooperation – especially that hard-core month in Genova – a great, maybe the best experience of my work at HMI, but it is also an important part of my thesis. In a similar vein, I would like to thank those colleagues that were not part of HMI, but with which I had the pleasure to collaborate during my time at HMI, especially Anne-Marie Brouwer, Mohammad Soleymani, Lasse Scherffig, and Sander Koelstra. It was really a pleasure and I was lucky to be able to discuss my ideas and doubts with people interested in the same subject. Of course, there would be no such thing as HMI without the many colleagues that devote their lives to the advancement of human-computer interaction studies and practices. There is not enough room here to mention each and every of the small and big encounters that constitute the pleasantries of life and work at HMI. However, I would like to mention a few of those colleagues with whom I developed a special rapport over the last four years. ACKNOWLEDGEMENTS Firstly, there are Dennis Reidsma and Boris Reuderink, with which I had the pleasure of sharing an office. Both were not only good-humored and supportive room-mates, but also always open to discussions, musings, and coffee breaks. From both I learned a lot about life, work, and myself. For both I harbor a deep respect. Secondly, there are Hayrettin Gürkök and Andreea Niculescu, which both were ready to take the responsibility as my paranymphs. My choice for them was well-informed on the basis of many shared experiences, both private and professional. Andreea did put up with me as her flat-mate for almost three years and I can’t express how grateful I am for the social and safe home she gave me – a place that I always liked to return to also in the darkest evenings of my soul. Thirdly, and consistent with the “rule of three”, I would like to thank all the other colleagues that made my time at HMI an emotionally rich and unforgettable experience. Finally, I am deeply indebted to my family. Their invaluable dedication and support, raising me into the many-layered, but mostly hedonic and harmony-seeking person I am. Some day I will figure out how to deal with that and work. I also want to mention my sister Sarah, who is, despite the distance in age and residence, a very important and loved part of my life. Oh, I almost forgot Lisa! Who’s Lisa? Lisa is not only a great and coming artist responsible for the cover of this thesis, but also a loving partner, supportive friend, and simply fun to be with. There are not enough words of thank and gratitude in the world to show my gratefulness for her steady, non-wavering support and the great sociopsychological care during the last four years. I am looking forward to many more travels with you. Let’s hope they are less stressful.. Christian Mühl, May 2012, Münster | vii viii | ACKNOWLEDGEMENTS Contents Abstract xiii 1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Affect - Terms and Models . . . . . . . . . . . . . . . . . . . . 1.2.1 Basic Emotion Models . . . . . . . . . . . . . . . . . . 1.2.2 Dimensional Feeling Models . . . . . . . . . . . . . . . 1.2.3 Appraisal Models . . . . . . . . . . . . . . . . . . . . . 1.3 Affect, Human-Media Interaction, and Affect Sensing . . . . . 1.4 Affective Brain-Computer Interfaces . . . . . . . . . . . . . . . 1.4.1 Parts of the Affective BCI . . . . . . . . . . . . . . . . . 1.4.2 The Different aBCI Approaches . . . . . . . . . . . . . 1.5 Physiological and Neurophysiological Measurements of Affect 1.5.1 The Peripheral Nervous System (PNS) . . . . . . . . . 1.5.2 The Central Nervous System (CNS) . . . . . . . . . . . 1.6 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . 1.6.1 Evaluating Affective Interaction . . . . . . . . . . . . . 1.6.2 Inducing Affect by Music Videos . . . . . . . . . . . . . 1.6.3 Dissecting Affective Responses . . . . . . . . . . . . . . 1.7 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 3 4 4 5 8 8 13 16 17 19 24 25 25 26 27 2 Evaluating Affective Interaction 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 From Proof-of-Concept to hybrid BCI Games . . . . 2.1.2 Relaxation in the Biocybernetic Loop . . . . . . . . 2.1.3 Alpha Activity as Indicator of Relaxation . . . . . . 2.1.4 Hypotheses . . . . . . . . . . . . . . . . . . . . . . 2.2 Methods: System, Study design, and Analysis . . . . . . . 2.2.1 Game Design and In-game EEG Processing Pipeline 2.2.2 EEG Processing . . . . . . . . . . . . . . . . . . . . 2.2.3 Participants . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Data acquisition and material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 29 30 32 33 34 35 35 38 44 44 . . . . . . . . . . . . . . . . . . . . x | CONTENTS 2.2.5 Experiment design . . . . . . . . . 2.2.6 Procedure . . . . . . . . . . . . . . 2.2.7 Data analysis . . . . . . . . . . . . 2.3 Results: Ratings, Physiology, and EEG . . . 2.3.1 Subjective Game Experience . . . . 2.3.2 Physiological Measures . . . . . . . 2.3.3 EEG Alpha Power . . . . . . . . . 2.4 Discussion: Issues for Affective Interaction 2.4.1 Indicator-suitability . . . . . . . . 2.4.2 Self-Induction of Relaxation . . . . 2.4.3 Feedback Saliency . . . . . . . . . 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 45 46 47 47 48 48 50 50 51 52 53 3 Inducing Affect by Music Videos 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Eliciting Emotions by Musical Stimulation . . . . 3.1.2 Enhancing Musical Affect Induction . . . . . . . . 3.1.3 Hypotheses . . . . . . . . . . . . . . . . . . . . . 3.2 Methods: Stimuli, Study design, and Analysis . . . . . . 3.2.1 Participants . . . . . . . . . . . . . . . . . . . . . 3.2.2 Stimuli Selection and Preparation . . . . . . . . . 3.2.3 Apparatus . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Design and Procedure . . . . . . . . . . . . . . . 3.2.5 Data Processing . . . . . . . . . . . . . . . . . . . 3.2.6 Statistical Analysis . . . . . . . . . . . . . . . . . 3.3 Results: Ratings, Physiology, and EEG . . . . . . . . . . . 3.3.1 Analysis of subjective ratings . . . . . . . . . . . 3.3.2 Physiological measures . . . . . . . . . . . . . . . 3.3.3 Correlates of EEG and Ratings . . . . . . . . . . . 3.4 Discussion: Validation, Correlates, and Limitations . . . 3.4.1 Validity of the Affect Induction Protocol . . . . . 3.4.2 Classification of Affective States . . . . . . . . . . 3.4.3 Neurophysiological Correlates of Affect . . . . . . 3.4.4 Emotion Induction with Music Clips: Limitations 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 55 56 57 59 60 60 60 65 66 68 69 70 70 73 74 78 78 79 79 83 84 4 Dissecting Affective Responses 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Cognitive Processes as Part of Affective Responses 4.1.2 Modality-specific Processing in the EEG . . . . . 4.1.3 Hypotheses . . . . . . . . . . . . . . . . . . . . . 4.2 Methods: Stimuli, Study Design, and Analysis . . . . . . 4.2.1 Participants . . . . . . . . . . . . . . . . . . . . . 4.2.2 Apparatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 87 88 88 90 91 91 91 CONTENTS | 4.2.3 Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.4 Design and Procedure . . . . . . . . . . . . . . . . . 4.2.5 Data Processing and Analysis . . . . . . . . . . . . . 4.3 Results: Ratings, Physiology, and EEG . . . . . . . . . . . . . 4.3.1 Subjective Ratings . . . . . . . . . . . . . . . . . . . 4.3.2 Heart Rate . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Modality-specific Correlates of Affect . . . . . . . . . 4.3.4 General Correlates of Valence, Arousal, and Modality 4.4 Discussion: Context-specific and General Correlates . . . . . 4.4.1 Validity of the Affect Induction Protocol . . . . . . . 4.4.2 Modality-specific Affective Responses . . . . . . . . . 4.4.3 General Correlates of Valence, Arousal, and Modality 4.4.4 Relevance to Affective Brain-Computer Interfaces . . 4.4.5 Limitations and Future Directions . . . . . . . . . . . 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 94 94 96 96 97 98 99 104 105 105 106 110 113 115 5 Conclusion 5.1 Synopsis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Evaluating Affective Interaction . . . . . . . . . . . . . 5.1.2 Inducing Affect by Music Videos . . . . . . . . . . . . . 5.1.3 Dissecting Affective Responses . . . . . . . . . . . . . . 5.2 Remaining Issues and Future Directions . . . . . . . . . . . . 5.2.1 Mechanisms and Correlates of Affective Responses . . 5.2.2 Context-specific correlates of affect . . . . . . . . . . . 5.2.3 Affect Induction and Validation in the Context of HMI 5.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 117 117 119 120 121 122 124 125 127 A Suplementary Material Chapter 3 129 B Suplementary Material Chapter 4 133 Bibliography 143 SIKS Dissertation Series 167 xi xii | CONTENTS Abstract Affective Brain-Computer Interfaces (aBCI), the sensing of emotions from brain activity, seems a fantasy from the realm of science fiction. But unlike faster-than-light travel or teleportation, aBCI seems almost within reach due to novel sensor technologies, the advancement of neuroscience, and the refinement of machine learning techniques. However, as for so many novel technologies before, the challenges for aBCI become more obvious as we get closer to the seemingly tangible goal. One of the primary challenges on the road toward aBCI is the identification of neurophysiological signals that can reliably differentiate affective states in the complex, multimodal environments in which aBCIs are supposed to work. Specifically, we are concerned with the nature of affect during the interaction with media, such as computer games, music videos, pictures, and sounds. In this thesis we present three studies, in which we employ a variety of methods to shed light on the neurophysiology of affect in the context of human media interaction as measured by electroencephalography (EEG): we evaluate active affective human-computer interaction (HCI) using a neurophysiological indicator identified from the literature, we explore correlates of affect induced by natural, multimodal media stimuli, and we study the separability of complex affective responses into context-specific and general components. In the first study, we aimed at the implementation of a hybrid aBCI game, allowing the user to play a game via conventional keyboard control and in addition to use relaxation to influence a parameter of the game, opponent speed, directly related to game difficulty. We selected the neurophysiological indicator, parietal alpha power, from the literature, suggesting an inverse relationship between alpha power and arousal. To evaluate whether a user could control his state of relaxation during multimodal, natural computer gaming, we manipulated the feedback of the indicator, replacing it with sham feedback. We expected the feedback manipulation to yield differences in the subjective gaming experience, physiological indicators of arousal, and in the neurophysiological indicator itself. However, none of the indicators, subjective or objective, showed effects of the feedback manipulation, indicating the inability of the users to control the game by their affective state. We discuss problematic issues for the application of neurophysiological indicators during multimodal, natural HCI, such as the transferability of neurophysiological findings from rather restricted laboratory studies to an applied context, the capability of the users to learn and apply the skill of relaxation in a complex and challenging environment, and the provision of salient feedback of the user’s state during active and artifact-prone HCI. For the identification of reliable neurophysiological indicators of affect, strong affect xiv | Chapter 0 – CONTENTS induction protocols are a necessary prerequisite. In the second study, we investigated the use of multimodal, natural stimuli for the elicitation of affective responses in the context of human media interaction. We used excerpts from music videos, partially selected based on their affective tags in a web-based music-streaming service, to manipulate the participants affective experience along the dimensions of valence and arousal. Effects on the participants’ subjective experience and physiological responses validated the thus devised affect induction protocol. We explored the neurophysiological correlates of the valence and arousal experience. The valence manipulation yielded strong responses in the theta, beta, and gamma bands. The arousal manipulation showed effects on single electrodes in the theta, alpha, and beta bands, which however did not reach overall significance. Furthermore, we found that general subjective preference for the music videos was related to the power in the beta and gamma bands. These frequency bands were partially associated with the literature on affective responses, and hence seem reliable features for aBCIs. The correlates of subjective preference make implicit tagging approaches of personal media preference, using neurophysiological activity, a potential aBCI application. Multimodal and natural media, such as music videos, are viable stimuli for affect induction protocols. The neurophysiological correlates observed can be a basis for the training of aBCIs. The neurophysiological correlates of affective responses toward media stimuli suggest the involvement of diverse mechanisms of affect. EEG exhibits limitations to separate the parts constituting the affective response due to its poor spatial resolution. However, a separation into general and context-specific correlates might be of interest for aBCI. In the third study, we aimed at the creation of a multimodal affect induction protocol that would enable the separation of modality-specific and general parts of the affective response. We induced affect with auditory, visual, and combined stimuli and validated the protocol’s efficacy by subjective ratings and physiological measurements. We found systematic variations of alpha power for the affective responses induced by visual and auditory stimuli, supposedly indicating the activity of the underlying visual and auditory cortices. These observations are in correspondence with theories of general sensory processing, suggesting that modality-specific affective responses are of a rather cognitive nature, indicating the sensory orientation to emotionally salient stimuli. Such context-dependent correlates have consequences for the reliability of EEG-based affect detection. On the other hand, they carry information about the context in which the affective response occurred, the event it originated from. Moreover, we revealed more general responses toward the valence and arousal manipulation. For valence we could replicate the previously found beta and gamma band effects. For arousal the delta and theta bands showed effects. Summarizing, the transfer of neurophysiological indicators of affect from restricted laboratory studies to complex real-life environments, such as found during computer gaming or multimodal media consumption, is limited. We have suggested and validated methods that enable the identification of neurophysiological correlates of affect during the consumption of multimodal media, a prerequisite for the development of aBCIs. We have shown that, given adequate affect induction procedures, context-specific and general components of the affective response can be separated. The complexity of the brain is the challenge, but also the promise of aBCI, potentially enabling a detection of the nature and context of the affective state. Chapter 1 Introduction 1.1 Motivation Emotions are a vital part of the human existence. They influence our behavior and play a considerable role in the interactions with our environment. Being central in humanto-human interaction, affect is supposed to be relevant in human-computer interaction (HCI) as well. Information about the users’ affective states can enrich the interaction with applications and devices otherwise blind to the affective context of their use. In general, such affect-sensitive computers can respond more adequately to the user, enabling a more natural and intelligent interaction. For example, an affect-sensitive e-learning application could instantaneously adapt the teaching schedule by raising or lowering the difficulty, when the learner exhibits pronounced episodes of boredom or frustration, respectively. Enabling such affect-sensitive HCI requires reliable affect detection. Indications for the affective state of a user can be derived from several sources: behavior, physiology, and neurophysiology. Affective brain-computer interfaces (aBCI) aim at the continuous and unobtrusive detection of users’ affective states from their neurophysiological recordings. Information on affect is difficult to assess by conventional input means, which would make such technologies attractive for a broad range of users. However, research on aBCIs is still in its infancy – possibilities and limitations of brain-based affect sensing have to be explored. One of the great challenges for aBCI results out of the complexity of affective responses and their neurophysiology, both still poorly understood. Compared to control-type braincomputer interface paradigms – relying on well-known neural correlates, such as P300 or mu-rhythms – clear and reliable neurophysiological correlates of affect are hitherto lacking. A reason might be that affect is a multi-faceted phenomenon, constituted by a number of different processes which partially differ according to the context of the affective response. For example, affect-eliciting information is received through different sensory modalities or generated from internal events like memories or thoughts. The accompanying cognitive processes, relevant for coping with the situation, and the ensuing requirements on behavior differ according to the specific situation an affective state is 2 | Chapter 1 – Introduction occurring in. Consequently, an affective response consists of a variety of multiple synchronized subprocesses in different (neural) response systems, each with its own neural correlates. As their involvement might differ depending on the requirements of the specific situation in which correlates of affect are to be measured, classifiers to be trained, and aBCIs to be used, it is important to search and identify reliable indicators of affect. This is especially relevant considering the complexity of real-world settings in which aBCIs are supposed to work. To determine reliable neurophysiological indicators that are suited for the classification of affect and for their application in HCI, researchers can make use of a plethora of literature from psychophysiology and affective neuroscience. However, these studies have been carried out under specific and well-controlled conditions, which potentially limits the reliability and validity of their findings for use in complex, real-world scenarios. In this thesis, we investigate several ways to identify reliable neurophysiological indicators for aBCIs used in complex, multimodal environments. Firstly, we used and evaluated a neurophysiological indicator of affect, identified from the literature, in an application. Secondly, we tested an approach for the reliable induction of strong affective states with ecologically valid, multimodal stimuli. Thirdly, we devised and validated a controlled, multimodal affect induction protocol to disentangle context-specific and general neurophysiological responses to affective stimulation. Below we will introduce the reader to the relevant concepts used in this thesis. We will start with a short overview over the most important theories of emotion. Then we will outline the goals of affective computing research in general, and the available information sources on users’ affective states. We will discuss affective brain-computer interfaces, their association with other brain-computer interface approaches, and give an overview over the neurophysiology of affect, the relevant structures and the most prominent EEG correlates of affective states and processes. Finally, we will describe the research topics that the present work tackles. 1.2 Affect - Terms and Models The term “affect” might seem best defined by opposing it to the term “cognition”. While affective phenomena are subjective, intuitive, or based on a certain emotional feeling, cognitive phenomena are objective, often explicable, and normally not associated with any emotional feeling (e.g., attention, remembering, language, problemsolving). Despite these apparent contrasts, it is becoming ever clearer that affective and cognitive processes not only interact with each other, but that they are tightly intertwined [Picard, 1997; Damasio, 2000; Barrett et al., 2007]. The term affect is an umbrella for a number of phenomena that are encountered in daily life. According to Scherer [2005], we can distinguish between several different concepts that constitute affect, most importantly emotions and moods. Emotions are defined as relatively short-lived states (seconds or minutes) that are evoked by a specific event, such as a thought, a memory, or a perceived object. This distinguishes them from longer lasting (hours or days) mood states that do not directly relate to a specific event, though Section 1.2 – Affect - Terms and Models | they might build up from one or more events over time. In the present work we will be most interested in the relatively short (seconds to a few minutes) affective states that are either self-induced or induced by (audiovisual) stimuli, that is in emotions. Below we will introduce the concept of affective or emotional responses and the main theories about the nature of emotions. In this work, we will refer to the explored phenomena as affect and emotion as interchangeable terms. A precise definition of the phenomenon “emotion” is seemingly difficult, as there are multiple aspects to emotions. From almost 100 definitions Kleinginna and Kleinginna [1981] created a working definition, covering various of these aspects of emotions: “Emotion is a complex set of interactions among subjective and objective factors, mediated by neural/hormonal systems, which can (a) give rise to affective experiences such as feelings of arousal, pleasure/displeasure; (b) generate cognitive processes such as emotionally relevant perceptual effects, appraisals, labeling processes; (c) activate widespread physiological adjustments to the arousing conditions; and (d) lead to behavior that is often, but not always, expressive, goal-directed, and adaptive.” There is an ongoing debate about the structure of affective responses, their eliciting, and their co-occurring mechanisms, in which three main approaches to dealing with emotion can be differed: basic emotions, dimensional feeling models, and appraisal models. We will shortly introduce all three approaches. 1.2.1 Basic Emotion Models Basic emotion models assume that emotional responses can be described by a small number of universal, discrete emotions or “emotion families” of related states. These families of emotions have been developed during the course of evolution, but might vary to a certain degree within themselves due to social learning. They are assumed to be universal in the sense that they can be found to a certain extent in all cultures, are partially inborn, and to a degree shared with other primates. Basic emotions can be differentiated in terms of response patterns that were adapted in a species-dependent manner during the course of evolution. At the core of these claims lies the observation of physiological responses and behavioral expressions that are specific for most of the suggested basic emotions, enabling the individual to deal with threats and urges in a manner that increases the chance of survival. Different physiological responses prepare the organism for the required actions, while behavioral expressions of the emotional state in body, face, and voice communicate the state and behavioral intent to other individuals. These responses are initiated by an appraisal of external or internal stimuli, which can be either fast and automatic, but may also take the form of an “extended appraisal”. Ekman [1992], for example, suggested happiness, anger, surprise, fear, disgust, and sadness as basic emotions. In [Ekman, 1999], he extended the list to incorporate also amusement, contempt, contentment, embarrassment, excitement, guilt, pride in achievement, relief, satisfaction, sensory pleasure, and shame. Other basic emotion models have 3 4 | Chapter 1 – Introduction proposed different sets of emotions (see [Ortony et al., 1988], page 27 for a comprehensive overview). This inconsistency of the types of basic emotions are one of the main criticisms of the opponents of this model, which deem the definition of basic emotions too vague. 1.2.2 Dimensional Feeling Models Dimensional feeling models, aim at an abstraction of the basic emotional concepts by postulating several dimensions on which specific emotional feelings, “core affect”, are definable. One of the most popular dimensional models is the circumplex model of Russel [1980]. It assumes that any emotional feeling can be localized on a two-dimensional plane, spanned by the axes of valence, ranging from negative to positive feelings, and arousal, ranging from calm to excited. This type of model has the advantage that it inherently takes care of the possibility that affective states are not always clearly assignable to specific basic emotions. In this model such mixed or complex emotions could be represented by being located between two or more clearly ascribed emotions. The origin of dimensional models and the circumplex structure can be found in statistical approaches, such as factor analyses or multidimensional scaling methods, used to study the underlying structure or relatedness of different emotion categories assessed by self-reports, judgements on emotion words, facial expressions, and other instantiations of emotions. While factor analyses found two dimensions sufficient to explain most of the variance of self reports and judgments of emotions, scaling methods showed a circular representation of discrete emotion categories on the space spanned by these dimensions. As for basic emotion models, however, different theories about the precise nature of these fundamental dimensions exist (see [Russel and Barrett, 1999]). In addition, further dimensions (e.g. dominance/control, potency, or tension) have been found to explain additional variance, though less coherently and are therefore seldom addressed. However, as for basic emotion models, dimensional models focus about the structure of emotional responses, but are rather vague in terms of eliciting processes that lead from an event to a specific emotion. 1.2.3 Appraisal Models Appraisal models are a functional attempt to disentangle the complexity of emotional responses in brain and body and the specific contexts in which they appear. In general, appraisal models postulate a number of checks (appraisals) that a stimulus event undergoes, and which in consequence determine the nature of the (emotional) response that is most suited to deal with the event. For example, Scherer’s Component Process Model [Scherer, 2005; Sander et al., 2005] postulates that during an emotional episode several subsystems (associated with cognitive, motivational, neurophysiological, motor expressions, and subjective feeling components) change synchronously and in an interrelated manner in response to a stimulus event that is deemed relevant for the organism. The relevancy is determined by complex appraisal mechanisms, including a number of sequential event checks on different analysis levels: relevance, implications for current goals, Section 1.3 – Affect, Human-Media Interaction, and Affect Sensing | coping-potentials, and normative significance. These checks are informed by a number of cognitive and motivational mechanisms, including attention, memory, motivation, reasoning, and self-concept. It is the outcome of this evaluation process that defines a specific response patterning of physiological reactions, motor expression, and action preparation. Appraisal models and notions of basic or dimensional emotion models are not necessarily incompatible. They can be viewed as treating different aspects of the same object, namely affect, in more detail, specifically the mechanisms which lead to certain affective states. Consequently, appraisal theories have also incorporated the notion of valence (or positive/negative appraisal) and arousal dimensions as underlying structure to the emotional responses [Ortony et al., 1988; Barrett et al., 2007]. In the current work, we will not directly address the different models of emotion, as our research does not aim at the study of the underlying structure of emotions. However, as we aim at the exploration of neurophysiological correlates of emotions, which presumes to a certain degree an assumption about their nature, we adopt the view of the dimensional model, specifically that of Russel’s circumplex model. We assume that emotions can be differentiated in terms of their physiology and behavioral indications on the basis of their location in the valence and arousal space. Furthermore, we will repeatedly return to the appraisal models of emotion, as they allow speculations about the mental processes involved in affective responses. We will continue now with a short introduction to the field of affective computing, which seeks for the integration of affect into the domain of human-computer, or more general human-media interaction. 1.3 Affect, Human-Media Interaction, and Affect Sensing Emotions are ubiquitous in our daily lifes. They motivate and rule most of our interpersonal interactions and they are of importance for our so-called rational responses to the events we are confronted with. In short, our affective states and emotional responses are of great influence for the twistings and turnings of the story of our life. These emotional responses evolved with us through the course of evolution to enable the organism to quickly adapt to the demands arising out of an extremely competitive environment (e.g., avoiding danger, satisfying survival-relevant needs) and hence to secure the organism’s survival. They are equally fundamental to us in our modern environment as they were to our cave-dwelling, spear-casting, and berry-gathering ancestors. While the latter hunters and gatherers might have depended on the fast responses only possible after quick decisions based on evolutionary-inbred gut feelings, the luck of today’s society still often depends on fast and intuitive reactions, especially in the absence of perfect information. Emotions allow such reactions, and lack thereof, either due to congenital or acquired conditions, has immense consequences for one’s life (see Damasio [2000], Chapter 2, for illustrative examples). Being so fundamental, emotional responses are not only observed toward real-live events, but extend also into the domain of mediated experiences. The reader will have no difficulties remembering instances where he or she responded utterly emotionally to a piece of media, for example during the interaction with computers for 5 6 | Chapter 1 – Introduction work or recreation, during the consumption of movies or music pieces, or the reading of a good book. While it might seem counter-intuitive that we respond to media, and the experiences mediated by them, in similar ways as to real-life events, Reeves and Nass [1996] showed in their book ”The Media Equation” that we generally respond in rather natural and social ways to media. This also includes emotional responses, for which effects of affect-laden media exposure seem similar to those toward their corresponding real-world events. Regarding the arousal as well as the valence induced by mediated experiences, such as pictures or videos, it was shown that highly arousing (e.g., sexual or threatening images) and negative content (e.g., pictures of mutilations and gore) not only influenced the selfratings of affective experience, but also influenced which of the media stimuli were actually remembered. This is a well-known effect also observed for real-life emotional events, which shows the principal equivalence of the emotional effects elicited by real-world and mediated events. Moreover, physiological measures, even neurophysiological indicators of affect, support this claim of equivalence. In fact, contemporary research often makes use of media to study affective phenomena (e.g., see Table 1.1 for the media stimuli used to induce emotions in affective BCI studies). The recognition of the importance of emotions for interaction with media spawned the research on the integration of affect into the field of human-computer or human-media interaction, leading to the creation of the research domain of affective computing. Rosalind Picard defined affective computing in her seminal work as “computing that relates to, arises from, or deliberately influences emotions” [Picard, 1997]. The rationale to place emotions on the map of human-computer interaction was formed by the observation of the fundamental role emotions play in our daily lives: emotions guide our perceptions and actions. They have an undeniable impact on our cognitive processes, and hence are central to what we call intelligence. Therefore, Picard noted that computers would need the capability to sense and express emotions to become genuinely intelligent, a requirement for a truly natural human-computer interaction. On a related note, only affect-sensitive computers could acknowledge the fact that users are already treating media in natural and social ways, and hence be able to respond adequately. Hence, in a time when computers have moved away from their original role of “work horses”, exclusively serving number crunching and information organization, becoming ever more pervasive companions in our daily lives, new ways of interaction with them have to be found: interaction methods that are more natural and better suited to integrate the growing number of functions computers serve in our everyday environment, and interaction methods that acknowledge the irrational, emotional side of human-computer interaction. Since the midst of the 1990s, when the term affective computing was coined, a lively and highly diverse research community, dedicated to the sensing, processing, and use of emotions for human-computer interaction, has formed. It has led to ideas for applications that sense the affective state of a learner or gamer and adapt to it, that monitor and feed the affective state back to a user, or that enrich communication between remote participants with affective information. All these potential applications rely on a common assumption: as they require for their function knowledge about the affective state of their users, they suppose the possibility Section 1.4 – Affect, Human-Media Interaction, and Affect Sensing | of an automatic detection of the emotional state from observation. And indeed, there is a multitude of possible sensor modalities and measurement methods that are known to carry information about the affective state of a person. Already easily accessible behavioral observations, for example using video recordings of the posture or facial expression or recordings of voice and speech, can inform about the emotion a person is experiencing. However, these sensor modalities are assessing behavior that might be controlled and manipulated by the person, to deceive the environment about his emotional state [Zeng et al., 2009]. More difficult to control are physiological responses, measuring the activity of the central, somatic, or autonomous nervous system that respond more directly to affective stimulation. The present work is concerned with the neurophysiological, and especially electrophysiological correlates of affect, as measured by non-invasive electroencephalography (EEG) sensors. The brain is involved in several stages of the affective response, which make neurophysiological measures an interesting modality for the detection of affective states. According to appraisal theories [Scherer, 2005; Barrett et al., 2007], neural structures are involved in several ways during affect, for example in core affective processes, associated processes of event evaluation, and during response planing. There are several non-invasive measures of neurophysiological activity: those based on the metabolic activity of the brain, such as positron emission tomography (PET, see Phelps [2006]), functional magnetic resonance imaging (fMRI, see Raichle [2003]), or functional near-infrared spectroscopy (fNIRS, see Hoshi [2005]), and those directly based on the activity of the populations of neurons, such as magnetoencephalography (MEG, see Hansen et al. [2010]) and electroencephalography (EEG, see Schomer and Silva [2010]). From those techniques, EEG technology has several crucial advantages for the assessment of neurophysiological changes for use during human-media interaction. Compared to fMRI and MEG systems, a low cost factor, simple application, and a convenient wearability make EEG a sensor modality for use by a widespread and diverse population of healthy and disabled users. Furthermore, while indicators based on bloodoxygenation, such as fNIRS and fMRI, respond with delays of several seconds, electroencephalography has a high temporal resolution delivering quasi immediate reflections of neural activity. Finally, a plethora of research has associated EEG with affective and cognitive processes in the time and frequency domain (see Section 1.5.2), which enables researchers to base applications on well-known indicators of the corresponding mental processes. On the other hand, there are also disadvantages to EEG measurements, which limit their applicability to a certain degree. The several layers of tissue, cerebrospinal fluid, and bone that divide the brain from the scalp electrodes, limit the spatial resolution of EEG signals and the resolution in the frequency domain. Furthermore, EEG measurements are susceptible to non-neural artifacts, for example stemming from eye-movements and blinks, muscle activity from in the face and neck, and sensor displacements. Nevertheless, several measures of the EEG have been shown to be retrievable and classifiable under applied conditions, making brain-computer interfaces possible. In the following section, we will introduce the concept of affective brain computer interfaces and give examples for such applications. 7 8 | Chapter 1 – Introduction 1.4 Affective Brain-Computer Interfaces The term affective brain-computer interfaces (aBCI) is a direct result of the nomenclature of the field that motivates their existence: affective computing. aBCI research and affective computing aim at same ends with different means, the detection of the user’s emotional state for the enrichment of human-computer interaction. While affective computing tries to integrate all the disciplines involved in this endeavor, from sensing of affect to its integration into human-computer interaction processes, affective brain-computer interfaces are mainly concerned with the detection of the affective state from neurophysiological measurements. In this respect, it can be more specifically situated within the field of affective signal processing (ASP), which seeks to map affective states to their co-occurring physiological manifestations [van den Broek, 2011]. Originally, the term brain-computer interface was defined as “a communication system in which messages or commands that an individual sends to the external world do not pass through the brain’s normal output pathways of peripheral nerves and muscles” [Wolpaw et al., 2002]. The notion of an individual (volitionally) sending commands directly from the brain to a computer, circumventing standard means of communication, is of great importance considering the original target population of patients with severe neuromuscular disorders. More recently, however, the human-computer interaction community developed great interest in the application of BCI approaches for larger groups of users that are not dependent on BCIs as their sole means of communication. This development holds great potential for the further development of devices, algorithms, and approaches for BCI, also necessary for its advancement for patient populations. Numerous special sessions and workshops at renowned international HCI conferences have since witnessed this increasing and broad interest in the application of BCI for standard and novel HCI scenarios. Affective BCI workshops, for example, have been held in 2009 and 2011 at the international conference of Affective Computing and Intelligent Interaction [Mühl et al., 2009b, 2011c], leading in turn to an increased awareness in the HCI community for research on neurophysiology-based affect sensing. Simultaneously to the development of this broad interest for BCI, parts of the BCI community slowly started to incorporate new BCI approaches, such as aBCI, in its research portfolio, softening the confinement of BCI to interfaces serving purely volitional means of control [Nijboer et al., 2011]. Below, we will briefly introduce the parts of the affective BCI: signal acquisition, signal processing (feature extraction and translation algorithm), feedback, and protocol. Then we will give an overview of the various existing and possible approaches to affective BCI, based on a general taxonomy of BCI approaches. 1.4.1 Parts of the Affective BCI Being an instance of general BCI systems [Wolpaw et al., 2002], the affective BCI is defined by a sequence of procedures that transform neurophysiological signals into control signals. We will briefly outline the successive processing steps that a signal has to undergo in a BCI Section 1.4 – Affective Brain-Computer Interfaces | (see Figure 1.1), starting with the acquisition of the signal from the user, and finishing with the application feedback given back to the user. Figure 1.1: The schematic of a general BCI system as defined by Wolpaw et al. [2002]. The neurophysiological signal is recorded from the user and the relevant features, those that are informative about user intent or state, are extracted. They are then translated into the control parameters that are used by the application to respond adequately to the user’s state or intent. Signal Acquisition BCIs can make use of several sensor modalities that measure brain activity. Roughly, we can differ between invasive and non-invasive measures. While invasive measures, implanted electrodes or electrode grids, enable a more direct recording of neurophysiological activity from the cortex, and have therefore a better signal-to-noise ratio, they are currently reserved for patient populations. Non-invasive measures, on the other hand, as recorded with EEG, fNIRS, or fMRI, are also available for the healthy population. Furthermore, some of the non-invasive signal acquisition devices, especially EEG, are already available for consumers in the form of easy-to-handle and affordable headsets. The present work focuses on EEG as a neurophysiological measurement tool, for which we will detail the following processing steps in the BCI pipeline. A further distinction in terms of the acquired signals can be made differing between those signals that are partially dependent of the standard output pathways of the brain (e.g. moving the eyes to direct the gaze toward a specific stimulus), and those that are independent on these output pathways, merely registering user intention or state. These varieties of BCI are referred to as dependent and independent BCIs, respectively. Affective BCIs, measuring the affective state of the user, are usually a variety of the latter sort of BCIs. 9 EEG(19) EEG(32),ECG R,BVP,T,EMG EEG(64) Schuster et al. [2010] Koelstra et al. [2010] Murugappan [2010] 20 (17/3) 6 (5/1) 20 (10/10) 20 (15/5) 10 (5/5) 11 (4/7) 1 discarded 10 (8/2) 26 10 (8/2) 4 12 (12/0) Participants (m/f) 16 music pieces 20 music clips IAPS IAPS music and sounds relive face images 16 music pieces IAPS IAPS 10 videos, Induction method 5 basic (D,H,F,S,N) 4 quadrants negative, positive, & N negative & positive 6 basic emotions neg/pos excited, calm-neutral happy & crying 4 quadrants 4 quadrants & N low/high arousal 5 basic (J,A,S,F,R) Emotions induced 5*5 4*5 3* 18 2* 25 6*1 3 * 100 6 * 10 4*4 54 2 * 50 5*2 Trial number < 60 60 6 3 60 8 6 30 5 * 0.5 6 Stimulation length (sec) 3cl:45% 2cl:60% 5cl:42% 2cl:60% Accuracy 5cl:83% 2cl:56% 2cl:72% 2cl:73% 6cl:90% 3cl:63% 2cl:73-80% 2cl:94% 4cl:93% 5cl:24-32% 2cl:71-81% Table 1.1: Overview of studies using EEG and physiological sensors to classify affective states (continued on next page). EEG(2) EEG(19) Zhang and Lee [2009] Khosrowabadi et al. [2009] EEG(64),GSR BVP,R EEG(62) Li and Lu [2009] Chanel et al. [2009] EEG(32) EEG EEG(64),GSR BVP,R EEG(1) BVP,GSR Sensor modalities Lin et al. [2009] Horlings et al. [2008] Chanel et al. [2005] Takahashi [2004] Study LDA kNN SVM kNN, SVM, 5 NN varieties LDA, QDA SVM, RVM SVM SVM SVM,NN, Naive Bayes Naive Bayes, FDA SVM Classifier 10 | Chapter 1 – Introduction EEG(32) EEG(32),ECG R,GSR,BVP,T EEG(128) EEG(16) EEG(32),ECG R,BVP,T,EMG EEG(32),ECG R,BVP,T Winkler et al. [2010] Chanel et al. [2011] Makeig et al. [2011] George et al. [2011] Koelstra et al. [2012] Soleymani et al. [2011b] 27 (11/16) 32 (16/16) 10 (8/2) 1 (1/0) 20 (13/7) 9 (9/0) 28 (14/14) 16 (9/7) Participants (m/f) movie excerpts music videos selfinduction drone sounds compute game IAPS IAPS face images Induction method 3 arousal and valence level 4 quadrants relax, concentrate 5 emotional associations boredom, anxiety engagement negative positive, N 4 quadrants 6 basic emotions Emotions induced 20 4 * 10 2 * 18 5 * 200 3*2 2 * 48 N: 16 4 * 40 6 * 10 Trial number 40 117 60 30 102 (+/- 12) 300 6 1 5 Stimulation length (sec) 3cl(val):57% 3cl(aro):52% 2cl(val):58% 2cl(aro):62% 2cl:70-100% of subjects 5cl:59 - 70% 3cl:56% 2cl:53 - 56% 4cl:79.5 - 81.3% 6cl:85% Accuracy SVM Naive Bayes NF, CSP CSP LDA, QDA SVM CSP, LDA MD, SVM QDA,kNN SVM,MD Classifier Table 1.2: Continuation Table 1.1 - Overview of studies using EEG and physiological sensors to classify affective states. Sensors: A audio recordings, BVP blood volume pulse, ECG electrocardiogram, EEG(#) electroencephalogram (number of electrodes), EMG electromyogram, fEMG facial electromyogram, GSR galvanic skin response, R respiration, T temperature, V video recordings; Emotions: A anger, Am amusement, F fear, Fr frustration, H hate, J joy, N neutral, R relaxed, S sadness, Su surprise, St stress; Trial number: per subject (different emotions * repetitions); Classifiers: CSP common spatial patterns, FDA Fisher discriminant analysis, kNN k nearest neighbor, LDA linear discriminant analysis, MD Mahalanobis distance, NF neurofeedback-like mapping, NN neural network, QDA quadratic discriminant analysis, SDA stepwise discriminatory analysis, SVM support vector machine EEG(18) EEG(4) Sensor modalities Frantzidis et al. [2010] Petrantonakis et al. [2010] Study Section 1.4 – Affective Brain-Computer Interfaces | 11 12 | Chapter 1 – Introduction Signal Processing - Feature Extraction From the signals that are captured from the scalp, several signal features can be computed. We can differentiate between features in the time and in the frequency domain. An example for features in the time domain are amplitudes of stimulus-evoked potentials occurring at well-known time-points after a stimulus event was observed. One of the event-related potentials used in BCI is the P300, occurring in the interval between 300 to 500 ms after an attended stimulus event. An example for signal features in the frequency domain is the power of a certain frequency band. A well-known frequency band that is used in BCI paradigms is the alpha band, which comprises the frequencies between 8 - 13 Hz. Both, time and frequency-domain features of the EEG, have been found to respond to the manipulation of affective states and are therefore in principle interesting for the detection of affective states (see Section 1.5.2). However, the aBCI studies listed in Table 1.1, are almost exclusively using features from the frequency domain. Conveniently, however, frequency domain features, such as the power in the lower frequency bands (< 13 Hz) are correlated with the amplitude of event-related potentials, especially the P300, and hence partially include information about time-domain features. While the standard control-type BCI approaches focus on very specific features, for example the mu-rhythm over central scalp regions or the mean signal amplitude between 200 and 500 ms after each P300 stimulus, affective BCI to date lacks such a priori information. Most of the current aBCI approaches make use of a wide spectrum of frequency bands, resulting in a large number of potential features. However, such large numbers of features require also a large number of trials to train a classifier (the “curse of dimensionality ”, see Lotte et al. [2007]), which are seldom available due to the limitations of affect induction (e.g., the habituation of the responses toward affective stimulation with time). Therefore, one of the tasks on the road toward affective BCI is the evaluation and identification of reliable signal features that carry information about the affective state, especially in the complexity of real-world environments. Another important task is the development of potent affect induction procedures, for example using naturally affect-inducing stimuli that increase the likelihood to induce the affective responses of interest. Signal Processing - Translation Algorithms The core of the BCI is the translation of the selected signal features into the command for the application or device. The simple one-to-one mapping between feature and command, requires a feature that conveniently mirrors the state in such manner. Because such ideal features are rare in the neurophysiological signal domain, most BCI studies use machine learning approaches that are trained to find a mapping between a number of signal features and the labels for two or more classes (see Lotte et al. [2007] for an overview of BCI classifiers). These approaches are transforming independent variables, the signal features, into dependent variables, the control commands. The classifiers have to adapt to the signal characteristics of the particular user, adapt to changes over time and changing contexts of interaction, and deal with the changes in brain activity due to the user trying to learn and adapt to the system. Classifiers used for affective BCI include linear discriminant analysis, support vector machines, or neural networks (see Table 1.1). Section 1.4 – Affective Brain-Computer Interfaces | The Output Device / Feedback Depending on the application the affective BCI is serving, the output can assume different forms. For BCI in general, the most prominent output devices are monitor and speakers, providing visual and auditory feedback about the user and BCI performance. In a few cases robots (a wheelchair, a car) have been controlled [Leeb et al., 2007; Hongtao et al., 2010]. An exceptional example of BCI output, however, is control of the own hand by the BCI-informed functional electrical stimulation of a paralyzed hand [Pfurtscheller et al., 2005]. In the case of control-type BCIs, the output has a major function, relating to the adaptation of the user to the BCI mentioned above. As BCI control can be considered to be a skill, any learning necessitates the provision of feedback about successful and unsuccessful performance. In the specific case of aBCI the same is possible, but the relative lack of control-type applications and the dominance of passive paradigms (see Section 1.4.2) make explicit performance-based feedback an option rather than mandatory. Again, the output device and the type of feedback given to the user is closely associated with the function of the application. For example, for implicit tagging or affect monitoring (for later evaluation), the feedback is not immediate. For the user there is no clear relation between state and feedback perceivable, making the notion of feedback in these aBCI applications almost obsolete. However, in many other applications the feedback is still existent and relevant, since the affective data is used to produce a system response in a reasonably near future. Examples are the applications that reflect the current affective state (e.g., in a game like “alpha World of Warcraft” [Plass-Oude Bos et al., 2010]), any neurofeedback-like application (e.g., warn of unhealthy states or reward healthy states), the active self-induction of affective states, or the adaptation of games or e-learning applications to the state of the user. The Operating Protocol The operating protocol guides the operation of the BCI system, for example switching it on and off (how/when), if the actions are triggered by the system (synchronous) or by the user (asynchronous), and when and in which manner feedback is given to the user. Other characteristics of the interaction that are defined by the protocol are, whether the information is active or passive and whether the information is gathered dependent of a specific stimulus event (stimulus dependent/independent). These two characteristics of BCI, voluntariness and stimulus-dependency, are also the basis for the characterization of different BCI approaches in the next section. Below, we will outline the different existing applications and approaches to affective BCI, and try to locate affective BCI within the general landscape of BCI. 1.4.2 The Different aBCI Approaches There are several possible applications of affect sensing that can be categorized in terms of their dependence on stimuli and user volition. In the following, a two-dimensional classification of some of these BCI paradigms will be given. It is derived from the general three category classification for BCI approaches (active, reactive, and passive BCI) presented in [Zander et al., 2011]. The dimensions of this classification are defined by (i) the 13 14 | Chapter 1 – Introduction dependence on external stimuli and (ii) the dependence on an intention to create a neural activity pattern as illustrated in Figure 1.2. Figure 1.2: A classification of BCI paradigms, spanning voluntariness (passive vs. active) and stimulus dependency (user self-induced vs. stimulus-evoked). Axis (i) stretches from exogenous (or evoked) to endogenous (or induced) input. The former covers all forms of BCI which necessarily presuppose an external stimulus. Steadystate visually-evoked potentials [Farwell and Donchin, 1988] as neural correlates of (target) stimulus frequencies, for instance, may be detected if and only if evoked by a stimulus. They are therefore a clear example of exogenous input. Endogenous input, on the other hand, does not presuppose an external stimulus, but is generated by the user either volitionally, as seen in motor-imagery based BCIs [Pfurtscheller and Neuper, 2001] or involuntarily, as during the monitoring of affective or cognitive states. In the case of involuntary endogenous input, the distinction between stimulus dependent and independent input might not always be possible, as affective responses are often induced by external stimulus events, though these might not always be obvious. Axis (ii) stretches from active to passive input. Active input presupposes an intention to control brain activity while passive input does not. Imagined movements, for instance, can only be detected when users intend to perform these, making the paradigm a prototypical Section 1.4 – Affective Brain-Computer Interfaces | application of active or control-type BCI. All methods that probe the user’s mental state, on the other hand, can also be measured when the users do not exhibit an intention to produce it. Affective BCI approaches can be located in several of the four quadrants (categories) spanned by the two dimensions, as quite different approaches to affective BCI have been suggested and implemented. Q I. Induced-active BCIs This category is well-known in terms of neurofeedback systems, which encourage the user to attain a certain goal state. While neurofeedback approaches do not necessarily focus on affective states, a long line of this research is concerned with the decrease of anxiety or depression by making the users more aware of their bodily and mental states [Hammond, 2005]. Neurophysiological features that have been associated with a certain favorable state (e.g., relaxed wakefulness) are visualized or sonified, enabling the users of such feedback systems to learn to self-induce them. More recently, it has been shown that affective self-induction techniques, such as relaxation, are a viable control modality in gaming applications [Hjelm, 2003; George et al., 2011]. Furthermore, induced-passive approaches might also turn into active approaches, for example when players realize that their affective state has influence on the game parameters, and therefore begin to self-induce states to manipulate the gaming environment according to their preferences (see below). Q II. Induced-passive BCIs This category includes the typical affect sensing method for the application in HCI scenarios where a response of an application to the user state is critical. Information that identifies the affective state of a user can be used to adapt the behavior of an application to keep the user satisfied or engaged. For example, studies found neurophysiological responses to differentiate between episodes of frustrating and normal game play [Reuderink et al., 2013]. Applications could respond with helpful advice or clarifying information to the frustration of the user. Alternatively, parameters of computer games or e-learning applications could be adjusted to keep users engaged in the interaction, for example by decreasing or increasing difficulty to counteract the detected episodes of frustration or boredom, respectively [Chanel et al., 2011]. Another approach is the manipulation of the game world in response to the players affective state, as demonstrated in “alpha World of Warcraft” [Plass-Oude Bos et al., 2010], where the avatar shifts its shape according to the degree of relaxation the user experiences. Such reactive games could strengthen the players’ association with their avatars, leading to a stronger immersion and an increased sense of presence in the game world. Q III. Evoked-passive BCIs BCI research has suggested that evoked responses can be informative about the state of the user. Allison and Polich [2008] have used evoked responses to simple auditory stimuli to probe the workload of a user during a computer game. Similarly, neurophysiology-based lie detection, assessing neurophysiological orientation responses to compromising stimuli, has been shown to be feasible [Abootalebi et al., 2009]. A similar approach is the detection of error-potentials in response to errors 15 16 | Chapter 1 – Introduction in human-machine interaction. It was shown that such errors evoke specific neurophysiological responses that can be detected and used to trigger system adaptation [Zander and Jatzev, 2009]. Given that goal conduciveness is a determining factor of affective responses, such error-related potentials could be understood as being affective in nature [Sander et al., 2005]. More directly related to affect, however, are those responses observed to media, such as songs, music videos, or films. Assuming the genuine affective nature of the response to mediated experiences delivered by such stimuli, it might be possible to detect the affective user states that are associated with them. A possible use for such approaches are media recommendation systems, which monitor the user response to media exposure and label or tag the media with the affective state it produced. Later on, such systems could selectively offer or automatically play back media items that are known to induce a certain affective state in the user. Research toward such neurophysiology-based implicit tagging approaches of multimedia content has suggested its feasibility [Soleymani et al., 2011a; Koelstra et al., 2012]. Furthermore, assuming that general indicators of affect can be identified using music excerpts or film clips for affect-induction protocols, such multi-modal and natural media seem suited to collect data for the training of aBCIs that detect affective states occurring in rather uncontrolled, real-life environments. Q IV. Evoked-(re-)active BCI This category seems less likely to be used for aBCI approaches, as the volitional control of affect in response to presented stimuli is as yet unexplored. However, standard BCI paradigms that use evoked brain activity to enable users to select from several choices were the first approaches to BCI and have been thoroughly explored. Prominent examples are the P300 speller [Farwell and Donchin, 1988] and BCI control via steady-state evoked potentials [Vidal, 1973]. Summarizing, there is a multitude of possible applications for aBCI that can be categorized according to the axes of induced/evoked and active/passive control. Our work will focus on induced-active (Chapter 2) and evoked-passive BCIs (Chapter 3 and 4), though for the latter type of affective BCIs, we do not attempt the classification of affective states, but rather explore the neurophysiological responses of affect. In the next section, we will briefly review the physiological signals that have been found informative regarding the affective state. We will introduce the signals measuring the activity in the peripheral nervous systems (somatic and autonomous nervous system) and the neural structures of the central nervous system and their neurophysiological EEG correlates of affect. 1.5 Physiological and Neurophysiological Measurements of Affect As mentioned in Section 1.3, there is a multitude of possible sources of information about the affective state of users during human-computer interaction. A great part of these sources lies in the physiological activity of the users, originating directly from their nervous Section 1.5 – Physiological and Neurophysiological Measurements of Affect | systems. The human nervous system can be divided into the peripheral nervous system and the central nervous system. The peripheral nervous system and measures of its activity have been shown in the past to differentiate affective states, and in the current work are used to validate the different affect induction procedures. We will briefly describe the relevant subsystems of the peripheral nervous system, the somatic and autonomous nervous system, and the measures we have used. The central nervous system and its affectrelated structures, which are the sources of neurophysiological measures of affect, will be discussed in more detail to give the reader the necessary background for the hypothesisguided and exploratory studies in the current work. 1.5.1 The Peripheral Nervous System (PNS) The peripheral nervous system comprises the somatic nervous system, the autonomic nervous system, and the endocrine nervous system. Below, we will only discuss the first two systems and the measures we use in the following chapters. A more thorough discussion of the components of the peripheral nervous systems and the sensor modalities used to assess their activity can be found in [Stern et al., 2001]. The Somatic Nervous System (SNS) The somatic nervous system manages the voluntary control of body movements through the manipulation of skeletal muscles. Furthermore, it also realizes reflexive and other unconscious muscle responses towards stimuli. Efferent neurons project from the central nervous system to the skeletal musculature. Measures of muscle activity from various body parts by an electromyogram (EMG) are assessing the activity of the somatic nervous system. Already Darwin [2005] extensively wrote about the importance of facial and postural expression of emotions. Facial Electromyography (fEMG) The measurement of the activity of the facial muscles, those involved in the facial expression of emotion, is known as facial EMG. While the relation between muscle tension and emotional arousal – higher tension for higher arousal [Hoehn-Saric et al., 1997] – seems straightforward, Cacioppo et al. [1986] showed that muscle tensions assessed by facial EMG differentiate valence and arousal. Two of the most popular facial muscles to measure in an emotion context are zygomaticus major, involved in smiling, and the corrugator supercilli, involved in frowning. However, a more refined analysis, including more facial muscles, might differentiate more emotional states [Wolf et al., 2005]. Magnée et al. [2007] showed that the facial muscle activity that differentiates between emotions can be observed over several emotion-eliciting stimulus materials. This was held as evidence that the observed activity is the result of a genuine emotional response and not facial mimicry. Upper Trapezius Muscle The response of muscles to states of mental stress (i.e., highly arousing stimulation), has been reported by several studies [Lundberg et al., 1994; Larsson 17 18 | Chapter 1 – Introduction et al., 1995; Hoehn-Saric et al., 1997; Rissén et al., 2000; Laursen et al., 2002; Krantz et al., 2004; Schleifer et al., 2008]. In the context of HCI, the upper trapezius muscle was found responsive to the induction of states of high mental arousal [Wijsman et al., 2010]. Most prominently, the stroop color word test and diverse arithmetic tasks have been used to induce mental stress. Mental stress leads to higher amplitudes and fewer gaps (i.e., periods of relaxation) in muscular activity measured over the trapezius muscle. Wahlström et al. [2003] found higher trapezius activity also for emotional stress. The Autonomous Nervous System (ANS) The autonomic nervous system consists of two major branches: the sympathetic and parasympathetic branch. These branches run between structures of the central nervous system and various internal organs and glands. While the sympathetic branch activates and prepares the body for expected mental or bodily effort (“fight or flight”), the parasympathetic branch puts the body into a relaxed state (“rest and digest”) and thus reestablishes the homeostasis of the system. Most inner organs are “dually innervated”, that is they receive input from both branches of the autonomous nervous system. The different functions of both branches express in different effects of the respective branch activations. The heart, for example, increases the beat rate when the activation of the sympathetic branch is increased, and decelerates when the sympathetic activation decreases. The relationship between heart rate and parasympathetic system behaves in the opposite manner: the heart rate increases in case of parasympathetic deactivation and decreases with activation. While this pattern of activation suggests a reciprocal relationship between both constituents of the ANS, results from psychophysiological experiments indicate that both branches can also vary co-actively or independent from each other [Berntson et al., 1993]. Many descriptions of emotions involve bodily states, peripheral physiological processes in particular, which indicates a strong connection between affective and physiological state. There is a multitude of physiological signals which assess the workings of the ANS. In the following part we will limit our overview to the galvanic skin response and the electrocardiogram, both used in the subsequent chapters. Galvanic skin response (GSR) The galvanic skin response was already used by Jung in 1907 to study emotional responses by word association tasks [Jung, 1907]. The most common method to measure the galvanic skin response is the exogenic measure, where a tiny electric current is introduced at one part of the palm and measured at another location on the palm. This quantifies the conductivity of the skin, which is known to be sensitive to changes due to the activity of sweat glands (i.e., eccrine glands). This specific type of glands, located in palms or foot soles, responds to psychological factors rather than to temperature. Even very subtle changes of conductivity, not perceived as sweaty, are measured by the GSR. The glands are innervated by the sympathetic branch of the ANS, and GSR is therefore thought to be a trustworthy measure of sympathetic activation. Emotional arousal, for example, was shown to robustly express itself in the GSR response for affective pictures [Lang et al., 1993; Codispoti et al., 2001], film clips Section 1.5 – Physiological and Neurophysiological Measurements of Affect | [Frazier et al., 2004; Codispoti et al., 2008], environmental sounds [Bradley and Lang, 2000], and musical pieces [Zimny and Weidenfeller, 1963; Khalfa et al., 2002; Gomez and Danuser, 2004; Lundqvist et al., 2008]. Electrocardiography (ECG) Heart rate, blood pressure, blood volume and blood flow, are cardiovascular measures that tell us about the heart function and vasomotor activity. The heart is a muscle that contracts about 60 to 75 times a minute. With each beat blood is moved through the three blood systems: the pulmonary circulation (going to the lungs for the O2/CO2 exchange), the coronary circulation (going to the heart itself), and the systemic circulation (supplying the rest of the body with oxygenated blood). The heart responds thus to a number of factors that (might) influence the need for oxygen in the body. Such preparatory responses are also observed after the exposure to mental stressors that imply potential need for bodily reaction. The heart responds with an increase of its activity to a stressor, an effect mediated by the activation of the sympathetic branch of the ANS. During periods of relaxation the heart rate decreases again. However, a stressor does not always lead to an increase of heart rate. While an arousing stimulus might well activate the sympathetic branch of the ANS, as evident from increasing levels of GSR, it might at the same time lead to a decrease of heart rate compared to neutral stimulation. Such effects of “directional fractionation” of the ANS, are evidence for the independent innervation of different organs by sympathetic and parasympathetic branches. In the case for heart rate, such effects have been found especially for the stimulation with mild stressors, such as arousing pictures [Codispoti et al., 2006; Aftanas et al., 2004], sounds [Bradley and Lang, 2000] and videos [Codispoti et al., 2008]. Also stimulus valence has been found to influence heart rate. Decreases for negative compared to positive stimuli were observed for pictures [Codispoti et al., 2006], sounds [Bradley and Lang, 2000], and music pieces [Sammler et al., 2007]. 1.5.2 The Central Nervous System (CNS) The Neural Structures of Affect The brain comprises a number of structures that have been associated with affective responses by different types of evidence. Much of the early evidence to the function of certain brain regions comes from observations of the detrimental effects of lesions in animals and humans. More recently, functional imaging approaches, such as PET or fMRI, have yielded insights into the processes occurring during affective responses in normal functioning (for reviews see [Phillips, 2003; Barrett et al., 2007; Kober et al., 2008; Lindquist et al., 2011]). Here we will only briefly discuss the most prominent structures that have been identified as central during the evaluation of the emotional significance of stimulus events and the processes that lead to the emergence of the emotional experience. The interested reader can refer to the work of Barrett et al. [2007] for a more in-depth description of the structures and processes involved. The core of the system involved in the translation of external and internal events to the affective state is built by neural structures in the ventral portion of the brain: medial 19 20 | Chapter 1 – Introduction temporal lobe (including the amygdala, insula, and striatum), orbitofrontal cortex (OFC), and ventromedial prefrontal cortex (VMPFC). These structures compose two related functional circuits, which represent the sensory information about the stimulus event and its somatovisceral impact, as remembered or predicted from previous experience. The first circuit, composed from the basolateral complex of the amygdala, the ventral and lateral aspects of the OFC, and the anterior insula, is mainly involved in the gathering and binding of information from external and internal sensory sources. Both, the amygdala and the OFC structures, possess connections to the sensory cortices, enabling an interchange of information about the perceived events and objects. While the amygdala is coding the original value of the stimulus, the OFC is supposedly creating a flexible, experience and context-dependent representation of the object’s value. The insula represents interoceptive information from the inner organs and skin, playing a role in forming awareness about the state of the body. By the integration of sensory information and information about the body’s state, a value-based representation of the event or object is created. The second circuit, composed of the VMPFC (including the anterior cingulate cortex (ACC)) and the amygdala, is involved in the modulation of parts of the value-based representation via their control over visceromotor responses, as for example autonomous, chemical, and behavioral responses. The VMPFC links the sensory information about the event, as integrated by the first circuit, to visceromotor outcomes. It can be considered as an affective working memory, which informs judgments and choices, and is active during decisions based on intuitions and feelings. Both circuits project directly and indirectly, for example via the ventral striatum, to the hypothalamus and brainstem, which are involved in a fast and efficient computation of object values and influence autonomous, chemical and behavioral responses. The outcome of the complex interplay of ventral cortical structures, hypothalamus, and brainstem establishes the “core affective” state that the event induced: an event-specific perturbation of the internal milieu of the body that directs the body to prepare for the behavioral responses necessary to deal with the event. These responses include the attentional orienting to the source of the stimulation and the enhancement of sensory processes. The perturbation of the visceromotor state is also the basis of the conscious experience of the pleasantness and physical and cortical arousal that accompany affective responses. However, as stated by Barrett et al. [2007], the emotional experience is unlikely to be the outcome of one of the structures involved in establishing the ”core affect”, but rather emerges on the system-level, as the result of the activity of many or all of the involved structures1 . Correlates of Affect in the EEG Time-domain Correlates A great part of the electroencephalographical emotion research focussing on the time-domain explores the consequences of emotional stimulation on event-related potentials (ERP). Event-related potentials are prototypical deflections of the 1 This constructivist position, readily compatible with functional appraisal models of emotion and with evidence collected by neuroimaging meta-analyses [Kober et al., 2008; Lindquist et al., 2011], is opposed by the localist position, which is defended by the proponents of basic emotion models. For a neuroimaging meta-analysis supporting the localist position see [Vytal and Hamann, 2010]. Section 1.5 – Physiological and Neurophysiological Measurements of Affect | recorded EEG trace in response to a specific stimulus event, for example a picture stimulus. They are computed by sample-wise averaging of the traces following multiple stimulation events of the same condition, which reduces sporadic parts of the EEG trace not associated with the processes functionally involved in response to the stimulus, originating from artifacts or background EEG. Examples of such event-related potentials that have been shown to be responsive to affective manipulations can be found in the form of early and late potentials. Early potentials, for example P1, N1, or N2, loosely defined as those deflections of the EEG trace occurring until 300 ms after stimulus onset, indicate processes involved in the initial perception and automatic evaluation of the presented stimuli. They have been found to be affected by the emotional value of a stimulus, differentiating negative from positive or low from high arousal stimuli. However, the evidence is far from parsimonious as the variety of the findings shows [Olofsson et al., 2008]. Late event-related potentials, those deflections observed after 300 ms, are supposed to reflect higher-level processes, which are already more amenable to the conscious evaluation of the stimulus. The two most prominent potentials that have been found susceptible to affective manipulation are the P300 and the late positive potential (LPP). The P300 has been associated with attentional mechanisms involved in the orientation toward an especially salient stimulus, for example very rare (deviants) or expected stimuli [Polich, 2007]. Coherently, P300 components show a greater amplitude in response to highly salient emotional stimuli, especially aversive ones [Briggs and Martin, 2009]. The LPP has been observed after emotionally arousing visual stimuli [Schupp et al., 2000; Cuthbert et al., 2000], and was associated with a stronger perceptive evaluation of emotionally salient stimuli as evidenced by increased activity of posterior visual cortices [Sabatinelli et al., 2006]. As in real-world applications the averaging of several epochs of EEG traces with respect to the onset of a repeatedly presented stimulus is not feasible, the use of such time-domain analysis techniques is limited for affective BCIs. An alternative for event-related potentials that is more feasible in a context without known stimulus onsets or repetitive stimulation are effects on brain rhythms observed in the frequency domain. Frequency-domain Correlates The frequency domain can be investigated with two simple, but fundamentally different power extraction methods, yielding evoked and induced oscillatory responses to a stimulus event [Tallon-Baudry et al., 1999] . Evoked frequency responses are computed by a frequency transform applied on the averaged EEG trace, and thus yield a frequency-domain representation of the ERP components. Induced frequency responses, on the other hand, are computed by applying the frequency transform on the single EEG traces before then averaging the frequency responses. They therefore also capture oscillatory characteristics of the EEG traces that are not phase-locked to the stimulus onset, and hence averaged out in the evoked oscillatory response. In a context where the mental states or processes of interest are not elicited by repetitive stimulation, coming with a known stimulus onset and short stimulus duration, the use of evoked oscillatory responses is equally limited as the use of ERPs. Therefore, the induced oscillatory responses 21 22 | Chapter 1 – Introduction are of specific interest to us. The analysis of oscillatory activity in the EEG has a tradition that reaches back over almost 90 years to the 20s of the last century, when Hans Berger reported the existence of certain oscillatory characteristics in the EEG, now referred to as alpha and beta rhythms [Berger, 1929]. The decades of research since then led to the discovery of a multitude of cognitive and affective functions that bear influence upon the oscillatory activity in different frequency ranges. Below we will briefly review the frequency ranges of the conventional broad frequency bands, namely delta, theta, alpha, beta, and gamma, their supposed main functions and their association with affect. A more thorough overview of the different frequency bands, their neural generators, and functional associations can be found in Niedermeyer [2005] and for the low frequency bands (delta, theta, alpha) in Knyazev [2007]. The delta frequency band comprises the frequencies between 0.5 to 4 Hz. Delta oscillations are especially prominent during the late stages of sleep [Steriade et al., 1993]. However, during waking they have been associated with motivational states as observed during hunger and drug craving. In such states, they are supposed to reflect the workings of the brain reward system, some of which structures are believed to be generators of delta oscillations [Knyazev, 2012]. Delta activity has also been identified as a correlate of the P300 potential, which is seen in response to salient stimuli. This has led to the belief that delta oscillations play a role in the detection of emotionally salient stimuli. Congruously, increases of delta band power have been reported in response to arousing stimuli [Aftanas et al., 2002; Balconi and Lucchiari, 2006; Klados et al., 2009]. The theta rhythm comprises the frequencies between 4 to 8 Hz. Theta activity has been observed in a number of cognitive processes, and its most prominent form, fronto-medial theta, is believed to originate from limbic and associated structures (i.e., ACC) [Baar et al., 2001]. It is a hallmark of working memory processes (see [Klimesch, 1996; Kahana et al., 2001; Klimesch et al., 2008] for reviews), and has been found to increase with higher memory demands in various experimental paradigms [Jensen and Tesche, 2002; Osipova et al., 2006; Sato and Yamaguchi, 2007; Gruber et al., 2008; Sato et al., 2010]. Specifically, theta oscillations are believed to subserve central executive function, integrating different sources of information, as necessary in working memory tasks [Mizuhara and Yamaguchi, 2007; Kawasaki et al., 2010]. Concerning affect, early reports mention a “hedonic theta” that was reported to occur with the interruption of pleasurable stimulation. However, studies in children between 6 months and 6 years of age showed increases in theta activity upon exposure to pleasurable stimuli [Niedermeyer, 2005]. Recent studies on musically induced feelings of pleasure and displeasure found an increase of fronto-medial theta activity with higher valence [Sammler et al., 2007; Lin et al., 2010], which originated from ventral structures. For emotionally arousing stimuli, increases in theta band power have been reported over frontal [Balconi and Lucchiari, 2006; Balconi and Pozzoli, 2009] and over frontal and parietal regions [Aftanas et al., 2002]. Congruously, a theta increase was also reported during anxious Section 1.5 – Physiological and Neurophysiological Measurements of Affect | personal compared to object rumination [Andersen et al., 2009]. The alpha rhythm comprises the frequencies between 8 and 13 Hz. It is most prominent over parietal and occipital regions, especially during the closing of the eyelid, and decreases in response to sensory stimulation, especially during visual, but in a weaker manner also during auditory and tactile stimulation, or during mental tasks. More anterior alpha rhythms have been specifically associated with sensorimotor activity (central mu-rhythm, [Pfurtscheller et al., 2006]) and with auditory processing (tau-rhythm, [Lehtelä et al., 1997]). The observed decrease of the alpha rhythm in response to (visual) stimulation, the event-related desynchronization in the alpha band, is believed to index the increased sensory processing, and hence has been associated with an activation of task-relevant sensory cortical regions. The opposite phenomenon, an event-related synchronization in the alpha band, has been reported in a variety of studies on mental activities, such as working memory tasks, and is believed to support an active process of cortical inhibition of task-irrelevant regions [Klimesch et al., 2007]. The most prominent association between affective states and affect has been reported in the form of frontal alpha asymmetries [Coan and Allen, 2004], which vary as a function of valence [Silberman, 1986] or motivational direction [Davidson, 1992]. The different lateralization of frontal alpha power during positive or approach related emotions compared to negative or withdrawal related emotions is believed to originate from the different activation of left and right prefrontal structures involved in affective processes. To date, evidence for alpha asymmetry has been found in response to a variety of different induction procedures, using pictures [Huster et al., 2009; Balconi and Mazza, 2010], music pieces [Schmidt and Trainor, 2001; Altenmüller et al., 2002; Tsang et al., 2006], or film excerpts [Jones and Fox, 1992]. Despite the large body of evidence for frontal alpha asymmetries, several studies show that the phenomenon is not always observed in response to affective stimulation [Schutter et al., 2001; Sarlo et al., 2005; Winkler et al., 2010; Fairclough and Roberts, 2011; Kop et al., 2011]. The alpha rhythm has also been associated with a relaxed and wakeful state of mind [Niedermeyer, 2005]. Coherently, increases of alpha power are observed during states of relaxation, as indexed by physiological measures [Barry et al., 2007, 2009] and subjective self-report [Nowlis and Kamiya, 1970; Teplan and Krakovska, 2009]. The beta rhythm comprises the frequencies between 13 and 30 Hz. Central beta activity is associated with the somatosensory mu-rhythm, decreases during motor activity or tactile stimulation, and increases afterwards. That has led to the view that the beta rhythm is a sign of an “idling” motor cortex [Pfurtscheller, 1996]. A recent proposal for a general theory of the function of the beta rhythm suggests that beta oscillations impose the maintenance of the sensorimotor set for the upcoming time interval (or “signals the status quo”, see [Engel and Fries, 2010] for a review). Concerning affect, increases of beta band activity have been observed over temporal regions in response to visual and self-induced positive, compared to negative emotions [Cole and Ray, 1985; Onton and Makeig, 2009]. A general decrease of beta band power has been reported for stimuli that had an emotional 23 24 | Chapter 1 – Introduction impact on the subjective experience, compared with those that were not experienced as emotional [Dan Glauser and Scherer, 2008]. The gamma rhythm comprises the frequencies above 30 Hz. Gamma band oscillations are supposed to be a key mechanism in the integration of information represented in different sensory and non-sensory cortical networks [Fries, 2009]. Accordingly, they have been observed in association with a number of cognitive processes, such as attention [Gruber et al., 1999; Müller and Gruber, 2001; Gruber et al., 2008], multi-sensory integration [Senkowski et al., 2008, 2009], and memory [Jensen et al., 2007]. Concerning valence, temporal gamma rhythms have been found to increase with increasing valence [Müller et al., 1999; Onton and Makeig, 2009]. For arousal, posterior increases of gamma band power have been associated with the processing of high versus low arousing visual stimuli Keil et al. [2001]; Aftanas et al. [2004]; Balconi and Pozzoli [2009]. Similarly, increases of gamma activity over somatosensory cortices have also been linked to the awareness to painful stimuli [Gross et al., 2007; Hauck et al., 2007; Senkowski et al., 2011; Schulz et al., 2012]. However, Dan Glauser and Scherer [2008] found lower gamma power for emotion for stimuli with compared to those without an emotional impact on the subjective experience. Summarizing, the different frequency bands of the EEG have been associated with a multitude of cognitive functions. Consequently, it is rather unlikely to find simple oneto-one mappings between any oscillatory activity and a given cognitive function. Nevertheless, there is an abundance of studies evidencing the association of brain rhythms with affective responses. Affective brain-computer interfaces can thus make use of the frequency domain as a source of information about their users’ affective states. This thesis has the aim of exploring the relationship between oscillatory activity and affect in the context of human-media interaction. Below we introduce the studies that we have conducted with this regard, each exploring a different aspect of the way toward affective brain-computer interfaces. 1.6 Research Questions The main focus of this work is the question whether and how (which) neurophysiological activity can inform about the affective state of a user. Specifically, we investigate whether neurophysiological activity measures carry affective information in the context of humanmedia interaction. We apply several approaches to answer this question: implementation of a simple literature-informed affect indicator in a computer game, induction and study of affective states with normal, multimodal stimuli, namely music clips, and the induction and study of affective states with a controlled auditory and visual affect induction procedure to identify those affective responses specific to the sensory modality and to explore more general correlates. Below, we will briefly outline the rationale and hypotheses for the three approaches. Section 1.6 – Research Questions | 1.6.1 Evaluating Affective Interaction Most experiments studying the correlates of affective responses are conducted in wellcontrolled and restricted environments. Most affect induction studies use a passive presentation paradigm, in which the participants are asked to keep still, to avoid movements, and to concentrate on the stimulation. These studies have led to a rich source of literature about the neurophysiological correlates of affect, which seems a solid basis for the use of neurophysiological indicators of affect in human-computer interactions. However, the restrictive context of the studies exploring affect, is different from the situation that is encountered during human-computer interaction scenarios, as for example during gaming, e-learning, or music listening. The participants (users) cannot be asked to avoid movement or to concentrate only on one task. On the contrary: especially during normal gaming they will move and have one or multiple taxing tasks. This poses the question, whether an indicator of affect identified from the literature can be used during such normal human-computer interaction as an auxiliary input modality. We implemented a simple computer game that uses a neurophysiological indicator of affect, associated by several studies with relaxation/arousal state, to control an aspect of the game dynamic, while the user would control the game mainly with a conventional input modality, the keyboard. We used a double blind study design, manipulating the neurophysiology-based control, a subjective game experience questionnaire, physiological and neurophysiological measures to validate the workings of the neurofeedback control. Summarizing, we explored the use of a neurophysiological indicator of affect, identified from the literature, as an auxiliary control modality in a computer game. Can the literature be used as a heuristic to identify possible features for affective BCIs? What are potential issues that complicate such transfers of knowledge from basic to applied science? 1.6.2 Inducing Affect by Music Videos As mentioned above, there is a multitude of studies of the neurophysiological correlates of affect. These studies often used a variety of affect induction protocols to induce the affective states. These can be roughly distinguished in terms of their stimulation method, either using self-induction methods (asking participants to remember or imagine certain affective states) or using external stimuli (pictures, sounds, music pieces). Those studies applying external stimulation mostly used unimodal stimuli, mainly briefly presented pictures, to elicit emotional responses. While these approaches yielded a number of interesting findings, enabling the identification of certain mechanisms involved in emotional responses, such unimodal stimuli are different from the usually multimodal nature of stimulation in our environment. Similar to the above voiced concerns about the generalization of effects to a computer game, it is interesting to see whether neurophysiological correlates of affect are also observed using multimodal stimuli as encountered in daily life, namely music videos. Furthermore, we believe that multimodal stimuli, especially rather realistic and natural ones, are a strong means to efficiently induce affective responses – a pre-requirement for the (subject-dependent) training of affect classifiers. 25 26 | Chapter 1 – Introduction To induce affective responses in all four quadrants of the valence-arousal space, we selected 40 one-minute excerpts of music videos, partially based on their associated affective tags in the web-based music platform last.fm and validated via an online study. We used subjective self-assessments and physiological sensors to validate the manipulation of valence and arousal. We explored the neurophysiological correlates of valence, arousal, and subjective preference/liking. Summarizing, we explored the capability of music videos to induce affective states varying in valence and arousal. Moreover, we explored the neurophysiological correlates of valence and arousal induced by such natural and complex stimuli. 1.6.3 Dissecting Affective Responses Theories of affect suggest that affective responses are compound responses, consisting of the orchestrated responses of several subsystems or components. For example, according to Scherer [2005], the brain is directly implied in several of these components: information processing, feeling-monitoring and behavior planning. Depending on the specific situation in which the response occurs, and on the affordances associated with that situation, the involvement of specific subcomponents might differ. Furthermore, each situation might have its specific co-activations, not directly related to the affective response, but nevertheless present in the neurophysiological signals. Dependent on the context in which an affective response is elicited, different correlates of affective responses might be observed. For a reliable and valid recognition of affective states, we have to know about the nature of the neurophysiological correlates involved in affective responses. For an affective BCI, contextual co-activations might seem a vital part of the neurophysiological signature that will discern different user states, and hence will be part of the pattern used to differentiate them. Such context-specific correlates of affective responses, however, could change as the situation changes. An example are affective responses associated with sensory processing, such as orienting toward an emotionally salient stimulus, which is largely specific for the different sensory inputs. These modality-specific co-activations of affect, hence, might impede the reliability of affect classifiers in complex, multimodal environments in which the sensory origin of the affective stimulation can change, thereby changing the modality-specific part of the neurophysiological signature of affect. Therefore, it is of interest to the field of affective BCI to find methods that are able to discern context-specific and general correlates of affect. We devised an audio-visual affect induction protocol, using normed and well-established affective stimulus sets, to independently varied modality and value of induced affect. Specifically, we manipulated valence and arousal either with pictures, or with sounds, or by bi-modal stimulation. We validated the induction protocol with subjective selfassessments and physiological measurements. We evaluate the effect of affective stimulation (vs neutral) against a background of the literature suggesting modality-specific responses for the alpha band power. We further explore general neurophysiological correlations of valence and arousal, occurring for all stimulation modalities. Summarizing, we asked if we can experimentally separate modality-specific and general correlates of affective responses. We explored both specific and general effects and Section 1.7 – Thesis Structure relate them to the neuroscientific literature on affective and cognitive processes. 1.7 Thesis Structure In this thesis, we explore the suitability of neurophysiological signals, recorded with electroencephalography, for the differentiation of affective states in complex, multi-modal contexts. We study the neurophysiology of affective responses in different settings, from complex real life environments to carefully controlled laboratory experiments. We explore frequency-domain based EEG correlates of affect in active and passive HCI settings, and test hypotheses about the context-dependency of affective responses. In Chapter 2, we will describe a study where we employ a brain-based indicator of the affective state of self-induced relaxation to enable the user to control parts of the game dynamic in a simple computer game. We evaluate the capability of the user to control his relaxation state and thereby its indicator, via the manipulation of the indicator feedback (real vs sham). Assuming the necessity of rewarding feedback for the acquisition of the skill of mental control, we hypothesize an effect of the feedback manipulation on the user experience, physiological indicators of relaxation, and on the brain-based indicator (alpha power) itself. We will discuss the lacking effects of the feedback manipulation with regard to the reliability of simple neurophysiological indicators in the complexity of real world settings, participant training, and feedback saliency. In Chapter 3, we use natural multimodal stimuli, namely commercial music clips, to enable a strong emotion induction in a passive task. We selected clips on the basis of their user tags in the web 2.0 music station, so that we would have 10 clips for each quadrant of the valence-arousal emotion space. We will discuss the observed correlates of valence, arousal, and subjective preference in different broad frequency bands we investigated, with the most significant effects in the higher frequencies at temporal regions. Possible ways to improve classification accuracies might be found by an understanding of the complexity of affective responses of audio-visual stimuli. Hence, in Chapter 4, we investigate the context-dependency of affective responses, specifically the dependency on the sensory modality of emotion induction. Therefore, we induced affective states via visual, auditory, and audio-visual stimuli. We will discuss the found modality-specific and general correlates of affective responses. The ensuing consequences for the construction and use of aBCIs in complex, multimodal settings will be discussed, as well as promising future directions for aBCI research. Chapter 5 will discuss the results and conclusions obtained in all three studies from a more general perspective. We will reflect about the answers to the above formulated research questions, and discuss their consequences for the central question that the present work focuses on: Are neurophysiological measurements informative regarding the affective state of users in the context of human media interaction? | 27 28 | Chapter 1 – Introduction Chapter 2 Evaluating Affective Interaction 2.1 Introduction1 Affective BCI, as one instance of affective computing, offers new and exciting means for HCI, as the continuous and unobtrusive sensing of the affective state was something hitherto very difficult, if not impossible. A potential area of application of affect sensing in HCI are computer games that offer several ways of integrating information about the user state [Gilleade et al., 2005; Plass-Oude Bos et al., 2010; Chanel et al., 2011]. For example, the game can counteract states of boredom or frustration, by adjusting game difficulty, thereby keeping the player motivated and engaged. The user’s affective state can also be merely reflected in the appearance or behavior of the avatar or non-player characters, giving the player a deeper feeling of presence in the virtual world of the game. Finally, the players can be enabled to consciously control aspects of the game with self-induced affective states, offering an additional input modality that in turn might be used to encourage a desirable and healthy emotional state, for example relaxation, just in the same way as physical exertion games encourage physical well-being. In this chapter, we explore the use of neurophysiological activity as an additional, auxiliary control modality in a computer game, that would enable the player to control game difficulty by relaxation. We implemented a simple computer game that the players controlled mainly with the keyboard. A parameter of the game, opponent speed, was controlled by the player’s neurophysiological activity measured by EEG, supposedly indicating the level of relaxation. Our experiment aimed at the study of the effects of this auxiliary BCI control. Specifically, we employed a user experience questionary and physiological measures in a double-blind study design for the validation of the multimodal BCI gaming approach. Below, we will shortly motivate our approach for the integration of an affective BCI 1 The work presented here is based on a pilot study in which the game prototype was evaluated [Mühl et al., 2009a]. Based on the results of this work the experiment methodology was elaborated, including two levels of game difficulty and a double-blind design, leading to the present study, which was published in the Journal of Multimodal User Interfaces [Mühl et al., 2010]. 30 | Chapter 2 – Evaluating Affective Interaction as additional modality in a rather conventional computer game. Then we will outline the system characteristics and the evidence for the chosen neurophysiological indicator of relaxation. Finally, we will explicate the hypotheses underlying the experiment. 2.1.1 From Proof-of-Concept to hybrid BCI Games In addition to the potential applications possible with affect sensing technology, there are several good reasons to study BCI in the context of computer games. Firstly, computer games can be simulations of the real world - without the limitations or dangers that (disabled) people have to face in the real world. In such environments people can safely learn to use BCI, or BCIs that are supposed to work in the real world can be trained under realistic, but safer circumstances [Lecuyer et al., 2008; Nijholt et al., 2009]. Secondly, computer games allow the exploration of neurophysiological correlates of affective states that are difficult to measure in the real world, due to the restrictions that current signal acquisition devices impose, mainly the susceptibility to movement artifacts, or due to the rare occurrence and difficult validation of episodes of such states in complex, real-world environments (for example the induction and exploration of frustration during human-computer interaction [Reuderink et al., 2009]). Finally, gamers are early adopters of novel technologies. They are willing to accept a certain degree of suboptimal performance [van de Laar et al., 2011], they welcome the challenge of the skill acquisition necessary to use novel interfaces, such as BCIs [Wolpaw et al., 2002], and seek for novel experiences that BCI applications are able to offer. And most of all, they are a target group that is willing to spend a large amount of money for new and hedonistic experiences, which makes the gaming industry keen on providing these kinds of experiencing. Making BCIs part of the novel input modalities that will enter the mainstream market in the next years - a process that has already begun with comR R panies developing consumer EEG headsets, such as Emotive or Neurosky - therefore holds promise for the development of hard and software with the financial means of the entertainment industry, from which patient populations will also profit. There are already a number of games that use BCI as input modality. They can be roughly distinguished on the basis of the type of brain activity, or the BCI paradigms, they use [Nijholt et al., 2009]: internally generated (induced) versus externally triggered (evoked) neurophysiological activity. A popular example of internally generated activity is motor imagery [Lotze and Halsband, 2006], in which the user imagines movement of certain body parts, leading to a desynchronization of the mu-rhythm over the respective motor regions. Most often left or right hand imaginary movements are used, which consequently lead to the desynchronization of the alpha band over the right or left central cortices, respectively. An example for a simple game-like application using motor imaginary is “Brain Basher”, in which the user is asked to provide specific imageries in response to visual cues, earning points for the right responses [Bos and Reuderink, 2008; Nijholt et al., 2009]. An extraordinary approach using motor imagery is the brain pinball machine, in which left and right flippers are worked via imagery [Tangermann et al., 2009]. Another interesting approach to BCI gaming – and probably the first affective BCI Section 2.1 – Introduction | game – is the two-player game “Brainball”, in which the player is instructed to self-induce a state of relaxation [Hjelm and Browall, 2000; Hjelm, 2003]. If successful, and not having fallen asleep, the more relaxed player will witness his triumph as a small silver ball slowly rolls over the goal line of his opponent. More recently, a variety of commercial games R using a one-electrode headset (Neurosky ) were brought onto the mass market. These games attempt to reflect the user’s current state of concentration (attention) or relaxation (meditation) in the form of an air flow lifting a ball, which has to be held at a certain level R (Force Trainer, Uncle Milton ) or guided through a moving ring of obstacles (Mindflex, R Mattel ). Examples of externally generated activity are visual P300 and SSVEP (steady-state visually-evoked potentials) approaches, which use visual focus and attention to an external stimulation to generate the neurophysiological activity. The user selects a target by shifting the gaze to it, or simply by attending to it covertly, without moving the eyes. The first published BCI application by Vidal [1973] is an example for an SSVEP-based BCI game. The user controls the movement of his avatar through a maze by directing the gaze on checkerboard stimuli flickering in different frequencies at the different edges of the labyrinth. More recent examples, include the control of a toy car [Hongtao et al., 2010] or the balancing of a figure on a pole [Lalor et al., 2005]. A P300-based BCI game was created by Finke et al. [2009], enabling the user to control the movement of an avatar via his attention to flickering target locations in a three-dimensional environment. These games can be considered as proofs of concept, showing that the use of neurophysiological activity for playful and entertaining HCI is possible. However, current BCIs suffer from a number of limitations. Neurophysiological measurements have in general a low signal-to-noise ratio (SNR), making approaches such as averaging over multiple trials or relatively long (seconds) measurements necessary for an acceptable accuracy of control. The SNR is further decreased in situations where excessive noise due to eyemovement or muscle activity is present. Consequently, BCIs have rather low accuracies, especially when the measurement is conducted under natural conditions, not suppressing movement. In its most extreme form, the low SNR leads to BCI “illiteracy”, people unable to use a BCI devise, or at least a certain BCI paradigm [Blankertz et al., 2009; Vidaurre and Blankertz, 2010]. Furthermore, trial averaging, prolonged recording durations, slow generation of mental activity, signal preprocessing and classification might accumulate to relatively long system response times. In addition, most BCI paradigms currently need a number of training examples to train a subject-specific classifier, adding further to the temporal commitment that comes with BCI use. Finally, the production of mental activities recognizable for a BCI system can be fatiguing for the user. These limitations of BCI pose challenges for its use in applications for healthy users. Although patients for whom BCI is the only interaction modality left may be willing to accept these limitations, healthy subjects will be less forgiving. For them, many other input modalities are available. Although the novelty and challenge offered by BCI make playing such games worthwhile for a time, the above mentioned limitations lead to rather restrictive game design with a low dimensional control, slow pace, and high error rate. An alternative to purely BCI-controlled games are multimodally controlled games in which the player makes use of more traditional input modalities, enabling a wide variety 31 32 | Chapter 2 – Evaluating Affective Interaction of game dynamics, while BCI offers an additional level of control that has to be mastered either continuously or at specific points during the game. Such hybrid BCIs, the combination of a BCI and another control modality [Pfurtscheller et al., 2010], allow the incorporation of the positive characteristics of BCI, while limiting the impact of its negative characteristics. Examples of such hybrid BCIs in games are still rare. In “αWorld-of-Warcraft” a neurophysiological measure of the player’s state of relaxation is used to change the shape of the player’s avatar in accordance with the affective state, while the avatar’s actions in the massively multiplayer online role-playing environment of World-of-Warcraft are controlled by the conventional controls [Nijholt et al., 2009]. Gürkök et al. [2011] developed a game that used SSVEP-based selection in combination with mouse pointing for the control of several avatars, which were selected by the focus on their associated SSVEP stimuli and guided by mouse, to herd sheep to a specific goal location. Another recent game using eye gaze and motor imagery to control a racing car was developed by Finke et al. [2010]. Commercially, several companies have enabled the combination of BCI devices with other input modalities, for example to trigger a gun by producing a certain target frequency power while navigating via conventional control modalities 2 , or to manipulate virtual objects by priorly associated and trained mental activity 3 . The present work suggests and evaluates a hybrid BCI for the continuous manipulation of an aspect of the game dynamic during conventional game playing. Specifically, a version of the internally generated brain activity, the self-induced affective user state of relaxed wakefulness, is used. Below the working principle of the affect feedback is described, motivating our choice of the neurophysiological indicator. 2.1.2 Relaxation in the Biocybernetic Loop In the present study, the aim was the incorporation of a self-induced affective state, in addition to conventional keyboard control, as an auxiliary control modality that is supposed to have a positive effect on the state of the user. We chose to use relaxation as the psychophysiological state to foster, due to its relevance for psychological and bodily well-being. The user controlled a game element, opponent speed, by means of his relaxation state. During the game, relaxation should be a desirable state, as it is indirectly rewarded with advantages in game play (i.e., lower opponent speed and hence easier game play). A similar interface was implemented earlier in the simple game “Brainball” [Hjelm and Browall, 2000; Hjelm, 2003] and by [George et al., 2011]. However, Hjielm’s player was exclusively using EEG activity in order to compete with another player, with the goal to achieve a greater state of relaxation compared to the opponent. George’s player, on the other hand had no other task than controlling an object with self-induced relaxation. In the present work, the goal of the player was to chase and score opponents using the keyboard, while a state of greater relaxation would simplify the task. 2 http://www.ocztechnology.com/nia-game-controller.html 3 http://www.emotiv.com/ Section 2.1 – Introduction | This specific aBCI approach, making use of immediate feedback of neurophysiological activity that is associated with an affective state, can be described with the concept of the biocybernetic feedback loop [Fairclough, 2010]. A physiological measurement that is related to the psychophysiological state, such as relaxation, is measured. The application reacts to the observed physiological indicator of relaxation in terms of a priorly defined mapping between indicator and system parameters. This reaction results in sensory feedback of the measured state to the user. This feedback functions as a reminder and guide to the user to maintain or deepen the state of relaxation. Two functionally different feedback mechanisms can be applied: negative and positive feedback [Fairclough, 2009]. Negative feedback aims at the minimization of the difference between defined goal state and current user state, keeping the user in a desired goal zone (i.e. reaching and staying in a state of high relaxation). Positive feedback aims at the constant increase of the desired psychophysiological state, by adjusting the goal constantly upwards (i.e. setting a new goal). While the former feedback mode is optimal for safety systems, the latter method might be more interesting to gamers, requiring constant striving for improvement, and was thus implemented in the present study. The identification of a valid and reliable measure of the user state is a critical aspect of such a system. As laid out by Fairclough [2010], there are multiple possibilities how physiological measures can map to psychological concepts, such as the affective state. The easiest and most straightforward relationship between measurement and concept is the one-to-one mapping, where a specific measurement informs about a specific concept. Other mappings are the one to many, many to one, and many to many mappings, which reflect the complexity of the interrelations between different measurements with different psychophysiological concepts. For the purpose of this study, we identified parietal alpha activity as a potential indicator of relaxation. Parietal alpha is associated with affective processes, but also with cognitive functions. It is thus interesting, whether this activity constituting a one-to-many mapping - can be used for the volitional control of a parameter in a computer game. 2.1.3 Alpha Activity as Indicator of Relaxation A conceptually similar approach to the implemented feedback mechanism can be found in neurofeedback methods [Hammond, 2007]. Neurofeedback is a variant of biofeedback that enables the users to monitor and regulate body functions that are normally not accessible to them. Specifically, indicators of brain activity are made observable via visualisation or sonification. Examples of such indicators are the alpha band power (8-12 Hz), sensorymotor rhythm (8-15 Hz), or slow cortical potentials. The potential areas of the application of neurofeedback techniques are manifold, and can be found for therapeutical contexts as well as for purposes of cognitive enhancement. Prominent current applications involve the treatment of epilepsy [Sterman and Egner, 2006], attention-deficit hyperactivity disorder (ADHD) [Arns et al., 2009], and of substance abuse [Sokhadze et al., 2008]. In recent years studies also revealed performance increases in healthy subjects as a result of certain neurofeedback protocols [Gruzelier et al., 2006]. The rationale for these procedures is the assumption that training subjects to increase activation in relevant brain areas will im- 33 34 | Chapter 2 – Evaluating Affective Interaction prove the performance aspects of associated functions. In general, a specific brain activity, identified on the basis of patient versus norm populations or on correlates of desired states or processes, is fed back to the user, rewarding its (increased) occurence. We chose the feedback of alpha activity over parietal regions for its supposed relation with a state of relaxed wakefulness compared to a state of stress or mental workload. As [Niedermeyer, 2005] states, the “relaxed waking state is the optimal condition for the posterior alpha rhythm”. On the other hand, the alpha rhythm is “blocked or attenuated by attention, especially visual, and mental effort” [Shaw, 1996]. This relationship between alpha and relaxation was also observed in recent studies, which found increased alpha to correlate with a decreasing physiological indicator of arousal (skin conductance level) [Barry et al., 2007, 2009]. Teplan and Krakovska [2009] have shown the correlation of posterior alpha power with subjective ratings of the state of relaxation. Nowlis and Kamiya [1970] found that self-induced alpha, in response to a short neurofeedback training, led participants to report a feeling of “relaxation, letting go, and pleasant affect”. In that regard, it is interesting that alpha power is considered as inversely related to neuronal activity [Laufs et al., 2003]. This has been interpreted as cortical idling [Pfurtscheller and Lopes da Silva, 1999]. However, the view that alpha power is only a consequence of inactivity is starting to shift, as alpha oscillations are now considered as active mechanisms and functional correlates in perception, movement, and memory processes [Klimesch, 1999; Schürmann, 2001; Palva et al., 2007]. Furthermore, the feedback of alpha activity is considered to be an important tool for the treatment of anxiety disorders or of general anxiety in non-patient populations, which are associated with an unusually strong activation of the sympathetic branch of the autonomous nervous system [Moore, 2000, 2005; Hammond, 2005]. 2.1.4 Hypotheses The purpose of the experiment was to examine the effect that α-feedback-based BCI has on game play. To assess the effect that the feedback of the alpha power had, a control condition with sham feedback was devised. The effects of this feedback manipulation were investigated via subjective and objective indicators. As subjective indicators of user experience, we compared the responses of the participants for the Game Experience Questionnaire (GEQ; [IJsselsteijn et al., 2007]) scales (competence, flow, tension, negative affect, positive affect) and for an additional α-control item. We expected the participants to be better able to control their relaxation state in the condition where they actually received feedback of it in terms of alpha power. This should be reflected in their game experience ratings as a higher feeling of α-control, less tension and higher competence rating, and consequently lead to lower negative and higher positive affect. In addition, we manipulated game difficulty to avoid a ceiling effect of relaxation, which would hinder users to self-induce further relaxation. We expected the difficulty manipulation to result in higher tension, lower feeling of control and of competence for difficult compared to easy gaming. H1a : α-feedback conditions lead to greater feeling of α-control, less tension, lower negative, Section 2.2 – Methods: System, Study design, and Analysis higher positive affect, and a higher feeling of competence, compared to sham-feedback condition. H1b : The difficult game conditions lead to higher tension, lower feeling of competence, compared to the easy game condition. As objective indicators of the effects of α-feedback, we recorded and analyzed heart rate and parietal alpha band power during the different feedback conditions. We expected heart rate decreases4 to reflect higher relaxation in the α-feedback conditions. For alpha band power, increases for the α-feedback conditions were assumed to reflect the efficacy of its feedback, rewarding the user’s attempts of relaxation (see Section 2.1.3). H2 : The heart rate decreases for the α-feedback compared to the sham-feedback conditions. H3 : Alpha power increases for the α-feedback compared to the sham-feedback conditions. 2.2 Methods: System, Study design, and Analysis In this section we will first describe the details of the game design and mechanics of the system design. Then we will outline the methods, measures, and analysis we use to study the application of neurophysiology as input modality in multimodal gaming. 2.2.1 Game Design and In-game EEG Processing Pipeline Bacteria Hunt is a hybrid affective BCI game, or a multimodal game using neurophysiological indicators of affective and cognitive states as an auxiliary control modality. The neurophysiological control modality is based on parietal alpha activity related to the relaxation/arousal state of the user, and on the steady-state visual evoked potential (SSVEP), assessing the visual and mental focus on an SSVEP stimulus presented in certain parts of the game (see Section 2.2.2 for a description of SSVEP stimulation). The latter SSVEP condition is of no direct relevance for the the current study, but will be described for the sake of completeness. The game is composed of a game component and an EEG processing component. The game component manages the experimental stimulation (i.e., the conditions), while the EEG component analyzes the EEG signals and feeds control signals back to the game. This section provides the details of the implementation of these components and also proposes and evaluates the performance of a new SSVEP detection algorithm. Game design The game was built using the Game MakerTM development platform5 . It is based on the prototype described in Mühl et al. [2009a]. In the following, those features that are of importance for this study will be described. Following the terminology of Aarseth [2003], 4 The decrease of heart rate is supposed to be part of the “relaxation response” [Benson et al., 1974], and has been found to correlate positively with arousal, indicated by higher skin conductivity and subjectively perceived tension [Drachen et al., 2010] and game difficulty manipulations [Chanel et al., 2011] in computer games. 5 http://www.yoyogames.com/gamemaker | 35 36 | Chapter 2 – Evaluating Affective Interaction the game can be decomposed into the game world (consisting of the objects that comprise the game) and its rules (defining the possible interactions of these objects). Game world The game world, as depicted in Figure 2.1, consists of several entities: the player avatar (the amoeba), targets (bacteria), a numeric representation of the points obtained so far, a graph depicting the recent history of alpha band power and SSVEP classification, and an SSVEP stimulus (which is always associated with one of the target items) when it is triggered. The numeric representation of points functions as a high-level indicator of progress in terms of the game rules, while the graph depicting alpha band power and SSVEP classification functions as a low-level indicator of the neurophysiological coupling of the player and the game. Figure 2.1: The game world. Nine targets (orange “bacteria”) and one player avatar (blue “amoeba”) are present. The player’s score is shown at the top left. The histogram above the line depicts the recent alpha band power, below the line the SSVEP classification results are marked. Game rules For experimental reasons, three levels of the game were created as summarised in Table 2.1. Each level had two difficulty conditions: easy and difficult. Some general rules were the same for all levels of the game, others varied with the specific level. Section 2.2 – Methods: System, Study design, and Analysis Table 2.1: The game levels played during the study, and the manual or neurophysiological control mechanisms for the game difficulty (bacteria flight speed) and scoring (bacteria eating). Game Level Keyboard only (K) Keyboard+Alpha (KA) Keyboard+Alpha+SSVEP (KAS) Flight speed of bacteria Eating of the amoeba Constant α-based α-based Key press Key press SSVEP-based General rules: Movement of the player avatar is performed using the arrow keys, resulting in direct feedback through changed avatar position. Avatar position also jitters by some random noise in order to create a dynamic feel of a moving “amoeba”. The game world always contains nine non-overlapping targets. If the distance between the center of the avatar and that of a target is below a threshold radius, then the target can be “eaten”. Targets disappear when eaten and are replaced by a new target, randomly placed on a free space in the game world. Successful eating results in positive points, while eating failures result in negative points. The ultimate goal of the game is to obtain as many points as possible. Rules for the flight of targets: Similar to the avatar, each target moves randomly within a short range, creating the feel of a dynamic game world. In addition, as the avatar approaches a target, the target flees. This flight behavior depends on some factors. Levels KA and KAS employ alpha neurofeedback, which inversely affects target flight. Targets flee further if the scaled alpha power (Psca (fα ), see Section 2.2.2) is low and less far if it is high, making the player’s mental state control the difficulty of the hunt. Moreover, in an easy game, targets flee less far than in a difficult game where players have to chase their targets with great effort. Therefore in an easy game, movements are generally perceived as slower and in smaller numbers than in a difficult game. Rules for points for eating targets: In all game levels, an unsuccessful eating attempt is penalized with -50 points. In levels K and KA, eating is triggered by pressing the Ctrl key. A successful eating attempt rewards the player points proportional to the distance between the centres of the avatar and the closest target. In level KAS, approaching a target closer than the threshold radius triggers a SSVEP stimulus appearing next to the target. The association of the stimulus and the target is visualized by a line connecting both. The stimulus consists of a circle with a diameter of 128 pixels and flickers with a frequency of 7.5 Hz, changing from black to white. It is displayed for 6 seconds on the screen. This frequency-stimulus combination was shown to be highly efficient in the elicitation of an SSVEP response [Mühl et al., 2009a]. The SSVEP classification results (R(fH2,H3 ), see Section 2.2.2) are recorded within the time interval of 3-6 sec with respect to the stimulus onset (this specific interval depends on the window size of 3 seconds used in the EEG processing). If the mean output of the SSVEP classifier during this interval is above 0.5, the target gets eaten. In this case the player receives points proportional to his mean | 37 38 | Chapter 2 – Evaluating Affective Interaction alpha power measured in the same interval. Thus, for players, it is of benefit to control both alpha band power and SSVEP simultaneously. 2.2.2 EEG Processing The EEG signals were continuously read with 512 Hz and processed in 3-second sliding windows, the interval between window onsets being 1 second. We decided for this specific window-length, as it gave superior classification results compared to shorter window lengths in conjunction with 7.5 Hz SSVEP stimulation [Mühl et al., 2009a]. Signal processing and machine learning steps were carried out in Python, using the Golem6 and Psychic7 libraries . The steps for processing a window are depicted in Figure 2.2. A window was first re-referenced using common average referencing (CAR) since no reference electrode was employed during recordings. Then the total alpha power, P (fα ), was obtained by averaging the individual amplitude values for each integer frequency in 8-12 Hz at channel Pz computed by 512-points fast Fourier transform (FFT). For game level K, all observed alpha power values P (fα ) were just collected within a vector, AK , for use in the next levels and no further processing was done in the window. In games KA and KAS, local minima and maxima (αmin and αmax ) were defined for normalization as follows: αmin = µ(AK ) − (2 × σ(AK )) and αmax = µ(AK ) + (2 × σ(AK )) (2.1) where µ(AK ) and σ(AK ) are the mean and standard deviation values of set AK . This way, assuming a normal distribution, the interval [αmin , αmax ] would contain 95% of the cases and eliminate possible outliers. To obtain the final alpha power, P (fα ) was mapped to a value in [0,1] as: Psca (fα ) = P (fα ) − αmin αmax − αmin (2.2) where Psca (fα ) is the scaled alpha power value. In cases where Psca (fα ) < 0 and Psca (fα ) > 1, it was assigned to 0 and 1 respectively. For game KA, window processing ended here. For game KAS, processing went on by SSVEP detection. An adapted version of the detection algorithm by [Cheng et al., 2002] was implemented for this purpose (see Section 2.2.2 for implementation details and evaluation). Powers in second (15 Hz) and third (22.5 Hz) harmonics of the stimulus flickering frequency (7.5 Hz) at channel Oz were extracted using 1024-points FFT and averaged. Let P (fH2,H3 ) denote this average SSVEP response. P (fH2,H3 ) was standardised as follows: z(fH2,H3 ) = 6 https://github.com/breuderink/psychic 7 https://github.com/breuderink/golem P (fH2,H3 ) − µ(f10−35 ) σ(f10−35 ) (2.3) Section 2.2 – Methods: System, Study design, and Analysis 32 EEG CAR Oz Pz 8-12 Hz 15,22.5 Hz thresholding SSVEP no SSVEP Figure 2.2: The pipeline depicting the processing steps each 3-second EEG epoch is submitted to. The signal is measured from the 32-channel EEG cap, and referenced to the common average. Then the alpha band power (8-12 Hz) and the power for the SSVEP-algorithm (15, 22.5 Hz) are extracted via FFT from the parietal electrode Pz and the occipital electrode Oz, respectively. While the alpha power (relaxation state) is directly translated into opponent speed, the power of the SSVEP frequency is checked against a simultaneously computed threshold to decide if an SSVEP response (focus on the flicker) was detected. | 39 40 | Chapter 2 – Evaluating Affective Interaction where z(fH2,H3 ) is the z-score and µ(f10−35 ) and σ(f10−35 ) are the mean and standard deviation values for the average power in 10-35 Hz band. The SSVEP response, R(fH2,H3 ), was detected with respect to the threshold value of 0.4, determined as described in the next subsection, as: 1 : z(fH2,H3 ) > 0.4 R(fH2,H3 ) = (2.4) 0 : otherwise Once the values necessary for a game version were computed out of the window, the values were passed to the game via the TCP connection (see Table 2.2). Table 2.2: Values sent through TCP for each game version Level Alpha power SSVEP response K KA KAS Constant Psca (fα ) Psca (fα ) Constant Constant R(fH2,H3 ) The Steady-State Visually-Evoked Potentials Control Steady-state visually-evoked potentials (SSVEP) can be induced when a person is attending to a flickering visual stimulus, such as a LED. The frequency of the attended (and fixated) stimulus [Regan, 1989], as well as its harmonics [Müller-Putz et al., 2005], can then be traced in the EEG over the visual cortex, that is electrode locations O1, Oz, and O2 according to the 10-20 system. If several stimuli are flickering with a different frequency, then the attended frequency will dominate over the other presented frequencies in the observers EEG [Skidmore and Hill, 1991]. A commonly used method to detect SSVEPs is to apply a Fast Fourier Transformation on the EEG and compare the amplitudes in the frequency bins corresponding to the frequencies of the stimuli provided. If only one stimulus is used, the amplitude of the corresponding frequency bin is compared to a set threshold. Frequencies of stimuli between 5-20 Hz elicit the highest SSVEPs [Herrmann and Knight, 2001]. SSVEPs depend on gaze, which means that users have to move their eyes or head to control a SSVEP-BCI with maximal efficiency. Covert attention to SSVEP-stimuli has been shown to still allow the elicitation of SSVEPs [Morgan et al., 1996; Müller et al., 1998; Kelly et al., 2005; Allison and Polich, 2008]. However, when the stimulus is presented at an angle larger than three degrees from fixation, the detection of SSVEP (and therefore also a covert attention SSVEP-BCI) becomes unlikely [Thurlings et al., 2010]. The SSVEP detection algorithm used in our system is based on the algorithm described by Cheng et al. [2002]. Their algorithm is defined as follows: the SSVEP response is defined as the sum of the power in the fundamental frequency and the power in the Section 2.2 – Methods: System, Study design, and Analysis second harmonic. This is compared to twice the power of a baseline. In other words, the average of the fundamental frequency and second harmonic has to be higher than the baseline to give a positive classification. The baseline is the mean power in the frequency spectrum between 4 and 35 Hz, determined empirically before the session with real-time feedback. The selection in their system is made in cases where a particular option is selected for four consecutive windows. The window size is 512 samples, with a 60-sample inter-window-interval, at a sample frequency of 200 Hz. This algorithm has been adjusted in a few key ways, in order to make it more robust to spontaneous alpha activity, and personalized without the need for prerecorded data. In time-frequency plots created from the data of the pilot experiments, clear power increases show for the second and third harmonics for most of the subjects, but not always for the fundamental frequency. Based on this information, we decided not to look at the fundamental frequency and second harmonic, but at the second and third harmonics. This means that there is no longer a need to rely on information from the alpha range. As activity in the alpha band is also used as a representation of relaxation which the user can actively try to control or passively perceive, this is a way to separate the two paradigms somewhat. The lower bound of the baseline range has been increased to 10 Hz, in order to make it less susceptible to low-frequency artifacts, for example such as those caused by eye movements. Whereas Cheng et al. use a fixed baseline, the Bacteria Hunt algorithm first applies z-score normalization. This is a way to set a subject-independent threshold on previously subject-dependent data, without requiring previous recordings of the subject before starting the session with real-time feedback. It might also make the algorithm more robust to changes over time. Finally, Cheng et al. used electrode O1 and O2, but did not describe how they merged the recordings from these two electrode sites, so we decided to rely for both algorithms on brain activity at the center of the occipital lobe (electrode Oz), where the SSVEP response was expected to be the strongest. In short, in the Bacteria Hunt algorithm the SSVEP response is defined as the average of the power in the second and third harmonics. In the case of 7.5 Hz stimulation, this means the average of the power at 15 Hz and 22.5 Hz. To determine this power a 1024-points fast Fourier transform is applied to the 3-second windows recorded at 512 Hz sample frequency. This results in a frequency bin size of 0.5 Hz. The SSVEP response is normalized according to the data in the same window, from 10 Hz to 35 Hz. The mean and standard deviation of power over this frequency range is computed. These values are used for the z-score normalization. If this normalized response is higher than the fixed threshold, an SSVEP has been detected. This threshold determines the detection accuracy, and therefore has to be estimated carefully. Threshold estimation and pilot evaluation The threshold estimation was based on eight datasets gathered during preliminary experiments with six subjects. During these pilot studies, subjects played the Bacteria Hunt game, but were forced to have sufficient periods without SSVEP stimulation by waiting to eat a new bacterium for at least six seconds. This way, data was obtained with and without SSVEP stimulation within the environment later used in the game setting with real-time feedback. Based on this data, | 41 42 | Chapter 2 – Evaluating Affective Interaction Figure 2.3: Mean accuracy per threshold, average over the eight datasets from the pilot experiment. The highest accuracy is yielded with a threshold of 0.4. the threshold that would result in, on average, the optimal accuracy for SSVEP detection is 0.4 (Figure 2.3). This optimal threshold was used in the main experiment. In order to decide which of these two classification algorithms, the original or the modified version, should be implemented for the game, the area under the curve (AUC) was used as a performance measure. AUC values are a measure of discriminability of the two classes (SSVEP versus no SSVEP) based on the given values. To be exact: “The AUC of a classifier is equivalent to the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance. This is equivalent to the Wilcoxon test of ranks” [Fawcett, 2006]. The AUC is based on the receiving-operating characteristics (ROC) curve, which is created by varying the threshold and plotting the resulting true positive rates (hits relative to the total number of windows during which the SSVEP stimulus was shown) and false positive rates (false alarms relative to the total number of windows during which no stimulus was provided). This means on one hand that the class ratio of SSVEP versus no SSVEP has no influence on the resulting ROC or AUC value. Besides, this performance measure is independent of the actually chosen threshold, really focusing on the ability to discriminate on this feature. These are two advantages this measure of performance has over more simple measures such as accuracy (the number of hits and correct rejections relative to the total number of samples). An AUC value of 0.5 would be random performance for a two-class problem. An AUC value of 1.0 would indicate perfect discrimination. From the pilot studies, an average AUC value of 0.64 was computed for the original algorithm of Cheng et al., and of 0.74 for the Bacteria Hunt variety. Although it is based on only those eight datasets, the adaptation of the algorithm is expected to perform better than the original, thus the new algorithm was selected to be used with the game for real-time feedback. Section 2.2 – Methods: System, Study design, and Analysis Figure 2.4: ROC curves of the SSVEP detection algorithm by Cheng et al. (dashed) and the Bacteria Hunt algorithm. The Bacteria Hunt SSVEP algorithm performs, though not significantly, overally better. Evaluation of the SSVEP detection algorithm To compare the SSVEP detection algorithm used by Bacteria Hunt with the original developed by Cheng, et al., the ROC curve was computed on a superset containing samples from the four game blocks of all nineteen subjects (refer to Section 2.2 for the experiment design). For the samples which were positive for SSVEP stimulation, the brain signal data recorded during the KAS level was epoched starting from one second after the SSVEP stimulation with a duration of three seconds. The samples with no SSVEP stimulation were taken from KA levels, which were simply epoched in subsequent 3-second windows. The result is a dataset with 1569 windows during SSVEP stimulation, and 4560 windows without. See Figure 2.4 for the resulting ROC curves. The AUC value seems slightly larger for the Bacteria Hunt algorithm. However, based on two ROC curves it is not possible to determine whether the difference is significant. Therefore the AUC values per block per subject were examined as well. A paired samples t-test was used to compare the AUC values for the algorithm of Cheng, et al. to the Bacteria Hunt version. There was a trend towards a higher performance for Bacteria Hunt (M=0.65, SD=0.14) than Cheng, et al. (M=0.62, SD=0.18) with t(75)=1.775, p=0.08. When comparing the performance for real alpha versus sham alpha blocks, or easy versus difficult games, there was no difference. A possible problem for the estimation of an optimal threshold is the co-occurrence of SSVEP-unrelated physiological differences during the non-SSVEP trials during gaming. Normal gaming, in contrast to the concentrated fixation during SSVEP, might be contaminated by additional activity occurring due to muscle activity, eye movements and game-related cognitive activity. This activity might be found to influence the power in the relevant frequencies, that is between 10 and 35 Hz and hence bias the SSVEP recognition algorithm, relying on a threshold estimated with relatively clean non-SSVEP trials. Therefore, the conditions during threshold estimation should match the conditions of ap- | 43 44 | Chapter 2 – Evaluating Affective Interaction plication, to keep the bias as small as possible. Summarising, the modified version of Cheng, et al. was overall better performing than the original version. An optimal threshold estimation should take the context of use into account, specifically additional activities when no SSVEP stimulus is shown. 2.2.3 Participants Nineteen participants (three female) took part in the experiment. The mean age was 29 years, ranging from 16 to 47 years. Eighteen participants were right-handed, one was left-handed. Informed consent was obtained from all subjects. To motivate participants to highest performance levels, and thus to make active use of the advantages a high state of relaxation would bring, two cinema tickets were promised to the participant with the highest score. Six Euro per hour or a course credit were given as a reimbursement to the participants. 2.2.4 Data acquisition and material The game was run on a separate presentation computer (1.8 GHz Pentium M). The EEG signals were recorded with 512 Hz sample rate via a BioSemi ActiveTwo EEG system and thirty-four Ag/AgCl Active electrodes. Additionally, electrocardiogram (ECG), galvanic skin response (GSR), and blood volume pulse (BVP) were recorded. The data was processed and saved on a separate data recording and processing computer (2.53 GHz Quadcore) running BioSemi ActiView software. To acquire subjective responses from the participants, the game experience questionnaire (GEQ) devised by IJsselsteijn et al. [2007] was used. The questionnaire comprises items that are accumulated to 7 scales, each reflecting a specific facet of game experience: competence, immersion, tension, flow, challenge, positive affect, and negative affect. The immersion scale provided by the GEQ was left out of the questionnaire for its irrelevance in the current study. In addition to the items of the GEQ we asked the participants directly to rate their perceived ability to control the alpha power by relaxing (α-control item). For all items 5-point Likert-scales, ranging from 1 (not at all) to 5 (very), were used. 2.2.5 Experiment design Effect of alpha neurofeedback In order to test the hypothesis about the influence of neurofeedback, a two-factorial (2 (feedback) × 2 (difficulty)) experimental protocol was devised, resulting in four conditions: easy and real feedback (A), difficult and real feedback (B), easy and sham feedback (C), and difficult and sham feedback (D), as depicted in Table 2.3. For the first factor (feedback) the alpha feedback application was manipulated. In the real feedback version an increase of alpha yielded a decrease in mobility of the bacteria. That is, the successful player would be able to bring the bacteria to a halt, thus scoring more. In the sham feedback version the mobility of the bacteria was controlled by the Section 2.2 – Methods: System, Study design, and Analysis Table 2.3: The structure of the experiment design, illustrating the combination of the two factors of feedback (α versus sham) and difficulty (easy versus difficult) into four conditions, each implemented as a series of the 3 game levels (keys only (K), keys + α (KA), and keys + α + SSVEP (KAS)). Easy Difficult α-feedback Sham-feedback A B C D alpha values from the first of the three games played (level K). Thus there was no systematic relation between the players instantaneous state and the feedback in the levels KA and KAS. The second factor (difficulty) controlled the strength of the feedback. A high difficulty would mean that the bacteria moved faster and that it would therefore be harder to catch them. This manipulation was introduced to ensure that people’s relaxation was not at a ceiling level, obstructing possible increases of relaxation. Participants were asked to play four game blocks (i.e., the four conditions listed in Table 2.3). The order of blocks was pseudo-randomized to avoid order effects. Each block contained the 3 games in the following order: K, KA, and KAS (see Section 2.2.1). Each game lasted for 3 minutes. That made a total duration of 9 minutes per block, excluding small breaks between the games. After each block the participants were given the GEQ questionnaire to evaluate their game experience. To avoid an “observer-expectancy effect”, that is a bias of the participant by the experimenter’s knowledge and expectation, the experimenters were not aware of the specific experimental condition a given game block was associated with. 2.2.6 Procedure The participants were seated in front of the computer running the game. The distance to the screen was about 80 cm. They read and signed an informed consent and instructions, and filled in a questionnaire assessing information about the amount of computer usage, prior drug consumption, amount of sleep and so forth. After that the electrode cap was placed and the electrodes connected. The thirty-two electrodes were placed according to the 10-20 system [Jasper, 1958] (see Figure 2.5). Furthermore, two electrodes were attached to the back of the left and right lower arm of the participant to record ECG (Eindhoven Lead 1 standard). Two GSR electrodes and a plethysmograph, measuring the BVP, were fastened to the intermediate phalanges of the little and ring fingers, and to the distal phalange of the middle finger of the left hand, respectively. Immediately before the experiment, the participants were again informed about the differences in control of the 3 games, and specifically the role of alpha power and the possibility to influence it through relaxation. | 45 46 | Chapter 2 – Evaluating Affective Interaction Figure 2.5: Electrode locations according to the 10-20 system with Pz (location of α-band power extraction) and Oz (location of SSVEP-band power extraction) highlighted. 2.2.7 Data analysis The answers to the items regarding the participants’ experiences during each block of games were accumulated into the 6 scales of the GEQ [IJsselsteijn et al., 2007]. To investigate the effect of the feedback and difficulty manipulations on the subjective game experience ratings we computed 2 (feedback) × 2 (difficulty) repeated measure analyses of variance (ANOVA). For the analysis of the EEG data, the recordings were processed with EEGlab [Delorme and Makeig, 2004]. After application of a common average reference, the data was downsampled to 256 Hz, and separated according to the levels and conditions. Then the power of the frequencies of interest was extracted using Welch’s method [Welch, 1967], using sliding windows of 512 samples with 256 samples overlap. To approximate normal distribution of the frequency power, the natural logarithm was applied on the extracted power values. For the overall contrasts of conditions, the data of level K was used as a baseline for levels KA and KAS to remove variance due to frequency differences over time (i.e. blocks) and between participants. To extract the heart rate, the ECG signal was filtered by a 2-200 Hz bandpass 2-sided Butterworth filter and peak latencies were extracted with the Biosig toolbox [Schlögl and Brunner, 2008]. The results of the peak detection algorithm were visually screened and artifactual cases (i.e., the data of three participants) excluded from the analysis. Then the heart rate was computed for each 3 minute game in terms of average detected beats per minute. The GSR, BVP, respiration and temperature recordings were not analyzed due to a high number of movement artifacts. To investigate the effect of the feedback and difficulty manipulations on the heart rate and alpha power, we computed 2 (feedback) × 2 (difficulty) repeated measure analyses of Section 2.3 – Results: Ratings, Physiology, and EEG variance (ANOVA) for each of the neurophysiologically controllable levels (KA and KAS). To attenuate the influence of practice and time, we used the measurements of the first level (K) as a baseline, computing the changes of heart rate and power during levels KA and KAS relative to the respective measurements during level K of the same game series. To ensure normal distribution, we removed cases (i.e., participants) that contained outliers, according to the mild outlier criterium (3×inter-quartile range). Where appropriate, Greenhouse-Geisser corrected results are reported. As effect size measure we report the partial eta-squared ηp2 . 2.3 Results: Ratings, Physiology, and EEG 2.3.1 Subjective Game Experience Table 2.4: Main effects of the feedback manipulation (α vs. sham) and of the difficulty manipulation (easy vs. difficult) for the GEQ scales and the α control item. Mean (x̄) and standard deviation (s) for the respective conditions, and the statistical indicators F-value, p-value, and η p 2 . α-Feedback Sham-Feedback ANOVA GEQ Scale x̄ s x̄ s F p-value ηp 2 Competence Flow Tension Challenge Negative Positive α control 2.82 3.02 2.43 2.84 2.42 3.28 2.86 0.86 0.70 0.60 0.51 0.73 0.76 1.30 2.90 3.00 2.54 2.83 2.40 3.32 2.68 0.86 0.76 0.62 0.53 0.75 0.71 1.18 .378 .008 1.345 .015 .041 .085 .959 n.s. n.s. n.s. n.s. n.s. n.s. n.s. – – – – – – – Easy Level Competence Flow Tension Challenge Negative Positive α control Difficult Level ANOVA x̄ s x̄ s F p-value ηp 2 3.31 2.97 2.29 2.67 2.42 3.46 3.00 0.87 0.73 0.65 0.47 0.67 0.71 1.20 2.41 3.05 2.68 3.00 2.39 3.14 2.55 0.88 0.76 0.59 0.50 0.77 0.73 1.26 45.080 .563 14.379 22.287 .121 11.909 6.881 < 0.001 n.s. 0.001 < 0.001 n.s. 0.003 0.017 .714 – .444 .553 – .398 .276 From the post-game questionnaires we extracted the values of the GEQ scales and the α control item to examine the effect that the feedback and difficulty manipulation had on | 47 48 | Chapter 2 – Evaluating Affective Interaction the game experience (see Table 2.4). Real alpha feedback, compared to sham feedback, was expected to lead to less tension, less negative and more positive affect, and a higher feeling of competence. However, none of the game experience scales showed significant differences in the feedback contrasts for the easy and difficult conditions. The difficulty manipulation, on the other hand, had the expected effects on game experience. In the difficult conditions compared to the easy conditions, the users felt less competent in their capability to control the game (p < 0.001) and felt less in control of the α input modality (p = 0.017). Consequently, they felt a higher tension (p = 0.001), and more challenged in the difficult conditions (p < 0.001). Furthermore, the users felt less positive affect for the difficult feedback condition (p < 0.003). No interaction effects between neurofeedback and difficulty manipulation were found. 2.3.2 Physiological Measures The repeated measures ANOVA of heart rate responses to feedback and difficulty manipulations did not yield significant main effects (see Table 2.5), nor an interaction. The development of the heart rate over the course of the game levels KA and KAS relative to the first level K is depicted in the left panel of Figure 2.6. Table 2.5: Main effects of the feedback manipulation (α vs. sham) and of the difficulty manipulation (easy vs. difficult) on the heart rate change relative to game K of the respective condition (no-feedback baseline condition). Mean (x̄) and standard deviation (s) for the respective conditions, and the statistical indicators F-value, p-value, and η p 2 . α-Feedback ANOVA Level x̄ s x̄ s F p-value ηp 2 KA KAS -1.940 -5.137 2.012 3.718 -1.506 -3.514 2.262 4.057 0.453 2.580 n.s. n.s. – – Easy Level KA KAS 2.3.3 Sham-Feedback Difficult Level ANOVA x̄ s x̄ s F p-value ηp 2 -0.974 -3.809 2.433 3.986 -2.472 -4.842 2.010 4.058 4.297 .833 n.s. n.s. – – EEG Alpha Power The repeated measures ANOVA comparing the alpha power values measured at electrode Pz did show no significant main effects of the feedback conditions or of the difficulty manipulation (see Table 2.6). We also did not find an interaction effect. The development Section 2.4 – Results: Ratings, Physiology, and EEG of the alpha power over the game levels KA and KAS relative to the first level K is depicted in the right panel of Figure 2.6. Table 2.6: Main effects of the feedback manipulation (α vs. sham) and of the difficulty manipulation (easy vs. difficult) on the alpha power change relative to game K of the respective condition (no-feedback baseline condition). Mean (x̄) and standard deviation (s) for the respective conditions, and the statistical indicators F-value, p-value, and η p 2 . α-Feedback Sham-Feedback ANOVA Session x̄ s x̄ s F p-value ηp 2 KA KAS -0.004 0.179 0.103 0.283 -0.002 0.235 0.105 0.248 .004 3.556 n.s. n.s. – – Easy Level KA KAS Difficult Level ANOVA x̄ s x̄ s F p-value ηp 2 -0.027 0.184 0.090 0.237 0.021 0.230 0.099 0.303 3.923 1.333 n.s n.s. – – Figure 2.6: Left: The development of the heart rate over the games KA and KAS, relative to the initial game K, for all four conditions. Right: The development of the parietal alpha power over the games KA and KAS, relative to the initial game K, for all four conditions. The data shows a trend toward a lower heart rate and higher alpha power with increasing game level, but no significant effects of the feedback or difficulty manipulations. | 49 50 | Chapter 2 – Evaluating Affective Interaction 2.4 Discussion: Issues for Affective Interaction We attempted to build a computer game, which used affective BCI as auxiliary control modality, in addition to conventional keyboard control. Users were enabled to control a game parameter - opponent speed and thereby game difficulty - by their relaxation/arousal state, indexed by parietal alpha band power. To evaluate the system, we manipulated the control the user had over the auxiliary and neurophysiology-based modality “relaxation” by replacing the mapping of instantaneous parietal alpha power with that of apparently random alpha power (a playback of power recorded during the first game of each block, in game level K). We expected the neurofeedback-informed control over the relaxation state to be reflected in differences of the subjective game experience, heart rate, and in the relaxation indicator, alpha activity, itself. The comparison of real and sham feedback conditions did not provide evidence for the expected effects of the α-feedback. The feedback manipulation did not affect game experience, heart rate, or the alpha frequency power. The failure of the feedback manipulation to affect subjective game experience and physiological measures could be due to several neurophysiological, methodological and technical issues: 1. Alpha activity is not a reliable indicator for states of relaxation/arousal in a multimodal gaming environment. 2. The participants were not able to self-induce states of relaxation during the games. 3. The feedback was not salient enough to make the relationship between user’s affective states and the game dynamic perceivable. In the following sections we discuss these issues and their implications for future efforts toward this type of hybrid affective BCI approaches. 2.4.1 Indicator-suitability The reliability of the chosen indicator for a certain affective or cognitive state is a core component of any system that implements a biocybernetic feedback loop [Fairclough, 2010]. We selected alpha activity as an indicator of arousal, as it was found to correlate with the subjective feeling of relaxation, and was successfully involved in neurofeedback treatments of diverse anxiety disorders. However, alpha power might vary with several other factors in addition to its correlation with the relaxation/arousal dimension. It is for example sensitive to the opening and closing of the eyelid, but it is also implied in several cognitive processes, such as working memory and sensory processing [Pfurtscheller and Lopes da Silva, 1999; Schürmann, 2001; Klimesch et al., 2007; Palva et al., 2007; Jensen and Mazaheri, 2010].8 8 In Chapter 4, we will explore the role of parietal alpha power for the indication of emotional salience of visual compared to auditory affective stimuli – evidence for the modalty-specific cognitive function of alpha during sensory processing. Section 2.4 – Discussion: Issues for Affective Interaction | Such covariation of alpha power with other mental functions besides relaxation/arousal, might introduce an additional source of variance in the power of the alpha band. This might especially be the case in complex and natural environments, where the task at hand involves several neural processes, for example sensory processing and orientation. As the studies that suggest alpha to be an indicator of a relaxed state only use relaxation tasks [Barry et al., 2007, 2009; Teplan and Krakovska, 2009], no other processes that potentially covary with alpha power are adding to the variance in the alpha band. To test the suitability of a neurophysiological indicator of affect to reliably reflect the affective state in a specific applied and complex environment, additional experiments could be done in this very environment. For example, a dedicated experiment could validate the role of alpha power as predictor for relaxation in a complex natural HCI scenario (done here). Alternatively, an exploratory analysis could be conducted with the aim to find the best measure discriminating between the relevant states of affect. The latter approach has the advantage of identifying potential indicators other than the one supported by the literature, but it also bears the challenge of inducing the affective states of interest. Here users might be affectively primed before a certain condition by affect eliciting material or instructions. In the next two chapters, we will introduce possible affect induction procedures that use multimodal stimuli, and that might be applied for such emotional priming approaches. Summarizing, neurophysiological indicators, such as the alpha rhythm, are unlikely to map uniquely and in a one-to-one manner onto a certain affective state. Especially in complex environments that require a multitude of mental processes for coping, additional influences of other processes on the chosen indicator can be expected. Therefore it is necessary to critically evaluate the reliability of the chosen indicator under conditions similar to those existing in the application environment. 2.4.2 Self-Induction of Relaxation The physiological effects were also expected on the basis of clinical neurofeedback studies, which show that people are able to influence their brain activity when it is made perceivable [Hammond, 2005; Moore, 2005]. Studies on (generalised) anxiety in patients and non-patients show that alpha neurofeedback (i.e., alpha enhancement training) can decrease the level of stress and anxiety [Hardt and Kamiya, 1978; Hare et al., 1982; Rice et al., 1993]. Although the present study’s rationale (interaction modality versus therapeutic, anxiety reducing effect) and methodology (by clearly instructing participants to relax to influence alpha power) differ from the traditional neurofeedback research, some important lessons might be learned from neurofeedback approaches. One of the relevant factors for the efficacy of neurofeedback seems to be the amount of training participants receive (Moore, 2005). Neurofeedback therapy is mostly done over several sessions [Ancoli and Kamiya, 1978; Hammond, 2007], though subjective or physiological effects of neurofeedback can already be observed after one session [Nowlis and Kamiya, 1970; Mulholland et al., 1983; Fell et al., 2002]. Consequently, a training protocol which incorporates multiple sessions and gives users the chance to train the brain activity that is used for the interface might yield effects. However, it is also important to 51 52 | Chapter 2 – Evaluating Affective Interaction notice that we chose alpha because of an association with an affective state that would need less training than an abstract neurophysiological feature not associated with such an intuitively controllable state. Another problematic issue is the high level of relaxation in the subject population (students), possibly leading to a floor effect of relaxation. Hardt and Kamiya [1978] observed an effect of alpha neurofeedback on students with high levels of anxiety, but not on those with low levels. Therefore it is possible that the original level of relaxation was not further increasable by a neurofeedback procedure, leading to a ceiling effect. A careful selection of a highly stressed test population would be advised to evaluate the efficacy of a neurofeedback system. In the current experiment we tried to induce a higher level of stress via the more difficult game level. This did not result in an effect of neurofeedback, despite the indicated successful increase of the subjectively perceived stress level (i.e., tension scale of the GEQ). Finally, and not unrelated to the above discussion of the reliability of alpha band power as an indicator of relaxation in a multimodally controlled computer game, it is unclear, if participants are immediately able to enter a state of relaxation when faced with a challenging task. The simultaneous keyboard controlled movement of the avatar and the attempt to maintain or enter a state of relaxation is clearly different from most studies which solely ask the participant to do one thing, relax or to enter a state that satisfies the goal of the neurofeedback procedure (i.e., increasing or decreasing a certain neural feature). In the gaming scenario, however, the two tasks might impose a higher workload, as is a general issue in applications requiring the use of multimodal control [Fairclough, 2010]. Mulholland et al. [1983] found indeed a decrease of the effect of neurofeedback when participants were also instructed to perform a counting task during the session, though the effect was still superior to a sham-feedback condition. Summarizing, if the control of the neurophysiological indicators of self-induced affect is a viable input modality during multimodal human-computer interaction has to be evaluated with respect to the time users need for the acquisition of the skill of self-induction of affect and with respect to the workload imposed by simultaneous tasks. 2.4.3 Feedback Saliency The technical implementation of a feedback system and the choice of parameters for the extraction of the physiological features are further relevant issues that we will discuss here with a special focus on the saliency of the feedback method. A first factor that influences the sensory saliency of the feedback, is the manner in which the feedback is given: explicitly or implicitly. We visualized the alpha power in two ways: explicitly as bar hight at the top of the screen, and implicitly, by the speed of the opponents. Nevertheless, it is possible that the participants did not use the explicit feedback given, as they were concentrating on the game. The solely use of the implicit feedback would constitute a difference to other neurofeedback studies in which the feedback was given as a discrete signal (for example [Nowlis and Kamiya, 1970; Mulholland et al., 1983; Cho et al., 2008]). This might attenuate the saliency of the feedback and consequently of the reward. By increasing the game difficulty this relationship was made Section 2.5 – Conclusion | more salient, however, without success. Another technical aspect determining the feedback saliency, is related to the temporal fidelity of feedback – the length of time over which the neurophysiological measure is integrated. The three-second window used for the analysis of alpha power is susceptible to fast changes in alpha power that are, for example, related to the closing of the eye-lid [Pfurtscheller and Lopes da Silva, 1999]. Furthermore, the short feedback also reflects the constant waxing and waning that is characteristic for the alpha rhythm [Niedermeyer, 2005]. A longer EEG window would have attenuated these relaxation-unrelated fluctuations in the feedback and could thereby have strengthened its relation to relaxation. Such strengthening of the relationship of the feedback with the actual affective state, rather than non-related physiological changes, might outweigh the advantage of the chosen short-term alpha power integration, a more instantaneous feeling of control. Despite the fact that our choice of measurement location, electrode Pz, is not notorious for eye movement artifacts, mostly observed at frontal electrodes, those can in principle cause additional variance in the alpha frequency band. To attenuate their influence an option would be to detect and remove eye movements from the data automatically [Schlögl et al., 2007], or block the α-feedback change during those episodes of eye movements. Similarly, muscle artifacts might have some influence on alpha band power, although they are more frequently observed at temporal and frontal electrodes [Goncharova et al., 2003], and might be filtered out by suitable approaches [Onton and Makeig, 2009]. Summarizing, the feedback saliency is subject to a number of implementation choices, such as explicit/implicit feedback, length of integration period, and removal of non-neural influences (e.g., muscle or eye-movement artifacts). To increase the saliency of the feedback, such factors should be critically taken into account in the process of designing a game or application using neurophysiological feedback for game control. 2.5 Conclusion In this study a game was used to investigate the interaction with a hybrid affective BCI. Our aim was to study the efficacy of self-induced relaxation, sensed from its neurophysiological correlate, for the control of a part of the game dynamic. This approach is similar to alpha-based neurofeedback, though we explicitly asked our participants to relax to control the speed of the opponents and hence gain an advantage in the game. A sham-feedback condition was used to examine the effect that the perceived control over the BCI had on the player’s game experience. It was expected that real feedback, giving the user greater control over the relaxation-controlled game feature, would subjectively lead to a more relaxed state and greater feeling of competence in the game. Furthermore, differences in objective indicators of arousal, heart rate and alpha power, were expected. The analysis showed neither differences in subjective game experience indicators, nor in objective indicators of relaxation. Several possible factors for the inefficacy of the neurophysiological control were identified: the reliability of the indicator in a complex gaming environment, the capability of the participants to enter a state of relaxation during challenging multimodal interaction, and the saliency of the feedback method. Future work should try to 53 54 | Chapter 2 – Evaluating Affective Interaction evaluate their role to enable a successful neurophysiology-based interaction via affective states. In the following chapters, we will further explore the relationship of the alpha rhythm and other frequency bands with emotional arousal. Specifically, in the next chapter, we will induce arousal with complex, multimodal stimuli, namely music videos. In chapter 4 we will focus on the role of alpha as a modality-specific correlate of emotional arousal, rather indicating the activation, or inhibition, of sensory cortical regions that are relevant, or irrelevant, for the processing of the respective stimulus event. Chapter 3 Inducing Affect by Music Videos 3.1 Introduction1 A fundamental challenge in the study of emotions and their behavioral, physiological, and neurophysiological correlates is the induction of strong and reliable affective states. Especially for the development of affect classification methods, the robust induction of the relevant affective states is an important prerequisite. The training of affect classifiers requires a high quality of the training samples. The affective state should be induced reliably to increase the probability of marked neurophysiological responses in every trial. This is especially important, as the number of trials is limited, as responsiveness to affective stimulation might decrease as the participant habituates. Another desirable characteristic of an affect induction procedure might be its ecological validity. Picard et al. [2001] stresses the importance of a real world setting as one of five factors to ensure high quality, ecologically valid affective signals. Especially with the measurement of physiological signals, which still necessitates bulky and obtrusive equipment with direct contact to the user (i.e. electrode caps and the different physiological sensors), this ecological validity and the naturalness of the recording are limited. Therefore, it is the more important to find ways of affective stimulation that are natural to the participants. To explore novel ways of strong, reliable, and ecologically valid affective stimulation and their effects on neurophysiological measurements, we implemented an affect induction protocol using music clips. We conducted an experiment to evaluate the affective responses during the presentation of music clips. The clips were partially selected according to their affect-related tags in a social media website, which proved to be a convenient way to gather stimulus material with reasonable expenditure. We refined the selection further according to the affective impact of the clips in an online study. We present subjective ratings and physiological responses supporting our claim that the administration 1 The work presented here is based on a small-scale pilot study [Koelstra et al., 2010] and significant parts of the neurophysiological data have been published in [Koelstra et al., 2012]. The data, including ratings, physiological and neurophysiological measurements of 32 participants and face video for 22 participants, is publically available at http://www.eecs.qmul.ac.uk/mmv/datasets/deap/ to foster research on affective BCI. 56 | Chapter 3 – Inducing Affect by Music Videos of music clips is an efficient means to induce certain target affective states. Finally, we discuss the neural correlates associated with the induced valence and arousal responses, their classifiability, and potential sources of confounds related to musical affect induction. Below, we will motivate the use of musical stimuli for the elicitation of affective experiences and the induction and study of neurophysiological affective responses. Furthermore, we will argue for a stronger affect induction possible with multimodal musical stimuli, music videos, compared to purely musical stimuli. Finally, we will outline our expectations. 3.1.1 Eliciting Emotions by Musical Stimulation In the past, emotion research in psychology, psychophysiology, and affective neuroscience has used a variety of affect induction procedures2 . Gerrards-Hesse and Spies [1994] classified these in 5 categories: (1) free mental generation of emotional states (e.g. by hypnosis or imagination), (2) guided mental generation of emotional states (e.g. by film, stories, or music and additional instructions to enter the target state), (3) presentation of emotion inducing material (e.g. pictures, film, music, or gifts), (4) presentation of need-related emotional situations (e.g. manipulating the feeling of success/failure in certain situations), and (5) the generation of emotionally relevant physiological states (e.g. by drugs or the flexing of facial muscles associated with the expression of certain affective states). Music plays a prominent role among these affect induction procedures. Emotional experiences are central to music production and consumption. The regulation of the listener’s emotional state has been identified as a major reason for listening to music [Juslin and Västfjäll, 2008]. Musical pieces are known to lead to strong affective responses, and hence many studies have made use of music pieces to induce emotions [Gerrards-Hesse and Spies, 1994; Westermann et al., 1996]. An important question in the context of the development of affect induction procedures for affective brain-computer interfaces is, whether music is able to evoke genuine emotional responses (emotivist position), or expresses emotions that are merely recognized by the listener (cognitivist position). Cognitivists (e.g., [Konecni, 2008]) argue that, for example, sad music triggers the recognition of sadness rather than the processes associated with really feeling it. According to the cognitivists view, musical emotion induction would be impossible or merely enabling the supplementation of other (genuine) emotion induction procedures. On the other hand, emotivists (e.g., Juslin and Västfjäll [2008]) argue that, for example, sad music has indeed the potential to induce the genuine emotion of sadness, accompanied by feeling sad. Only if music is capable to induce genuine emotions, musical emotion induction protocols are suited for the exploration of neurophysiological emotional responses, assuming at least a generalization of the correlates of feelings. 2 Alternatives to the induction of affective states are quasi-experimental approaches that divide participants into groups according to (1) their previously evaluated affective state, (2) the occurrence of affect-influencing natural events (e.g. amount of sunny or rainy days), or (3) the prevalent affective state associated with a specific patient population [Gerrards-Hesse and Spies, 1994]. These approaches, however, are not experimental manipulations and might suffer from confounds associated with the cause for the affective state, as for example other affect-unrelated effects of the illness in patient populations. Section 3.1 – Introduction | Support for the genuine nature of emotional responses comes from studies that observed physiological responses that have also been observed with emotional experiences in response to non-musical affective stimuli [Zimny and Weidenfeller, 1963; Krumhansl, 1997; Khalfa et al., 2002; Sammler et al., 2007; Lundqvist et al., 2008]. It was found that music inducing sadness and peacefulness leads to a decrease of galvanic skin responses (GSR) compared to musically induced fear or happiness [Zimny and Weidenfeller, 1963; Khalfa et al., 2002; Gomez and Danuser, 2004; Lundqvist et al., 2008]. Krumhansl [1997] found also higher GSR for happy compared to fearful music. Also for heart rate, several studies observed typical emotional responses for musical emotion induction (but see [Zimny and Weidenfeller, 1963; Gomez and Danuser, 2004; Etzel et al., 2006; Lundqvist et al., 2008]). Heart rate increased for arousing fearful and happy music compared to less arousing sad music [Krumhansl, 1997]. Sammler et al. [2007] found heart rate increases for pleasant (not contorted) versus unpleasant (contorted) musical pieces. Similarly, skin temperature was found to differentiate between happy and sad music. However, while Krumhansl [1997] found a temperature increase for happy compared to fearful and sad music, Lundqvist et al. [2008] observed a temperature decrease for happy compared to sad music. The facial electromyographical responses associated with smiles, an increase of the zygomaticus major activity, was found to be stronger for happy compared to sad music [Lundqvist et al., 2008]. Finally, also the respiratory rhythm was observed to vary with the emotional content of music. Gomez and Danuser [2004] found a positive correlation between respiration rate and depth with valence and arousal. Etzel et al. [2006] showed lower respiratory frequency for sadness compared to fear compared to happiness. Neuroimaging studies have likewise shown the involvement of affect-related limbic and para-limbic brain regions in response to musical emotion induction (Koelsch2010). Furthermore, several EEG studies observed neurophysiological responses to affective manipulations with music [Schmidt and Trainor, 2001; Altenmüller et al., 2002; Baumgartner et al., 2006a,b; Tsang et al., 2006; Sammler et al., 2007; Lin et al., 2010], and were able to interpret them in the context of conventional affective responses. In general, the physiological and neurophysiological responses to musical emotion induction partially parallel those observed during non-musical emotion induction. This equivalence of responses suggests the activation of similar underlying mechanisms as observed during genuine emotions and, hence, the suitability of musical stimulation for the study of the neurophysiology of affect. 3.1.2 Enhancing Musical Affect Induction Despite its frequent use for affect induction, meta-analyses of the efficacy of the most prominent affect induction approaches have shown that pure musical emotion induction gave only a medium reliability [Gerrards-Hesse and Spies, 1994; Westermann et al., 1996]. Gerrards-Hesse and Spies [1994] only found a reliable musical induction of the targeted affective states of depression and elation in 75% of the reviewed studies, whereas the presentation of movies led to success in 95% of the studies. To increase the reliability of the affect induction, we chose to present music clips, which might be more comparable to 57 58 | Chapter 3 – Inducing Affect by Music Videos films, as they carry auditory and visual affective information. The motion picture industry uses music to guide and increase the emotional impact of movies by pairing emotionally intense scenes with the adequate music [Cohen, 2001]. Similarly, it might be argued that in music clips film sequences are added to music, conveying the same affective message, to increase the impact of the affective message of the music. This would make music videos an interesting means for affective stimulation, as the combination of congruent affective messages is supposed to deliver a clearer emotional response by removing affective ambiguity. Below, we present evidence for (1) the disambiguation of affective stimulation, and (2) the enhancement of the affect induction that can be achieved by affectively congruent multimodal compared to unimodal stimulation. A number of studies investigated the effect of multimodal affective stimulation on the perception of affect. They showed strong interactions between the affective content of different stimulus modalities. Pairing emotional facial expressions either with emotionally pronounced words [Massaro and Egan, 1996; De Gelder and Vroomen, 2000; Ethofer et al., 2006] or emotional music [Logeswaran and Bhattacharya, 2009] leads to a bias of the judgement of the emotion expressed by the face in the direction of the emotion expressed by the auditory stimulus. Similarly, the pairing of neutral films, portraying every-day actions, with emotional music biased the emotional perception of the films [Van den Stock et al., 2009] according to the auditory counterparts. The same crossmodal transition also worked in the opposite direction, when for example emotionally spoken sentences had to be evaluated while non-relevant emotional expressions were shown [De Gelder and Vroomen, 2000]. Consequently, congruently paired affective visual (pictures) and auditory (music) stimuli led to more consistent judgements of the emotion expressed by music compared to incongruent pairings [Esposito et al., 2009]. Furthermore, the perceptual accuracy increased for emotions expressed with multimodal stimuli, compared to visual (facial expression) and auditory stimuli (emotionally spoken words) alone [Kreifelts et al., 2007]. These studies suggest that the multimodal stimulation of affect is able to bias, and more importantly, to disambiguate the affective message conveyed by one of the stimulus modalities alone. Further studies suggest that besides or through this disambiguation, multimodal stimulation also leads to a clearer and stronger affective experience toward congruently paired audiovisual stimuli, compared to auditory or visual stimuli alone. Baumgartner et al. [2006a] found an enhanced clarity of the subjective experience, when pictures inducing sadness, happiness and fear were combined with sounds inducing the same emotions. This enhanced clarity was also indicated by increased skin conductance responses to audiovisual stimuli compared to unimodal stimuli. For clarinet performances, Vines et al. [2006] found a complex relationship between auditory (sound) and visual (video) modalities, where the combined audiovisual stimulation led partially to enhancements but also to a dampening of the experience of (emotional) tension. For the same performances, skin conductance responses were similarly complex [Chapados and Levitin, 2008], but showed a general increase in the electrodermal responsiveness for the audiovisual performance compared to the separate visual and auditory stimulation. Corroborating evidence for the increased efficacy of affective stimulation by multimodal stimuli comes from neuroimaging studies that found a greater activation of affect-related subcortical regions for Section 3.1 – Introduction | congruently combined multimodal affective stimuli [Dolan et al., 2001; Pourtois et al., 2005; Baumgartner et al., 2006b; Eldar et al., 2007]. Summarizing, the evidence for a crossmodal transfer of affective information, the increased effectiveness of audiovisual congruent affective information in terms of the clarity and intensity of the affective experience, and the increased (neuro-)physiological responsiveness indicates an increased reliability of multimodal affect induction. Hence, we suggest the use of music videos instead of pure music for reliable emotion induction. 3.1.3 Hypotheses To validate the affect induction protocol, we collect ratings of the subjective experience of valence and arousal. As we manipulate valence and arousal, we expect a reflection in differences of valence ratings between negative versus positive valence, and of the arousal ratings between low versus high arousal. Furthermore, we collect subjective ratings of preference, dominance, and familiarity to explore the relationship between these and the valence/arousal manipulations3 . H1a : Valence and arousal manipulation affect the subjective ratings of valence and arousal accordingly. Additionally, we record physiological sensors for an objective validation of the affective manipulations. For the valence manipulation we expect higher activity of the zygomaticus major muscle, involved when smiling, for high versus low valenced stimuli. For the arousal manipulation, we expect higher activity of the trapezius muscle for more compared to less arousing stimuli, indicating higher tension of the neck musculature during arousing stimuli. Furthermore, we expect higher electrodermal activity during high arousing, compared to low arousing stimuli. H2a : The zygomaticus (smile) muscle activity increases for positive compared to negative stimulation. H2b : The trapezius (neck) muscle activity increases for high relative to low arousing stimulation. H2c : The skin conductance level increases for high relative to low arousing stimulation. Concerning the neurophysiological measurements, we explore the correlates of subjective ratings (valence, arousal, and liking) and the power in the standard frequency bands of theta, alpha, beta, gamma, high gamma. Specifically, we expect changes of frontal alpha asymmetry [Allen et al., 2004] with valence, as has been shown for musical emotion induction before [Schmidt and Trainor, 2001; Tsang et al., 2006]. H3a : Valence is positively correlated with frontal alpha band power asymmetry (rightward lateralization of alpha band power for positive versus negative affect). H3b : Valence has effects on the broad frequency bands. 3 Preference, the general liking a user exhibits for a music video or the song, refers to the intrinsic pleasantness that it carries. It is a measure of the aesthetic value that the stimulus has for the user, a basic object variable in appraisal models [Ortony et al., 1988; Scherer, 2005], possibly determining the effect of the stimulus on valence and arousal experience. 59 60 | Chapter 3 – Inducing Affect by Music Videos H3c : Arousal has effects on the broad frequency bands. H3d : Subjective preference has effects on the broad frequency bands. 3.2 3.2.1 Methods: Stimuli, Study design, and Analysis Participants Thirty-two (sixteen male and sixteen female) participants, aged between 19 and 37 (mean age 26.9), participated in the experiment. All participants, but one, were right-handed. Twenty participants were recruited from a subject pool of the Human Media Interaction group at the University of Twente and via advertisement at the university campus Twente. These participants received a reimbursement of 6 Euro per hour or the alternative in course credits. The remainder of the participants was recruited at the university campus in Geneva by the Computer Vision and Multimedia Laboratory. These participants were reimbursed with 30 CHF per hour. All participants gave their written informed consent and received the same treatment and information prior to the experiments. 3.2.2 Stimuli Selection and Preparation To elicit strong emotional responses with a high probability in a given participant, the process of stimulus selection or construction is of prime importance. We used a social-web and crowd-sourcing supported semi-automatic stimulus selection approach, that would also limit collection time and experimenter bias during stimulus selection. The stimuli used in the experiment were selected in a three-step procedure. Firstly, we selected 120 initial stimuli, half of which were chosen semi-automatically by their tags in an online music streaming service, and the rest manually. Secondly, a part of one-minute was determined for each stimulus according to a highlight detection algorithm. Finally, we selected 40 stimuli for use in the main experiment according to the subjective assessment in a web-based experiment. Each of these steps is explained below. Initial Stimuli Selection We selected 60 of the 120 initially selected stimuli using the Last.fm4 music streaming service. Last.fm allows users to track their music listening habits and receive recommendations for new music and events. Additionally, it allows the users to assign tags to individual songs, thus creating a folksonomy of tags. Many of the tags carry emotional meanings, such as “depressing” or “aggressive”. Last.fm offers an API, allowing one to retrieve tags and tagged songs. A list of emotional keywords was taken from Parrott [2001], a rather exhaustive collection of nouns naming emotions (see Table 3.1), and expanded to include inflections and synonyms, yielding 304 keywords. Next, for each keyword, the correspondingly tagged 4 http://www.last.fm/ Section 3.2 – Methods: Stimuli, Study design, and Analysis Table 3.1: The list of emotion concepts and words, taken from Parrott [2001], used as keywords to search for adequately tagged songs that induce the respective emotion. Primary emotion Love Joy Surprise Secondary emotion/feelings Affection Lust/Sexual desire Longing Cheerfulness Zest Contentment Pride Optimism Enthrallment Relief Surprise Irritability Exasperation Rage Anger Disgust Envy Torment Suffering Sadness Sadness Disappointment Shame Neglect Sympathy Horror Fear Nervousness Tertiary feelings/emotions Adoration Fondness Liking Attractiveness Caring Tenderness Compassion Sentimentality Arousal Desire Passion Infatuation Longing Amusement Bliss Gaiety Glee Jolliness Joviality Joy Delight Enjoyment Gladness Happiness Jubilation Elation Satisfaction Ecstasy Euphoria Enthusiasm Zeal Excitement Thrill Exhilaration Pleasure Triumph Eagerness Hope Enthrallment Rapture Relief Amazement Astonishment Aggravation Agitation Annoyance Grouchy Grumpy Crosspatch Frustration Anger Outrage Fury Wrath Hostility Ferocity Bitter Hatred Scorn Spite Vengefulness Dislike Resentment Revulsion Contempt Loathing Jealousy Torment Agony Anguish Hurt Depression Despair Gloom Glumness Unhappy Grief Sorrow Woe Misery Melancholy Dismay Displeasure Guilt Regret Remorse Alienation Defeatism Dejection Embarrassment Homesickness Humiliation Insecurity Insult Isolation Loneliness Rejection Pity Alarm Shock Fear Fright Horror Terror Panic Hysteria Mortification Anxiety Suspense Uneasiness Apprehension (fear) Worry Distress Dread songs were found in the Last.fm database. For each affective tag, the ten songs most often labeled with this tag were selected. This resulted in a total of 1084 songs. The valence-arousal space can be subdivided into 4 quadrants, namely low arousal/low valence (LALV), low arousal/high valence (LAHV), high arousal/low valence (HALV) and high arousal/high valence (HAHV). In order to ensure diversity of induced emotions, from the 1084 songs, 15 were selected manually for each quadrant according to the following criteria. Does the tag accurately reflect the emotional content? Examples of songs subjectively rejected according to this criterium included songs that were tagged merely because the song title or artist name corresponded to the tag. Also, in | 61 62 | Chapter 3 – Inducing Affect by Music Videos some cases the lyrics may have corresponded to the tag, but the actual emotional content of the song was entirely different (e.g. happy songs about sad topics). Is a music video available for the song? Music videos for the songs were automatically retrieved from YouTube, corrected manually where necessary. However, many songs did not have a music video. Is the song appropriate for use in the experiment? Since our test participants were mostly European students, we selected those songs most likely to elicit emotions for this target demographic. Therefore, mainly European or North American artists were selected. In addition to the songs selected using the method described above, 60 stimulus videos were selected manually, with 15 videos selected for each of the quadrants in the arousal / valence space. The goal here was to select those videos expected to induce the most clear emotional reactions for each of the quadrants. The combination of manual selection and selection using affective tags produced a list of 120 candidate stimulus videos. Next, for each video an affective highlight segment of one-minute length was estimated. Detection of One-minute Highlights To limit experiment duration and to increase the number of trials, that is data points, we extracted only an excerpt of one minute for each clip for further use in the experiment. In order to extract a segment with maximum emotional content, an affective highlighting algorithm was used. The detailed procedure and used features are described in Koelstra et al. [2012]. For training, 21 movie fragments were used for which the features and affective ratings (valence, arousal) were available [Soleymani et al., 2009]. The music clips were cut into 1-minute fragments, each with an overlap of 55 seconds with the next fragment. The trained regressor weights were then used on the content features of each fragment i to compute its emotional energy ei from the segments valence, vi , and arousal, ai : q ei = a2i + vi2 (3.1) Finally, the segment with the highest emotional energy was chosen for each clip. For a few clips the automatic selection of the one-minute highlight was manually overridden to choose a particularly characteristic segment for this specific music clip, instead. In such cases the one-minute interval would be shifted to include the characteristic segment. Such distinctive segments were supposed to be of stronger emotion-evoking power. To refine the collection of music clips further, an online study was conducted. Online Subjective Annotation From the initial collection of 120 stimulus videos, the final 40 test video clips were chosen by using a web-based subjective emotion assessment interface. Participants watched music Section 3.2 – Methods: Stimuli, Study design, and Analysis videos and rated them on the SAM scales [Bradley and Lang, 1994]. The SAM scales are discrete 9-point scales for valence, arousal and dominance with pictograms to symbolize each scale’s meaning and range. A screenshot of the interface is shown in Figure 3.1. Figure 3.1: A Screenshot of the web interface for subjective emotion assessment, showing the video and, below, the valence and arousal scale for the rating of the experience during viewing. Each participant watched as many videos as he/she wanted and was able to end the rating at any time. The order of the clips was randomized, but preference was given to the clips rated by the least number of participants. This ensured a similar number of ratings for each video (14-16 assessments per video were collected). It was ensured that participants never saw the same video twice. After all of the 120 videos were rated by at least 14 volunteers each, the final 40 videos for use in the experiment were selected. To maximize the strength of elicited emotions, we selected those videos that had the strongest volunteer ratings and at the same time a small variation. To this end, for each video x we calculated a normalized arousal and valence score by taking the mean rating divided by the standard deviation (µx /σx ). Then, for each quadrant in the normalized valence-arousal space, we selected the 10 videos that lay closest to the extreme corner of the quadrant (see Table 3.2 for the list of all 40 selected videos). Figure 3.2 shows the score for the ratings of each video and the | 63 64 | Chapter 3 – Inducing Affect by Music Videos selected videos highlighted in green. The video whose rating was closest to the extreme corner of each quadrant is mentioned explicitly. Of the 40 videos, 17 were selected via Last.fm affective tags, indicating that useful stimuli can be selected via this method. Table 3.2: The list of songs that were extracted as most appropriate stimuli according to their effects on subjective experience of valence and arousal in the online study. The conditions to that the stimuli were associoated (from the top ten to the bottom ten stimuli) were high arousal - high valence (HAHV), low arousal - high valence (LAHV), low arousal - low valence (LALV), high arousal - low valence (HALV). Next to each stimulus the mean and standard deviation for the ratings obtained in the online study are given. Music Video Artist Titel Emilana Torrini Lustra Jackson 5 The B52’S Blur Blink 182 Benny Benassi Lily Allen Queen Rage Against The Machine Valence Arousal Dominance µ s µ s µ s Jungle Drum Scotty Doesn’t Know Blame It On The Boogie Love Shack Song 2 First Date Satisfaction Fuck You I Want To Break Free Bombtrack 6,9 5,9 6,9 7,0 7,2 6,1 6,7 7,3 7,1 5,9 1,3 2,1 2,3 1,7 1,8 1,5 1,9 1,3 1,8 2,2 5,9 6,9 6,5 5,9 7,3 6,2 6,5 6,1 6,4 7,1 2,2 2,0 1,9 2,0 1,6 1,6 2,1 1,8 2,0 1,4 6,0 5,5 5,8 6,1 6,5 5,6 5,9 6,6 6,8 7,1 1,6 2,4 2,0 2,0 2,1 2,0 2,3 1,6 2,2 1,6 Michael Franti & Spearhead Grand Archives Bright Eyes Jason Mraz Bishop Allen The Submarines Air Louis Armstrong Manu Chao Taylor Swift Say Hey (I Love You) Miniature Birds First Day Of My Life I’m Yours Butterfly Nets Darkest Things Moon Safari What A Wonderful World Me Gustas Tu Love Story 7,1 5,9 6,6 7,1 6,5 5,1 6,1 7,1 7,5 6,3 1,2 1,8 1,4 1,4 1,4 1,1 1,6 1,4 1,3 1,2 4,9 3,4 4,2 4,7 4,0 2,4 3,0 3,9 4,5 4,1 1,5 1,3 2,5 2,1 1,8 1,6 1,5 1,9 1,6 1,7 5,2 4,9 5,4 5,3 4,9 4,5 4,8 5,0 5,7 4,7 1,3 1,8 2,1 2,1 1,9 2,1 2,1 2,2 1,5 1,8 Porcupine Tree Wilco James Blunt A Fine Frenzy Kings Of Convenience Madonna Sia Christina Aguilera Enya Diamanda Galas Normal How To Fight Loneliness Goodbye My Lover Goodbye My Almost Lover The Weight Of My Words Rain Breathe Me Hurt May It Be (Saving Private Ryan) Gloomy Sunday 4,2 3,3 3,3 4,2 4,2 4,3 3,3 3,4 3,2 4,1 1,4 1,2 1,4 1,6 1,8 2,0 1,3 1,3 1,8 1,8 3,7 4,5 2,9 3,6 3,0 3,1 2,8 3,6 3,7 4,2 1,8 2,0 1,7 1,2 1,5 1,8 1,4 1,8 1,9 1,5 3,9 3,2 4,7 4,6 3,3 4,8 2,9 3,9 3,3 4,0 1,9 1,4 2,2 1,8 1,8 2,0 1,4 2,3 1,9 1,8 Mortemia Marilyn Manson Dead To Fall Dj Paul Elstak Napalm Death Sepultura Cradle Of Filth Gorgoroth Dark Funeral Arch Enemy The One I Once Was The Beautiful People Bastard Set Of Dreams A Hardcore State Of Mind Procrastination On The Empty Vessel Refuse Resist Scorched Earth Erotica Carving A Giant My Funeral My Apocalypse 3,7 4,7 3,9 4,8 3,5 4,9 3,3 3,3 3,5 3,7 1,5 1,5 2,1 1,9 1,9 2,3 1,6 2,1 2,3 2,6 5,5 6,4 6,1 6,4 6,3 7,3 5,9 5,3 5,3 5,7 2,1 1,9 2,1 2,0 2,1 1,3 2,2 2,2 2,4 2,2 4,6 4,9 5,5 4,9 4,9 7,1 5,5 5,7 5,7 5,5 1,7 1,6 1,9 2,2 2,4 2,1 2,3 2,4 2,9 2,4 Section 3.2 – Methods: Stimuli, Study design, and Analysis Figure 3.2: µx /σx value for the ratings of each video in the online assessment. Videos selected for use in the experiment are highlighted in green. For each quadrant, the most extreme video is detailed with the song title and a screenshot from the video. The graphic was taken from Koelstra et al. [2012] by courtesy of Sander Koelstra. 3.2.3 Apparatus The experiments were performed in two laboratory environments with controlled illumination. EEG and peripheral physiological signals were recorded using a Biosemi ActiveTwo system5 on a dedicated recording PC. Stimuli were presented using a dedicated stimulation PC that sent synchronization markers directly to the recording PC. For presentation of the stimuli and recording the users’ ratings, the “Presentation” software by Neurobehavioral systems6 was used. The music videos were presented on a 17-inch screen (1280 × 1024, 60 Hz) and in order to minimize eye movements, all video stimuli were displayed at 800×600 resolution, filling approximately 2/3 of the screen. Subjects were seated approximately 1 meter from the screen. Philips stereo speakers were used and the music volume was set at a relatively loud level, however each participant was asked before the experiment whether the volume was comfortable and it was adjusted if necessary. Physiological and neurophysiological signals were recorded at a sampling rate of 512 Hz. The EEG was recorded using 32 active AgCl electrodes, which were placed according to the international 10-20 system [Jasper, 1958]. Additionally, 4 electrodes were applied to the outer canthi of the eyes and above and below the left eye to derive horizontal electrooculogram and vertical electrooculogram, respectively. For the electrodermal measurements, two metal plates were applied to the distal phalanges of the middle and ring 5 http://www.biosemi.com 6 http://www.neurobs.com | 65 66 | Chapter 3 – Inducing Affect by Music Videos finger of the left hand. For the facial electromyogram recording, two electrodes were placed on the zygomaticus major muscle according to the guidelines of Fridlund and Cacioppo [1986]. For the neck electromyogram recording, two electrodes were placed on the upper trapezius muscle, on the middle of the line from the acromion to the spine on vertebra C7, according to the SENIAM guidelines7 . Furthermore, we applied a temperature sensor to the distal phalange of the left little finger, a plethysmograph to the index finger, and a respiration belt. Figure 3.3 illustrates the electrode placement for acquisition of peripheral physiological signals. Additionally, for the first 22 of the 32 participants, frontal face video was recorded in DV quality using a Sony DCR-HC27E consumer-grade camcorder. The temperature, blood pulse, and respiration measurements and the face video were not used in the current study, but were made publicly available along with the rest of the data. 3.2.4 Design and Procedure Figure 3.3: Placement of peripheral physiological sensors on the participants. Four Electrodes were used to record EOG and 4 for EMG (zygomaticus major and trapezius muscles). In addition, GSR, blood volume pressure (BVP), temperature and respiration were measured.The graphic was taken from Koelstra et al. [2012] thanks to the curtesy of Sander Koelstra. Before the experiment, each participant signed a consent form and filled out a questionnaire. Next, they were given a set of instructions, informing them of the experiment procedure and the meaning of the different scales used for self-assessment. An experimenter was present to answer any questions. After that, participants were led into the experiment room and the experimenter started to attach the sensors. After the sensors were placed and their signals checked, the participants performed a practice trial to familiarize themselves with the system. In this unrecorded trial, a short video was shown, followed by a self-assessment by the participant. Finally, the experimenter started the recording, left the room, so that the participant could start the experiment by pressing a key on the keyboard. The experiment started with a 2 minute baseline recording, during which a fixation cross was displayed to the participant (who was asked to relax during this period). Then 7 ttp://www.seniam.org/ Section 3.2 – Methods: Stimuli, Study design, and Analysis the 40 videos were presented in 40 trials, each consisting of the following steps: 1. A 2 second screen displaying the current trial number to inform the participants of their progress. 2. A 5 second baseline recording (fixation cross). 3. The 1 minute display of the music video. 4. Self-assessment for arousal, valence, preference and dominance. Figure 3.4: The scales used for self-assessment (from top: valence (SAM), arousal (SAM), dominance (SAM), preference. For the assessment of participants’ levels of arousal, valence, and dominance, selfassessment manikins (SAM) on 9-point Likert scales [Bradley and Lang, 1994] were used (see Figure 3.4). For the preference scale, thumbs down/thumbs up symbols were used. The manikins were displayed in the middle of the screen with the numbers 1-9 printed below. Participants moved the mouse strictly horizontally just below the numbers and clicked to indicate their self-assessment level. Participants were informed they could click | 67 68 | Chapter 3 – Inducing Affect by Music Videos anywhere directly below or in-between the numbers, making the self-assessment a continuous scale. The valence scale ranged from unhappy or sad to happy or joyful. The arousal scale ranges from calm or bored to stimulated or excited. The dominance scale ranged from submissive (or “without control”) to dominant (or “in control, empowered”). A fourth scale asked for participants’ personal preference of the video. This last scale should not be confused with the valence scale. This measure inquires about the participants’ tastes, their subjective aesthetically preference, not their current feelings. For example, it is possible to prefer videos that make one feel sad or angry. After 20 trials, the participants took a short break. During the break, they were offered some cookies and non-caffeinated, non-alcoholic beverages. The experimenter then checked the quality of the signals and the electrodes placement and the participants were asked to continue the second half of the test. Finally, after the experiment, participants were asked to rate their familiarity with each of the songs on a scale of 1 (“Never heard it before the experiment”) to 5 (“Knew the song very well”). 3.2.5 Data Processing For the investigation of the correlates of the subjective ratings with the EEG signals, the EEG data was common average referenced, down-sampled to 256 Hz, and high-pass filtered with a 2 Hz cutoff-frequency using the EEGlab8 toolbox. Then, we removed eyeartefacts with a blind source separation technique9 The signals from the last 30 seconds of each trial (video) were extracted for further analysis. To correct for stimulus-unrelated variations in power over time, the EEG signal from the five seconds before each video was extracted as baseline for the later “order-corrected” analysis. The frequency power of trials and baselines between 3 and 90 Hz was extracted with Welch’s method Welch [1967] with a sliding window of 256 samples length and 128 samples overlap, leading to power values for 1 Hz-sized bins. The power was averaged over the frequency bands of theta (3 - 7 Hz), alpha (8 - 13 Hz), beta (14 - 29 Hz), gamma (30 - 48 Hz), and high gamma (52 - 90 Hz). For the baseline approach, the power in the pre-stimulus period was subtracted from the trial power, yielding the change of power relative to the pre-stimulus period. The facial and neck EMG responses were extracted according to Fridlund and Cacioppo [1986]: after computing the EMG signals by computing the difference between the two electrodes of each muscle, we band-passed the signals between 15 and 128 Hz, and extracted the root mean square of the signals. The galvanic skin response was detrended and low-pass filtered with an upper cut-off at 30 Hz. The skin conductance levels, were computed by averaging the conductivity over the whole of each trial. To attenuate the variance from stimulus-unrelated long-term changes in conductivity, we subtracted the skin conductance level of the five seconds period before each stimulus. 8 http://sccn.ucsd.edu/eeglab/ 9 http://www.cs.tut.fi/g̃omezher/projects/eeg/aar.htm Section 3.3 – Methods: Stimuli, Study design, and Analysis All physiological signals were visually checked for each trial and pre-trial period to identify corrupted recordings. For skin conductance several trials were excluded from the analysis. Subjects that, after trial removal, lacked more than half of the 40 trials for a specific sensor, were excluded from the analysis for this sensor. 3.2.6 Statistical Analysis The effects of affect induction on the subjective ratings were analyzed with non-parametric Wilcoxon-signed rank tests. Spearman correlations were used to explore the relationship between the ratings, familiarity and preference. For physiological sensors (GSR, EMG), repeated measures ANOVAs (rmANOVA) were conducted. To ensure normality, cases (participants) that showed excessive outliers (values smaller or greater than 3 times inter-quartile range from upper and lower 25th quartile, respectively) were removed from the analysis. Equality of variances was tested via Bartlett’s test of homogeneity of variance (homoscedasticity). Where appropriate, Greenhouse-Geisser corrected results are reported. As effect size measure we report the partial eta-squared ηp2 . To explore the relationship between the broad frequency bands of the EEG (i.e., theta, alpha, beta, gamma, and high gamma)10 and subjective ratings of valence, arousal and preference, we calculated Spearman correlations for the frequency power of each electrode with the respective rating. The analysis was performed over all participants, concatenating their respective ratings on the one hand, and frequency measures on the other hand. To attenuate the influence of different rating scale use, or different prevalences of power in specific frequency bands, we normalized ratings and power measurements for each participant before concatenation, computing the standard score (z-score). To limit the number of tests and to restrict the probability of false rejections of the null hypothesis of no correlation between a given electrode and a subjective rating, we applied a two step testing procedure. Firstly, we calculated for each broad frequency band and each trial a general power value according to the leave-one-out regression method suggested by Grosse-Wentrup et al. [2011]. We successively correlated this general frequency band power with the ratings. A permutation test procedure was used to create a distribution of 10.000 correlation values on the basis of permuted subjective ratings. The original correlation was then tested against the random correlations. Secondly, the electrode-wise correlation analysis was performed for the frequency band when the respective general correlation reached a significance level of p < .05. For this, we computed the electrode-wise Spearman correlations and their significances according to a permutation test with 10.000 permutations. We report and discuss only those correlations that reached a significance level of p < .01. 10 As we had to high-pass filter the EEG signals with a cut-off frequency of 2 Hz to obtain a stable performance of the eye-artifact removal algorithm used, we will not analyze the frequency responses in the Delta band. | 69 70 | Chapter 3 – Inducing Affect by Music Videos 3.3 3.3.1 Results: Ratings, Physiology, and EEG Analysis of subjective ratings In this section we analyze the effect of the affective stimulation on the subjective ratings obtained from the participants. Firstly, we contrast the ratings between conditions to validate the induction protocol. Secondly, we analyze the covariation of the different rating scales of affective experience with each other, with familiarity, and with time, to uncover possible confounds. Figure 3.5: The mean locations of the stimuli on the arousal-valence plane for the 4 conditions (LALV, HALV, LAHV, HAHV). Preference is encoded by color: dark red is low preference and bright yellow is high preference. Dominance is encoded by symbol size: small symbols stand for low dominance and big for high dominance. Stimuli were selected to induce emotions in the four quadrants of the valence-arousal space (LALV, HALV, LAHV, HAHV). The statistical analysis of the ratings between the affect conditions (see Table 3.4 for descriptive and test statistic) shows a marked difference of arousal between the low and high arousal condition, but no difference of valence. Similarly, the analysis shows a difference of valence for the negative and positive valence condition, but not for arousal. Hence, the stimuli from these four affect elicitation conditions generally resulted in the elicitation of the target emotion aimed for when the stimuli were selected, ensuring that large parts of the arousal-valence plane (AV plane) are covered (see Figure 3.5). Figure 3.5 shows, that the valence manipulation worked specifically well for the high Section 3.3 – Results: Ratings, Physiology, and EEG arousing stimuli, yielding relatively extreme valence ratings for the respective stimuli. The stimuli in the low arousing conditions were less successful in the elicitation of strong valence responses. This lack of efficacy of low arousing stimuli to induce strong valence responses results in a C-shape of the stimuli on the valence-arousal plane. This particular shape is also observed for the well-validated ratings for the international affective picture system (IAPS) [Lang et al., 1999] and the international affective digital sounds system (IADS) [Bradley et al., 2007], and indicates the general difficulty to induce emotions with strong valence but low arousal. The distribution of the individual ratings per condition (see Figure 3.6) shows a large variance within conditions, resulting from between-stimulus and between-participant variations, possibly associated with stimulus characteristics or inter-individual differences in music taste, general mood, or scale interpretation. However, the significant differences between the conditions in terms of the ratings of valence and arousal reflect the successful elicitation of the targeted affective states (see Table 3.3). Table 3.3: The mean values (and standard deviations) of the different ratings of preference (1-9), valence (1-9), arousal (1-9), dominance (1-9), familiarity (1-5) for each affect elicitation condition. LALV HALV LAHV HAHV Valence Arousal Preference Dominance Familiarity 4.2 (0.9) 3.7 (1.0) 6.6 (0.8) 6.6 (0.6) 4.3 (1.1) 5.7 (1.5) 4.7 (1.0) 5.9 (0.9) 5.7 (1.0) 3.6 (1.3) 6.4 (0.9) 6.4 (0.9) 4.5 (1.4) 5.0 (1.6) 5.7 (1.3) 6.3 (1.0) 2.4 (0.4) 1.4 (0.6) 2.4 (0.4) 3.1 (0.4) Table 3.4: The Wilcoxon Signed Rank analysis of the ratings between the valence and arousal conditions: mean values (and standard deviations) of the conditions, the test statistic W+, and the p-value. Valence Valence Arousal Dominance Preference Familiarity Arousal Negative Positive W+ p Low High W+ p 3.9 (1.0) 5.0 (1.5) 4.8 (1.5) 4.6 (1.6) 1.9 (0.8) 6.6 (0.7) 5.3 (1.1) 6.0 (1.2) 6.4 (0.9) 2.8 (0.8) 0 885 350 160 140 < .001 n.s < .001 < .001 < .001 5.4 (1.5) 4.5 (1.0) 5.1 (1.5) 6.1 (1.0) 2.4 (0.6) 5.1 (1.7) 5.8 (1.2) 5.7 (1.5) 5.0 (1.8) 2.3 (1.1) 791 223 557 401 672.5 n.s < .001 .001 < .001 n.s The distribution of ratings for the different scales and conditions suggests a complex relationship between ratings. We explored the mean inter-correlation of the different scales | 71 72 | Chapter 3 – Inducing Affect by Music Videos Figure 3.6: The distribution of the participants’ subjective ratings per scale (L - preference, V - valence, A arousal, D - dominance, F - familiarity) for the 4 affect elicitation conditions (LALV, HALV, LAHV, HAHV). over participants (see Table 3.5), as they might be indicative of possible confounds or unwanted effects of habituation or fatigue. We observed high positive correlations between preference and valence, and between dominance and valence. Seemingly, without implying any causality, people liked music which gave them a positive feeling and/or a feeling of empowerment. Medium positive correlations were observed between arousal and dominance, and between arousal and preference. Familiarity correlated moderately positively with preference and valence. As already observed above, the scales of valence and arousal are not independent, but their positive correlation is rather low, suggesting that participants were able to differentiate between these two important concepts. Stimulus order had only a small effect on preference and dominance ratings, and no significant relationship with the other ratings, suggesting that effects of habituation and fatigue were kept to an acceptable minimum. Table 3.5: The means of the subject-wise inter-correlations between the scales of valence, arousal, preference, dominance, familiarity and the order of the presentation (i.e. time) for all 40 stimuli. Significant correlations (p < .05) according to Fisher’s method are indicated by asterisks. Preference Valence Arousal Dom. Fam. Order Preference Valence Arousal Dom. Fam. Order 1 0.62* 1 0.29* 0.18* 1 0.31* 0.51* 0.28* 1 0.30* 0.25* 0.06* 0.09* 1 0.03* 0.02 0.00 0.04* 1 In summary, the affect elicitation was in general successful, though the low valence conditions were partially biased by moderate valence responses and higher arousal. High Section 3.3 – Results: Ratings, Physiology, and EEG scale inter-correlations observed are limited to the scale of valence with those of preference and dominance, and might be expected in the context of musical emotions. The rest of the scale inter-correlations are small or medium in strength, indicating that the scale concepts were well distinguished by the participants. 3.3.2 Physiological measures To validate the affect induction protocol via objective measures, we analyzed the physiological responses toward the affective stimulation. According to the psychophysiological literature on emotion induction, and specifically on musical emotion induction, we tested our hypotheses for electrodermal, and electromyographical measurements. An overview over the rmANOVAs computed for each sensor is presented in Table 3.6. Table 3.6: The effects of the repeated measures ANOVA for the physiological sensors for valence and arousal manipulations: mean (standard deviation), F-value and p-value. Valence trapezius (µV) zygomaticus (µV) SCL (µS) Negative Positive F p ηp2 5.63 (3.20) 3.19 (1.54) -0.09 (0.15) 6.18 (5.79) 4.47 (3.42) -46.626 (0.09) 0.111 17.917 0.195 n.s < .001 n.s 0.004 0.408 0.008 Arousal trapezius (µV) zygomaticus (µV) SCL (µS) Low High F p ηp2 5.84 (3.83) 3.50 (2.19) -0.09 (0.12) 5.97 (4.87) 4.17 (2.80) -0.05 (0.10 4.059 11.457 3.686 .054 .002 .066 0.127 0.306 0.128 Zygomaticus major muscle For the electromyographical activity recorded from the zygomaticus major (smile muscle) differentiated positive and negative valence. A main effect of valence indicated the expected higher activity for positive stimulation (F = 17.917 , p < 0.001, ηp2 = 0.408). Unexpectedly, we also found a main effect of arousal, indicating higher activity for arousing stimuli (F = 11.457, p = 0.002, ηp2 = 0.306). No interaction between valence and arousal was found. | 73 74 | Chapter 3 – Inducing Affect by Music Videos Electrodermal activity For the electrodermal activity, a trend for a main effect of arousal indicated the expected increase of skin conductance level for higher arousing stimuli (F = 3.686, p = 0.066, ηp2 = 0.128). No main effect of valence and no interaction effect were found. Trapezius muscle The electromyographical activity recorded from the trapezius (neck muscle) revealed a trend for a main effect of arousal, indicating the expected higher activity for high arousing stimulation (F = 4.059, p = 0.054, ηp2 = 0.127). No main effect of valence or interaction effect were found. In general, the differences found for physiological measurements corroborate the subjective ratings, and hence support the efficacy of the affect induction protocol. For valence, a higher activity of the zygomaticus major during positive stimuli suggests that positively valenced stimuli lead to stronger and more smiling. For arousal, the higher activity of the trapezius and higher skin conductance levels for more compared to less arousing stimuli indicate stronger tension of the neck musculature and increased electrodermal activity during musically induced arousal states. However, we also observed higher zygomaticus major activity during high arousing stimuli, compared to low arousing stimuli. This unexpected finding might be explained by a generally stronger activity of facial musculature during arousing stimuli. Unfortunately, we did not record the activity of the corrugator muscle, which should have behaved in an opposite manner to the zygomaticus, being involved in frowning and therefore more active during negative affect. In case of a generally higher activity of facial muscles during arousing affect, this muscle would rather be increasing in parallel with the zygomaticus. Alternatively, there might be a difference between less and more arousing positive stimuli in terms of smiling behavior: more arousing positive stimuli would lead to strong smiles or even laughter, whereas less arousing positive, relaxing stimuli might lead to smiles as well, but less marked ones. 3.3.3 Correlates of EEG and Ratings The correlations between general band power, computed according to Grosse-Wentrup et al. [2011], and the subjective ratings, yielded highly significant correlations for valence, but only marginally significant correlations for arousal. Moreover, we observed correlations for preference and stimulus order, the latter indicating order effects and hence possible confounds in the analysis. Table 3.7 summarizes the results of the correlations of general frequency band power and subjective ratings/stimulus order. Below we will give an overview over the electrode-wise correlates between the subjective ratings and the frequency bands (see Appendix A for tables of the correlations and their significances). Furthermore, we will evaluate the outcome of a baselining approach, a method to attenuate the possibility of potential confounds due to order effects. Section 3.3 – Results: Ratings, Physiology, and EEG Table 3.7: The correlations ρ (and their significances) between general frequency band power and ratings/order for the six broad frequency bands. Valence Arousal Preference Order f ρ p-value ρ p-value ρ p-value ρ p-value Theta Alpha Beta Gamma High Gamma 0.14 0.03 0.10 0.12 0.11 < .001 n.s. < .001 < .001 < .001 0.05 0.05 0.06 0.01 0.01 .07 n.s. .078 n.s. n.s. 0.04 0.02 0.07 0.09 0.09 n.s. n.s. .024 .008 .002 0.11 0.07 0.15 0.13 0.10 < .001 .012 < .001 < .001 .008 Valence The correlations between subjective ratings and the general frequency band power showed effects of valence manipulation in theta, beta, and gamma frequency bands. Figure 3.7 shows the correlations between valence and the frequency power in the different broad bands (The distribution of high gamma correlations is similar to that seen for low gamma and hence not depicted, but supplied in Table A.1 in the appendix.). Figure 3.7: The scalp distribution of the correlation coefficients (computed over all participants) of the valence ratings with the power in the broad frequency bands of theta, beta, gamma and high gamma. The highlighted sensors correlate significantly (p < .01) with the ratings. Nose is up. For the theta band, the electrode-wise correlations showed a highly significant positive correlation over fronto-central electrodes, and significant correlations over the posterior region. For the beta band, we found significant positive correlations over left and right temporal regions. For the gamma bands, we observed highly significant increases of power of left and right temporal electrodes. To explore the presence of a frontal alpha lateralization, observed in general and mu- | 75 76 | Chapter 3 – Inducing Affect by Music Videos sical emotions as a correlate of approach and withdrawal motivation, we computed alpha lateralization indices for the frontal electrode pairs of AF3 and AF4, F3 and F4, and F7 and F8 and correlated these with the valence ratings. We expected a positive correlation between indicator and valence, since the index increases with left-sided activation and decreses with right-sided activation. However, none of the correlations between indicators and valence ratings reached significance (see Table 3.8). Table 3.8: The correlations ρ between frontal alpha asymmetry measure and the valence ratings. Electrodepair ρ p-value AF3 - AF4 F3 - F4 F7 - F8 0.04 0.01 0.02 n.s n.s n.s Arousal For arousal, we observed no significant general correlations. However, we found trends toward significant effects (p < 0.1) in theta and beta frequency bands (see Figure 3.8 below and Table A.2 in the appendix). Based on the literature, we expected a (negative) correlation of arousal with the alpha band. The correlation of general band power and arousal, however, did not reach significance. Nevertheless, it is interesting to note that the electrode-wise analysis showed some highly significant correlations over central and posterior regions. Figure 3.8: The scalp distribution of the correlation coefficients (computed over all participants) of the arousal ratings with the power in the broad frequency bands of theta, alpha, and beta. The highlighted sensors correlate significantly (p < .01) with the ratings. Nose is up. Section 3.3 – Results: Ratings, Physiology, and EEG Preference The neurophysiological correlates between general frequency band power and preference ratings were found in the beta, and gamma frequency bands (see Figure 3.9 below and Table A.3 in the appendix). The bilateral temporal increases found in the beta and gamma bands are similar to those observed for valence: the valence and preference correlations seem very similar in terms of sign, size and location, which might be a result of the high inter-correlations of the two ratings. Figure 3.9: The scalp distribution of the correlation coefficients (computed over all participants) of the preference ratings with the power in the broad frequency bands of beta, gamma and high gamma. The highlighted sensors correlate significantly (p < .01) with the ratings. Nose is up. Order effects Correlations with stimulus order manifested themselves in all frequency bands. For the correlation analysis used to explore correlates of affect, such linear trends over time are worrisome for two reasons. Firstly, long-term changes (drifts) in frequency power increase the stimulation-unrelated variance in the data, and hence might conceal smaller stimulus-related effects. Secondly, these changes might introduce confounds in the analysis, especially in a correlation analysis based on ratings of experience, which are also subject to order effects, as for example habituation. The standard procedure to attenuate the influence of order effects, is to compute the effects of stimulation relative to the measurements of the preceding pre-stimulation interval. We recalculated the electrode-wise correlations between general band power and the subjective ratings according to such a “baselining” approach. Table 3.9 summarizes the correlations between general band power and ratings/order computed on the basis of the baselined data. The results of the order-corrected analysis deviate from those of the non-corrected analysis. The most prominent effects are the reductions of correlation sizes in the theta and beta bands. This reduction leads to the disappearance of effects for valence and arousal in both bands. Similarly, the beta band correlation with preference looses its significance. | 77 78 | Chapter 3 – Inducing Affect by Music Videos Table 3.9: The correlations ρ (and their significances) between general frequency band power, based on baselined power measures, and ratings/order for the six broad frequency bands. Valence Arousal Preference Order f ρ p-value ρ p-value ρ p-value ρ p-value Theta Alpha Beta Gamma High Gamma 0.04 0.03 0.04 0.1 0.09 n.s. n.s. n.s. < .001 .004 0 0.02 0.01 0.03 0.01 n.s. n.s. n.s n.s. n.s. 0.05 0.02 0.04 0.06 0.05 .096 n.s. n.s. .074 n.s. 0.08 0.11 0.10 0.07 0.09 .002 < .001 .002 .028 .004 These changes of the general correlations are evidence for the presence of order effects in the analysis. We will discuss their implications in Section 3.4.3. 3.4 Discussion: Validation, Correlates, and Limitations The goal of the study was the implementation and validation of an affect induction protocol using music clips, and the exploration of neurophysiological correlates of the affective states induced by it. The analysis revealed several limitations that will also be discussed here. 3.4.1 Validity of the Affect Induction Protocol The validity of the implemented affect induction protocol using music clips was suggested by evidence from participants’ ratings and their physiological responses to the stimuli. Specifically, the clear and strong effects of the stimulation on the valence and arousal ratings indicates the reliability of the induction method. The arousal manipulation had no significant effect on the valence ratings, and vice versa, suggesting an even distribution of the induced experiences over the valence-arousal plane. Nevertheless, we saw correlations between the rating scales. Possibly, that covariation of valence and arousal might be an artifact of a natural characteristic of musical affect - with higher arousal often accompanying higher valence. In this regard, it is challenging to find highly arousing negative music videos, though certainly some death metal or hardcore clips do feature gruesome and arousing scenes. Concerning the physiological responses to the affective stimulation, we found an expected difference of zygomaticus muscle activity for valence manipulations, and marginally significant differences of trapezius muscle and electrodermal activity for arousal manipulations. Unexpected was, however, a difference in zygomaticus activity for arousal. With the current data, we are not able to check if this increase of activity is due to the covariation Section 3.4 – Discussion: Validation, Correlates, and Limitations | between valence and arousal, or merely an effect of higher general activation of (facial) musculature during arousing stimulation. In general, the observed differences justify the notion of different affective states that were induced by the music clips. Hence, we explored the neurophysiology of these affective states. 3.4.2 Classification of Affective States Further evidence for the success of the manipulation of valence and arousal is given by approaches to classify the affective states by their neurophysiological correlates. Specifically, Soleymani et al. [2011a] were able to classify valence, arousal, and preference, each formulated as two class problem (differing between low and high ratings, respectively). As the number of trials per class differed, significance of the classification was assessed by comparison against a random classification with biased class sizes comparable to the class distribution for valence, arousal, and preference scales, respectively. The classification approach, a linear ridge regressor, performed significantly better than a biased random classifier. Together with subjective ratings, physiological indicators, and neurophysiological effects, we may claim that (1) we induced a variety of affective states differing in valence and arousal, and (2) that these states can indeed be differentiated by their neurophysiological correlates. The next section discusses the neurophysiological correlates that are the common basis of the classifier performance. 3.4.3 Neurophysiological Correlates of Affect The primary motivation for the implementation of the affect induction protocol was the induction of affect-related neurophysiological responses, that would be a basis for the training of affective BCIs. We found strong neurophysiological correlates with valence and with preference ratings. Correlates with arousal ratings were only present on electrodelevel, but not strong enough for correlation with general frequency band power - the indicator we used to guide electrode-wise testing. Valence General correlations with valance ratings were present in several frequency bands: theta, beta, and gamma. The positive correlations in the theta band were priorly reported in the context of musical emotions by Sammler et al. [2007] and Lin et al. [2010]. Both studies associated the theta effect with the anterior cingulate cortex (ACC), a neural structure involved with affective and cognitive processes [Bush et al., 2000]. Baar et al. [2001] speak of a selectively distributed theta system that has its strongest generators within the limbic system and controls responses to environmental stimuli. Similarly, Yordanova et al. [2002] associate theta responses with modality-independent stimulus evaluation processes. Koelsch [2010] noted that the association of the ACC with changes in autonomic activity [Critchley 79 80 | Chapter 3 – Inducing Affect by Music Videos et al., 2003], with cognitive processes, for example stimulus evaluation and performance monitoring, and with affective responses, suggests a role of the ACC in the synchronization of biological subsystems. Consequently, the different components (and the respective biological systems) involved in the emotional response [Scherer, 2005] might be synchronized via the ACC, yielding the conscious percept of the feeling. The positive correlations with high frequency bands of beta and gamma bands found, have been priorly reported for different induction approaches as well [Cole and Ray, 1985; Aftanas et al., 2004; Lutz et al., 2004; Onton and Makeig, 2009], though some studies also reported mixed [Müller et al., 1999; Keil et al., 2001] or opposite effects of valence manipulation [Gross et al., 2007]. Such high-frequency band responses, especially increases over visual cortices for visual stimulation, have often been reported in the context of increased sensory processing of attended, known or meaningful stimuli [Herrmann et al., 2004]. In that regard, gamma band oscillations have been interpreted as part of a cerebral mechanism supporting the binding of different object features into one perceived object, yielding a holistic percept. In the context of emotion, the amygdala has been identified as a source of increased gamma band activity to emotional stimuli [Oya et al., 2002; Luo et al., 2009]. Consequently, it is tempting to identify the increased temporal high-frequency activity as direct consequences of the workings of the limbic system or of associated structures, which reside in the medial temporal lobes. However, despite evidence for a cortical origin of affectrelated high-frequency oscillations also in the EEG, localized in the anterior temporal lobes [Onton and Makeig, 2009], the high frequency bands are notorious for containing electromyographic activity [Goncharova et al., 2003]. Taking the high covariation of the zygomaticus major with valence into account, it is well possible that high frequency correlates are mainly resulting from the muscle tension associated with smiling. The electromyographical activity from the involved facial muscles might propagate to the scalp. Without a careful separation of EMG and EEG sources, as attempted by Onton and colleagues by use of independent component analysis, a definite attribution of the effects observed in the present study is not possible. Frontal Alpha Asymmetry and Valence We also tested the hypothesis of a positive correlation of frontal alpha asymmetry indicators with valence, but found no effects for any of the frontal electrode pairs. Alpha asymmetries have not been found by other studies manipulating valence neither (e.g., see [Schutter et al., 2001; Sarlo et al., 2005; Winkler et al., 2010; Fairclough and Roberts, 2011; Kop et al., 2011]). There are methodological differences between our study and those that reported musically induced, valence-related alpha asymmetries [Schmidt and Trainor, 2001; Tsang et al., 2006]. The latter used a different reference scheme, referencing all electrodes to the vertex, whereas we used common average reference. Despite observations of comparable results yielded with average and vertex reference schemes (see [Allen et al., 2004]), Hagemann [2004] discourage the use of average reference to investigate frontal alpha asymmetries. Average reference might introduce additional variance in the frontal alpha Section 3.4 – Discussion: Validation, Correlates, and Limitations | band. On the other hand, frontal alpha asymmetry is often interpreted in relation to motivational direction (approach/withdrawal) [Harmon-Jones et al., 2010] rather than to induced valence (positive/negative). While those concepts of valence and motivational direction are in general overlapping – negative valenced emotions being withdrawal-related and positive valenced emotions being approach related – some of our musical stimuli might not mirror that motivational division. Negative songs, especially, might induce anger – a negative, but approach-related emotion. Furthermore, the motivational component might not be strong enough in the manipulation. While the dimensions of valence and motivation were supposed to be congruent for most emotions, only differing for (negative, but approach-related) anger, Harmon-Jones et al. [2008] found that differences in motivational strength of positive emotions are reflected in the strength of the frontal alpha asymmetry Preference General correlations with preference in the beta and gamma bands were in location and size following the effects of valence. This can be expected, given the high inter-correlations between both ratings. Cinzia and Vittorio [2009] review several studies of aesthetical experience, which might be comparable to the experience associated with preference, in that the aesthetical features of a certain object determine in part the emotional response. They find that many studies report common structures activated as during emotional experience. Another interesting point about the neurophysiological correlates of general preference – closing the circle to our stimulus selection method – is, that those correlations might be exploited in applications such as media recommendation systems, such as last.fm, which can use implicit tagging approaches based on neurophysiology to capture information about user preferences. Arousal General correlations with arousal ratings did not reach the criterium for significance. However, it might be said that they were leaning towards significance for the theta and beta band. Specifically, we observed significant electrode-wise correlations for these frequency bands and the arousal ratings: a positive correlation with posterior theta and a negative correlation with frontal beta. As the notion of the alpha decrease for the activation of brain regions and as a (inverse) correlate of arousal is very prominent, the lack of effects in the alpha band was surprising. We expected to see a negative correlation between arousal and alpha power over auditory fronto-central and visual parieto-occipital regions to indicate stronger processing of higher arousing stimuli. On the electrode level, such a negative correlation with alpha was indeed observed. That the effects were too weak to lead to a general correlation between alpha and arousal might indicate a weak covariation between emotional arousal and cognitive (sensory) arousal: also the low arousing songs, for example sad or relaxing pieces, lead to intense listening and watching. This corobborates our observations from Chapter 2, the 81 82 | Chapter 3 – Inducing Affect by Music Videos relation between alpha band power and emotional arousal is rather weak. In Chapter 4, we will explore this relationship further. Effects of Order Finally, we investigated the possibility of order effects in the general correlations. This seemed relevant to us, as linear trends in power and ratings, found for the preference rating, could theoretically lead to confounds in the correlation analysis. The general correlations between frequency power and time of stimulus presentation showed strong order effects, that is linear changes of frequency power over the duration of the experiment. These might be due to several factors, such as increasing fatigue, tension, emotional habituation, or changes in the connectivity between electrodes and scalp (e.g., drying gel). We found that baselining of the frequency band power of a given trial by the power of the preceding resting period had effects on the correlations. Specifically, the size of general theta and beta band correlations with valence and arousal, and the general beta band correlations with preference were reduced by the baselining approach, and consequently dropped below significance level. Two possible reasons can be given for the changes of the general correlation after baselining. As laid out above, affect-unrelated order effects might have led to a correlation of frequency power and ratings. Alternatively, it is possible that the effects observed prior to the baseline-correction, as for example the theta band correlation with valence, are still linked to affect. While no assumption about the causality can be made with the used correlation approach, it is not unlikely that pre-trial brain activity partially determines the subjective experience in response to an affective stimulus (cf. the mediator/moderator distinction for alpha asymmetries by Coan and Allen [2004]). Studies that probe behavior based on the activation of certain structures or the instantaneous presence of brain rhythms, for example by applying stimulation contingent with the activity or by directly manipulating their occurence (e.g., by transcranial magnetic stimulation (TMS)), have found evidence for them partially determining cognitive functions and experience. In that sense, baseline-correction could have removed such response-determining pre-trial activity, thereby removing a genuine correlate of affect. However, such correlates should rather be related to the determinants of the affective response, than to the response as such. From these observations follows that the use of subjective ratings for the ground truth determination has advantages and disadvantages. They allow a better estimate of the actual affective state that is induced by a given stimulus, especially when using musical stimulation that is naturally dependent on personal music listening preferences. On the other hand, they might introduce order effects in the correlations of frequency power and subjective ratings. A simple removal with baselining-approaches as done here also might remove correlates that carry information about the affective state of a participant, as the pre-trial activity could be of an affective nature (e.g., due to response-defining changes in mood, and thus in brain activity). Section 3.4 – Discussion: Validation, Correlates, and Limitations | 3.4.4 Emotion Induction with Music Clips: Limitations We set out to explore the viability of music clips for a reliable and ecologically valid multimodal affect induction. Despite encouraging results in terms of subjective ratings, physiological indicators of affect, and neurophysiological correlates, we encountered several issues that limit the reliability of the affect induction procedure. We address these issues and suggest, where possible, changes to our approach that might remedy the problems. Coverage of the Stimulus Space Despite our attempts to select the most effective stimuli from each quadrant of the valencearousal plane, that is those that had the most extreme values during the online ratings, the ratings in the final experiment revealed less extreme valence values for low arousal emotions. The resulting C-shape of stimuli on the valence-arousal plane can also be observed with the often used IAPS and IADS stimulus sets, containing affective pictures and sounds, respectively. One could argue, that low arousal emotions are in general attenuated with respect to the experienced valence. However, it might also be a limitation of the affect induction possible in a laboratory. For example, grave sadness or depression, which is a highly negative and low arousing state, is difficult to induce in the laboratory due to practical and ethical concerns. Lying on extreme ends of the valence and arousal dimensions, however, such affective states might be interpolated from the less extreme observed responses. A possible improvement of the impact of low arousing emotions could be yielded by the combination of a musical and an additional affect induction procedure, for example the instruction to imagine the emotion that is to be induced by the clip. Such combination with a second affect induction procedure generally increases the efficacy of the affect induction [Westermann et al., 1996]. Confounds of Familiarity We observed medium correlations between valence, arousal, and preference with familiarity ratings. Specifically, the familiarity ratings showed differences between negative high arousal and the other emotion conditions. This is a consequence of the selection of video clips: the negative high arousal condition consisted to a large part of heavy-metal music clips, which were less popular than the clips chosen for the other conditions. Other studies exploring musical emotions chose pieces that would be rather unknown, some even composing music for the study. With video clips the composition and filming is not an option. The selection is further limited, as the selected music should be available as video clip and popular enough to enable a selection via tags from social media sites. Alternatively, the selection of the music clips could be subject-specific, taking the music listening preferences of the participant into account. An alternative to the self-selection of music clips by the participants, is to select music clips that fit the preference profile of a given participant in social music networks, like last.fm. 83 84 | Chapter 3 – Inducing Affect by Music Videos Confounds by Stimulus Features The current study did not control for the perceptual features of the stimuli used for affect induction, but selected stimuli only on the basis of their effects on the affective experience in the online study. As mentioned by Juslin and Västfjäll [2008], certain perceptual aspects of the musical stimuli, and likewise of the visual stimulation with video clips, as for example pitch, loudness, color, or luminance, might have influences on the affective experience. For physiological measurements, a strong association between physiological changes and content features can be observed. Gomez and Danuser [2007] found strong correlations between physiological measurements, such as respiration, heart rate, and skin conductance, and musical content features, such as tempo, accentuation, and rhythmic articulation: “Music that induced faster breathing and higher minute ventilation, skin conductance, and heart rate was fast, accentuated, and staccato. This finding corroborates the contention that rhythmic aspects are the major determinants of physiological responses to music.” Similarly, Van Der Zwaag et al. [2011] hold that musical characteristics, such as tempo, mode, and percussiveness, partially cause the emotional experience, and therefore their effects on physiology cannot be separated from those effects stemming from the emotional experience per se. On the other hand, Etzel et al. [2006] stressed the potential detrimental effects that physiology entrainment has for correlations of affect. The physiological variance that is induced by purely perceptual characteristics of the music might conceal the relationship between physiology and musical emotion. Koelstra et al. [2012] found the information in the content features of the stimulus sound and film material sufficient for a classification of the affective experience. The content features were even of superior informativeness compared to EEG frequency domain features. Further studies are needed to test the dependence of frequency features on content features, and to ensure the generalization of the emotion effects to other emotion induction methods. 3.5 Conclusion We proposed and evaluated an affect induction protocol which uses multimodal, musical stimuli to induce different affective states. The validity of the induction procedure, its capability to manipulate experience along the dimensions of valence and arousal, was supported by the expected differences we found for the subjective experience and the physiological responses of the participants. The exploratory correlation analysis of the neurophysiological responses to the affective stimulation delivered further evidence for the validity of the induction procedure. Especially, strong responses for valence in the theta, beta and gamma range are in accordance with the literature. The arousal manipulation, however, did not deliver significant overall effects, though tendencies toward significance have been found for theta and beta bands, and an expected negative correlation with the alpha band was present for some electrodes. The participants’ general preference for specific music pieces was, as expected Section 3.5 – Conclusion | from high correlations with valence ratings, correlated with partially the same frequency bands as the valence ratings, namely beta and gamma. Furthermore, we found order effects on all frequency bands and explored the influence of a baselining-approach. The computation of responses relative to a pre-trial baseline had an influence on the correlations, leading to a different pattern of results for all ratings. Therefore, we conclude that order effects are an issue for experiments that induce affective states and consequently for affective BCIs, especially when subjective ratings are used as a ground truth. However, baselining-approaches might also remove genuine correlates of affect, namely such pre-stimulus activity that partially determines the emotional response. The suggested and validated affect induction procedure, offers a way to extract large numbers of stimuli from the internet, using the rich repository of tagged multimedia platform content. Thereby a large number of multimodal stimuli, specifically produced to induce affect, can be identified and acquired in a relatively short time. These stimuli are consumed on a daily basis by large parts of the normal population and therefore a guarantee for the ecologically validity of the stimulation. Physiological studies suggest that the emotional responses are comparable to non-musically induced emotional responses. However, how general or context-specific the musically induced neurophysiological responses are has yet to be explored. Are the correlates observed also valid in other contexts than in passive music consumption? We pointed out some limitations of the current study, specifically of the affect induction by music videos, which have to be resolved in future studies: only modest valence experience for low arousing emotions, a covariation of familiarity with the subjective ratings of affect, and a potential influence of stimulus features on the neurophysiological correlates. We propose a further investigation of the advantages of a subject-specific selection of clips, to yield more extreme valence experiences for low arousing emotions and to avoid confounds by familiarity. Also the combination of musical and an additional affect induction procedure, for example the instruction to imagine the emotion that is to be induced by the clip, might help to produce stronger affective response. 85 86 | Chapter 3 – Inducing Affect by Music Videos Chapter 4 Dissecting Affective Responses 4.1 Introduction1 Affective BCIs are supposed to recognize emotions in the complexity of the real world, which is different from the restrictively controlled laboratory environments in which emotional responses are normally studied. To build affective BCIs that also function reliably in the real world, the identification of reliable affect correlates is a basic requirement. However, according to Scherer [2005], emotional responses are complex compounds of core affective (conscious feeling - self-monitoring) and other, cognitive processes (information processing, behavior planning). Consequently, in standard affect induction protocols, for example using music clips as described in chapter 3, the co-occurrance of activations associated with rather cognitive processes in addition to the relevant purely affective processes cannot be excluded. For example, assuming that an emotional stimulation is not only followed by a conscious perception of the successfully induced emotion (Scherer’s self-monitoring component), but also by an additional cognitive stimulus evaluation (Scherer’s information processing component), the measured neurophysiological activations will contain a mix of affective and cognitive processes. This is relevant for affective BCI, because consequently, the training data not only contains information about the affective state, but also information about cognitive processes. Depending on the strength and reliability of the cognitive correlates the classifier might learn these in addition, or worse, instead of the affective correlates. As such correlates do not occur exclusively during emotions, but also during non-emotional cognitive processes, for example during attention to a certain object or event, such “affective/cognitive” classifiers would respond also during non-emotional processes in complex real-world environments. To avoid such confounds, potentially hampering the robustness of affective BCIs, meth1 The work presented here is based on a pilot study [Mühl and Heylen, 2009] that led to significant methodological changes in our approach. The affective efficacy of a subset of the stimuli and the experimental design were tested in a collaboration with TNO [Brouwer et al., 2012], and parts of the analysis of the ratings, physiological, and neurophysiological data, presented here, have been published in [Mühl et al., 2011b] and [Mühl et al., 2011a]. 88 | Chapter 4 – Dissecting Affective Responses ods for the separation of affective from cognitive parts of the emotional response have to be investigated. Here we suggest and evaluate an affect induction protocol that might, to a certain degree, enable the distinction between affective and cognitive correlates of emotional responses. Specifically, we aimed at the separation of modality-specific cognitive processes and general, potentially affective processes, by the induction of affect via stimulation from different stimulus modalities, namely visual (pictures) and auditory (sounds). Below, we will shortly review evidence for modality-specific cognitive processes during emotional responses and their reflection in the EEG, and we will outline our hypotheses. 4.1.1 Cognitive Processes as Part of Affective Responses Appraisal theories, as in the Component Process Model of Scherer [2005], postulate the importance of the analysis of the stimulus for the subsequent emotional response. In Scherer’s model, this analysis is a complex interaction of several cognitive functions, involving attention, memory, and other higher level evaluations, scrutinizing stimulus features such as novelty, pleasantness, goal conduciveness, coping potential, and selfcompatibility [Sander et al., 2005]. While these processes happen very fast and partially subconsciously, the outcome determines the conscious feeling. There is considerable evidence for the existence of cognitive stimulus evaluation processes as part of the emotional response. Functional neuroimaging studies provide evidence for stimulus-specific cognitive responses to affective stimulation. They show that correlates of affective responses can be found in core affective areas in the limbic system and associated frontal brain regions, but also in modality-specific areas associated with stimulus processing in general [Kober et al., 2008]. Emotional postures [Hadjikhani and de Gelder, 2003] and facial expressions [Pourtois and Vuilleumier, 2006] led to a stronger activation of areas known to be involved in (visual) posture and face processing than their neutral counterparts. Similarly, affectinducing sounds [Grandjean et al., 2005] activated auditory cortical regions stronger than neutral sounds, and auditorily induced affective states could be classified on the basis of the activations measured within these regions [Ethofer et al., 2009]. It should be noted that these responses are similar in their nature to purely cognitive responses, as observed during attentional orienting to a specific modality [Vuilleumier, 2005]. 4.1.2 Modality-specific Processing in the EEG For the EEG, modality-specific affective responses (e.g., to visually, auditory induced, or self-induced affect) have been only seldom investigated. Considering the limited spatial resolution that EEG measures can achieve, the study of modality-specific affective responses with EEG needs to directly compare two different affect induction modalities. The few studies that did compare the effects of different stimulus modalities reported neurophysiological differences due to affect that varied between the modalities. Spreckelmeyer et al. [2006] noted different ERP loci in response to visually (facial expressions) compared to auditory (sung notes) attended affective stimuli. Visual affective responses are prominent over posterior electrodes, while auditory affective responses were seen over more Section 4.1 – Introduction | anterior electrodes. Cole and Ray [1985] compared affective responses either induced by a self-induction procedure (imagination) or by visual stimuli (pictures), and found that right posterior alpha power differed between the induction modality, whereas temporal beta power differed between the induced affective states. The posterior alpha effect was taken to reflect internal versus external attention focus during self-induced versus visual affect, respectively. Finally, Baumgartner et al. [2006a] found an activation of posterior cortices occurring mainly for visual affective stimuli (pictures), while almost absent for auditory stimuli. The absence of a comparable activation for anterior cortices in response to auditory affective stimuli might be explained by the lack of electrodes over frontocentral sites. In the following, we present evidence for a posterior/anterior division of the modality-specific responsiveness of alpha activity to auditory and visual stimulation to motivate our hypotheses. Most research on EEG responses to affective stimulation uses visual affect induction by pictures [Olofsson et al., 2008], because of which most is known about the neurophysiological correlates of this stimulus modality. The late positive potential, for example, is a hallmark of visual affective correlates that is strongest over posterior cortical sites [Hajcak et al., 2010] and can be traced back to the activation of a network of visual cortices [Sabatinelli et al., 2006]. In the frequency domain, posterior alpha band power is known as a strong indicator of visual processing, for example in response to the manipulation of visual attention [Pfurtscheller and Lopes da Silva, 1999; Jensen and Mazaheri, 2010; Foxe and Snyder, 2011]. Correspondingly, combined fMRI and EEG measurements found an inverse relationship of alpha band power and the activation of underlying cortices [Laufs et al., 2003]. The behavior of posterior alpha band power indicates the intensity of processing in visual cortices: decreasing alpha indicates more intense visual stimulus processing, whereas increasing alpha indicates less intense visual stimulus processing. Conveniently, the power of the alpha band - especially over posterior regions - was also shown to respond to visual affective manipulation [Aftanas et al., 2004; Baumgartner et al., 2006a; Güntekin et al., 2007]. Therefore, alpha band power is of special interest here, as it responds strongly to sensory processing in a modality-specific way and could be used to mark cognitive processes in response to affective manipulation. Less is known about EEG correlates of auditory affective stimulation, as the activation of auditory areas, is less easy to assess by EEG compared to that of visual areas [Pfurtscheller and Lopes da Silva, 1999]. However, magnetoencephalographic and EEG measurements have shown that alpha activity in the superior-temporal cortex correlates with auditory stimulus processing [Weisz et al., 2011], and is supposed to be reflected in fronto-central EEG leads [Hari et al., 1997; Mayhew et al., 2010]. Consistent with this expectation, auditory-related cognitive processes were associated with differences in the amplitude of anterior and temporal alpha oscillations in the EEG [Krause, 2006; Weisz et al., 2011]. A link between affective auditory manipulation and alpha band power was suggested by [Panksepp and Bernatzky, 2002; Baumgartner et al., 2006a; Schmidt and Trainor, 2001; Tsang et al., 2006], making fronto-central alpha power a potential correlate of cognitive processes in response to affective manipulation. Summarizing, posterior alpha power has been associated with visual processing and with visual affective stimulation, whereas anterior alpha power has been linked to audi- 89 90 | Chapter 4 – Dissecting Affective Responses tory processing and might be associated with affect. In general, alpha decreases during active processing, and increases otherwise [Pfurtscheller and Lopes da Silva, 1999]. To study the differentiation of posterior and anterior alpha power according to visual and auditory affective stimulation, respectively, we induced affective states differing in valence and arousal via visual, auditory, and audio-visual stimuli. To enable the delineation of modality-specific and general affective correlates, we manipulated the affective content (neutral, positive calm, positive arousing, negative calm, negative arousing) and the modality (visual, auditory, audiovisual) of the stimulation independently. Below, we outline our expectations for the devised induction protocol. 4.1.3 Hypotheses To verify whether we manipulated emotional states as expected, we also recorded subjective ratings of the participants’ emotional states and electrocardiographical (ECG) signals. We expect a decrease of heart rate during negative and arousing emotions, as has been reported as a sign for successful affect manipulation with the used visual [Codispoti et al., 2006] and auditory stimuli [Bradley and Lang, 2000]. H1a : Valence and arousal manipulation affect the subjective ratings of valence and arousal accordingly. H1b : Negative affective stimulation leads to a stronger decrease of heart rate than neutral or positive affective stimulation. H1c : High arousing affective stimulation leads to a stronger decrease of heart rate than low arousing affective stimulation. In accordance with the literature, we expect affect-related responses to pictures in the alpha band power over posterior cerebral regions. We define a region of interest (ROI) pa for parietal that comprises electrodes P3, Pz, P4 and formulate our hypothesis for the expected response to the modality-specific affective stimulation: in the case of visual affect induction, alpha band power at pa is decreasing, as expected for increased visual processing. During auditory affect induction, alpha power at pa will increase, as expected for the inhibition of visual processing. During the audio-visual induction, however, a strong decrease of alpha at pa is expected. H2a : Visual affective stimulation leads to a stronger decrease of parietal alpha power compared to auditory stimulation. Main effects of auditory affective stimulation might be anticipated in the activity over anterior cerebral regions. We define an ROI fc comprising the fronto-central electrodes of FC1, Cz, FC2 and formulate the following hypotheses: In the case of visual affect induction, alpha band power at fc is increasing, as expected for decreased auditory processing. During auditory affect induction, alpha power at fc will decrease, as expected during increased auditory processing. During the audio-visual induction, a strong decrease of alpha at fc is expected. H2b : Auditory affective stimulation leads to a stronger decrease of fronto-central alpha power compared to visual stimulation. Section 4.2 – Methods: Stimuli, Study Design, and Analysis As a general effect of valence, we expect a relative right shift of alpha band power with increasing valence (activation of left frontal ventrolateral regions), reported for various affect induction modalities [Schmidt and Trainor, 2001; Tsang et al., 2006; Huster et al., 2009] in terms of the frontal alpha asymmetry [Allen et al., 2004]. In addition to the hypotheses-guided analyses, we conducted an exploratory analysis of the differences induced by valence and arousal in the frequency bands of delta, theta, alpha, beta, and gamma. H3a : Positive relative to negative affect leads to a relative right shift of alpha band power (higher frontal alpha asymmetry index). H3b : Valence manipulation has general effects on the broad frequency bands in the EEG. H3c : Arousal manipulation has general effects on the broad frequency bands in the EEG. 4.2 4.2.1 Methods: Stimuli, Study Design, and Analysis Participants Twenty-four (twelve female and twelve male) participants with a mean age of 27 years (standard deviation of 3.8 years), were recruited from a subject pool of the research group and via advertisement at the university campus. All participants, but one, were righthanded. Participants received a reimbursement of 6 Euro per hour or the alternative in course credits. 4.2.2 Apparatus The stimuli were presented with “Presentation” software (Neurobehavioral systems) using a dedicated stimulus PC, which sent markers according to stimulus onset and offset to the EEG system (Biosemi ActiveTwo Mk II). The visual stimuli were presented on a 19" monitor (Samsung SyncMaster 940T). The auditory stimuli were presented via a pair of custom computer speakers (Philips) located at the left and right sides of the monitor. The distance between participants and monitor/speakers was about 90 cm. To assess the neurophysiological responses, 32 active silver-chloride electrodes were placed according to the 10-20 system. Additionally, 4 electrodes were applied to the outer canthi of the eyes and above and below the left eye to derive horizontal electrooculogram and vertical electrooculogram, respectively. To record the electrocardiogram (ECG), active electrodes were attached with adhesive disks at the left, fifth intercostal space and 5 to 8 cm below. Additionally, participants’ skin conductance, blood volume pulse, temperature, and respiration were recorded for later analysis. Physiological and neurophysiological signals were sampled with 512 Hz. 4.2.3 Stimuli To ensure a comparably strong affective stimulation with visual, auditory, and audio-visual stimuli, we constructed a multi-modal stimulus set from normed visual and auditory uni- | 91 92 | Chapter 4 – Dissecting Affective Responses Figure 4.1: The location of the selected visual (left) and auditory (right) stimuli with respect to the whole stimulus set (grey points) in the valence-arousal plane. Figure 4.2: The distributions of the selected visual (IAPS) and auditory (IADS) stimuli for each of the 5 conditions along the valence (left) and arousal (right) axes. Note the differences in scale and condition order between both plots. modal affective stimulus sets. We selected 50 pictures and 50 sounds from the affective stimuli databases International Affective Picture Set (IAPS;Lang et al. [1999]) and International Affective Digital Sounds Set (IADS2 ;Bradley et al. [2007]). The advantage of these stimulus databases is the evaluation of each stimulus regarding its effect on the subjective experience of arousal and valence. The picture set consists of stock and press photographies from the 80s and 90s, showing a great variety of affect-inducing motives, for example mutilations, attacks, pollution, ill and dying people, smiling children, harmonic family scenes, sport scenes, and beautiful landscapes. The sounds are retrieved from movies and music pieces. For both stimulus modalities we selected 10 stimuli for each of 4 categories varying in valence (pleasant and unpleasant) and arousal (high and low). Additionally, a neutral Section 4.2 – Methods: Stimuli, Study Design, and Analysis class (i.e., low arousal, neutral valence) with 10 stimuli for each stimulus modality was constructed. First, the arousal-valence space containing the IADS stimuli was divided into 3 sections along the valence axis (negative, neutral, positive), and 2 sections along the arousal axis (low and high). This was done with the goal to have a similar number of stimuli in each of the 6 sections (low arousal negative, high arousal negative, neutral, low arousal positive, high arousal positive). Second, the thresholds thus created for the IADS stimuli were applied to the arousal-valence space containing the IAPS stimuli, resulting in 6 sections. In a third step, we selected 10 stimuli from each section from IADS and IAPS sets, reiterating the selection until (1) the stimuli within the stimulus sets and within one valence or arousal condition were as close as possible according to their mean valence or arousal, respectively, and (2) the stimuli within each section, between the stimulus sets were as close as possible in terms of mean valence and arousal. The resulting selections are depicted in Figure 4.1, with not used stimuli indicated as grey dots to visualize the relation of the chosen subsets to the rest of the stimuli. Table 4.1: The stimulus IDs of visual [Lang et al., 1999] and auditory [Bradley et al., 2007] stimuli according to the conditions they appeared in and in the order in which they were paired for the audio-visual stimuli (e.g. 1st auditory with 1st visual stimulus of LALV to create 1st LALV multi-modal stimulus). LALV HALV LAHV HAHV Neutral visual audio visual audio visual audio visual audio visual audio 2141 2205 2278 3216 3230 3261 3300 9120 9253 8230 280 250 296 703 241 242 730 699 295 283 2352.2 2730 3030 6360 3068 6250 8485 9050 9910 9921 600 255 719 284 106 289 501 625 713 244 1811 2070 2208 2340 2550 4623 4676 5910 8120 8496 226 110 813 221 721 820 816 601 220 351 4660 5629 8030 8470 8180 8185 8186 8200 8400 8501 202 817 353 355 311 815 415 352 360 367 2220 2635 7560 2780 2810 3210 7620 7640 8211 9913 724 114 320 364 410 729 358 361 500 425 Table 4.2 presents the mean arousal and valence values as reported in [Lang et al., 1999; Bradley et al., 2007] for each condition (values in bold font). They were matched as well as possible between the emotion conditions, and between modalities (see Figure 4.2). Table 4.1 lists the IDs of the specific IAPS and IADS stimuli used. For the audio-visual conditions, visual and auditory stimuli of the same emotion conditions were paired with special attention to match the content of picture and sound (e.g., pairing of “aimed gun” picture and “gun shot” sound). The pictures and sounds of the unimodal conditions were paired in the order they are listed in Table 4.1. | 93 94 | Chapter 4 – Dissecting Affective Responses Table 4.2: Mean (std) valence and arousal ratings for the stimuli of each emotion condition1 computed from the norm ratings of the IAPS [Lang et al., 1999] and IADS [Bradley et al., 2007] stimulus sets. Visual (IAPS) Condition (1) LALV (2) HALV (3) LAHV (4) HAHV (5) Neutral 4.2.4 Auditory (IADS) Valence Arousal Valence Arousal 2.58 (0.60) 2.26 (0.34) 7.53 (0.44) 7.37 (0.31) 4.92 (0.54) 5.24 (0.54) 6.50 (0.22) 5.26 (0.52) 6.67 (0.38) 5.00 (0.51) 3.05 (0.51) 2.70 (0.51) 7.09 (0.43) 7.19 (0.44) 4.82 (0.44) 5.81 (0.43) 6.79 (0.31) 5.59 (0.39) 6.85 (0.39) 5.42 (0.42) Design and Procedure Before the start of the experiment, participants signed an informed consent form. Next, the sensors were placed. Before the start of the recording, the participants were shown the online view of their EEG to make them conscious of the influence of movement artifacts. They were instructed to restrict their movements to the breaks, and to watch and listen to the stimuli, while fixating the fixation cross at the centre of the screen. Stimuli were presented in three separate modality blocks in a balanced order over all participants (Latin square design). Each modality block started with a baseline recording in which the participants looked at a black screen with a white fixation cross for 60 seconds. Between the modality blocks were breaks of approximately 2 minutes in which the signal quality was checked. Each modality block consisted of the five emotion conditions (see Table 4.2) presented in a pseudo-randomized order, ensuring an approximate balancing of the order of emotion conditions within each modality condition over all subjects. Between the emotion blocks there were breaks of 20 seconds. Each emotion condition consisted of the presentation of the respective 10 stimuli, in a randomized order. Each stimulus was presented for about 6 seconds. Stimuli were separated by 2 seconds blank screens. Finally, all stimuli were presented again to the participants, to rate their affective experience on 9-point SAM scales of arousal, valence, and dominance as used in [Bradley et al., 2007; Lang et al., 1999]. 4.2.5 Data Processing and Analysis After referencing to the common average, the EEG data was downsampled to 256 Hz, high-pass FIR filtered with a cut-off of 1 Hz and underwent a three-step artifact removal procedure: (1) it was visually screened to identify and remove segments of excessive EMG and bad channels, (2) eye artifacts were removed via the AAR toolbox in EEGlab, and finally (3) checked for residuals of the EOG. For the estimation of the power within the different frequency bands, Welch’s method [Welch, 1967] was used with a sliding window of 256 samples length and 128 samples overlap. It was applied to each of the 15 blocks and their preceding baseline intervals for every participant separately. The resulting power Section 4.2 – Methods: Stimuli, Study Design, and Analysis values with a resolution of 1 Hz were averaged according to the different frequency ranges to derive delta (1 - 3 Hz), theta (3 - 7 Hz), alpha (8 - 13 Hz), beta (14 - 29 Hz), and gamma band power (30 - 48 Hz). To derive the the alpha power measures for the regionsof-interest, the alpha power for the parietal and fronto-central electrodes was averaged. To approach normal distribution for later parametric statistical analysis, the natural logarithm was computed, and baselining was performed by subtraction of the power of the preceding resting period. To compute heart rate (HR), the ECG signal was filtered by a 2-200 Hz bandpass 2sided Butterworth filter and peak latencies extracted with the BIOSIG toolbox [Schlögl and Brunner, 2008] in Matlab. Next, the interbeat intervals were converted to HR and averaged for each participant and each of the 15 stimulus blocks (3 modality × 5 emotional blocks). For the analysis of the effects of the affect induction protocol on ratings, HR, and neurophysiological activity, repeated measures ANOVAs (rmANOVA) were conducted. Subjective ratings and heart rate were first submitted to 3 (modality) × 5 (emotion) rmANOVA to show general impact of emotional stimulation, after which a 3 (modality) × 3 (valence) rmANOVA for valence (negative, positive, neutral) and a 3 (modality) × 2 (arousal) rmANOVA for arousal (low,high) were conducted. For the analysis of the modality-specific effects in the alpha band, 3 (modality) × 2 (emotion) rmANOVAs were conducted for the parietal and fronto-central regions-ofinterest, with emotion being neutral versus aggregated positive and negative conditions. To determine the nature of the expected interaction of emotion and modalities, we computed second level contrasts of the modality-specific affective responses: visual affect (affective minus neutral visual stimulation), auditory affect (affective minus neutral auditory stimulation), audiovisual affect (affective minus neutral audiovisual stimulation). We then contrasted the three different modality-specific responses with one-sided, pairwise Ttests. For the exploratory analysis of the effects of valence and arousal on delta, theta, alpha, beta, and gamma bands, separate 3 (modality) × 2 (emotion) rmANOVAs were conducted for modality×valence and modality×arousal for each electrode. For the valence analysis, low and high valence conditions each consisted of the aggregated low/high arousal conditions. For the arousal analysis, low and high arousal conditions each consisted of the aggregated positive/negative valence conditions. For the analysis of the alpha asymmetry indices, we conducted a 3 (modality) × 2 (valence) rmANOVA. To ensure normal distribution, we removed cases that contained outliers, according to the mild outlier criterium (3×inter-quartile range). Where appropriate, GreenhouseGeisser corrected results are reported. As effect size measure we report the partial etasquared ηp2 . | 95 96 | Chapter 4 – Dissecting Affective Responses 4.3 4.3.1 Results: Ratings, Physiology, and EEG Subjective Ratings The affective manipulations yielded the expected differences in subjective ratings (see Table 4.3 and Figure 4.3). A 3 (modality) × 5 (emotion) rmANOVA on the valence ratings showed a main effect of both emotion (F(4,92) = 246.100,p < 0.001,ηp2 = 0.915) and modality (F(2,46) = 6.057,p = 0.005,ηp2 = 0.208). Figure 4.3: The mean valence and arousal ratings for all 5 emotion conditions for visual, auditory, and audiovisual modality of affect induction (error bars: standard error of mean). Table 4.3: Mean (std) of the valence and arousal ratings for the stimuli of each emotion condition computed from the participants’ ratings. Visual (IAPS) Condition (1) LALV (2) HALV (3) LAHV (4) HAHV (5) Neutral Auditory (IADS) Audio-visual Valence Arousal Valence Arousal Valence Arousal 1.99(0.79) 1.97(0.83) 6.88(0.70) 6.29(0.93) 4.52(0.64) 4.98(1.60) 5.73(1.84) 5.24(1.45) 5.50(1.60) 4.41(1.24) 2.84(0.81) 2.55(0.77) 6.17(0.71) 6.28(0.71) 4.86(0.60) 4.55(1.73) 5.32(1.64) 4.97(1.50) 5.67(1.62) 4.10(1.29) 2.28(0.78) 2.02(0.90) 6.69(0.81) 6.40(0.69) 4.54(0.60) 5.08(1.56) 5.82(1.61) 5.37(1.50) 5.92(1.65) 4.38(1.38) A 3 (modality) × 3 (valence) rmANOVA on mean ratings of negative, neutral, and positive conditions, showed a main effect of valence (F(2,46) = 264.100, p < 0.001,ηp2 = 0.920). Participants rated positive (t(23) = -19.096, p < 0.001) and neutral ((t(23) = -14.920, p < 0.001) conditions higher on valence compared to the negative condition. The positive condition was rated higher than the neutral ((t(23) = 11.946, p < 0.001). Furthermore, a main effect for modality (F(2,46) = 7.078, p = 0.002,ηp2 = 0.235) reflected more positive ratings for states induced via the auditory modality compared to visual (t(23) = 4.853, p < 0.001) and audio-visual (t(23) = 3.541, p = 0.002). An interaction Section 4.3 – Results: Ratings, Physiology, and EEG effect indicates less extreme ratings for auditory stimuli (F(4,92) = 14.813, p < 0.001,ηp2 = 0.392) and, hence, a weaker efficacy. A similar pattern was found in a 3 (modality) × 5 (emotion) rmANOVA on the arousal ratings, showing a main effect of emotion (F(4,92) = 12.588,p < 0.001,ηp2 = 0.354), and of modality (F(2,46) = 9.177,p < 0.001,ηp2 = 0.285). A 3 (modality) × 2 (arousal) rmANOVA on mean ratings of low and high arousing conditions, showed a main effect of arousal (F(1,23) = 27.180, p < 0.001,ηp2 = 0.542). A main effect for modality (F(2,46) = 9.344, p < 0.001,ηp2 = 0.289) indicated, that auditorilly induced affect was rated less arousing than visually (t(23) = 2.495, p = 0,02) and audiovisually induced affect (t(23) = 4.032, p = 0.001). No interaction effect was found between modality and arousal. 4.3.2 Heart Rate The data of 3 participants was excluded from analysis, as ECG artifacts resulted in difficulties for adequate QRS-peak detection. The analysis of HR change relative to the resting baseline indicated differences resulting from the valence and arousal manipulations. A 3 (modality) × 5(emotion) rmANOVA showed main effects of modality (F(2,40) = 6.063, p = 0.005, ηp2 = 0.233) and emotion (F(4,80) = 4.878, p = 0.001,ηp2 = 0.196). Table 4.4: Main effects of the modality and valence manipulation for the heart rate. No interaction effect was found. Mean and standard deviation for the respective conditions and the statistical indicators F-value, p-value, and η p 2 are given. Negative Neutral Positive ANOVA x̄ s x̄ Std s Std F p-value ηp 2 -0.87 3.92 0.31 3.59 -0.29 3.66 5.41 0.007 0.2 20 Visual Auditory Audiovisual ANOVA x̄ s x̄ s x̄ s F p-value ηp 2 -0.87 4.03 0.96 3.21 -0.92 4.60 5.829 0.006 0.228 A 3 (modality) × 3 (valence) rmANOVA (see Table 4.4) on low arousing negative, neutral, and positive conditions, showed a main effect of modality (F(2,40) = 5.829, p = 0.006,ηp2 = 0.228) with significantly higher HR for the auditory conditions compared to the visual (t(20) = -3.359, p = 0.003) and the audiovisual (t(20) = -2.906 , p = 0.009) conditions. The analysis also showed a main effect of valence (F(2,40) = 5.641, p = 0.007,ηp2 = 0.220): as expected HR decreased stronger for negative compared to neutral (t(20) = -3.128, p = 0.06). A trend for a stronger HR decrease for negative compared to positive stimulation was also present (t(20) = -1.629, p = 0.06). No interaction for modality and valence was found. For arousal, a 3 (modality) × 2 (arousal) rmANOVA on averaged low versus high arousing emotion conditions, showed a main effect of modality (F(2,40) = 7.029, p = 0.002,ηp2 = 0.260), with higher HR for the auditory compared to visual (t(20) = -3.134, | 97 98 | Chapter 4 – Dissecting Affective Responses Table 4.5: Main effects of modality and arousal manipulation for the heart rate. No interaction effect was found. Mean and standard deviation for the respective conditions and the statistical indicators F-value, p-value, and η p 2 are given. Low Arousal High Arousal ANOVA x̄ s x̄ s F p-value ηp 2 -0.36 3.79 -0.79 3.70 5.360 .031 0.211 Visual Auditory Audiovisual ANOVA Mean Std x̄ s x̄ s F p-value ηp 2 -0.92 4.10 0.85 3.27 -1.35 4.51 7.029 .002 0.260 p = 0.005) and audiovisual conditions (t(20) = -3.133, p = 0.005); and a main effect of arousal (F(1,20) = 5.360, p = 0.031,ηp2 = 0.211), with a HR decrease for higher arousing stimuli (see Table 4.5). No interaction for modality and arousal was found. 4.3.3 Modality-specific Correlates of Affect Figure 4.4: The parietal and fronto-central alpha power (error bars: standard error of mean) during visual, auditory, and audio-visual affect (emotion – neutral), averaged over all subjects that entered the respective analyses (A); and the 2nd level contrasts of visual – auditory affect (B), visual – audiovisual affect (C), audio-visual – auditory affect (D), showing modality-specific affective responses averaged over all subjects. For each ROI, the data of 4 cases was excluded according to the outlier criterium (1.5×inter-quartile range) to guarantee normality, which was potentially violated by residual artifacts. To test the effect of modality-specific affective stimulation, we contrasted for all 3 modality conditions neutral against mean emotional response with a 3 (modality) × 2 (emotion) rmANOVA. We did so for the alpha power over the “visual” parietal pa and “auditory” fronto-central fc region (see Figure 4.4), separately. In accordance with our prediction for pa, we found an interaction effect between modality and emotion (F(2,38) = 3.373, p = 0.045, ηp2 = 0.151). To specify the nature of the interaction, we computed the second-level emotion contrasts (emotion - neutral) for Section 4.3 – Results: Ratings, Physiology, and EEG the visual, auditory, and audio-visual conditions. Pairwise one-sided t-tests confirmed the expected difference between audio-visual versus auditory affect (t = -2.651, p = 0.008, Figure 4.4D), and for visual versus auditory affect (t = -2.093, p = 0.026, Figure 4.4B). Alpha power at pa decreases for visual and audio-visual affect, but increases for auditory affect. No main effect of emotion was observed. For fc, we found an interaction between modality and emotion (F(2,38) = 4.891, p = 0.013, ηp2 = 0.205). As for the pa, we computed the second-level emotional contrasts for all conditions over fc. Pairwise one-sided t-tests showed the expected significant differences between audio-visual versus visual affect (t = -3.155, p = 0.003, Figure 4.4B), and for the visual versus auditory affect (t = -1.817, p = 0.043, Figure 4.4C). Alpha power at fc decreased for auditory and audio-visual affect, but increased during visual affect induction. No main effect of emotion was observed. Figure 4.5: The parietal and fronto-central alpha power (error bars: standard error of mean) during visual, auditory, and audio-visual stimulation (over all emotion condition), averaged over all subjects that entered the respective analyses (A); and the 2nd level contrasts of visual – auditory (B), visual – audio-visual (C), audio-visual – auditory stimulation (D), showing modality-specific responses averaged over all subjects. For both regions of interest, we found a main effect of modality, independent of emotion conditions (see Figure 4.5). Over pa, we found a main effect of modality (F(2,38) = 3.821, p = 0.043, ηp2 = 0.167), resulting from lower alpha power for visual (t = -2.059, p = 0.053) and audiovisual (t = -2.291, p = 0.034) stimulation, compared to auditory stimulation. Similarly, we found a main effect of modality over fc (F(2,38) = 6.110, p = 0.005, ηp2 = 0.243), resulting from lower alpha power for visual (t = -2.978, p = 0.008) and audiovisual (t = -2.389, p = 0.027) stimulation, compared to auditory stimulation. No significant differences were seen between visual and audio-visual stimulation. 4.3.4 General Correlates of Valence, Arousal, and Modality To explore the general correlates of affect further, we conducted repeated measures ANOVAs for indices of frontal alpha (power) lateralization between valence conditions, and for the contrasts of valence and arousal conditions in the power of the delta, theta, alpha, beta, and gamma bands. Modality was included as an additional factor in all three analyses, and will be discussed separately in Section 4.3.4 as it is the same for valence and arousal. | 99 100 | Chapter 4 – Dissecting Affective Responses Frontal Alpha Lateralization and Valence To test the effect of valence on the frontal alpha asymmetry, we computed the index according to Allen et al. [2004] for positive and negative conditions and for all modalities. We tested these in a 3 (modality) × 2 (valence) rmANOVA. We expected a main effect of valence, specifically a smaller index for negative emotions compared to positive emotions, indicating a higher activation of left frontal regions for positive emotions and a higher activation of right frontal regions for negative emotions. However, while the expected pattern of alpha asymmetry was observed, it did not reach significance (see Table 4.6). There were also no effects of modality or an interaction. Table 4.6: Main effects of alpha asymmetry for the frontal electrode pairs E in the alpha frequency bands f. Mean and standard deviation for the respective frequency band power of the negative and positive valence conditions, and the statistical indicators F-value, p-value, and η p 2 Negative Positive ANOVA f E x̄ s x̄ s F p-value ηp 2 Valence AF3 – AF4 F7 – F8 F3 – F4 0.015 0.030 0.004 0.095 0.137 0.105 0.026 0.046 0.014 0.081 0.144 0.097 0.649 1.136 0.480 n.s. n.s. n.s. - Effects of Valence For the analysis of correlates of valence, we contrasted (all negative versus all positive) valence conditions for visual, auditory, and audio-visual stimulation via 3 (modality) × 2 (valence) rmANOVA. Main effects of valence were observed for the beta and gamma frequency bands (see Table 4.7). Figure 4.6: The main effect of valence manipulation, showing the contrasts of the responses to negative compared to positive affective stimulation (negative minus positive) in the beta (A) and gamma (B) frequency bands, averaged over all subjects. Electrodes that showed a significant main effect of valence in the rmANOVA are marked with circles. Nose is up. Section 4.3 – Results: Ratings, Physiology, and EEG Table 4.7: Main effects of valence for the significant (p < 0.05) electrodes E in the specific frequency bands f. Mean and standard deviation for the respective frequency band power of the negative and positive valence conditions, and the statistical indicators F-value, p-value, and η p 2 . Negative Positive ANOVA f E x̄ s x̄ s F p-value ηp 2 Beta FC5 T7 T8 -0.059 -0.079 -0.030 0.094 0.151 0.130 0.036 0.065 0.086 0.118 0.204 0.193 11.742 5.632 4.838 0.002 0.026 0.038 0.338 0.196 0.174 Gamma FC1 FC5 T7 C3 T8 FC6 -0.019 -0.051 -0.090 -0.023 -0.066 -0.087 0.066 0.163 0.198 0.092 0.172 0.165 0.025 0.096 0.102 0.050 0.127 0.069 0.080 0.193 0.315 0.125 0.334 0.309 4.326 8.546 5.183 5.576 5.796 4.550 0.048 0.007 0.032 0.027 0.024 0.043 0.158 0.271 0.184 0.195 0.201 0.165 For the beta band, main effects of valence were observed over left and right temporal regions (see left panel in Figure 4.6). Beta band power increased for positive relative to negative affective stimulation over these regions. These effects were observed bilaterally, but were stronger over left temporal regions. Similarly, we observed main effects of valence in the gamma band over left and right temporal electrodes (see right panel in Figure 4.6). As for the beta band, gamma band power increased for positive relative negative affective stimulation. The effects on the gamma band included more electrodes compared to those observed in the beta band. Correlates of Arousal In an exploratory analysis, we contrasted positive and negative high versus positive and negative low arousal conditions for visual, auditory, and audio-visual stimulation via 3 (modality) × 2 (arousal) rmANOVA for the power of the delta, theta, alpha, beta, and gamma bands for each electrode. We found main effects of arousal in the delta and theta frequency bands (see Table 4.8). For the modality effects, being the same as for the valence analysis, we present the results in Section 4.3.4. No interaction was found between modality and arousal. For the delta band, the analysis revealed main effects of arousal over the right parietal region (see left panel of Figure 4.7). Delta power decreased for high compared to low arousal conditions. For the theta band, we observed a general decrease of power over central and temporal regions for high arousal affective stimulation compared to low arousal conditions. We found significant main effects of arousal over left temporal and central electrodes (see right panel of Figure 4.6). | 101 102 | Chapter 4 – Dissecting Affective Responses Table 4.8: Main effects of arousal for the significant (p < 0.05) electrodes E in the specific frequency bands f. Mean and standard deviation for the respective frequency band power of the low and high arousal conditions, and the statistical indicators F-value, p-value, and η p 2 Low Arousal High Arousal ANOVA f E x̄ s x̄ s F p-value ηp 2 Delta P8 CP6 0.126 0.055 0.070 0.083 0.061 0.005 0.088 0.057 10.028 7.939 0.004 0.009 0.304 0.257 Theta FC5 C3 CP2 0.016 0.062 0.034 0.094 0.126 0.127 -0.039 -0.009 -0.014 0.125 0.152 0.141 5.688 9.588 5.516 0.025 0.005 0.027 0.198 0.294 0.193 Figure 4.7: The general effect of arousal manipulation, showing the contrasts of of the responses in low compared to high arousing stimulation (high minus low) in the delta (A) and theta (B) frequency bands, averaged over all subjects. Electrodes that showed a significant main effect of arousal in the rmANOVA are marked with circles. Nose is up. Effects of Modality For valence and arousal rmANOVAs, we found main effects of modality for the delta, alpha and beta frequency bands (for tables refer to Section B in the appendix). We will report here the effects of the modality×arousal analysis, as they are virtually identical to those observed for modality×valence analysis. For the delta band the exploratory analysis yielded main effects of modality for occipital and right parietal electrodes (see Figure 4.8 and Table B.1 in the appendix). Delta power was higher for visual and audio-visual compared to auditory stimulation (see Table B.2 in the appendix). As for the rmANOVA on the parietal and fronto-central regions of interest, we found pronounced main effects of modality in the alpha band (Figure 4.9 and Table B.3 in the appendix). Alpha power decreased over the whole scalp in response to visual and audiovisual stimulation compared to purely to auditory stimulation (see Table B.4 in the appendix). Section 4.3 – Results: Ratings, Physiology, and EEG Figure 4.8: The modality contrasts of visual – auditory (A), visual – audio-visual (B), audio-visual – auditory stimulation (C), showing the effects of sensory stimulation (e.g., visual minus auditory) on the delta band, averaged over all subjects. Electrodes that showed a significant main effect of modality in the rmANOVA and subsequent post-hoc Ttests between the different modality pairs are marked with circles on the corresponding contrasts. Nose is up. Figure 4.9: The modality contrasts for visual – auditory (A), visual – audio-visual (B), audio-visual – auditory stimulation (C), showing the effects of sensory stimulation (e.g., visual minus auditory) on the alpha band, averaged over all subjects. Electrodes that showed a significant main effect of modality in the rmANOVA and subsequent post-hoc Ttests between the different modality pairs are marked with circles on the corresponding contrasts. Nose is up. For the beta band, main effects of modality similar to the alpha band were observed (Figure 4.10 and Table B.5 in the appendix). Beta band power decreased for visual and audiovisual compared to purely auditory stimulation over central and parietal regions (see Table B.6 in the appendix). In contrast to the alpha band the effects in the beta band were less pronounced in effect size and did not include frontal, temporal and occipital electrodes. | 103 104 | Chapter 4 – Dissecting Affective Responses Figure 4.10: The modality contrasts of visual – auditory (A), visual – audio-visual (B), audio-visual – auditory stimulation (C), showing the effects of sensory stimulation (e.g., visual minus auditory) on the beta band, averaged over all subjects. Electrodes that showed a significant main effect of modality in the rmANOVA and subsequent post-hoc Ttests between the different modality pairs are marked with circles on the corresponding contrasts. Nose is up. Summary of the Exploratory Analysis Despite a great amount of literature showing different lateralization patterns for frontal alpha power for positive versus negative valence, our analysis did not show effects of frontal alpha lateralization. For valence, the comparison of frequency band power showed consistent decreases of power over temporal regions in the beta and gamma bands for negative compared to positive affective stimulation. These can be related to other studies of valence correlates in the frequency domain. However, power in high frequency bands over temporal regions is also prone to contaminations by EMG activity. For arousal, high arousal compared to low arousal stimulation led to a general decrease of right parietal delta power and of central theta power. The main effects of modality were the same for valence and arousal rmANOVAs, as both analyses did not differ with regard to the data taken into account for the modality contrasts. We observed increases of occipito-parietal delta band power, decreases of the alpha band power over the whole scalp, and decreases of power over parietal and frontocentral regions for the (lower) beta band for visual and audio-visual compared to auditory stimulation. No interactions with modality were found for valence or arousal contrasts. 4.4 Discussion: Context-specific and General Correlates In the present study, we explored neurophysiological responses to visual, auditory, and combined affective stimulation. We predicted that parietal and fronto-central alpha power would respond differently during visually and auditory induced emotion. Furthermore, we conducted an exploratory analysis of the main effects of valence and arousal on the broad frequency bands of delta, theta, alpha, beta, and gamma. Below we will discuss Section 4.4 – Discussion: Context-specific and General Correlates | the evidence for the validity of the induction protocol, the observed modality-specific and general affective responses, their relevance for affective BCI, and the limitations of the present study. 4.4.1 Validity of the Affect Induction Protocol The effects of valence and arousal on ratings and heart rate suggest that in general, the affect induction was successful. The affective stimulation had strong effects on the subjective experience of the participants. We found effects of the valence manipulation on valence ratings, and of the arousal manipulation on arousal ratings. The successful manipulation of affect was further confirmed by the participants’ heart rates. The contrasts of low versus high arousing and positive versus negative valence conditions showed the expected decreases of heart rate for arousing and negative stimuli, replicating results of other studies using IADS and IAPS stimuli [Bradley and Lang, 2000; Codispoti et al., 2006]. The deceleration of the heart rate is believed to index the activity of the parasympathetic nervous system, in response to arousing and unpleasant stimuli that lead to an increased orienting response [Bradley, 2009]. Despite our efforts to keep the valence and arousal ratings of the conditions comparable over modalities, the ratings showed a slightly weaker efficacy of auditory affective stimuli. This pattern was mirrored in the heart rate changes in response to stimulation, which were smallest for auditory, followed by visual, and strongest for audio-visual stimulation. Summarizing, we found evidence for the successful manipulation of valence and arousal in the form of subjective and objective indicators of affect. 4.4.2 Modality-specific Affective Responses The observed responses of posterior (paα ) and anterior (f cα ) alpha power (see Figure 4.4) match our expectations as formulated in the introduction. We found a lower paα for visual compared to auditory affective stimulation, especially when visual stimuli were paired with auditory stimuli. Complementary to paα , f cα was lower for auditory compared to visual affective stimulation, especially when auditory stimuli were paired with visual stimuli. With regard to the literature on sensory processing [Foxe and Snyder, 2011; Jensen and Mazaheri, 2010; Pfurtscheller and Lopes da Silva, 1999], the effects might result from a decrease of alpha band power during “appropriate” stimulation (activation), and an increase during “inappropriate” stimulation (inhibition). The “appropriateness” is defined by the modality-specificity of the underlying cortices - the posterior visual cortices for ROI pa [Pfurtscheller and Lopes da Silva, 1999], and the temporal auditory cortices for ROI fc [Hari et al., 1997]. It should be noted, that the alpha increase to “inappropriate” stimulation seems more marked than the decrease during “appropriate” stimulation, stressing the inhibitory nature of the potential sensory gating process [Foxe and Snyder, 2011; Jensen and Mazaheri, 2010]. The relatively strong alpha decrease during audio-visual affective stimulation, might result from extra activation due to more intense processing of bimodal stimuli. Alternatively, it might be an additive effect resulting from an overlap of paα and f cα due to volume conduction. 105 106 | Chapter 4 – Dissecting Affective Responses Please note that these modality-specific effects were yielded after subtracting neutral from emotion conditions for each modality, removing responses that are common to any visual or auditory stimulation, leaving the correlates of the affective response. Without such a neutral “baseline”, the purely perceptual effects of modality-specific stimulation dominate the affective responses and hence might conceal any affect-related differences. This becomes evident when looking at the main effect of modality, which showed a general decrease of alpha with visual and audio-visual stimulation, and an increase with auditory stimulation. Surprisingly, no significant differences (effects of modality) between visual and audio-visual stimulation were observed, though the additional auditory processes should lead to a desynchronization relative to purely visual over auditory regions. This lack of effects suggests that the alpha power over pa and fc responds strongly to visual stimulation, but is only weak and not always visible for auditory stimulation. Difficulties in observing auditory alpha with EEG have been reported before [Pfurtscheller and Lopes da Silva, 1999; Weisz et al., 2011]. These results suggest that affective stimulation via different modalities has different, even opposing effects which can be observed in the EEG. These adhere to theories of modality-specific activation and inhibition of the respective, functionally specialized brain regions also involved in non-affective cognitive processes. Therefore, it is conceivable that these correlates are stimulus-specific cognitive consequences of affective stimulation, comparable to the enhanced processing during attentionial selection [Vuilleumier, 2005], rather than primary correlates of affect. This is also in line with appraisal theories of emotion (e.g., the Component Process Theory [Sander et al., 2005]). These theories postulate the involvement of core affective processes such as affective self-monitoring (feeling) and of cognitive processes related to the stimulus evaluation and behavior planning. In Section 4.4.4 we will discuss the consequences of the existence of modality-specific correlates of affective responses in the EEG for affective BCIs. After the presentation of the modality-specific effects of the affective stimulation, we now will recapitulate the general effects of valence and arousal manipulation on the different frequency bands. 4.4.3 General Correlates of Valence, Arousal, and Modality The exploratory analysis of differences between valence and arousal correlates, yielded by repeated-measures ANOVA with the factors of emotion (the different valence and arousal levels, respectively) and modality, indicated that information about the affective state can also be found in several other frequency bands than alpha. Due to the multi-comparison approach adopted for the exploratory analysis, the reliability of the observed effects is limited2 . For the discussion, we take the findings at face value, but note that the effects 2 Uni- and multivariate statistics are possible features selection strategies applied in the context of affect detection from physiological signals [Kim and André, 2008; Chanel et al., 2011], and in that sense it is interesting to see which frequency-electrode pairs would be selected as possible features and consequently which potential processes would be involved in the detection of affect. The exploratory analysis is necessary, as in contrast to the clear hypothesis presented for modality-specific effects of affective stimulation, the wealth of different findings for the general effects of affect induction in the frequency domain precludes concise expectations in Section 4.4 – Discussion: Context-specific and General Correlates | need validation in further studies. The found main effects of valence and arousal support the suitability of these frequency ranges for the use as indicators for the discrimination of positive and negative valence and of low and high arousal, respectively. Below, we will discuss the correlates with regard to the possible underlying mechanisms and functions. Furthermore, the main effects of modality found for some frequency bands are discussed, as they indicate high variability within these bands that might be problematic for their use for affective BCIs. Interestingly, no interaction effects between modality and the respective affective dimensions were observed. This suggests that modality-specific aspects of affective responses are mainly occurring during affective compared to neutral stimulation, potentially showing the increased level of evaluation of emotionally salient stimuli. For the contrasts of different emotion conditions, that is between negative versus positive valence, or between low versus high arousal, such modality-specific differences might be rather weak or absent as they occur in all conditions. Frontal Alpha Asymmetry and Valence An effect of valence on the asymmetry of frontal alpha power, often reported to differ between positive and negative affect or approach versus withdrawal motivation, was not observed in the current study. However, recent investigations have questioned the association of this indicator with the valence dimension, and strengthened its association with motivational processes [Harmon-Jones et al., 2010]. While the dimensions of valence and motivation were supposed to be congruent for most emotions, only differing for (negative, but approach-related) anger, Harmon-Jones et al. [2008] found that differences in motivational strength of positive emotions were reflected in the strength of the frontal alpha asymmetry. The imagination of positive emotions related to motivating goals led to a stronger asymmetry compared to positive imaginations without attached goal. Accordingly, the emotional stimulation of the present study might not have induced an alpha asymmetry, as the motivational component was missing. High Frequency Oscillations and Valence The exploratory analysis revealed main effects of valence in the high frequency bands: left and right temporal beta and gamma power increased for positive relative to negative affect. Increases of high frequency power were related to increasing valence in several studies [Cole and Ray, 1985; Müller et al., 1999; Keil et al., 2001; Aftanas et al., 2004; Onton and Makeig, 2009], including our own (see Chapter 3). As laid out in Section 3.4.3, these effects might be directly associated with limbic structures (e.g., amygdala) or, indirectly, as their downstream consequences on temporal cortices. In this regard, terms of specific frequency bands and locations. The testing of every electrode within a given frequency band for effects of affective stimulation, however, leads to an increased probability of false rejections of the null hypotheses. Consequently, the correction for multiple tests conducted within the frequency bands would lead to a very conservative significance threshold [Groppe et al., 2011], which will result in no rejections for the null hypotheses. To limit the number of false positives, we only discuss frequency bands that show effects with a significance of p < 0.01. 107 108 | Chapter 4 – Dissecting Affective Responses the modality-independent nature of the beta and gamma band responses observed here suggests the involvement of processes that are general to auditory and visual emotion induction procedures. However, also in the present study the attribution of the effects to cortical sources is limited due to the potential contamination of high frequency power by electromyographical activity due to facial expressions, such as smiling in response to positive stimuli [Goncharova et al., 2003]. Though we instructed our participants to limit their movement, especially for head and face, and removed excessive EMG artifacts by visual inspection of the signals, some EMG activity due to involuntary facial expressions or microexpressions might have remained in the EEG traces. Without a careful separation of EMG and EEG components in the signal, for example by independent component analysis [Onton and Makeig, 2009], the possibility of non-EEG correlates of affect in the EEG spectrum cannot be excluded. Low Frequency Oscillations and Arousal High and low arousal were differed by frequency oscillations in the delta and theta bands. Delta band power decreased over the right parietal regions in response to high arousing compared to low arousing stimulation. Delta oscillations have been associated with the workings of the motivational system, as some of its subcortical generators are part of the brain reward system [Knyazev, 2007]. Delta power responses have been seen during the P300 response [Schürmann et al., 2001; Baar-Eroglu et al., 2001; Polich, 2007], which is a modality-independent response [Comercho et al., 1999; Polich, 2007] and affected by emotional stimulation, suggests a function of delta in salience tagging. Other studies, using visual affect induction, have reported an increase of delta power in response to higher, versus lower, arousing stimuli [Aftanas et al., 2002; Balconi and Lucchiari, 2006; Klados et al., 2009]. Consequently, its decrease with arousal, observed here, might be an indicator of the higher emotional salience of low arousing stimuli. While this seems at first counterintuitive, it might be possible that sad hospital or peaceful nature scenes, and their auditory equivalents, have a stronger emotional impact than exciting sport scenes or cruel violent scenes. Former low arousing stimuli might seem more “real”, or more relevant, to the perceiver than the latter scenes, thus leading to a greater involvement. Regarding the rather subtle effect of the arousal manipulation on subjective and physiological variables, effects of emotional involvement could have dominated the arousal manipulation. For an evaluation of the emotional impact of the different arousal levels, an additional assessment of the emotional involvement of the participants, as done by Baumgartner et al. [2006a], can be performed. Theta band power decreased over central regions in response to high arousing compared to low arousing stimulation. Several studies reported effects of emotional arousal on the theta band. Aftanas et al. [2002] found left anterior and bilateral posterior increases of theta band activity in response to high compared to low arousing pictures. Balconi and Lucchiari [2006] observed increases of frontal theta band power in response to pictures showing emotional versus neutral faces. As theta band power has been associated with the workings of limbic (e.g., amygdala, hippocampus) and associated structures (e.g., anterior cingulate gyrus [Bush et al., 2000]), the theta effect suggests a difference in the emotional Section 4.4 – Discussion: Context-specific and General Correlates | percept and response (see Section 3.4.3). Similar to the observed effects in the delta band, the theta increase for low versus high arousal indicates a stronger response of limbic and paralimbic structures to low arousing stimuli. Alpha Band Oscillations in Cognition and Affect Despite studies reporting alpha band responses to valence manipulations [Aftanas et al., 2004; Baumgartner et al., 2006a; Güntekin et al., 2007], no main effects of valence were observed. Similarly, no main effect of arousal was observed, though the alpha band is often viewed as an inverted measure of cortical activity [Laufs et al., 2003] increased by arousing stimulation [LeDoux, 1996], reflected in a heightened sensory processing of attended salient stimuli [Pfurtscheller and Lopes da Silva, 1999; Jensen and Mazaheri, 2010; Weisz et al., 2011]. The above discussed alpha power effects in response to general emotional salience (versus neutral stimulation) corroborate this role of alpha for affective stimulation. The lack of comparable alpha effects for high versus low arousing stimulation might be due to the relatively weak arousal manipulation possible with the used IAPS and IADS stimuli subsets. Furthermore, differences in emotional arousal during stimulation might not per se affect the intensity of the sensory processing: low-arousing sad stimuli can carry the same or even higher relevance to the viewer or listener as high-arousing threatening stimuli (see the previous section for an elaboration of that idea). This might especially be the case when the emotional stimulation is carried out in a passive setting, where the stimuli have no immediate significance for the well-being of the participants. The lack of general alpha band effects for valence and arousal manipulations is further support for the strong relation of alpha band to the cognitive correlates of affective stimulation. Other studies that observed (posterior) alpha band effects in response to valence manipulations [Baumgartner et al., 2006a; Güntekin et al., 2007] did not control for the effects of arousal. It is possible that higher levels of arousal during negative stimulation, might have led to a comparison of emotional states equal to the here presented analysis of neutral versus emotional stimulation, hence showing sensory responses to emotionally salient stimuli. Correlates of Modality in the Frequency Domain The exploratory analysis yielded several strong main effects of modality in diverse frequency bands. Besides the already observed influence of modality on the power in the alpha band, discussed in Section 4.4.2, we also found marked influences of the modality of stimulation on the delta and beta band power. Delta power was lower for auditory compared to visual and audio-visual stimulation. A higher delta power over occipital and parietal regions could be caused by visual potentials (e.g., early potentials or the LPP), which are observed in response to visual stimulation [Pfurtscheller and Lopes da Silva, 1999; Schürmann et al., 2001]. Such posterior potentials would be absent or attenuated in response to purely auditory stimulation. In response 109 110 | Chapter 4 – Dissecting Affective Responses to affective visual stimulation, these potentials are often even increased in amplitude compared to non-affective stimulation [Olofsson et al., 2008]. Beta band power was higher for auditory, compared to visual and audio-visual, stimulation. The modality effect in the beta band over posterior and central electrodes resembled in location and direction, increase for auditory, the alpha band responses. The functional significance of beta oscillations, however, is to date largely unknown. One attempt of an overarching theory of beta [Engel and Fries, 2010], suggests the signaling of a status-quo by beta oscillations. This signaling of non-change is difficult to aline with the manipulation of stimulation modality. Alternatively, as suggested by Mazaheri and Picton [2005], beta band effects occurring in the presence of strong alpha band effects might be harmonies of this prominent (visual) alpha. In our study, the comparable topography of alpha and the weaker beta band response, are compatible with such harmonic oscillations. Summarizing, while the above discussed general correlates of valence and arousal encourage the consideration of frequency band features for the use as neurophysiological indicators of affect, they also hint at the intricacies of such approaches. They suggest a complex and distributed multi-process nature of affective responses. In the following section we will defer to the potential consequences of modality-specific and general correlates of affective processes. 4.4.4 Relevance to Affective Brain-Computer Interfaces We have shown that, while several frequency bands respond in a modality-unspecific manner to valence and arousal manipulation, the alpha band responds differently, depending on the sensory origin, to the affective manipulation. In line with theories of sensory processing within modality-specific regions of the brain, we believe those modality-specific alpha band responses to be an indicator of cognitive processes (i.e., attention to emotionally salient stimuli) during the affective stimulation. This complex nature of neurophysiological responses to emotional stimulation has consequences for those trying to detect affective user states from neurophysiological activity. Threats for Generalizability and Specificity of Affect Sensing The existence of modality-specific, or more general context-specific correlates of affective responses, can have detrimental effects for the reliable detection of affective states. Firstly, classifiers based on stimulus-specific neurophysiological responses, will be limited in their capability to generalize to affect evoked by other stimuli. Secondly, if these responses are of a general cognitive nature, also occurring in non-affective contexts, such classifiers are prone to confuse purely cognitive and affective events. The nature of the neurophysiological correlates (i.e. affective or cognitive) enabling the successful detection of affective states should be taken into account, when considering the application of such classifiers in real-world scenarios. Otherwise, the use of context-dependent or general cognitive correlates of affect in aBCIs might lead to lacking generalization or specificity of the classifier, Section 4.4 – Discussion: Context-specific and General Correlates | respectively. Figure 4.11 schematically shows between which situations generalization and specificity are compromised. Figure 4.11: Context-specific correlates of affective responses can lead to problems for the generalization and specificity of affective classifiers, especially if the more general correlates of affective responses are not learned by the classifier. The arrows show which components of the auditory affective response (center bar) generalize to the visual affective response (upper bar) and to the auditory (and purely cognitive) attentional response (lower bar). Generalization is compromised when the classifier learns the context-specific affective responses (e.g., fronto-central correlates of auditory affective responses), and hence is not able to recognize affective responses that occur in a different context (e.g., visually induced affect). Specificity is compromised when the classifier detects modality-specific correlates of auditory attentional orientation, and confuses such cognitive state as an indication for an affective response. To deal with such context-specificity, classifiers could either be trained with protocols that take the specific-context of the application into account, resulting in a restriction of the classifier to that context. Alternatively, the applicability of complex induction protocols, as the one used here, for the training of “context-independent” classifiers could be explored. The use of separate classifiers for different sensory contexts might be a way to deal with context-specificity of affective correlates in aBCI. This approach was used by Frantzidis et al. [2010] for the context of user gender, for which differences in the neurophysiological correlates of affect were also found. However, as opposed to user gender, the sensory context might not be known beforehand or change during an application. When information about the modality of affect induction is lacking, it has to be inferred from brain-activity itself. Then adequate context-specific affect classifiers, trained specifically on data from either visual, auditory, or audio-visual induced affective states, can be put to use. For example, Kim and André [2008] used a multi-level dichotomous classification scheme to infer affective state (a quadrant on the valence-arousal plane) during music listening from physiological sensors. First classifying the (low or high) arousal state, and then positive or negative valence within these arousal quadrants, helped increasing the classification performance. Similarly, first classifying induction modality and then the affective state might be helpful in overcoming the outlined threats to classifier reliability, at least if a 111 112 | Chapter 4 – Dissecting Affective Responses robust classifiability of the modality can be assumed. Alternatively, one might restrict the feature selection to general, context-independent features. The invariant common spatial patterns algorithm [Blankertz et al., 2008] enables the consideration of prior neurophysiological knowledge (in this case the increase of the parietal alpha rhythm with fatigue) in the feature selection process. Similarly, parietal and fronto-central alpha power could be excluded when classifying affective states in real-world contexts. However, there is a trade-off between discarding potentially useful and potentially detrimental information. A Chance for Origin-aware Affect Sensing The observed modality-specific responses might be of interest in their own right: assuming that a part of the response to affective manipulation can be attributed to the way (i.e. the type of stimulation) emotions are evoked, it might be possible to distinguish the origin of the affective response via neurophysiological activity. Complex induction protocols, as the one devised here, might be used to extract correlates that indicate specific contextual factors and to train “context-aware” classification schemes. Another interesting consequence of the presumed cognitive nature of the responses is the discrimination of the attended modality in a general context, not restricted to affect. However, it has to be noted that the current study made use of an elaborated preprocessing procedure, including manual steps during artifact rejection, subtraction of pre-trial baselines to reduce variability, and the computation of relevant contrasts from different conditions. These procedures might be difficult to reproduce in an online system. The computation of the effects of affective stimulation from the neutral and the emotional contrasts of a given modality seem necessary to arrive at the modality-specific effects of affect that are the basis for the recognition of the source of the affective response. As has been shown, the physical stimulation - without any consideration of the affective connotation - has a strong effect on the alpha band. While visual stimulation leads to a general decrease of alpha power, auditory stimulation leads to a marked increase of alpha power. Any approach to identify the attended modality might need to remove additional variance resulting from such differences in physical, especially visual stimulation. Diverse Mechanisms of Affect at Work The general correlates of valence and arousal were observed in beta and gamma, and delta and theta frequency bands, respectively. Due to the exploratory nature of the analysis, possibly rejecting a number of null hypotheses falsely, not all effects may be taken at face value. However, the comparison of the frequency domain effects with neuroscientific literature on affect and cognition suggested the involvement of several processes in the affective responses measured by us: attention-like fostering of (modality-specific) sensory processes (parietal and fronto-central alpha), relevance detection processes (delta and theta), response integration and coordination (fronto-medial theta), and affect-related processes in limbic or associated regions (temporal gamma). Such a multiple stage affective response, involving the prerequisites of emotional response and the actual responses, where each stage has its respective neural correlates (i.e., scalp locations and frequency Section 4.4 – Discussion: Context-specific and General Correlates | responses) is also suggested by Scherer’s Component Process Model [Scherer, 2005] and other emotion theories [LeDoux, 1996; Barrett et al., 2007]. The consequence of a distributed nature of the correlates of affect is that it is difficult to focus on one specific frequency band for the detection of affective state. Rather, multiple broad frequency bands respond to the manipulation of the affective state and might hence be used for its recognition. However, as the specific functions of the involved oscillatory responses are not fully understood, and their involvement often depends on certain contextual factors not known, it seems currently impossible to give guidelines for the selection of specific frequency bands for the detection of valence or arousal, respectively. Furthermore, the frequency bands involved have also been reported to respond during cognitive processes (see Section 1.5.2) and thus the existence of purely affective correlates seems questionable. Consequently, any classifier of affect based on such frequency domain features might be susceptible to the misclassification of non-affective, purely cognitive states. To avoid this, further potential features, for example inter-regional synchronizations, have to be explored, and the underlying mechanisms of affective responses have to be further illuminated. 4.4.5 Limitations and Future Directions The presented evidence for modality-specific effects of visual, auditory, and audio-visual affect induction have been observed in response to isolated and rather artificially combined auditory and visual stimuli. This poses potential limitations for the ecological validity of the results, which we will discuss here. These limitations have consequences for the development of future studies aiming at the validation and further inquiry of the presented findings. Limited Ecological Validity One limitation lies in the nature of stimuli used here, that is the use of static images and dynamic sounds. While pictures offer all their (emotional) content at once, leaving their detailed exploration to the viewer, sounds extend over a certain time, and identification of the (emotional) content might require the accrual of information over this time. Though, physiological responses of affect seem overally not affected by such differences in the temporal stimulus dynamic [Bradley and Lang, 2000], this is not necessarily the case for neurophysiological indicators. Furthermore, the combination of static images and dynamic sounds is not a natural stimulation method in the sense that in real-world environments normally visual and auditory information are of a dynamic nature. Therefore, future studies should investigate modality-specific effects with dynamic auditory and dynamic visual stimuli. This was done by Rutkowski et al. [2011] using a data base of emotional utterances and faces, created to test children on symptoms of autism, but without directly comparing the two modality conditions. Alternatively, music videos might be separated into auditory and visual counterparts. If an equal contribution of music and video to the affective reaction can be shown via ratings and physiological responses, they might be used in the same way as pictures and sounds were used in the current study. 113 114 | Chapter 4 – Dissecting Affective Responses An additional advantage would be a higher ecological validity of music videos, though emotions evoked by music might be somewhat different from utilitarian emotions evoked by pictures of mutilations, attacks, or food. Confounds due to Affective (In-)Congruency Another limitation is the creation of artificial multimodal combinations of IADS and IAPS stimuli. The restricted choice of stimuli led to partial mismatches of the content, which might decrease protocol efficacy and induce additional processes and correlates associated with content mismatches [Koelsch et al., 2004; Orgs et al., 2007] or multi-sensory integration [Brefczynski-Lewis et al., 2009; Schneider et al., 2008]. Regarding the processes of multi-sensory integration, mostly enhanced behavioral and physiological responses to congruently combined stimuli compared to the unimodal stimuli have been observed. Similar effects were shown in response to congruently combined affective stimuli (see Section 3.1.2 for a review of cross-modal and multimodal affect induction). The results show, conveniently, that such effects are not likely to be responsible for the effects observed in the present study, as for both modalities similar unimodal and multimodal effects on the EEG signals were found. Especially, the direct contrasts between the visual and auditory affect support our interpretation in the context of modality-specificity. There might nevertheless be effects of multi-sensory integration, especially when considering the differences between the unimodal appropriate and the respective multimodal affective response (see Figure 4.4). For example, the decrease of parietal alpha power during visually-induced affect is smaller than during multimodally-induced affect. However, since we did not see a significant difference between appropriate unimodal and multimodal affect at paα or f cα , these effects of multisensory integration seem rather weak. Nevertheless, to completely exclude confounds due to artificial and hence incongruent stimuli, visual and auditory parts from a naturally occurring multimodal stimulus can be used. Faces with accompanying vocal stimulation are one possibility (e.g., [Spreckelmeyer et al., 2006; Rutkowski et al., 2011]). However, this limits the results of these studies for multimodal affective processing. Alternatively, visual and auditory parts of music videos or film sequences that carry the same affective information might be separated to study their respective effects. Cross-modal Spreading of Attention A further limitation of the present study that warrants validation of the modality-specific effects for real-world applications is the potential spreading of attention from one objectfeature to another. Such cross-modal transfers of attention have often been observed for spatial attention: auditory attended locations also lead to attention-related increases for the visual modality and vice versa [Driver and Spence, 1998]. There is evidence that spreading of attention can also happen independent of the spatial location, but dependent on the attended feature in one modality being part of an object. Consequently, all features of the object receive increased processing [Busse et al., 2005], potentially leading to a spread of alpha desynchronization to other sensory cortices. Section 4.5 – Conclusion | To inquire the persistence of the observed modality-specific effects of affective manipulation during simultaneous audiovisual stimulation, multimodal stimuli should be constructed that allow the manipulation of affect in one modality. In Mühl and Heylen [2009], we conducted a small-sample pilot study using the IAPS and IADS stimuli sets to combine neutral stimuli in one modality with affective stimuli in the other modality. However, the analysis of the subjective ratings indicated the limitation of such a modality-specific manipulation of affect: many stimuli yielded inconsistent effects and about 20% of the affective stimuli failed to induce emotional responses. Besides using multimodal stimuli that naturally come with the affective content in one of the two modalities, whose collection and evaluation might be a great challenge, the above just described approach might be elaborated, for example by restraining the processing to the affect-carrying modality. In addition to the bottom-up manipulation of modality-specific affect origin, a top-down manipulation of attention, for example by instructions to evaluate only one modality, could be used. This approach was chosen by Spreckelmeyer et al. [2006] and indeed yielded different processing loci for visual and auditory attended affective stimuli. 4.5 Conclusion The present study, aimed at the creation and validation of an affect induction protocol that allows to separate affective responses induced by different stimulus modalities into modality-specific and general neurophysiological correlates of the affective response. Subjective ratings and electrocardiographical data validated the protocol’s efficacy, and neurophysiological measurements delivered evidence for the hypothesized modality-specific correlates of affective responses. Taken together, we have shown that neurophysiological correlates of affective responses measured by EEG alpha band power are partially modality-specific: they differ in a systematic way between visual, auditory, and audio-visual stimulation. Visual processes are mainly reflected over parietal regions, while auditory processes are reflected over frontocentral regions. Furthermore, we observed general effects of valence in beta and gamma bands, and general effects of arousal in delta and theta bands. The occurrence of modality-unspecific correlates of affect in several frequency bands indicates the involvement of multiple processes (e.g., significance detection, response synchronization, attention) and structures in the affective response. On the one hand, the general correlates of affect are a promising basis for neurophysiology-based affect detection. On the other hand, they also hint at the complexity of affective responses, which also in the case of general correlates are not necessarily of an exclusively affective nature. The observed modality-specific and potentially cognitive responses to affective stimulation, pose problems for the generalization and specificity of affect classifiers relying on such correlates instead of more general correlates of affect. However, they also imply the possibility to detect the modality, visual or auditory or combined, through which the affective response was elicited. This would make affective BCIs a unique sensor modality 115 116 | Chapter 4 – Dissecting Affective Responses for the detection of emotional responses and the context in which they appear. Further studies on context-specific correlates of affect are necessary to work toward robust aBCIs, and toward the exploitation of the context-specificity of neurophysiological responses. Chapter 5 Conclusion The present work addressed three different, but interrelated issues that are relevant for the work toward affect sensing in the context of human-media interaction, that can be summarized in the following questions: • Study 1 (Chapter 2): Can a literature-based neurophysiological indicator of relaxation, parietal alpha power, be used as additional control modality in playful humancomputer interaction? What are the critical factors of such a transfer from literature to application? • Study 2 (Chapter 3): Can music clips be used to induce affect? What are the neurophysiological correlates of such multi modally induced affective responses? • Study 3 (Chapter 4): Can a multimodal affect induction protocol be used to separate modality-specific and general correlates of affect? What are the consequences of context-specific affective responses? In this chapter we will shortly review the results obtained from the conducted studies, discuss the consequences and remaining issues that research toward affective BCI has to address in the future, and outline the contributions, or take-home messages, of the present work for the field of aBCI. 5.1 5.1.1 Synopsis Evaluating Affective Interaction In Chapter 2 we built a game that enabled the user to control an aspect of a game, opponent speed (i.e., game difficulty), with his affective state as measured by its neurophysiological indicator, parietal alpha band power. High parietal alpha power has been associated with a state of relaxation in several studies and has been used for aBCI game control before. Different from existing approaches, the main part of our game was controlled 118 | Chapter 5 – Conclusion using conventional keyboard input for movement and scoring, as we were interested in exploring the use of a neurophysiological indicator of affect in a normal gaming scenario. To evaluate this hybrid aBCI game, we manipulated the alpha feedback, and thereby the user’s control over the neurophysiological indicator of affect, by replacing it with quasi-random feedback. We predicted differences in users’ game experience (e.g., control, tension) and physiological indicators (heart rate and alpha power). Unexpectedly, the feedback manipulation did not cause any effects of subjective or objective indicators. Despite the discouraging result, the study yielded insight into issues that might have led to the inefficacy of the affective control, and that are, hence, important when implementing aBCI applications. The failure to find effects of the manipulation of aBCI control on subjective and objective indicators of the affect can be attributed to three issues: 1. The reliability of the affect indicator we identified from the literature did not suffice for the application in a complex scenario. The affective indicator used in the game, posterior alpha activity, was associated with a state of relaxed wakefulness by several studies conducted in a neurofeedback or neuroscience context. The context of these experiments and the chosen application scenario are markedly different. In the original studies, participants were only given the task to relax, they were not engaged in any activity. In our study, participants had to absolve a visually intense computer game, which is an active task that requires not only mental activity, but also movement of the eyes to spot and follow opponents and of the hands to control the avatar. The additional processes involved due to movement, visual perception, focusing of attention – all activities that have found to influence alpha band power over parietal and central regions [Pfurtscheller and Lopes da Silva, 1999] – might have had a strong effect on the neurophysiological indicator used. The additional variance in the alpha band would have led to a decrease of the variance explained by arousal, attenuating the relationship between relaxation and its indicator. 2. To make use of the additional modality of “relaxation”, the participants had to selfinduce a state of relaxation, which might be difficult if they were already highly relaxed. We tried to counteract such a possible floor effect by the introduction of a highly difficult game level. The increase of difficulty did indeed lead to higher tension in addition to that generally induced by such a stressor as a computer game. However, the higher difficulty levels did also not lead to an effect of affective control. Neurofeedback studies, which have followed the same principle of enabling control over a neural indicator (mostly not directly associated with a specific mental state, but an abnormal neural quantity that is to be changed to normal), make use of multiple sessions to enable the participants to learn the control over the indicator. Similarly, training the users to achieve a better control over their affective state could be helpful in the acquisition of the skill of mental and bodily relaxation during a complex computer game. 3. The feedback saliency, the clarity of the effect that relaxation has on the game dynamic, is another possible factor that led to the lack of effects of the feedback manipulation. The feedback saliency is directly associated with the mapping of affect to game, and Chapter 5 – Conclusion | crucial for the assessment of the efficacy of the aBCI control using feedback, and hence control, manipulation. We visualized the relaxation state of the user implicitly, by opponent speed, and explicitly, as bar heights on the top of the screen. The temporal fidelity, the integration time of the neurophysiological indicator of affect is another factor that might influence the strength of the perceived relation between relaxation and game. Finally, the artifacts that were introduced by movements in the context of game play, muscle tension, hand and eye movements, are a further source that has an influence on the EEG. Summarizing, the use of the literature to identify indicators of affect for the application in real-world scenarios or computer games is limited by the difference of the environments of controlled and restrictive laboratory study and the application scenario. For example, co-activations that are specific to the environment of application might have negative effects on the indicator reliability. Similarly, self-induced affective states might be difficult to produce in a complex environment. These issues make a validation of the approach necessary - but several factors have to be taken into account: feedback saliency, skill acquisition time. 5.1.2 Inducing Affect by Music Videos Effective affect induction methods are a prerequisite for the exploration of the neurophysiological correlates of affective responses. To study affective responses in the context of human-media interaction, we proposed and evaluated an affect induction protocol which uses multimodal, musical stimuli, music videos, to induce different affective states (Chapter 3). We chose ten out of forty stimuli for each quadrant of the valence-arousal plane, based on their efficacy to induce strong subjective responses in an online study. Half of the chosen stimuli were identified via their affective tags in the social music platform last.fm. The validity of the affect induction procedure, its capability to manipulate the affective state of the participants along the dimensions of valence and arousal, was supported by the expected differences we found for the subjective experience and the physiological responses of the participants. The exploratory correlation analysis of the neurophysiological responses to the affective stimulation delivered further evidence for the validity of the induction procedure. Especially, strong responses for valence in the theta, beta and gamma range were in accordance with the literature. The arousal manipulation, however, did not deliver significant overall effects, though trends toward significance were found for theta and beta bands, and an expected negative correlation with the alpha band was present for some electrodes. The participants’ general preference for specific music pieces was, as expected from high correlations with valence ratings, correlated with partially the same frequency bands as the valence ratings, namely beta and gamma. Furthermore, we found order effects on all frequency bands and explored the influence of a baselining-approach. The computation of responses relative to a pre-trial baseline had an influence on the correlations, leading to a different pattern of results for all ratings. Therefore, we conclude that order effects are an issue for experiments that induce affective states and consequently for affective BCIs, especially when subjective ratings are used as 119 120 | Chapter 5 – Conclusion a ground truth. However, baselining-approaches might also remove genuine correlates of affect, namely such pre-stimulus activity that partially determines the emotional response. The suggested and validated affect induction procedure offers a way to extract a large number of stimuli from the internet, using the rich repository of tagged multimedia platform content. Thereby a large number of multimodal stimuli, specifically produced to induce affect, can be identified and acquired in a relatively short time. These stimuli are consumed on a daily basis by large parts of the normal population and therefor a guarantee for the ecological validity of the stimulation. Physiological studies suggest that the emotional responses are comparable to non-musically induced emotional responses. However, how general or context-specific the musically induced neurophysiological responses are has yet to be explored. Are the correlates observed also valid in other contexts than in passive music consumption? We pointed out some limitations of the current study, specifically of the affect induction by music videos, which have to be resolved in future studies: only modest valence experience for low arousing emotions, a covariation of familiarity with the subjective ratings of affect, and a potential influence of stimulus features on the neurophysiological correlates. We propose a further investigation of the advantages of a subject-specific selection of clips, to yield more extreme valence experiences for low arousing emotions and to avoid confounds by familiarity. Also the combination of musical and an additional affect induction procedure, for example the instruction to imagine the emotion that is to be induced by the clip, might lead to stronger emotional responses. Summarizing, music videos are a viable method to manipulate the affective state of a user along the dimensions of valence and arousal. Furthermore, the large repository of affectively annotated content in social media websites offers a convenient method to preselect stimuli for the creation of novel stimulus sets using rich and natural media. 5.1.3 Dissecting Affective Responses Appraisal theories of emotion suggest the co-occurence of a number of processes involved in the processing of emotional stimuli. This leads to a compound response, consisting of the responses of different affective and affect-related neural subsystems, that is observed during affect induction procedures. Specifically, in addition to general core affective processes, sensory processes that are related to the stimulus evaluation can be expected. In Chapter 4, we aimed at the creation and validation of an affect induction protocol that would allow the separation of affective responses induced by different stimulus modalities, into modality-specific and more general neurophysiological correlates of the affective response. Subjective ratings and electrocardiographical responses validated the protocol’s efficacy to induce affective states differing along the dimensions of valence and arousal. The neurophysiological measurements delivered evidence for the hypothesized modalityspecific correlates of affective responses. Specifically, alpha band power over “visual” parietal regions decreased in response to visual affective stimulation, and increased in response to auditory affective stimulation. The opposite response pattern was observed for alpha band power over “auditory” front-central regions. For combined, audiovisual Chapter 5 – Conclusion | affective stimulation, a decrease over both regions was found. These responses are in line with theories postulating a role of alpha oscillations in sensory processes, such as attentional orientation, and are therefore likely to be of an affect-related, but general cognitive nature. Exploring the more general correlates of the affective response, we found several frequency bands that responded to valence and arousal manipulations. Low and high valence were differentiated by beta and gamma frequency bands. Low and high arousal were differentiated by delta and theta frequency bands. Literature on affective responses in the frequency domain related these to different processes and structures that are involved in affective responses (e.g., significance detection, attention, response synchronization). These general correlates might therefore represent a promising basis for the detection of affective states. However, they also indicate that affective responses are of a complex nature, involving processes and mechanisms that are not necessarily of an exclusively affective nature. The complexity of affective responses, involving a number of processes which are partially specific to the circumstances of the affect elicitation or of a general cognitive nature, has implications for the implementation of aBCIs. The observed modality-specific, potentially cognitive responses to affective stimulation, pose problems for the generalization and specificity of affect classifiers relying on such correlates instead of more general correlates of affect. However, they also imply the possibility to detect the modality, visual or auditory or combined, through which the affective response was elicited. This would make affective BCIs a unique sensor modality for the detection of emotional responses and the context in which they appear. Further studies on context-specific correlates of affect are necessary to work toward robust aBCIs, and toward the exploitation of the context-specificity of neurophysiological responses. Summarizing, the devised affect induction protocol is able to manipulate the affective state with auditory, visual, and audiovisual stimuli. The observed neurophysiological correlates of affective responses measured by EEG alpha band power are partially modalityspecific: they differ in a systematic way between visual, auditory, and audio-visual stimulation. Visual processes are mainly reflected over parietal regions, while auditory processes are reflected over fronto-central regions. In addition, more general correlates of the affective response were found. Such affect induction protocols can be used to collect data for the training of context-independent or even context-aware aBCIs. 5.2 Remaining Issues and Future Directions We found that affective responses are accompanied by complex patterns of neurophysiological activity in the frequency domain. Already the literature on the neurophysiology of affect is characterized by a multitude of non-overlapping or even contradictory findings, which hints at the many factors that influence the characteristics of affective responses and their respective neurophysiological correlates. This is also suggested by the differences in neurophysiological responses observed between the two multimodal affect induction studies we conducted, and more directly by the observed modality-specificity of affective 121 122 | Chapter 5 – Conclusion responses to visual and auditory stimuli. Furthermore, the application of an indicator of affect to control parts of the game dynamic in a computer game failed, despite being based on a number of studies showing the efficacy of the used indicator. We identified several possible sources of the failure of our approach, one being the compromised reliability of an indicator of the affective state in a complex HCI environment. The generalization from well-controlled and restricted laboratory studies to environments that require many complex mental processes is a key point for the successful transfer of literature knowledge to praxis. This leads to consequences for the successful implementation of aBCIs, and for the development of future generations of more powerful BCIs. Firstly, the mechanisms and correlates of affect, and specifically the contextual factors that partially determine the correlates of affective states, have to be understood to develop reliable aBCIs. Secondly, the rich information that the human brain holds about the context of a given affective response might be exploited for affect recognition. Thirdly, the affect induction protocol used to train affective BCIs should be oriented at the specific application context that the aBCI will work in, so that confounds due to differing affective mechanisms and cognitive co-activations can be avoided. Below we will further develop these notions. 5.2.1 Mechanisms and Correlates of Affective Responses As formulated by Phelps [2006], it is “apparent that much like the study of cognition divided functions into different domains, such as memory, attention, and reasoning, the concept of emotion has a structural architecture that may be similarly diverse and complex”. This complexity of affective responses is formulated and predicted by appraisal theories of emotion [Sander et al., 2005; Barrett et al., 2007]. Consequently, what is observed in response to an emotional stimulus event can be described as a compound response, consisting of the combined responses of the relevant neural subsystems directly or indirectly involved in the processing of the emotional event. This is especially relevant for studies that make use of EEG measures, as these are notorious for their low spatial resolution, making the identification of specific subprocesses difficult. Our own studies have shown that no single frequency band and location differs between affective states. Rather, we found different frequency ranges responding to valence, for example theta, beta, and gamma bands, and to arousal, for example delta and theta bands (see Chapter 3 and 4). The literature on functional correlates of these oscillations to affective or cognitive processes is vast and not conclusive. Especially, the difficulty to associate EEG frequency responses with their sources, subcortical, cortical, or muscular, impedes the clear attribution of neural structures and a better interpretation with regard to the underlying mental processes responsible for them. Furthermore, depending on the specific context in which the affective response is elicited and observed, specific subsystems might respond differently. For example, affective responses to visual stimulus events can be differentiated from affective responses to auditory stimulus events, presumably based on the activation of the respective sensory cortices (see Chapter 4). Chapter 5 – Conclusion | This complexity of the neural processes associated with affective responses can be considered as curse and blessing at the same time. While it makes the identification of the core affective components of the neural responses extremely difficult, it also opens up new possibilities for the identification of the circumstances of affective responses based on their neural pattern. Consequently, it is important to explore and identify the “atoms” of affective responses, and their respective neurophysiological correlates. Appraisal theories, for example the Component Process Model [Sander et al., 2005], can give guidance of which mechanisms and subprocesses are involved in the overall affective responses. Researchers that aim at the identification of affective states might profit from a dissection of the expected responses in terms of the mechanisms (e.g., core affective processes, sensory orientation, behavior planning) involved, and their respective neural correlates. While this endeavor seems not the core interest of aBCI research, we argue that it is of prime importance for the development of reliable affect classification, especially in the context of the complex processes that occur during human-media interaction. In that sense, the goals of more traditional emotion research and the aBCI community overlap, and a joint effort to understand the mechanism and correlates of affective responses seems imperative. In Chapter 4, we devised a multimodal affect induction protocol, to identify contextspecific correlates of the affective response. This is a first step in delineating the mechanisms of affect that occur in complex multimodal environments. We were able to associate emotional responses in the alpha band with modality-specific, auditory and visual processes, that are potentially of a general cognitive nature similar to attentional responses. Such affect induction procedures not only separate context-specific from more general correlates of the affective response, they can also be used for the selection of features and the training of affect classifiers that are either “blind” to contextual factors or that even allow a context-sensitive classification of affective responses. Classifiers that are able to discriminate between core and context-specific correlates would lead to improvements of the performance of affect detection approaches. We will elaborate this argument further in Section 5.2.2. Another way to increase the reliability of aBCI approaches is the exploration of further methods and measurements to decipher the complex processes that establish and accompany affective responses. As alternative or in addition to the here studied frequency domain features, neural source localization of different components of the affective response [Onton and Makeig, 2009], connectivity changes between brain regions accompanying affective responses [Miskovic and Schmidt, 2010], or the development of neural signatures of affective processes over time can be explored. Concerning the latter temporal unfolding of affective processes, Scherer’s Component Process Model, postulates a development of the affect-related processes over time [Sander et al., 2005]. An emotional stimulus undergoes a series of “checks” that successively evaluate its relevance according to factors such as novelty, intrinsic pleasantness, goal conduciveness. Grandjean et al. [2008] have found evidences for this temporal structure over time in the frequency bands, though the existence of a clear (and applicable) chronometry of their occurrence has still to be determined. Rudrauf et al. [2009] also showed a complex development of neurophysiological responses to affective pictures. Similarly, for the processing of emotional facial 123 124 | Chapter 5 – Conclusion expressions, evidence for several stages of information elaboration was found in the EEG [Balconi and Pozzoli, 2009]. Future research can explore the neural sources of affective and affect-related cognitive processes, the cross-talk between them, and their development over time to further refine the classification of affect, and disambiguate affective and cognitive processes. Alternatively, other physiological measures can be used to increase the classification accuracy. Electrophysiological measurements that can be obtained with EEG devices are limited in spatial resolution and in terms of the structures that can be assessed in terms of their activity. Besides exploring the possibilities that EEG offers for the detection of affect and its contextual factors, the fusion of EEG with measurements obtained from other sensor modalities, for example activations of the autonomous or somatic nervous systems, has to be explored. Our own studies have shown that physiological sensors, measuring heart rate, muscle tension, and electrodermal activity, respond to affective stimulation in the context of human-media interaction. Chanel et al. [2005] reported an increase for the performance of arousal detection, when incorporating physiological sensors in addition to EEG frequency domain features. Though, they might be similarly susceptible to contextual factors (see for example [Stemmler et al., 2001]), in unison with EEG, physiological sensors could lead to a more reliable affect recognition. Such “hybrid BCI” systems [Pfurtscheller et al., 2010], might be the next step on the way toward the automatic detection of affect. Thereby, the complexity of neurophysiological activity could prove to be a valuable asset for the identification of the circumstances of the affective response. 5.2.2 Context-specific correlates of affect We have observed context-specific correlates of affect in response to different affect induction modalities. These effects are likely to be of a rather general cognitive nature as observed during any episodes of increased sensory processing, as for example the case when attending a stimulus in a specific modality. We have discussed theoretical threats for generalization and specificity of affect classifiers that are trained using data collected during unimodal affective stimulation (see Chapter 4). However, as we did not attempt the classification of affective responses based on their neurophysiological correlates, the practical impact of such context-specific correlates has to be determined. More generally and in the sense of the previous section, the dependence of affective responses on contextual factors should be determined, including for example also correlates of behavior planning in more active aBCI scenarios. On a more positive note, future studies should evaluate the viability of the classification of context-dependent correlates, as they may provide a rich description of the circumstances a given affective response was observed in. An example is, of course, the determination of the origin of affective events in multimodal settings where such knowledge can be used for implicit tagging approaches. These try to predict the affective responses of consumers to the content of media that they are exposed to, for example music videos or films. Knowing, whether the user response was elicited by a particular auditory event (for example a song fragment) or a visual event (an aesthetically beautiful view or well photographed scene) can enrich the informativeness and quality of the affective tags at- Chapter 5 – Conclusion | tached to the multimedia content. Later on, when the tags are used by a recommendation system to suggest media that fits a certain affective criteria, additional information about the visual or auditory nature of the media can help to extract media better suited to serve the wishes of the user. Due to the specific design of our affect induction procedure, however, it is not possible to determine, whether such modality-specific differences in the affective response can also be observed in real-world settings. There is a defining difference between our study and potential application scenarios. Whereas we used isolated unimodal visual and auditory affective stimulation to study the different affective responses, real-world settings are mostly characterized by the simultaneous experience of visual and auditory stimuli. Due to the cross-modal transfer of attentional effects [Driver and Spence, 1998], the modalityspecific responses might be less distinct. This could be tested using a simultaneous visual and auditory stimulation, manipulating only the origin of the affective information, as attempted by Spreckelmeyer et al. [2006]. Below, we will further discuss this issue, the ecological validity of affect induction protocols, that is the capability to generalize effects from well-controlled and restricted experiments to their application in complex environments of human-media interaction. 5.2.3 Affect Induction and Validation in the Context of HMI A further consequence of the complexity of affective responses is the difficulty of simply implementing indicators of affective user states that were identified from literature. In our first study, we evaluated the use of an indicator of the affective user state of relaxation, posterior alpha power, for the control in a computer game. The manipulation of the controllability of the game by user-induced states did not affect user experience or physiological measures, despite several studies reporting the association of posterior alpha power with a relaxed state of mind (see Chapter 2). A major issue for the application of affect indicators in human-computer interaction is the reliability of the used indicator of affect under the complex conditions that are present during normal human-computer interaction. Most research that is conducted in the fields of psychophysiology and especially in the neurosciences is carried out in the form of well-controlled and restricted laboratory experiments. The experiment protocols are optimized to observe the effects that are of relevance for the research questions at hand. Participants are given a simple task, instructed not to move, and the influence of additional factors is minimized. These conditions are very different from what can be observed during human-computer interaction. Here the users are carrying out quite complex tasks, maybe even several tasks simultaneously. They are allowed and have to move, while they are perceiving a constantly changing environment, as is for example the case in a computer game. These differences between more traditional laboratory research and research carried out in complex human-computer interaction environments might be a problem for the generalization of observed effects from one environment to the other. Picard et al. [2001] refer to one of the factors that should be taken into account when designing experiments for the study of the relationship between physiological measurements and affective states 125 126 | Chapter 5 – Conclusion as “Lab setting versus real-world”. In the face of the complexity of emotional responses, especially those that are measured in terms of neural activity, this factor seems to us of prime importance. For aBCI applications, it directly pertains to the ecological validity of the indicator, its generalization from the study that associated it with a certain affective state to a context in which it is applied. The changed circumstances, including additional mental processes and different contextual factors, might lead to severe changes of the correspondence between indicator and state. An example for such “unreliable” correlates are the modality-specific affective responses reported in Chapter 4. While parietal alpha power is decreasing with visual emotional stimulation, it increases with auditory emotional stimulation. On the other hand, effects that are found to be statistically significant in traditional laboratory studies, for example the relation of posterior alpha power with relaxation, might not be strong enough, and hence significantly related to a state of relaxation in applied aBCI scenarios. Especially, additional variance due to the complex visual stimulation (as found in Chapter 4) might conceal the relationship between alpha activity and relaxation. Consequently, researchers that aim at the development of aBCI approaches have to take into account the context in which a given indicator of affect is supposed to work. That entails two implications for affect induction and indicator validation in applied and active (using self-induced affect) aBCI settings. Firstly, the reliability of literature-based indicators of affect for aBCI has to be critically evaluated in the application environment. We devised one possible validation scheme, a double-blind study manipulating the controllability of the affective control, and formulated expectations for subjective and objective measures of the user experience. Other validation schemes are possible. Especially, when the user has to self-induce different affective states to control the application, these can be directly compared to each other, also in terms of consequent changes in subjective and objective measures. The advantage of such validation methodologies is, that not only the reliability of the indicators of affect can be assessed, but also the users capacity to manipulate their affective states during human-computer interaction. Furthermore, the adequacy of the feedback conditions can be evaluated and adjusted (see Chapter 2 for some considerations pertaining the feedback saliency). The disadvantage, however, is the laborious evaluation of alternative indicators, and the factors that might influence their reliability. Alternatively, correlates of the relevant affective states can be induced and identified in the environment where they will later be applied. For example, in the case of relaxation as an auxiliary control modality during normal human-computer interaction, participants can be asked to relax in one condition, while playing normally in the other. To encourage relaxation, other affect induction procedures might be used prior to the relaxation condition, for example using relaxing music videos or nature scenes. Care should be given to the more specific effects of such priming methods. It is important that the correlates are not specific to the prior affect induction, and that the affective state can also be attained by the user without such stimulation procedures. On a more general note, any affect induction protocol that aims at the collection of data for the training of affective classifiers, used in passive and active aBCI paradigms, has to be designed to ensure the ecological validity of the inferences, their generalization to the context of use. In that regard, the induction of affect with music videos explored Chapter 5 – Conclusion | in Chapter 3 is one possible approach to identify the correlates of emotional responses in complex, multimodal environments. Other scenarios of aBCI use, require different affect induction procedures that specifically reflect the situation at the time of intended use. Recent examples of such protocols from the human-computer interaction domain include the induction of frustration during game play [Reuderink et al., 2009, 2013], and the induction of different levels of engagement during game play [Chanel et al., 2011]. Both studies have evaluated the correlates of the respective affective states, and were partially able to relate them to the literature on affective and cognitive processes expected in these specific contexts. 5.3 Contributions Our studies covered a number of different, but interrelated questions. Below, we will shortly capture the contributions we have made to the emerging field of affective braincomputer interfaces. • The application of neurophysiological correlates of affect for active human-computer interaction needs the validation in the context of the application. We outlined 3 factors that seem essential to us for the successful transfer of literature-based indicators to praxis: reliability of the indicator in the HCI scenario, capability of users to learn to induce the affective state in that context, and saliency of the indicator feedback. • We devised and validated two multimodal affect induction procedures for the exploration of correlates of affect and for the collection of training material for aBCI in the context of human-media interaction. • We found that neurophysiological activity is informative about the affective state, as determined by valence and arousal experience, of the user in such multimodal stimulation contexts. Valence and arousal manipulations were associated with diverse frequency bands. Similarly, user preference, a personal aesthetic judgement for a given stimulus, is accompanied by responses in the frequency bands that might be used for implicit tagging of media content. • We showed that context-specific cognitive co-activations in response to affective stimulation can be separated from more general correlates of affect. The devised induction protocol can be used to collect data for aBCIs that focus on general correlates to avoid problems of specificity and generalization of context-specific correlates. On the other hand, the data can be used to train context-sensitive classifiers that are able to identify the source of the affective response. • The observed responses in the different frequency bands, and their presumably associated processes as identified from the literature, indicate the complexity of affective responses, as predicted by appraisal theories. For reliable affect estimation from brain activity an understanding of the different mechanisms and processes involved in affective responses and their neural correlates is necessary. 127 128 | Chapter 5 – Conclusion • We argued that due to the complexity and context-specificity of affective responses, affect induction paradigms that aim at the collection of data for the training of classifiers for aBCI applications should be as similar as possible to the intended context of the later application scenario. This would minimize the risk of detrimental effects due to differing context-dependent affective responses. • We published a database [Koelstra et al., 2012] consisting of behavioral, physiological, and neurophysiological measurements from 32 participants during music videobased affect induction. This data can be used to explore the suitability of alternative methods and features, for example neural source localization, inter-regional connectivity. Concluding, affective brain-computer interfaces are a fascinating mixture of diverse fields of research. They combine the intriguing and complex nature of the human emotions, the exploration of their neurophysiological basis, and the endeavor to decipher them algorithmically. In the present work, we focused on the correlates of affective states, exploring the conditions and limitations under which they occur and might be used. Clearly, this is just a small, though important, aspect of the work necessary toward affective braincomputer interfaces. The brain offers a great source of information on the user state in general, and future research on the border between psychology, affective neuroscience, and computer science is prone to elucidate not only the use of the brain for user interfaces, but also holds the promise to shed light on the working of the human mind. Appendix A Suplementary Material Chapter 3 130 | Chapter A – Suplementary Material Chapter 3 Correlations between Frequency Bands and Valence Table A.1: The correlation coefficients (ρ) and significances (p-value) of the correlations between the singleelectrode (E) frequency band power and the valence ratings of all participants. Significant correlations (p < 0.05) are highlighted for all frequency bands (also those not significant in the general correlation analysis). E Fp1 AF3 F7 F3 FC1 FC5 T7 C3 CP1 CP5 P7 P3 Pz PO3 O1 Oz O2 PO4 P4 P8 CP6 CP2 C4 T8 FC6 FC2 F4 F8 AF4 Fp2 Fz Cz θ α low γ β high γ ρ p-value ρ p-value ρ p-value ρ p-value ρ p-value 0.10 0.08 0.08 0.09 0.06 0.04 0.04 0.06 0.07 0.03 0.09 0.08 0.10 0.11 0.14 0.19 0.17 0.12 0.14 0.13 0.05 0.07 0.04 0.07 0.06 0.08 0.11 0.08 0.06 0.12 0.10 0.09 0.002 0.006 0.004 0.003 0.039 0.174 0.123 0.049 0.014 0.322 0.003 0.007 < 0.001 < 0.001 < 0.001 < 0.001 < 0.001 < 0.001 < 0.001 < 0.001 0.117 0.011 0.158 0.028 0.029 0.007 < 0.001 0.009 0.049 < 0.001 < 0.001 0.004 0.06 0.05 0.05 0.03 0.02 0.015 0.016 0.01 0.01 0.022 0.01 0.02 0.02 0.03 0.03 0.06 0.06 0.06 0.07 0.04 0.01 0.01 0.02 0.02 0.02 0.03 0.06 0.04 0.04 0.05 0.04 0.04 0.034 0.07 0.101 0.294 0.468 0.6 0.577 0.734 0.764 0.439 0.669 0.578 0.591 0.337 0.259 0.03 0.042 0.046 0.01 0.205 0.713 0.822 0.611 0.575 0.585 0.384 0.033 0.18 0.158 0.074 0.129 0.128 0.01 -0.03 0.04 -0.03 -0.05 0.09 0.12 0.04 0.03 0.06 0.02 0.02 0.01 0.02 0.03 0.06 0.03 0.01 0.02 0.04 0.03 0.04 0.07 0.09 0.08 -0.02 -0.02 0.11 -0.08 -0.03 -0.05 -0.01 0.713 0.279 0.201 0.327 0.085 0.002 < 0.001 0.138 0.389 0.031 0.482 0.433 0.745 0.622 0.263 0.068 0.265 0.875 0.585 0.203 0.271 0.233 0.024 0.003 0.01 0.598 0.543 < 0.001 0.007 0.309 0.091 0.748 0.01 -0.02 0.07 0.01 0.06 0.09 0.10 0.09 0.10 0.10 0.08 0.09 0.09 0.05 0.06 0.07 0.03 0.06 0.09 0.10 0.09 0.08 0.11 0.08 0.13 0.12 0.04 0.11 0.01 -0.02 -0.03 0.08 0.639 0.55 0.017 0.874 0.044 0.001 0.001 0.002 0.001 0.001 0.008 0.004 0.002 0.089 0.053 0.028 0.306 0.03 0.003 < 0.001 0.002 0.011 < 0.001 0.006 < 0.001 < 0.001 0.218 < 0.001 0.759 0.589 0.299 0.005 -0.01 -0.01 0.07 0.03 0.08 0.11 0.13 0.11 0.08 0.11 0.08 0.08 0.09 0.05 0.03 0.06 0.05 0.06 0.09 0.08 0.08 0.10 0.13 0.12 0.14 0.09 0.07 0.11 0.01 -0.03 0.02 0.09 0.673 0.851 0.015 0.263 0.009 < 0.001 < 0.001 < 0.001 0.006 < 0.001 0.008 0.006 0.002 0.106 0.389 0.025 0.066 0.037 0.003 0.008 0.007 0.001 < 0.001 < 0.001 < 0.001 0.002 0.018 < 0.001 0.855 0.296 0.555 0.003 Chapter A – Suplementary Material Chapter 3 | Correlations between Frequency Bands and Arousal Table A.2: The correlation coefficients (ρ) and significances (p-value) of the correlations between the singleelectrode (E) frequency band power and the arousal ratings of all participants. Significant correlations (p < 0.05) are highlighted for all frequency bands (also those not significant in the general correlation analysis). E Fp1 AF3 F7 F3 FC1 FC5 T7 C3 CP1 CP5 P7 P3 Pz PO3 O1 Oz O2 PO4 P4 P8 CP6 CP2 C4 T8 FC6 FC2 F4 F8 AF4 Fp2 Fz Cz θ α low γ β high γ ρ p-value ρ p-value ρ p-value ρ p-value ρ p-value 0.01 -0.01 -0.02 -0.02 -0.04 -0.04 -0.02 -0.03 0.01 -0.04 -0.02 0.04 0.07 0.08 0.04 0.07 0.03 0.03 0.08 -0.01 -0.03 0.02 0.01 0.03 0.01 -0.02 0.01 0.03 0.01 0.05 -0.01 0.01 0.553 0.849 0.45 0.412 0.203 0.201 0.464 0.377 0.77 0.177 0.517 0.241 0.02 0.01 0.219 0.017 0.342 0.395 0.007 0.743 0.384 0.481 0.818 0.349 0.814 0.556 0.874 0.24 0.968 0.089 0.63 0.84 -0.03 -0.04 -0.07 -0.07 -0.08 -0.07 -0.04 -0.02 -0.02 -0.04 -0.08 -0.03 -0.04 -0.05 -0.07 -0.06 -0.07 -0.06 -0.03 -0.05 -0.04 -0.03 -0.03 -0.03 -0.01 -0.09 -0.01 -0.01 -0.02 -0.02 -0.03 -0.06 0.288 0.12 0.021 0.011 0.005 0.018 0.142 0.542 0.583 0.194 0.009 0.239 0.133 0.065 0.01 0.049 0.015 0.043 0.339 0.068 0.173 0.236 0.229 0.359 0.702 0.001 0.654 0.802 0.462 0.559 0.258 0.042 -0.01 -0.05 -0.02 -0.09 -0.11 -0.02 -0.01 -0.06 -0.02 -0.03 -0.04 -0.03 -0.02 -0.03 -0.02 -0.01 -0.04 -0.05 0 -0.03 -0.07 -0.01 0.03 0.01 -0.01 -0.08 -0.04 -0.02 -0.08 -0.04 -0.03 -0.01 0.784 0.098 0.598 0.002 0.001 0.611 0.96 0.03 0.407 0.369 0.211 0.376 0.577 0.346 0.429 0.783 0.208 0.079 0.998 0.248 0.02 0.842 0.315 0.785 0.692 0.009 0.125 0.476 0.006 0.135 0.278 0.71 0.01 -0.05 0.01 -0.08 -0.05 0.02 0.03 -0.03 -0.01 0.01 -0.01 -0.01 0.01 0.02 0.02 0.01 0.01 -0.01 0.02 0.02 0.02 -0.01 0.04 0.05 0.02 -0.01 -0.02 -0.01 -0.07 -0.02 -0.06 0.02 0.962 0.103 0.659 0.008 0.069 0.527 0.411 0.273 0.675 0.896 0.867 0.873 0.88 0.435 0.434 0.956 0.821 0.802 0.499 0.373 0.55 0.635 0.17 0.116 0.582 0.826 0.476 0.714 0.023 0.442 0.039 0.618 0.01 -0.02 0.02 -0.04 0.02 0.02 0.02 0.04 0.02 0.03 0.01 0.01 0.03 0.03 0.03 0.01 0.02 0.03 0.05 0.06 0.04 0.04 0.04 0.05 0.06 0.01 -0.01 -0.01 -0.01 -0.01 -0.01 0.03 0.86 0.449 0.511 0.212 0.526 0.6 0.532 0.245 0.419 0.301 0.978 0.742 0.37 0.405 0.369 0.861 0.557 0.356 0.09 0.048 0.178 0.164 0.233 0.07 0.052 0.714 0.619 0.834 0.628 0.848 0.953 0.324 131 132 | Chapter A – Suplementary Material Chapter 3 Correlations between Frequency Bands and Preference Table A.3: The correlation coefficients (ρ) and significances (p-value) of the correlations between the singleelectrode (E) frequency band power and the preference ratings of all participants. Significant correlations (p < 0.05) are highlighted for all frequency bands (also those not significant in the general correlation analysis). E Fp1 AF3 F7 F3 FC1 FC5 T7 C3 CP1 CP5 P7 P3 Pz PO3 O1 Oz O2 PO4 P4 P8 CP6 CP2 C4 T8 FC6 FC2 F4 F8 AF4 Fp2 Fz Cz θ α low γ β high γ ρ p-value ρ p-value ρ p-value ρ p-value ρ p-value 0.05 0.04 0.02 0.02 -0.02 -0.01 0.01 0.01 -0.01 0.01 0 0.01 0.03 0.05 0.04 0.09 0.06 0.02 0.06 0.05 -0.01 0.01 0.01 0.03 0.02 -0.02 0.03 0.03 0.01 0.06 0.03 0.02 0.118 0.165 0.577 0.531 0.575 0.734 0.615 0.959 0.942 0.83 0.986 0.643 0.24 0.072 0.207 0.002 0.049 0.421 0.051 0.084 0.607 0.863 0.635 0.327 0.391 0.446 0.302 0.25 0.654 0.053 0.305 0.553 0.03 0.02 0 0.01 0.01 -0.02 -0.02 -0.02 -0.03 -0.03 -0.03 -0.04 -0.02 -0.02 -0.03 0 -0.01 0 0.03 0.03 -0.02 -0.01 -0.01 0 0.01 -0.01 0.03 0.02 0.02 0.03 0.02 0.03 0.283 0.435 0.923 0.819 0.735 0.554 0.554 0.404 0.353 0.343 0.244 0.146 0.407 0.556 0.251 0.941 0.871 0.985 0.242 0.396 0.524 0.741 0.868 0.898 0.863 0.789 0.305 0.6 0.559 0.328 0.598 0.346 0.07 0.03 0.03 0 -0.05 0.10 0.09 0 0 0.03 0.02 -0.01 -0.01 -0.02 0 0.04 0.02 -0.03 0.02 0.05 0.02 0.03 0.05 0.06 0.06 -0.02 0.02 0.09 -0.01 0.05 -0.01 0 0.013 0.377 0.339 0.965 0.116 < 0.001 0.001 0.93 0.884 0.376 0.612 0.862 0.769 0.54 0.911 0.214 0.424 0.375 0.504 0.128 0.507 0.274 0.07 0.058 0.029 0.478 0.443 0.003 0.704 0.072 0.634 0.959 0.07 0.02 0.05 0.02 0.06 0.10 0.07 0.05 0.06 0.05 0.06 0.05 0.07 0.04 0.04 0.07 0.06 0.06 0.08 0.09 0.09 0.06 0.08 0.07 0.10 0.08 0.04 0.09 0.05 0.04 -0.01 0.08 0.012 0.52 0.085 0.445 0.036 0.001 0.025 0.092 0.051 0.077 0.064 0.076 0.027 0.206 0.205 0.015 0.039 0.065 0.009 0.002 0.003 0.031 0.006 0.021 0.001 0.005 0.159 0.003 0.108 0.197 0.651 0.009 0.06 0.05 0.04 0.03 0.08 0.10 0.08 0.05 0.05 0.05 0.05 0.04 0.04 0.04 0.02 0.05 0.07 0.06 0.07 0.10 0.08 0.09 0.09 0.10 0.11 0.09 0.06 0.08 0.08 0.04 0.07 0.10 0.032 0.073 0.201 0.246 0.003 0.001 0.008 0.068 0.069 0.088 0.11 0.201 0.135 0.161 0.511 0.104 0.014 0.058 0.013 0.002 0.008 0.003 0.003 < 0.001 < 0.001 0.004 0.065 0.009 0.011 0.186 0.013 0.001 Appendix B Suplementary Material Chapter 4 134 | Chapter B – Suplementary Material Chapter 4 Effects of Stimulation Modality in the Delta Band ANOVA Main Effects of Stimulation Modality Table B.1: Main effects of modality for the significant (p < 0.05) electrodes E in the delta frequency band. Mean and standard deviation for the visual, auditory, and audiovisual conditions, and the statistical indicators F-value, p-value, and η p 2 . E O1 Oz O2 P8 Visual Auditory Audiovisual ANOVA x̄ s x̄ s x̄ s F p-value ηp 2 0.132 0.131 0.136 0.147 0.109 0.131 0.125 0.128 0.050 0.059 0.065 0.039 0.074 0.072 0.084 0.088 0.129 0.134 0.112 0.096 0.146 0.145 0.116 0.140 5.563 4.613 3.329 4.263 0.006 0.014 0.044 0.020 0.194 0.167 0.126 0.156 Post-hoc Ttests between Pairs of Stimulation Modalities Table B.2: The effects of modality in the delta frequency band. T-values and significances (p-value) for the comparisons of the effects of the different modalities: visual versus auditory (VvsA), visual versus audiovisual (VvsAV), and audiovisual vs auditory (AVvsA). E O1 Oz O2 P8 V vs A V vs AV AV vs A t-value p-value t-value p-value t-value p-value 2.774 2.078 2.596 4.447 0.011 0.048 0.016 < 0.001 0.214 0.013 0.635 1.730 0.832 0.989 0.530 0.096 2.961 2.953 3.020 3.674 0.007 0.007 0.006 0.001 Chapter B – Suplementary Material Chapter 4 | Effects of Stimulation Modality in the Alpha Band ANOVA Main Effects of Stimulation Modality Table B.3: Main effects of modality for the significant (p < 0.05) electrodes E in the alpha frequency band. Mean and standard deviation for the visual, auditory, and audiovisual conditions, and the statistical indicators F-value, p-value, and η p 2 . E AF3 F7 F3 FC1 FC5 T7 C3 CP1 CP5 P7 P3 Pz PO3 O1 O2 PO4 P4 P8 CP6 CP2 C4 T8 FC6 FC2 F4 F8 AF4 Fp2 Fz Cz Visual Auditory Audiovisual ANOVA x̄ s x̄ s x̄ s F p-value ηp 2 -0.069 -0.046 -0.068 -0.054 -0.051 -0.015 -0.043 -0.070 -0.076 -0.123 -0.144 -0.086 -0.178 -0.095 -0.029 -0.129 -0.135 -0.121 -0.083 -0.081 -0.053 -0.024 -0.030 -0.060 -0.048 -0.029 -0.046 -0.067 -0.055 -0.019 0.192 0.148 0.190 0.204 0.146 0.139 0.154 0.186 0.155 0.237 0.238 0.249 0.295 0.276 0.306 0.294 0.248 0.290 0.202 0.207 0.158 0.143 0.166 0.189 0.184 0.156 0.178 0.181 0.196 0.161 0.078 0.052 0.087 0.104 0.046 0.012 0.111 0.123 0.086 0.062 0.172 0.142 0.135 0.088 0.098 0.172 0.193 0.116 0.129 0.145 0.124 0.051 0.074 0.103 0.115 0.054 0.103 0.085 0.133 0.081 0.152 0.184 0.156 0.131 0.179 0.177 0.169 0.148 0.174 0.169 0.197 0.190 0.180 0.158 0.169 0.265 0.244 0.188 0.184 0.188 0.172 0.161 0.164 0.141 0.141 0.150 0.149 0.139 0.138 0.133 -0.043 -0.088 -0.081 -0.106 -0.115 -0.104 -0.076 -0.101 -0.102 -0.195 -0.121 -0.105 -0.192 -0.133 -0.091 -0.142 -0.140 -0.189 -0.101 -0.110 -0.044 -0.065 -0.053 -0.130 -0.087 -0.059 -0.089 -0.044 -0.096 -0.109 0.189 0.171 0.205 0.246 0.192 0.169 0.247 0.261 0.223 0.288 0.254 0.238 0.332 0.325 0.285 0.261 0.265 0.310 0.227 0.199 0.179 0.178 0.167 0.202 0.205 0.196 0.197 0.181 0.229 0.222 4.303 4.489 5.721 8.116 5.063 3.688 6.532 8.537 8.294 7.842 15.644 8.847 11.502 5.824 3.287 8.112 11.193 8.928 8.701 10.567 7.363 3.394 4.382 10.899 8.599 3.385 7.731 5.660 11.549 6.163 0.019 0.016 0.006 0.001 0.010 0.032 0.003 0.001 0.001 0.001 < 0.001 0.001 0.001 0.005 0.046 0.001 0.001 0.001 0.001 0.001 0.001 0.042 0.018 0.001 0.001 0.042 0.001 0.006 0.001 0.004 0.157 0.163 0.199 0.260 0.180 0.138 0.221 0.270 0.265 0.254 0.404 0.277 0.333 0.202 0.125 0.260 0.327 0.279 0.274 0.314 0.242 0.128 0.160 0.321 0.272 0.128 0.251 0.197 0.334 0.211 135 136 | Chapter B – Suplementary Material Chapter 4 Post-hoc Ttests between Pairs of Stimulation Modalities Table B.4: The effects of modality in the alpha frequency band. T-values and significances (p-value) for the comparisons of the effects of the different modalities: visual versus auditory (VvsA), visual versus audiovisual (VvsAV), and audiovisual vs auditory (AVvsA). E AF3 F7 F3 FC1 FC5 T7 C3 CP1 CP5 P7 P3 Pz PO3 O1 O2 PO4 P4 P8 CP6 CP2 C4 T8 FC6 FC2 F4 F8 AF4 Fp2 Fz Cz V vs A V vs AV AV vs A t-value p-value t-value p-value t-value p-value -2.735 -2.781 -2.858 -2.971 -2.703 -1.642 -3.175 -3.147 -2.618 -2.198 -3.539 -2.458 -2.814 -2.139 -1.830 -2.615 -3.175 -2.945 -3.377 -2.820 -2.939 -2.440 -2.450 -2.746 -2.761 -2.531 -2.647 -2.694 -3.104 -2.052 0.011 0.010 0.008 0.006 0.012 0.113 0.004 0.004 0.015 0.037 0.001 0.021 0.009 0.042 0.079 0.015 0.004 0.007 0.002 0.009 0.007 0.022 0.022 0.011 0.010 0.018 0.014 0.012 0.004 0.051 0.526 0.331 0.598 1.487 0.929 2.897 1.032 0.447 1.393 2.199 -0.108 0.381 0.710 0.797 1.148 0.969 1.013 2.039 1.130 1.266 0.499 1.103 0.374 1.744 1.062 0.578 1.046 0.594 1.520 1.972 0.603 0.743 0.555 0.149 0.362 0.007 0.312 0.658 0.176 0.037 0.914 0.706 0.484 0.432 0.262 0.341 0.321 0.052 0.269 0.217 0.621 0.280 0.711 0.093 0.298 0.568 0.305 0.557 0.141 0.060 -3.218 -2.874 -3.356 -3.987 -3.147 -2.891 -3.214 -3.423 -3.098 -3.335 -3.803 -3.019 -3.303 -2.277 -2.426 -3.312 -3.770 -4.368 -3.597 -3.657 -2.711 -3.355 -2.618 -4.182 -3.497 -3.044 -3.334 -3.131 -4.359 -3.530 0.003 0.008 0.002 0.001 0.004 0.008 0.003 0.002 0.004 0.002 0.001 0.005 0.003 0.032 0.023 0.002 0.001 0.001 0.001 0.001 0.012 0.002 0.015 0.001 0.001 0.005 0.002 0.004 0.001 0.001 Chapter B – Suplementary Material Chapter 4 | Effects of Stimulation Modality in the Beta Band ANOVA Main Effects of Stimulation Modality Table B.5: Main effects of modality for the significant (p < 0.05) electrodes E in the beta frequency band. Mean and standard deviation for visual, auditory, and audiovisual conditions, and the statistical indicators F-value, p-value, and η p 2 . E FC1 C3 CP1 P7 P3 Pz PO3 PO4 P4 P8 CP6 CP2 C4 FC2 Fz Cz Visual Auditory Audiovisual ANOVA x̄ s x̄ s x̄ s F p-value ηp 2 -0.021 -0.013 -0.048 -0.043 -0.047 -0.019 -0.043 -0.029 -0.032 -0.049 -0.028 -0.038 0.004 -0.023 -0.017 -0.039 0.071 0.081 0.086 0.105 0.082 0.078 0.097 0.093 0.086 0.091 0.100 0.074 0.077 0.073 0.062 0.086 0.033 0.052 0.056 0.017 0.048 0.055 0.029 0.064 0.064 0.048 0.060 0.045 0.049 0.014 0.031 0.029 0.085 0.094 0.080 0.091 0.083 0.081 0.071 0.106 0.086 0.089 0.081 0.072 0.099 0.068 0.073 0.103 -0.046 -0.034 -0.045 -0.067 -0.035 -0.014 -0.050 -0.017 -0.026 -0.048 -0.021 -0.042 -0.016 -0.043 -0.017 -0.066 0.087 0.093 0.105 0.094 0.092 0.100 0.107 0.112 0.093 0.124 0.088 0.096 0.101 0.082 0.089 0.099 6.813 6.160 12.439 6.055 9.404 7.606 5.767 5.528 8.423 6.151 5.910 9.890 3.417 4.322 3.316 7.781 0.002 0.004 < 0.001 0.004 0.001 0.001 0.005 0.007 0.001 0.004 0.005 0.001 0.041 0.019 0.045 0.001 0.228 0.211 0.351 0.208 0.290 0.248 0.200 0.193 0.268 0.211 0.204 0.300 0.129 0.158 0.126 0.252 Post-hoc Ttests between Pairs of Stimulation Modalities Table B.6: The effects of modality in the beta frequency band. T-values and significances (p-value) for the comparisons of the effects of the different modalities: visual versus auditory (VvsA), visual versus audiovisual (VvsAV), and audiovisual vs auditory (AVvsA). E FC1 C3 CP1 P7 P3 Pz PO3 PO4 P4 P8 CP6 CP2 C4 FC2 Fz Cz V vs A V vs AV AV vs A t-value p-value t-value p-value t-value p-value -1.730 -2.386 -4.537 -2.574 -3.666 -2.337 -2.640 -2.063 -3.338 -2.840 -4.035 -3.226 -2.147 -2.450 -2.770 -2.554 0.096 0.025 0.001 0.016 0.001 0.028 0.014 0.050 0.002 0.009 0.001 0.003 0.042 0.021 0.010 0.017 1.575 1.240 0.126 0.673 -0.183 -0.018 0.219 0.166 0.647 0.018 -0.097 1.047 1.543 0.876 0.856 0.658 0.128 0.226 0.900 0.507 0.856 0.985 0.828 0.869 0.523 0.985 0.923 0.305 0.135 0.389 0.400 0.516 -3.169 -2.914 -3.482 -2.107 -3.076 -2.169 -2.581 -2.332 -3.654 -2.764 -3.085 -3.341 -2.884 -2.930 -2.433 -2.777 0.004 0.007 0.001 0.045 0.005 0.040 0.016 0.028 0.001 0.010 0.005 0.002 0.008 0.007 0.022 0.010 137 138 | Chapter B – Suplementary Material Chapter 4 List of Figures 1.1 The BCI processing pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2 A taxonomy of affective BCI applications . . . . . . . . . . . . . . . . . . . . 14 2.1 2.2 2.3 2.4 2.5 2.6 Bacteria Hunt gameworld screenshot . . . . . . . . . . . . . . . . . . . . The BCI pipeline used for the extraction of alpha power and SSVEPs . . Accuracies according to the threshold parameter of the SSVEP detection ROC curves of the original and used SSVEP algorithm . . . . . . . . . . . The electrode montage with for the signal extraction relevant sites . . . The development of heart rate and alpha power over the game levels . . . . . . . . . . . . . . 36 39 42 43 46 49 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 The web-interface of the pilot study . . . . . . . . . . . . . . . . . The stimuli of the pilot study in the valence-arousal space . . . . Sensor placement scheme and a participant example . . . . . . . The affect questionnaire used in the main study . . . . . . . . . . The stimuli of the main study in the valence-arousal space . . . . Boxplots of subjective ratings for the stimulation conditions . . . Correlations between theta, beta, and gamma bands and valence Correlations between theta, alpha, and beta bands and arousal . Correlations between beta, and gamma bands and preference . . . . . . . . . . . . . . . . . . . . 63 65 66 67 70 72 75 76 77 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 The selected IAPS/IADS stimuli subsets on the valence-arousal plane . . . The distribution of norm valence and arousal ratings of the stimuli subsets Participant valence and arousal ratings for each condition . . . . . . . . . Modality-specific affective correlates for frontal and parietal alpha power . Main effect of modality for frontal and parietal alpha power . . . . . . . . Main effects of valence over all modalities . . . . . . . . . . . . . . . . . . Main effects of arousal over all modalities . . . . . . . . . . . . . . . . . . Main effect of modality in delta band . . . . . . . . . . . . . . . . . . . . . Main effect of modality in alpha band . . . . . . . . . . . . . . . . . . . . Main effect of modality in beta band . . . . . . . . . . . . . . . . . . . . . Consequences of context-specific correlates for aBCI . . . . . . . . . . . . . . . . . . . . . . . 92 92 96 98 99 100 102 103 103 104 111 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 | Chapter B – LIST OF FIGURES List of Tables 1.1 Overview of studies on affective BCI (Part 1) . . . . . . . . . . . . . . . . . . 10 1.2 Overview of studies on affective BCI (Part 2) . . . . . . . . . . . . . . . . . . 11 2.1 2.2 2.3 2.4 2.5 2.6 The manual and physiological controls used for each level . . . . . . . . . . 37 The output of the BCI to the application . . . . . . . . . . . . . . . . . . . . 40 The experiment design with the factors of feedback and difficulty . . . . . . 45 Repeated-measures ANOVA Game Experience Questionnaire (feedback/difficulty) 47 Repeated-measures ANOVA heart rate (feedback/difficulty) . . . . . . . . . 48 Repeated-measures ANOVA EEG alpha power (feedback/difficulty) . . . . . 49 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 The list of affective words used for stimulus selection . . . . . . . . . . . . . The list of music videos that were selected for the main experiment . . . . . Mean subjective ratings of the stimulation conditions . . . . . . . . . . . . . Wilcoxon Signed Rank analysis of the subjective ratings . . . . . . . . . . . Correlations between the rating scales . . . . . . . . . . . . . . . . . . . . . Repeated-measures ANOVA for the physiological signals . . . . . . . . . . . General Correlations between ratings and frequency band power . . . . . . Correlations between frontal alpha asymmetries and valence . . . . . . . . . General Correlations between ratings and baseline-corrected frequency band power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 Stimulus IDs of the IAPS and IADS used in the experimental conditions . Mean (std) for norm valence and arousal ratings for each condition . . . Mean (std) of participant valence and arousal ratings for each condition Repeated-measures ANOVA heart rate (valence/modality) . . . . . . . . Repeated-measures ANOVA heart rate (arousal/modality) . . . . . . . . Repeated-measures ANOVA frontal alpha asymmetry (valence) . . . . . Repeated-measures ANOVA EEG frequency band power (valence) . . . . Repeated-measures ANOVA EEG frequency band power (arousal) . . . . . . . . . . . . . . . . . . . . 61 64 71 71 72 73 75 76 78 93 94 96 97 98 100 101 102 A.1 Correlations between frequency band power and valence . . . . . . . . . . . 130 A.2 Correlations between frequency band power and arousal . . . . . . . . . . . 131 A.3 Correlations between frequency band power and preference . . . . . . . . . 132 142 | LIST OF TABLES B.1 B.2 B.3 B.4 B.5 B.6 Repeated-measures ANOVA delta power for modality Post-hoc Ttests delta band effects of modality . . . . Repeated-measures ANOVA alpha power for modality Post-hoc Ttests alpha band effects of modality . . . . Repeated-measures ANOVA beta power for modality Post-hoc Ttests beta band effects of modality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 134 135 136 137 137 Bibliography A ARSETH , E. Playing research: Methodological approaches to game analysis. In Proceedings of the 5th International Digital Arts and Culture Conference (DAC), Melbourne, Australia, 2003. A BOOTALEBI , V., M. H. M ORADI, and M. A. K HALILZADEH. A new approach for EEG feature extraction in P300-based lie detection. Computer Methods and Programs in Biomedicine, 94(1):48–57, 2009. A FTANAS , L. I., N. V. R EVA, A. A. VARLAMOV, S. V. PAVLOV, and V. P. M AKHNEV. Analysis of evoked EEG synchronization and desynchronization in conditions of emotional activation in humans: temporal and topographic characteristics. Neuroscience and behavioral physiology, 34(8):859–867, 2004. A FTANAS , L. I., A. A. VARLAMOV, S. V. PAVLOV, V. P. M AKHNEV, and N. V. R EVA. Timedependent cortical asymmetries induced by emotional arousal: EEG analysis of eventrelated synchronization and desynchronization in individually defined frequency bands. International Journal of Psychophysiology, 44(1):67–82, 2002. A LLEN , J., J. A. C OAN, and M. N AZARIAN. Issues and assumptions on the road from raw signals to metrics of frontal EEG asymmetry in emotion. Biological Psychology, 67(1-2): 183–218, 2004. A LLISON , B. Z. and J. P OLICH. Workload assessment of computer gaming using a singlestimulus event-related potential paradigm. Biological Psychology, 77(3):277–283, 2008. A LTENM ÜLLER , E., K. S CH ÜRMANN, V. K. L IM, and D. PARLITZ. Hits to the left, flops to the right: different emotions during listening to music are reflected in cortical lateralisation patterns. Neuropsychologia, 40(13):2242–2256, 2002. A NCOLI , S. and J. KAMIYA. Methodological issues in alpha biofeedback training. Applied Psychophysiology and Biofeedback, 3(2):159–183, 1978. A NDERSEN , S. R . B., R. A. M OORE, L. V ENABLES, P. J. C ORR, and L. V ENEBLES. Electrophysiological correlates of anxious rumination. International Journal of Psychophysiology, 71(2):156–169, 2009. 144 | BIBLIOGRAPHY A RNS , M., S. DE R IDDER, U. S TREHL, M. B RETELER, and A. C OENEN. Efficacy of neurofeedback treatment in ADHD: the effects on inattention, impulsivity and hyperactivity: a meta-analysis. Clinical EEG and neuroscience, 40(3):180–189, 2009. B ALCONI , M. and C. L UCCHIARI. EEG correlates (event-related desynchronization) of emotional face elaboration: a temporal analysis. Neuroscience letters, 392(1-2):118–23, 2006. B ALCONI , M. and G. M AZZA. Lateralisation effect in comprehension of emotional facial expression: a comparison between EEG alpha band power and behavioural inhibition (BIS) and activation (BAS) systems. Laterality, 15(3):361–84, 2010. B ALCONI , M. and U. P OZZOLI. Arousal effect on emotional face comprehension: frequency band changes in different time intervals. Physiology & behavior, 97(3-4):455–62, 2009. B ARRETT, L. F., B. M ESQUITA, K. N. O CHSNER, and J. J. G ROSS. The experience of emotion. Annual Review of Psychology, 58(1):373–403, 2007. B ARRY, R. J., A. R. C LARKE, S. J. J OHNSTONE, and C. R. B ROWN. EEG differences in children between eyes-closed and eyes-open resting conditions. Clinical Neurophysiology, 120(10):1806–1811, 2009. B ARRY, R. J., A. R. C LARKE, S. J. J OHNSTONE, C. A. M AGEE, and J. A. R USHBY. EEG differences between eyes-closed and eyes-open resting conditions. Clinical Neurophysiology, 118(12):2765–2773, 2007. B AUMGARTNER , T., M. E SSLEN, and L. JANCKE. From emotion perception to emotion experience: Emotions evoked by pictures and classical music. International Journal of Psychophysiology, 60(1):34–43, 2006a. B AUMGARTNER , T., K. L UTZ, C. F. S CHMIDT, and L. J ÄNCKE. The emotional power of music: how music enhances the feeling of affective pictures. Brain research, 1075(1): 151–64, 2006b. B AAR , E., M. S CH ÜRMANN, and O. S AKOWITZ. The selectively distributed theta system: functions. International Journal of Psychophysiology, 39(2-3):197–212, 2001. B AAR -E ROGLU , C., T. D EMIRALP, M. S CH ÜRMANN, and E. B AAR. Topological distribution of oddball ’P300’ responses. International journal of psychophysiology : official journal of the International Organization of Psychophysiology, 39(2-3):213–20, 2001. B ENSON , H., J. F. B EARY, and M. P. C AROL. The relaxation response. Psychiatry, 37(1): 37–46, 1974. B ERGER , H. Über das Elektrenkephalogramm des Menschen. Archiv für Psychiatrie und Nervenkrankheiten, 87:527–570, 1929. B ERNTSON , G., J. C ACIOPPO, and K. Q UIGLEY. Cardiac psychophysiology and autonomic space in humans: empirical perspectives and conceptual implications. Psychological bulletin, 114(2):296–322, 1993. BIBLIOGRAPHY | B LANKERTZ , B., C. S ANELLI, S. H ALDER, E.-M. H AMMER, A. K ÜBLER, K.-R. M ÜLLER, G. C URIO, and T. D ICKHAUS. Predicting BCI performance to study BCI illiteracy. BMC Neuroscience, 10(Suppl 1):P84, 2009. B LANKERTZ , B., M. K. R. T OMIOKA, F. U. H OHLEFELD, V. N IKULIN, and K.- R . M ÜLLER. Invariant common spatial patterns: Alleviating nonstationarities in brain-computer interfacing. In Advances in neural information processing systems, volume 20. MIT Press, Cambridge/Ma, USA, 2008. B OS , D. O. and B. R EUDERINK. BrainBasher: a BCI Game. In Extended Abstracts of the International Conference on Fun and Games (2008), pages 7–10. Eindhoven University of Technology, 2008. B RADLEY, M. M. Natural selective attention: orienting and emotion. Psychophysiology, 46 (1):1–11, 2009. B RADLEY, M. M., S. H AMBY, A. L ÖW, and P. J. L ANG. Brain potentials in perception: picture complexity and emotional arousal. Psychophysiology, 44(3):364–373, 2007. B RADLEY, M. M. and P. J. L ANG. Measuring emotion: the Self-Assessment Manikin and the Semantic Differential. Journal of Behavior Therapy and Experimental Psychiatry, 25(1): 49–59, 1994. B RADLEY, M. M. and P. J. L ANG. Affective reactions to acoustic stimuli. Psychophysiology, 37(2):204–215, 2000. B REFCZYNSKI -L EWIS , J., S. L OWITSZCH, M. PARSONS, S. L EMIEUX, and A. P UCE. Audiovisual non-verbal dynamic faces elicit converging fMRI and ERP responses. Brain Topography, 21(3-4):193–206, 2009. B RIGGS , K. E. and F. H. M ARTIN. Affective picture processing and motivational relevance: arousal and valence effects on ERPs in an oddball task. International Journal of Psychophysiology, 72(3):299–306, 2009. VAN DEN B ROEK , E. L. Affective Signal Processing (ASP): Unraveling the mystery of emotions. PhD thesis, Human Media Interaction (HMI), Faculty of Electrical Engineering, Mathematics, and Computer Science, University of Twente, Enschede, 2011. B ROUWER , A.-M., N. VAN W OUWE, C. M UEHL, J. VAN E RP, and A. T OET. Perceiving blocks of emotional pictures and sounds: Valence and arousal effects on heart rate, heart rate variability and skin conductance. submitted, 2012. B USH , G., P. L UU, and M. P OSNER. Cognitive and emotional influences in anterior cingulate cortex. Trends in Cognitive Sciences, 4(6):215–222, 2000. B USSE , L., K. C. R OBERTS, R. E. C RIST, D. H. W EISSMAN, and M. G. W OLDORFF. The spread of attention across modalities and space in a multisensory object. Proceedings of the National Academy of Sciences of the United States of America, 102(51):18751–18756, 2005. 145 146 | BIBLIOGRAPHY C ACIOPPO, J., R. P ETTY, M. L OSCH, and H. K IM. Electromyographic activity over facial muscle regions can differentiate the valence and intensity of affective reactions. Journal of Personality and Social Psychology, 50(2):260–268, 1986. C HANEL , G., J. J. K IERKELS, M. S OLEYMANI, and T. P UN. Short-term emotion assessment in a recall paradigm. International Journal of Human-Computer Studies, 67(8):607–627, 2009. C HANEL , G., C. R EBETEZ, M. B ÉTRANCOURT, T. P UN, and M. B ETRANCOURT. Emotion Assessment From Physiological Signals for Adaptation of Game Difficulty. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 41(6):1052–1063, 2011. C HANEL , G., C. R EBETEZ, M. B ETRANCOURT, T. P UN, J. K RONEGG, and D. G RANDJEAN. Emotion Assessment: Arousal Evaluation Using EEGs and Peripheral Physiological Signals. In G UNSEL , B., A. JAIN, A. T EKALP, and B. S ANKUR, editors, Multimedia Content Representation, Classification and Security in Lecture Notes of Computer Science, volume 4105 of Lecture Notes in Computer Science, pages 1052–1063, Istanbul, Turkey, 2005. Springer Berlin / Heidelberg. C HAPADOS , C. and D. J. L EVITIN. Cross-modal interactions in the experience of musical performances: physiological correlates. Cognition, 108(3):639–51, 2008. C HENG, M., X. G AO, S. G AO, and D. X U. Design and implementation of a brain-computer interface with high transfer rates. IEEE Transactions on Biomedical Engineering, 49(10): 1181–1186, 2002. C HO, M. K., H. S. JANG, S.-H. J EONG, I.-S. JANG, B.-J. C HOI, and M.-G. T. L EE. Alpha neurofeedback improves the maintaining ability of alpha activity. Neuroreport, 19(3):315– 7, 2008. C INZIA , D. D. and G. V ITTORIO. Neuroaesthetics: a review. Current Opinion in Neurobiology, 19(6):682–7, 2009. C OAN , J. A. and J. J. B. A LLEN. Frontal EEG asymmetry as a moderator and mediator of emotion. Biological Psychology, 67(1-2):7–50, 2004. C ODISPOTI , M., M. M. M. B RADLEY, and P. J. P. L ANG. Affective reactions to briefly presented pictures. Psychophysiology, 38(3):474–478, 2001. C ODISPOTI , M., V. F ERRARI, and M. M. B RADLEY. Repetitive picture processing: autonomic and cortical correlates. Brain Research, 1068(1):213–220, 2006. C ODISPOTI , M., P. S URCINELLI, and B. B ALDARO. Watching emotional movies: affective reactions and gender differences. International Journal of Psychophysiology, 69(2):90–5, 2008. C OHEN , A. J. Music as a source of emotion in film. In J USLIN , P. N. and J. A. S LOBODA, editors, Music and Emotion: Theory and Research, Series in Affective Science, chapter 11, pages 249–272. Oxford University Press, 2001. BIBLIOGRAPHY | C OLE , H. W. and W. J. R AY. EEG correlates of emotional tasks related to attentional demands. International Journal of Psychophysiology, 3(1):33–41, 1985. C OMERCHO, M., J. P OLICH, and M. D. C OMERCHERO. P3a and P3b from typical auditory and visual stimuli. Clinical Neurophysiology, 110(1):24–30, 1999. C RITCHLEY, H. D., C. J. M ATHIAS, O. J OSEPHS, J. O’D OHERTY, S. Z ANINI, B.-K. D EWAR, L. C IPOLOTTI, T. S HALLICE, and R. J. D OLAN. Human cingulate cortex and autonomic control: converging neuroimaging and clinical evidence. Brain: a Journal of Neurology, 126(10):2139–52, 2003. C UTHBERT, B. N., H. T. S CHUPP, M. M. B RADLEY, N. B IRBAUMER, and P. J. L ANG. Brain potentials in affective picture processing: covariation with autonomic arousal and affective report. Biological Psychology, 52(2):95–111, 2000. D AMASIO, A. R. The Feeling of what Happens. Vintage U.K. Random House, 2000. D AN G LAUSER , E. and K. S CHERER. Neuronal Processes Involved in Subjective Feeling Emergence: Oscillatory Activity During an Emotional Monitoring Task. Brain Topography, 20(4):224–231, 2008. D ARWIN , C. The Expression of the Emotions in Man and Animals. Digireads.com, 2005. D AVIDSON , R. J. Anterior cerebral asymmetry and the nature of emotion. Brain and Cognition, 20(1):125–51, 1992. D E G ELDER , B. and J. V ROOMEN. The perception of emotions by ear and by eye. Cognition & Emotion, 14(3):289–311, 2000. D ELORME , A. and S. M AKEIG. EEGLAB: an open source toolbox for analysis of singletrial EEG dynamics including independent component analysis. Journal of Neuroscience Methods, 134(1):9–21, 2004. D OLAN , R. J., J. S. M ORRIS, and B. DE G ELDER. Crossmodal binding of fear in voice and face. Proceedings of the National Academy of Sciences of the United States of America, 98 (17):10006–10, 2001. D RACHEN , A., L. E. N ACKE, G. YANNAKAKIS, and A. L. P EDERSEN. Correlation between heart rate, electrodermal activity and player experience in first-person shooter games. In Proceedings of the 5th ACM SIGGRAPH Symposium on Video Games - Sandbox ’10, pages 49–54, Los Angeles, CA, USA, 2010. ACM Press. D RIVER , J. and C. S PENCE. Crossmodal attention. Current Opinion in Neurobiology, 8(2): 245–253, 1998. E KMAN , P. Are there basic emotions? Psychological Review, 99(3):550–553, 1992. E KMAN , P. Basic Emotions. In D ALGLEISH , T. and M. J. P OWER, editors, Handbook of Cognition and Emotion, chapter 3. John Wiley & Sons, Ltd, Chichester, UK, 1999. 147 148 | BIBLIOGRAPHY E LDAR , E., O. G ANOR, R. A DMON, A. B LEICH, and T. H ENDLER. Feeling the real world: limbic response to music depends on related content. Cerebral Cortex, 17(12):2828–40, 2007. E NGEL , A. K. and P. F RIES. Beta-band oscillations–signalling the status quo? Opinion in Neurobiology, 20(2):156–65, 2010. Current E SPOSITO, A., D. C ARBONE, M. R IVIELLO, J. F IERREZ, J. O RTEGA -G ARCIA, A. D RYGA JLO , and M. FAUNDEZ -Z ANUY . Visual context effects on the perception of musical emotional expressions. In F IERREZ , J., J. O RTEGA -G ARCIA, A. E SPOSITO, A. D RYGAJLO, and M. FAUNDEZ -Z ANUY, editors, Proceedings of the International Conference on Biometric ID Management and Multimodal Communication 2009, volume 5707 of Lecture Notes in Computer Science, pages 73–80, Madrid, Spain, 2009. Springer Berlin Heidelberg. E THOFER , T., S. A NDERS, M. E RB, C. D ROLL, L. R OYEN, R. S AUR, S. R EITERER, W. G RODD, and D. W ILDGRUBER. Impact of voice on emotional judgment of faces: an event-related fMRI study. Human Brain Mapping, 27(9):707–14, 2006. E THOFER , T., D. VAN D E V ILLE, K. S CHERER, and P. V UILLEUMIER. Decoding of Emotional Information in Voice-Sensitive Cortices. Current Biology, 19(12):1028–1033, 2009. E TZEL , J., E. J OHNSON, J. D ICKERSON, D. T RANEL, and R. A DOLPHS. Cardiovascular and respiratory responses during musical mood induction. International Journal of Psychophysiology, 61(1):57–69, 2006. FAIRCLOUGH , S. H. Fundamentals of physiological computing. Interacting with Computers, 21(1-2):133–145, 2009. FAIRCLOUGH , S. H. Physiological Computing : Interfacing with the Human Nervous System. In O UWERKERK , M., J. H. D. M. W ESTERINK, and M. K RANS, editors, Sensing Emotions: The Impact of Context on Experience Measurements, chapter 1, pages 1 – 20. Springer New York, 2010. FAIRCLOUGH , S. H. and J. S. R OBERTS. Effects of performance feedback on cardiovascular reactivity and frontal EEG asymmetry. International Journal of Psychophysiology, 81(3): 291–8, 2011. FARWELL , L. A. and E. D ONCHIN. Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials. Electroencephalography and Clinical Neurophysiology, 70(6):510–523, 1988. FAWCETT, T. An introduction to ROC analysis. Pattern Recognition Letters, 27(8):861–874, 2006. F ELL , J., H. E LFADIL, P. K LAVER, J. R OESCHKE, C. E. E LGER, and G. F ERNANDEZ. Covariation of spectral and nonlinear EEG measures with alpha biofeedback. International Journal of Neuroscience, 112(9):1047–1057, 2002. BIBLIOGRAPHY | F INKE , A., H. KOESLING, and H. R ITTER. Multi-modal human-computer interfaces for handicapped user groups: Integrating mind and gaze. In 12th ACM International Conference on Ubiquitous Computing 2010, Copenhagen, Denmark, 2010. ACM Press. F INKE , A., A. L ENHARDT, and H. R ITTER. The MindGame: a P300-based brain-computer interface game. Neural Networks, 22(9):1329–1333, 2009. F OXE , J. J. and A. C. S NYDER. The role of alpha-band brain oscillations as a sensory suppression mechanism during selective attention. Frontiers in Psychology, 2(0), 2011. F RANTZIDIS , C. A., C. B RATSAS, C. L. PAPADELIS, E. KONSTANTINIDIS, C. PAPPAS, and P. D. B AMIDIS. Toward emotion aware computing: an integrated approach using multichannel neurophysiological recordings and affective visual stimuli. IEEE Transactions on Information Technology in Biomedicine, 14(3):589–597, 2010. F RAZIER , T., M. S TRAUSS, and S. S TEINHAUER. Respiratory sinus arrhythmia as an index of emotional response in young adults. Psychophysiology, 41(1):75–83, 2004. F RIDLUND , A. J. and J. T. C ACIOPPO. Guidelines for Human Electromyographic Research. Psychophysiology, 23(5):567–589, 1986. F RIES , P. Neuronal gamma-band synchronization as a fundamental process in cortical computation. Annual Review of Neuroscience, 32:209–24, 2009. G EORGE , L., F. L OTTE, R. V. A BAD, and A. L ECUYER. Using scalp electrical biosignals to control an object by concentration and relaxation tasks: design and evaluation. In Proceedings of the 33th Annual International Conference of the IEEE Engineering in Medicine and Biology Society., volume 2011, pages 6299–302, Boston, MA, USA, 2011. IEEE. G ERRARDS -H ESSE , A. and K. S PIES. Experimental inductions of emotional states and their effectiveness: a review. British Journal of Psychology, 85(1):55, 1994. G ILLEADE , K. M., A. D IX, and J. A LLANSON. Affective videogames and modes of affective gaming: assist me, challenge me, emote me. Computer, 2005:16–20, 2005. G OMEZ , P. and B. D ANUSER. Affective and physiological responses to environmental noises and music. International Journal of Psychophysiology, 53(2):91–103, 2004. G OMEZ , P. and B. D ANUSER. Relationships between musical structure and psychophysiological measures of emotion. Emotion, 7(2):377–87, 2007. G ONCHAROVA , I., D. M C FARLAND, T. VAUGHAN, J. W OLPAW, M. D. J, V. J. R, and W. J. R. EMG contamination of EEG: spectral and topographical characteristics. Clinical Neurophysiology, 114(9):1580–1593, 2003. G RANDJEAN , D., D. S ANDER, G. P OURTOIS, S. S CHWARTZ, M. L. S EGHIER, K. R. S CHERER, and P. V UILLEUMIER. The voices of wrath: brain responses to angry prosody in meaningless speech. Nature Neuroscience, 8(2):145–146, 2005. 149 150 | BIBLIOGRAPHY G RANDJEAN , D., D. S ANDER, and K. R. S CHERER. Conscious emotional experience emerges as a function of multilevel, appraisal-driven response synchronization. Consciousness and Cognition, (2):484–95, 2008. G ROPPE , D. M., T. P. U RBACH, and M. KUTAS. Mass univariate analysis of event-related brain potentials/fields I: a critical tutorial review. Psychophysiology, 48:1–15, 2011. G ROSS , J., A. S CHNITZLER, L. T IMMERMANN, and M. P LONER. Gamma oscillations in human primary somatosensory cortex reflect pain perception. PLoS Biology, 5(5):e133, 2007. G ROSSE -W ENTRUP, M., B. S CH ÖLKOPF, and J. H ILL. Causal influence of gamma oscillations on the sensorimotor rhythm. NeuroImage, 56(2):837–42, 2011. G RUBER , T., M. M. M ÜLLER, A. K EIL, and T. E LBERT. Selective visual-spatial attention alters induced gamma band responses in the human EEG. Clinical Neurophysiology, 110 (12):2074–2085, 1999. G RUBER , T., D. T SIVILIS, C.-M. G IABBICONI, and M. M. M ÜLLER. Induced electroencephalogram oscillations during source memory: familiarity is reflected in the gamma band, recollection in the theta band. Journal of Cognitive Neuroscience, 20(6):1043–53, 2008. G RUZELIER , J., T. E GNER, and D. V ERNON. Validating the efficacy of neurofeedback for optimising performance. Progress in Brain Research, 159:421–431, 2006. G ÜNTEKIN , B., E. B ASAR, and B. GUNTEKIN. Emotional face expressions are differentiated with brain oscillations. International Journal of Psychophysiology, 64(1):91–100, 2007. G ÜRK ÖK , H., G. H AKVOORT, and M. P OEL. Modality switching and performance in a thought and speech controlled computer game. In Proceedings of the 13th international conference on multimodal interfaces (ICMI ’11), page 41, Alicante, Spain, 2011. ACM Press. H ADJIKHANI , N. and B. DE G ELDER. Seeing Fearful Body Expressions Activates the Fusiform Cortex and Amygdala. Current Biology, 13(24):2201–2205, 2003. H AGEMANN , D. Individual differences in anterior EEG asymmetry: methodological problems and solutions. Biological Psychology, 67(1-2):157–82, 2004. H AJCAK , G., A. M AC N AMARA, and D. M. O LVET. Event-related potentials, emotion, and emotion regulation: an integrative review. Developmental Neuropsychology, 35(2):129–55, 2010. H AMMOND , D. C. Neurofeedback Treatment of Depression and Anxiety. Journal of Adult Development, 12(2):131–137, 2005. H AMMOND , D. C. What Is Neurofeedback? Journal of Neurotherapy, 10(4):25–36, 2007. BIBLIOGRAPHY | H ANSEN , P., M. K RINGELBACH, and R. S ALMELIN. MEG: An Introduction to Methods. Oxford University Press, USA, 2010. H ARDT, J. V. and J. KAMIYA. Anxiety change through EEG alpha feedback seen only in high anxiety subjects. Science, 201(201):79–81, 1978. H ARE , J. F., B. H. T IMMONS, J. R. R OBERTS, and A. S. B URMAN. EEG alpha-biofeedback training: An experimental technique for the management of anxiety. Journal of Medical Engineering & Technology, 6(1):19–24, 1982. H ARI , R., R. S ALMELIN, J. P. M AKELA, S. S ALENIUS, M. H ELLE, and J. M ÄKEL Ä. Magnetoencephalographic cortical rhythms. International Journal of Psychophysiology, 26(1-3): 51–62, 1997. H ARMON -J ONES , E., P. A. G ABLE, and C. K. P ETERSON. The role of asymmetric frontal cortical activity in emotion-related phenomena: a review and update. Biological Psychology, 84(3):451–462, 2010. H ARMON -J ONES , E., C. H ARMON -J ONES, M. F EARN, J. D. S IGELMAN, and P. J OHNSON. Left frontal cortical activation and spreading of alternatives: tests of the action-based model of dissonance. Journal of Personality and Social psychology, 94(1):1–15, 2008. H AUCK , M., J. L ORENZ, and A. K. E NGEL. Attention to painful stimulation enhances gamma-band activity and synchronization in human sensorimotor cortex. The Journal of Neuroscience, 27(35):9270–7, 2007. H ERRMANN , C. S. and R. T. K NIGHT. Mechanisms of human attention: event-related potentials and oscillations. Neuroscience and Biobehavioral Reviews, 25(6):465–476, 2001. H ERRMANN , C. S., M. H. J. M UNK, and A. K. E NGEL. Cognitive functions of gamma-band activity: memory match and utilization. Trends in Cognitive Sciences, 8(8):347–55, 2004. H JELM , S. I. Research + design: the making of Brainball. Interactions, 10(1):26–34, 2003. H JELM , S. I. and C. B ROWALL. Brainball - using brain activity for cool competition. In The Proceedings of the First Nordic Conference on Computer-Human Interaction (NordiCHI 2000), pages 177–188, Stockholm, Sweden, 2000. H OEHN -S ARIC , R., R. L. H AZLETT, T. P OURMOTABBED, and D. R. M C L EOD. Does muscle tension reflect arousal? Relationship between electromyographic and electroencephalographic recordings. Psychiatry Research, 71(1):49–55, 1997. H ONGTAO, W., L. T ING, and H. Z HENFENG. Remote control of an electrical car with SSVEPbased BCI. In 2010 IEEE International Conference on Information Theory and Information Security, pages 837–840, Beijing, China, 2010. IEEE. H ORLINGS , R., D. D ATCU, and L. J. M. R OTHKRANTZ. Emotion recognition using brain activity. In CompSysTech ’08 Proceedings of the 9th International Conference on Computer Systems and Technologies and Workshop for PhD Students in Computing, Gabrovo, Bulgaria, 2008. ACM New York, NY, USA. 151 152 | BIBLIOGRAPHY H OSHI , Y. Functional near-infrared spectroscopy: potential and limitations in neuroimaging studies. International Review of Neurobiology, 66:237–66, 2005. H USTER , R. J., S. S TEVENS, A. L. G ERLACH, and F. R IST. A spectralanalytic approach to emotional responses evoked through picture presentation. International Journal of Psychophysiology, 72(2):212–216, 2009. IJ SSELSTEIJN , W., Y. DE KORT, and K. P OELS. The Game Experience Questionnaire: Development of a self-report measure to assess the psychological impact of digital games. Technical report, Game Experience Research Lab, Eindhoven University of Technology, 2007. JASPER , H. H. The ten-twenty electrode system of the International Federation. Electroencephalography and Clinical Neurophysiology, 10:371–375, 1958. J ENSEN , O., J. KAISER, and J. P. L ACHAUX. Human gamma-frequency oscillations associated with attention and memory. Trends in Neurosciences, 30(7):317–324, 2007. J ENSEN , O. and A. M AZAHERI. Shaping functional architecture by oscillatory alpha activity: gating by inhibition. Frontiers in Human Neuroscience, 4, 2010. J ENSEN , O. and C. D. T ESCHE. Frontal theta activity in humans increases with memory load in a working memory task. The European Journal of Neuroscience, 15(8):1395–9, 2002. J ONES , N. A. and N. A. F OX. Electroencephalogram asymmetry during emotionally evocative films and its relation to positive and negative affectivity. Brain and Cognition, 20(2): 280–299, 1992. J UNG, C. G. On psychophysical relations of the associative experiment. The Journal of Abnormal Psychology, 1(6):247–255, 1907. J USLIN , P. N. and D. V ÄSTFJ ÄLL. Emotional responses to music: The need to consider underlying mechanisms. Behavioral and Brain Sciences, 31(05):559–575, 2008. KAHANA , M. J., D. S EELIG, and J. R. M ADSEN. Theta returns. Current Opinion in Neurobiology, 11(6):739–44, 2001. KAWASAKI , M., K. K ITAJO, and Y. YAMAGUCHI. Dynamic links between theta executive functions and alpha storage buffers in auditory and visual working memory. The European Journal of Neuroscience, 31(9):1683–9, 2010. K EIL , A., M. M. M ÜLLER, T. G RUBER, C. W IENBRUCH, M. S TOLAROVA, and T. E LBERT. Effects of emotional arousal in the cerebral hemispheres: a study of oscillatory brain activity and event-related potentials. Clinical Neurophysiology, 112(11):2057–2068, 2001. K ELLY, S. P., E. C. L ALOR, C. F INUCANE, G. M C D ARBY, and R. B. R EILLY. Visual spatial attention control in an independent brain-computer interface. IEEE Transactions on Biomedical Engineering, 52(9):1588–1596, 2005. BIBLIOGRAPHY | K HALFA , S., P. I SABELLE, B. J EAN -P IERRE, and R. M ANON. Event-related skin conductance responses to musical emotions in humans. Neuroscience Letters, 328(2):145–149, 2002. K HOSROWABADI , R., A. WAHAB, K. K. A NG, and M. H. B ANIASAD. Affective computation on EEG correlates of emotion from musical and vocal stimuli. In Proceedings of the 2009 International Joint Conference on Neural Networks, pages 1590–1594, Atlanta, Ga, USA, 2009. IEEE Computer Society Press. K IM , J. and E. A NDR É. Emotion Recognition Based on Physiological Changes in Music Listening. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(12):2067– 2083, 2008. K LADOS , M. A., C. F RANTZIDIS, A. B. V IVAS, C. PAPADELIS, C. L ITHARI, C. PAPPAS, and P. D. B AMIDIS. A Framework Combining Delta Event-Related Oscillations (EROs) and Synchronisation Effects (ERD/ERS) to Study Emotional Processing. Computational Intelligence and Neuroscience, 2009. K LEINGINNA , P. R. and A. M. K LEINGINNA. A categorized list of emotion definitions, with suggestions for a consensual definition. Motivation and Emotion, 5(4):345–379, 1981. K LIMESCH , W. Memory processes, brain oscillations and EEG synchronization. International Journal of Psychophysiology, 24(1-2):61–100, 1996. K LIMESCH , W. EEG alpha and theta oscillations reflect cognitive and memory performance: a review and analysis. Brain Research. Brain Research Reviews, 29(2-3):169–195, 1999. K LIMESCH , W., R. F REUNBERGER, P. S AUSENG, and W. G RUBER. A short review of slow phase synchronization and memory: evidence for control processes in different memory systems? Brain Research, 1235:31–44, 2008. K LIMESCH , W., P. S AUSENG, and S. H ANSLMAYR. EEG alpha oscillations: the inhibitiontiming hypothesis. Brain Research Reviews, 53(1):63–88, 2007. K NYAZEV, G. Motivation, emotion, and their inhibitory control mirrored in brain oscillations. Neuroscience & Biobehavioral Reviews, 31(3):377–395, 2007. K NYAZEV, G. G. EEG delta oscillations as a correlate of basic homeostatic and motivational processes. Neuroscience and Biobehavioral Reviews, 36(1):677–95, 2012. KOBER , H., L. F. F. B ARRETT, J. J OSEPH, E. B LISS -M OREAU, K. L INDQUIST, and T. D. WAGER. Functional grouping and cortical-subcortical interactions in emotion: a metaanalysis of neuroimaging studies. NeuroImage, 42(2):998–1031, 2008. KOELSCH , S. Towards a neural basis of music-evoked emotions. Trends in Cognitive Sciences, 14(3):131–7, 2010. 153 154 | BIBLIOGRAPHY KOELSCH , S., E. KASPER, D. S AMMLER, K. S CHULZE, T. G UNTER, and A. D. F RIEDERICI. Music, language and meaning: brain signatures of semantic processing. Nature Neuroscience, 7(3):302–307, 2004. KOELSTRA , S., C. M ÜHL, M. S OLEYMANI, J.-S. L EE, A. YAZDANI, T. E BRAHIMI, T. P UN, A. N IJHOLT, and I. PATRAS. DEAP : A Database for Emotion Analysis using Physiological Signals. IEEE Transactions on Affective Computing, 3(1):18 – 31, 2012. KOELSTRA , S., A. YAZDANI, M. S OLEYMANI, C. M ÜHL, J.-S. L EE, A. N IJHOLT, T. P UN, T. E BRAHIMI, and I. PATRAS. Single trial classification of EEG and peripheral physiological signals for recognition of emotions induced by music videos. In Proceedings of the 2010 International Conference on Brain informatics (BI 2010), volume 6334 of Lecture Notes in Computer Science, pages 89–100, Toronto, Canada, 2010. Springer. KONECNI , V. J. Does music induce emotion? A theoretical and methodological analysis. Psychology of Aesthetics Creativity and the Arts, 2(2):115–129, 2008. KOP, W. J., S. J. S YNOWSKI, M. E. N EWELL, L. A. S CHMIDT, S. R. WALDSTEIN, and N. A. F OX. Autonomic nervous system reactivity to positive and negative mood induction: the role of acute psychological responses and frontal electrocortical activity. Biological Psychology, 86(3):230–8, 2011. K RANTZ , G., M. F ORSMAN, and U. L UNDBERG. Consistency in physiological stress responses and electromyographic activity during induced stress exposure in women and men. Integrative Physiological and Behavioral Science, 39(2):105–118, 2004. K RAUSE , C. M. Cognition- and memory-related ERD/ERS responses in the auditory stimulus modality. Progress in Brain Research, 159:197–207, 2006. K REIFELTS , B., T. E THOFER, W. G RODD, M. E RB, and D. W ILDGRUBER. Audiovisual integration of emotional signals in voice and face: an event-related fMRI study. NeuroImage, 37(4):1445–56, 2007. K RUMHANSL , C. L. An exploratory study of musical emotions and psychophysiology. Canadian Journal of Experimental Psychology, 51(4):336–53, 1997. VAN DE L AAR , B., D. P LASS -O UDE B OS , B. R EUDERINK , M. P OEL , and A. N IJHOLT . How much control is enough? Optimizing fun with unreliable input. Technical report, Centre for Telematics and Information Technology, University of Twente, 2011. L ALOR , E. C., S. P. K ELLY, C. F INUCANE, R. B URKE, R. S MITH, R. B. R EILLY, and G. M C D ARBY. Steady-State VEP-Based Brain-Computer Interface Control in an Immersive 3D Gaming Environment. EURASIP Journal on Advances in Signal Processing, 2005(19):3156– 3164, 2005. L ANG, P. J., M. M. B RADLEY, and B. N. C UTHBERT. International Affective Picture System (IAPS): Technical manual and affective ratings. Technical report, University of Florida, Center for Research in Psychophysiology, Gainesville, Fl, USA., 1999. BIBLIOGRAPHY | L ANG , P. J., M. K. G REENWALD, M. M. B RADLEY, and A. O. H AMM. Looking at pictures: Affective, facial, visceral, and behavioral reactions. Psychophysiology, 30(3):261–273, 1993. L ARSSON , S. E., R. L ARSSON, Q. Z HANG, H. C AI, and P. A. O BERG. Effects of psychophysiological stress on trapezius muscles blood flow and electromyography during static load. European Journal of Applied Physiology and Occupational Physiology, 71(6):493–8, 1995. L AUFS , H., A. K LEINSCHMIDT, A. B EYERLE, E. E GER, A. S ALEK-H ADDADI, C. P REIBISCH, and K. K RAKOW. EEG-correlated fMRI of human alpha activity. NeuroImage, 19(4):1463– 76, 2003. L AURSEN , B., B. R. J ENSEN, A. H. G ARDE, and A. H. J Ø RGENSEN. Effect of mental and physical demands on muscular activity during the use of a computer mouse and a keyboard. Scandinavian Journal of Work, Environment & Health, 28(4):215–21, 2002. L ECUYER , A., F. L OTTE, R. R EILLY, R. L EEB, M. H IROSE, and M. S LATER. Brain-Computer Interfaces, Virtual Reality, and Videogames. Computer, 41(10):66–72, 2008. L E D OUX , J. The emotional brain. Number 4. Simon & Schuster New York, 1996. L EEB , R., D. F RIEDMAN, G. R. M ÜLLER -P UTZ, R. S CHERER, M. S LATER, and G. P FURTSCHELLER. Self-paced (asynchronous) BCI control of a wheelchair in virtual environments: a case study with a tetraplegic. Computational Intelligence and Neuroscience, 2007:79642, 2007. L EHTEL Ä , L., R. S ALMELIN, and R. H ARI. Evidence for reactive magnetic 10-Hz rhythm in the human auditory cortex. Neuroscience Letters, 222(2):111–114, 1997. L I , M. and B.-L. L U. Emotion classification based on gamma-band EEG. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society., pages 1223–1226, Minneapolis, MN, USA, 2009. IEEE Computer Society Press. L IN , Y.-P., J.-R. D UANN, J.-H. C HEN, and T.-P. J UNG. Electroencephalographic dynamics of musical emotion perception revealed by independent spectral components. NeuroReport, 21(6):410–415, 2010. L IN , Y. P., C. H. WANG, T. L. W U, S. K. J ENG, and J. H. C HEN. EEG-based emotion recognition in music listening: A comparison of schemes for multiclass support vector machine. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 489–492. IEEE Computer Society, 2009. L INDQUIST, K. A., T. D. WAGER, H. KOBER, E. B LISS - MOREAU, and L. F. B ARRETT. The brain basis of emotion : A meta-analytic review. Behavioral and Brain Sciences, 173(4): 1–86, 2011. L OGESWARAN , N. and J. B HATTACHARYA. Crossmodal transfer of emotion by music. Neuroscience Letters, 455(2):129–33, 2009. 155 156 | BIBLIOGRAPHY L OTTE , F., M. C ONGEDO, A. L ÉCUYER, F. L AMARCHE, and B. A RNALDI. A review of classification algorithms for EEG-based brain-computer interfaces. Journal of Neural Engineering, 4(2):R1–R13, 2007. L OTZE , M. and U. H ALSBAND. Motor imagery. Journal of Physiology, 99(4-6):386–95, 2006. L UNDBERG, U., R. KADEFORS, B. M ELIN, G. PALMERUD, P. H ASSMEN, M. E NGSTROM, and I. E. D OHNS. Psychophysiological stress and EMG activity of the trapezius muscle. International Journal of Behavioral Medicine, 1(4):354–370, 1994. L UNDQVIST, L.-O., F. C ARLSSON, P. H ILMERSSON, and P. N. J USLIN. Emotional responses to music: experience, expression, and physiology. Psychology of Music, 37(1):61–90, 2008. L UO, Q., D. M ITCHELL, X. C HENG, K. M ONDILLO, D. M CCAFFREY, T. H OLROYD, F. C ARVER, R. C OPPOLA, and J. B LAIR. Visual awareness, emotion, and gamma band synchronization. Cerebral cortex, 19(8):1896–904, 2009. L UTZ , A., L. L. G REISCHAR, N. B. R AWLINGS, M. R ICARD, and R. J. D AVIDSON. Long-term meditators self-induce high-amplitude gamma synchrony during mental practice. Proceedings of the National Academy of Sciences of the United States of America, 101(46):16369–73, 2004. M AGN ÉE , M., J. S TEKELENBURG, C. K EMNER, and B. DE G ELDER. Similar facial electromyographic responses to faces, voices, and body expressions. Neuroreport, 18(4):369–372, 2007. M AKEIG, S., G. L ESLIE, T. M ULLEN, D. S ARMA, N. B IGDELY-S HAMLO, and C. KOTHE. First demonstration of a musical emotion BCI. In Proceedings of the 4th International Conference on Affective Computing and Intelligent Interaction and Workshops, 2011 (ACII 2011), 2nd Workshop on Affective Brain-Computer Interfaces, volume 6975 of Lecture Notes in Computer Science, pages 487–496, Memphis, TN, USA, 2011. Springer. M ASSARO, D. W. and P. B. E GAN. Perceiving affect from the voice and the face. Psychonomic Bulletin & Review, 3(2):215–221, 1996. M AYHEW, S. D., S. G. D IRCKX, R. K. N IAZY, G. D. I ANNETTI, and R. G. W ISE. EEG signatures of auditory activity correlate with simultaneously recorded fMRI responses in humans. NeuroImage, 49(1):849–64, 2010. M AZAHERI , A. and T. W. P ICTON. EEG spectral dynamics during discrimination of auditory and visual targets. Brain Research, 24(1):81–96, 2005. M ISKOVIC , V. and L. A . S CHMIDT. Cross-regional cortical synchronization during affective image viewing. Brain Research, 1362:102–11, 2010. M IZUHARA , H. and Y. YAMAGUCHI. Human cortical circuits for central executive function emerge by theta phase synchronization. NeuroImage, 36(1):232–44, 2007. BIBLIOGRAPHY | M OORE , N. The Neurotherapy of Anxiety Disorders. Journal of Adult Development, 12(2): 147–154, 2005. M OORE , N. C. A review of EEG biofeedback treatment of anxiety disorders. Clinical Electroencephalography, 31(1):1–6, 2000. M ORGAN , S. T., J. C. H ANSEN, and S. A. H ILLYARD. Selective attention to stimulus location modulates the steady-state visual evoked potential. Proceedings of the National Academy of Sciences of the United States of America, 93(10):4770–4774, 1996. M ÜHL , C., E. VAN DEN B ROEK, A.-M. B ROUWER, F. N IJBOER, N. VAN W OUWE, and D. H EYLEN. Multi-modal affect induction for affective brain-computer interfaces. In DM ELLO, S., A. G RAESSER, B. S CHULLER, and J.-C. M ARTIN, editors, Proceedings of the 4th International Conference on Affective Computing and Intelligent Interaction and Workshops, 2011 (ACII 2011), volume 6974 of Lecture Notes in Computer Science, pages 235–245, Memphis, TN, USA, 2011a. Springer Berlin Heidelberg. M ÜHL , C., A.-M. B ROUWER, N. VAN W OUWE, E. L. VAN DEN B ROEK, F. N IJBOER, and D. K. J. H EYLEN. Modality-specific Affective Responses and their Implications for Affective BCI. In M ÜLLER -P UTZ , G. R., R. S CHERER, M. B ILLINGER, A. K REILINGER, V. KAISER, and C. N EUPER, editors, Proceedings of the Fifth International Brain-Computer Interface Conference 2011, Graz, Austria, pages 120–123, Graz, Austria, 2011b. Verlag der Technischen Universität. M ÜHL , C., H. G ÜRK ÖK, D. P LASS -O UDE B OS, M. T HURLINGS, L. S CHERFFIG, M. D UVINAGE, A. E LBAKYAN, S. KANG, M. P OEL, and D. H EYLEN. Bacteria Hunt: A multimodal, multiparadigm BCI game. In Proceedings of the 5th International Summer Workshop on Multimodal Interfaces (enterface2009), number 1, Genua, Italy, 2009a. University of Genua. M ÜHL , C., H. G ÜRK ÖK, D. P LASS -O UDE B OS, M. E. T HURLINGS, L. S CHERFFIG, M. D UVINAGE, A. A. E LBAKYAN, S. KANG, M. P OEL, and D. H EYLEN. Bacteria Hunt Evaluating multi-paradigm BCI interaction. Journal on Multimodal User Interfaces, 4(1): 11–25, 2010. M ÜHL , C. and D. H EYLEN. Cross-modal elicitation of affective experience. In C OHN , J., A. N IJHOLT, and M. PANTIC, editors, Proceedings of the 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, 2009 (ACII 2009), Amsterdam, The Netherlands, 2009. IEEE Computer Society Press. M ÜHL , C., D. K. J. H EYLEN, and A. N IJHOLT. Workshop on Affective Brain-Computer Interfaces 2009 (aBCI 2009). In C OHN , J., A. N IJHOLT, and M. PANTIC, editors, Proceedings of the 3rd International Conference on Affective Computing and Intelligent Interaction (ACII2009), volume 2, Amsterdam, The Netherlands, 2009b. IEEE Computer Society. M ÜHL , C., A. N IJHOLT, B. Z. A LLISON, S. D UNNE, and D. K. J. H EYLEN. Workshop on Affective Brain-Computer Interfaces 2011 (aBCI 2011). In M ELLO, S. D., A. G RAESSER, 157 158 | BIBLIOGRAPHY B. S CHULLER, and J.-C. M ARTIN, editors, Proceedings 4th International Conference on Affective Computing and Intelligent Interaction and Workshops, 2011 (ACII 2011), volume 6975 of Lecture Notes in Computer Science, page 435, Memphis, TN, USA, 2011c. Springer. M ULHOLLAND , T., D. G OODMAN, and R. B OUDROT. Attention and regulation of EEG alphaattenuation responses. Biofeedback and Self-regulation, 8(4):585–600, 1983. M ÜLLER , M., T. P ICTON, P. VALDES -S OSA, J. R IERA, W. T EDER -S ÄLEJ ÄRVI, and S. H ILL YARD . Effects of spatial selective attention on the steady-state visual evoked potential in the 20-28 Hz range. Brain research. Cognitive Brain Research, 6(4):249–261, 1998. M ÜLLER , M. M. and T. G RUBER. Induced gamma-band responses in the human EEG are related to attentional information processing. Visual Cognition, 8(3-5):579–592, 2001. M ÜLLER , M. M., A. K EIL, T. G RUBER, and T. E LBERT. Processing of affective pictures modulates right-hemispheric gamma band EEG activity. Clinical Neurophysiology, 110 (11):1913–1920, 1999. M ÜLLER -P UTZ , G. R., R. S CHERER, C. B RAUNEIS, and G. P FURTSCHELLER. Steady-state visual evoked potential (SSVEP)-based communication: impact of harmonic frequency components. Journal of Neural Engineering, 2(4):123–130, 2005. M URUGAPPAN , M. Inferring of Human Emotional States using Multichannel EEG. European Journal of Scientific Research, 48(2):281–299, 2010. N IEDERMEYER , E. The normal EEG of the waking adult. In N IEDERMEYER , E. and F. L OPES D A S ILVA, editors, Electroencephalography Basic Principles Clinical Applications and Related Fields, chapter 9, pages 167–192. Williams & Wilkins, 2005. N IJBOER , F., J. C LAUSEN, B. Z. A LLISON, and P. H ASELAGER. The Asilomar Survey: Stakeholders Opinions on Ethical Issues Related to Brain-Computer Interfacing. Neuroethics, 2011. N IJHOLT, A., B. R EUDERINK, and D. O UDE B OS. Turning Shortcomings into Challenges: Brain-Computer Interfaces for Games. In Intelligent Technologies for Interactive Entertainment, pages 153–168. 2009. N OWLIS , D. P. and J. KAMIYA. The control of electroencephalographic alpha rhythms through auditory feedback and the associated mental activity. Psychophysiology, 6(4): 476–484, 1970. O LOFSSON , J. K., S. N ORDIN, H. S EQUEIRA, and J. P OLICH. Affective picture processing: an integrative review of ERP findings. Biological Psychology, 77(3):247–265, 2008. O NTON , J. and S. M AKEIG. High-frequency Broadband Modulations of Electroencephalographic Spectra. Frontiers in Human Neuroscience, 3, 2009. BIBLIOGRAPHY | O RGS , G., K. L ANGE, J.-H. D OMBROWSKI, and M. H EIL. Is conceptual priming for environmental sounds obligatory? International Journal of Psychophysiology, 65(2):162–166, 2007. O RTONY, A., G. L. C LORE, and A. C OLLINS. The cognitive structure of emotions, volume 54. Cambridge University Press, Cambridge, USA, 1988. O SIPOVA , D., A. TAKASHIMA, R. O OSTENVELD, G. F ERN ÁNDEZ, E. M ARIS, and O. J ENSEN. Theta and gamma oscillations predict encoding and retrieval of declarative memory. The Journal of Neuroscience, 26(28):7523–31, 2006. O YA , H., H. KAWASAKI, M. A. H OWARD, and R. A DOLPHS. Electrophysiological responses in the human amygdala discriminate emotion categories of complex visual stimuli. Journal of Neuroscience, 22(21):9502–9512, 2002. PALVA , S., M. M. PALVA, and J. M. PALVA. New vistas for alpha-frequency band oscillations. Trends in Neurosciences, 30(4):150–158, 2007. PANKSEPP, J. and G. B ERNATZKY. Emotional sounds and the brain: the neuro-affective foundations of musical appreciation. Behavioral Processes, 60(2):133–155, 2002. PARROTT, W. G. Emotions in Social Psychology: Essential Readings. Psychology Press, Philadelphia, USA, 2001. P ETRANTONAKIS , P. C. and L. J. H ADJILEONTIADIS. Emotion Recognition from Brain Signals Using Hybrid Adaptive Filtering and Higher Order Crossings Analysis. IEEE Transactions on Affective Computing, 1(2):81–97, 2010. P FURTSCHELLER , G. Post-movement beta synchronization. A correlate of an idling motor area? Electroencephalography and Clinical Neurophysiology, 98(4):281–293, 1996. P FURTSCHELLER , G., B. Z. A LLISON, C. B RUNNER, G. B AUERNFEIND, T. S OLIS -E SCALANTE, R. S CHERER, T. O. Z ANDER, G. M UELLER -P UTZ, C. N EUPER, and N. B IRBAUMER. The hybrid BCI. Frontiers in Neuroscience, 4:30, 2010. P FURTSCHELLER , G., C. B RUNNER, A. S CHL ÖGL, and F. H. L OPES DA S ILVA. Mu rhythm (de)synchronization and EEG single-trial classification of different motor imagery tasks. NeuroImage, 31(1):153–9, 2006. P FURTSCHELLER , G. and F. L OPES DA S ILVA. Event-related EEG/MEG synchronization and desynchronization: basic principles. Clinical Neurophysiology, 110(11):1842–1857, 1999. P FURTSCHELLER , G., G. R. M ÜLLER -P UTZ, J. P FURTSCHELLER, and R. R UPP. EEG-Based Asynchronous BCI Controls Functional Electrical Stimulation in a Tetraplegic Patient. EURASIP Journal on Advances in Signal Processing, 2005(19):3152–3155, 2005. P FURTSCHELLER , G. and C. N EUPER. Motor imagery and direct brain-computer communication. Proceedings of the IEEE, 89(7):1123–1134, 2001. 159 160 | BIBLIOGRAPHY P HELPS , M. E, editor. PET. Springer New York, New York, NY, 2006. P HILLIPS , M. Neurobiology of emotion perception I: the neural basis of normal emotion perception. Biological Psychiatry, 54(5):504–514, 2003. P ICARD , R. W. Affective Computing. The MIT Press, Massachusetts, USA, 1997. P ICARD , R. W. R., E. V YZAS, and J. H EALEY. Toward Machine Emotional Intelligence: Analysis of Affective Physiological State. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10):1175–1191, 2001. P LASS -O UDE B OS , D., B. R EUDERINK, B. L AAR, H. G ÜRK ÖK, C. M ÜHL, M. P OEL, A. N I JHOLT , and D. H EYLEN . Brain-Computer Interfacing and Games. In TAN , D. S. and A. N I JHOLT , editors, Brain-Computer Interfaces: Applying Our Minds to Human-Computer Interaction, chapter 10, pages 149–178. Springer London, 2010. P OLICH , J. Updating P300: an integrative theory of P3a and P3b. Clinical Neurophysiology, 118(10):2128–2148, 2007. P OURTOIS , G., B. DE G ELDER, A. B OL, and M. C ROMMELINCK. Perception of facial expressions and voices and of their combination in the human brain. Cortex, 41(1):49–59, 2005. P OURTOIS , G. and P. V UILLEUMIER. Dynamics of emotional effects on spatial attention in the human visual cortex. Progress in Brain Research, 156:67–91, 2006. R AICHLE , M. E. Functional Brain Imaging and Human Brain Function. The Journal of Neuroscience, 23(10):3959–3962, 2003. R EEVES , B. and C. N ASS. The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places. Cambridge University Press, NY, USA, 1996. R EGAN , D. Human brain electrophysiology: Evoked potentials and evoked magnetic fields in science and medicine. Appleton & Lange, New York, NY, USA, 1989. R EUDERINK , B., C. M ÜHL, and M. P OEL. Valence, arousal and dominance in the EEG during game play. International Journal of Autonomous and Adaptive Communication Systems, 6(1), 2013. R EUDERINK , B., A. N IJHOLT, and M. P OEL. Affective Pacman: A Frustrating Game for Brain-Computer Interface Experiments. In Proceedings of the 3rd International Conference on Intelligent Technologies for Interactive Entertainment (INTETAIN 2009), pages 221–227, Amsterdam, The Netherlands, 2009. R ICE , K. M., E. B. B LANCHARD, and M. P URCELL. Biofeedback treatments of generalized anxiety disorder: Preliminary results. Applied Psychophysiology and Biofeedback, 18(2): 93–105, 1993. BIBLIOGRAPHY | R ISS ÉN , D., B. M ELIN, L. S ANDSJ Ö, I. D OHNS, and U. L UNDBERG. Surface EMG and psychophysiological stress reactions in women during repetitive work. European Journal of Applied Physiology, 83(2-3):215–22, 2000. R UDRAUF, D., J.-P. L ACHAUX, A. D AMASIO, S. B AILLET, L. H UGUEVILLE, J. M ARTINERIE, H. D AMASIO, and B. R ENAULT. Enter feelings: somatosensory responses following early stages of visual induction of emotion. International Journal of Psychophysiology, 72(1): 13–23, 2009. R USSEL , J. A. A circumplex model of affect. Journal of Personality and Social Psychology, 39(6):1161–1178, 1980. R USSEL , J. A. and L. F. B ARRETT. Core affect, prototypical emotions, and other things called emotion: Disecting the elephant. Journal of Personality and Social Psychology, 76 (5):805–819, 1999. R UTKOWSKI , T. M., T. TANAKA, A. C ICHOCKI, D. E RICKSON, J. C AO, and D. P. M ANDIC. Interactive component extraction from fEEG, fNIRS and peripheral biosignals for affective brainmachine interfacing paradigms. Computers in Human Behavior, 27(5):1512–1518, 2011. S ABATINELLI , D., P. J. J. L ANG, A. K EIL, and M. M. M. B RADLEY. Emotional Perception: Correlation of Functional MRI and Event-Related Potentials. Cerebral Cortex, 17(5):1085– 1091, 2006. S AMMLER , D., M. G RIGUTSCH, T. F RITZ, and S. KOELSCH. Music and emotion: Electrophysiological correlates of the processing of pleasant and unpleasant music. Psychophysiology, 44(2):293–304, 2007. S ANDER , D., D. G RANDJEAN, and K. R. S CHERER. A systems approach to appraisal mechanisms in emotion. Neural Networks, 18(4):317–352, 2005. S ARLO, M., G. B UODO, S. P OLI, and D. PALOMBA. Changes in EEG alpha power to different disgust elicitors: the specificity of mutilations. Neuroscience Letters, 382(3):291–6, 2005. S ATO, N., T. J. O ZAKI, Y. S OMEYA, K. A NAMI, S. O GAWA, H. M IZUHARA, and Y. YAM Subsequent memory-dependent EEG theta correlates to parahippocampal blood oxygenation level-dependent response. Neuroreport, 21(3):168–72, 2010. AGUCHI . S ATO, N. and Y. YAMAGUCHI. Theta synchronization networks emerge during human object-place memory encoding. Neuroreport, 18(5):419–24, 2007. S CHERER , K. R. What are emotions? And how can they be measured? Social Science Information, 44(4):695–729, 2005. S CHLEIFER , L. M., T. W. S PALDING, S. E. K ERICK, J. R. C RAM, R. L EY, and B. D. H ATFIELD. Mental stress and trapezius muscle activation under psychomotor challenge: a focus on EMG gaps during computer work. Psychophysiology, 45(3):356–65, 2008. 161 162 | BIBLIOGRAPHY S CHL ÖGL , A. and C. B RUNNER. BioSig: A Free and Open Source Software Library for BCI Research. Computer, 41(10):44–50, 2008. S CHL ÖGL , A., C. K EINRATH, D. Z IMMERMANN, R. S CHERER, R. L EEB, and G. P FURTSCHELLER. A fully automated correction method of EOG artifacts in EEG recordings. Clinical Neurophysiology, 118(1):98–104, 2007. S CHMIDT, L. A. and L. J. T RAINOR. Frontal brain electrical activity (EEG) distinguishes valence and intensity of musical emotions. Cognition and Emotion, pages 487–500, 2001. S CHNEIDER , T. R., S. D EBENER, R. O OSTENVELD, and A. K. E NGEL. Enhanced EEG gammaband activity reflects multisensory semantic matching in visual-to-auditory object priming. NeuroImage, 42(3):1244–1254, 2008. S CHOMER , D. L. and F. L. D. S ILVA. Niedermeyer’s Electroencephalography: Basic Principles, Clinical Applications, and Related Fields. Lippincott Williams & Wilkins, 2010. S CHULZ , E., A. Z HERDIN, L. T IEMANN, C. P LANT, and M. P LONER. Decoding an Individual’s Sensitivity to Pain from the Multivariate Analysis of EEG Data. Cerebral Cortex, 22(5): 1118–1123, 2012. S CHUPP, H. T., B. N. C UTHBERT, M. M. B RADLEY, J. T. C ACIOPPO, T. I TO, and P. J. L ANG. Affective picture processing: the late positive potential is modulated by motivational relevance. Psychophysiology, 37(2):257–261, 2000. S CH ÜRMANN , M. Functional aspects of alpha oscillations in the EEG. International Journal of Psychophysiology, 39(2-3):151–158, 2001. S CH ÜRMANN , M., C. B AAR -E ROGLU, V. KOLEV, and E. B AAR. Delta responses and cognitive processing: single-trial evaluations of human visual P300. International Journal of Psychophysiology, 39(2-3):229–239, 2001. S CHUSTER , T., S. G RUSS, H. K ESSLER, A. S CHECK, H. H OFFMANN, and H. T RAUE. EEG: pattern classification during emotional picture processing. In Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments (PETRA ’10), Samos, Greece, 2010. ACM Press. S CHUTTER , D. J., P. P UTMAN, E. H ERMANS, and J. VAN H ONK. Parietal electroencephalogram beta asymmetry and selective attention to angry facial expressions in healthy human subjects. Neuroscience Letters, 314(1-2):13–16, 2001. S ENKOWSKI , D., J. KAUTZ, M. H AUCK, R. Z IMMERMANN, and A. K. E NGEL. Emotional Facial Expressions Modulate Pain-Induced Beta and Gamma Oscillations in Sensorimotor Cortex. Journal of Neuroscience, 31(41):14542–14550, 2011. S ENKOWSKI , D., T. R. S CHNEIDER, J. J. F OXE, and A. K. E NGEL. Crossmodal binding through neural coherence: implications for multisensory processing. Trends in Neurosciences, 31(8):401–9, 2008. BIBLIOGRAPHY | S ENKOWSKI , D., T. R. S CHNEIDER, F. TANDLER, and A. K. E NGEL. Gamma-band activity reflects multisensory matching in working memory. Experimental Brain Research, 198(23):363–372, 2009. S HAW, J. C. Intention as a component of the alpha-rhythm response to mental activity. International Journal of Psychophysiology, 24(1-2):7–23, 1996. S ILBERMAN , E. Hemispheric lateralization of functions related to emotion. Brain and Cognition, 5(3):322–353, 1986. S KIDMORE , T. A. and H. W. H ILL. The evoked potential human-computer interface. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 13(1):407–408, 1991. S OKHADZE , T. M., R. L. C ANNON, and D. L. T RUDEAU. EEG Biofeedback as a Treatment for Substance Use Disorders: Review, Rating of Efficacy and Recommendations for Further Research. Journal of Neurotherapy: Investigations in Neuromodulation, Neurofeedback and Applied Neuroscience, 12(1):5–43, 2008. S OLEYMANI , M., J. J. K IERKELS, G. C HANEL, and T. P UN. A Bayesian framework for video affective representation. In C OHN , J., A. N IJHOLT, and M. PANTIC, editors, Proceedings of the 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops (ACII2009), Amsterdam, The Netherlands, 2009. IEEE Computer Society Press. S OLEYMANI , M., S. KOELSTRA, I. PATRAS, and T. P UN. Continuous emotion detection in response to music videos. In Proceedings of the 9th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2011), 1st International Workshop on Emotion Synthesis, rePresentation, and Analysis in Continuous spacE (EmoSPACE), pages 803–808, Santa Barbara, CA, USA, 2011a. IEEE. S OLEYMANI , M., J. L ICHTENAUER, T. P UN, and M. PANTIC. A Multi-Modal Affective Database for Affect Recognition and Implicit Tagging. IEEE Transactions on Affective Computing, 3(99):1–1, 2011b. S PRECKELMEYER , K. N., M. KUTAS, T. P. U RBACH, E. A LTENM ÜLLER, and T. F. M ÜNTE. Combined perception of emotion in pictures and musical sounds. Brain Research, 1070 (1):160–170, 2006. S TEMMLER , G., M. H ELDMANN, C. A. PAULS, and T. S CHERER. Constraints for emotion specificity in fear and anger: the context counts. Psychophysiology, 38(2):275–291, 2001. S TERIADE , M., D. A. M C C ORMICK, and T. J. S EJNOWSKI. Thalamocortical oscillations in the sleeping and aroused brain. Science, 262(5134):679–685, 1993. S TERMAN , M. and T. E GNER. Foundation and Practice of Neurofeedback for the Treatment of Epilepsy. Applied Psychophysiology and Biofeedback, 31(1):21–35, 2006. S TERN , R. M., W. J. R AY, and K. S. Q UIGLEY. Psychophysiological Recording. Oxford University Press, Inc., second edition, 2001. 163 164 | BIBLIOGRAPHY TAKAHASHI , K. Remarks on Emotion Recognition from BioPotential Signals. In M UKHOPAD HYAY, S. and S. G. G UPTA , editors, Proceedings of the 2nd International Conference on Autonomous Robots and Agents, pages 186–191, Palmerston North, New Zealand, 2004. TALLON -B AUDRY, C., O. B ERTRAND, T. B AUDRY, and B ERTRAND. Oscillatory gamma activity in humans and its role in object representation. Trends in Cognitive Sciences, 3(4):151–162, 1999. TANGERMANN , M., DAURRE , and K.-R. M. K RAULEDAT, K. G RZESKA, M. S AGEBAUM, B. B LANKERTZ, C. V I M ÜLLER. Playing pinball with non-invasive BCI. Advances in Neural Information Processing Systems, 21:1641–1648, 2009. T EPLAN , M. and A. K RAKOVSKA. EEG features of psycho-physiological relaxation. In Proceedings of the 2nd International Symposium on Applied Sciences in Biomedical and Communication Technologies (ISABEL 2009), Bratislava, Slovakia, 2009. IEEE. T HURLINGS , M. E., J. B. F. E RP, A.-M. B ROUWER, and P. J. W ERKHOVEN. EEG-Based Navigation from a Human Factors Perspective. In TAN , D. S. and A. N IJHOLT, editors, Brain-Computer Interfaces: Applying Our Minds to Human-Computer Interaction, chapter 5, pages 71–86. Springer London, 2010. T SANG, C. D., L. J. T RAINOR, D. L. S ANTESSO, S. L. TASKER, and L. A. S CHMIDT. Frontal EEG Responses as a Function of Affective Musical Features. Annals of the New York Academy of Sciences, 930(1):439–442, 2006. VAN DEN S TOCK , J., I. P ERETZ, J. G R ÈZES, and B. DE G ELDER. Instrumental music influences recognition of emotional body language. Brain Topography, 21(3-4):216–20, 2009. VAN D ER Z WAAG , M. D., J. H. D. M. W ESTERINK, and E. L. VAN D EN B ROEK. Emotional and psychophysiological responses to tempo, mode, and percussiveness. Musicae Scientiae, 15(2):250–269, 2011. V IDAL , J. J. Toward direct brain-computer communication. Annual Review of Biophysics and Bioengineering, 2:157–80, 1973. V IDAURRE , C. and B. B LANKERTZ. Towards a Cure for BCI Illiteracy. Brain Topography, 23 (2):194–198, 2010. V INES , B. W., C. L. K RUMHANSL, M. M. WANDERLEY, and D. J. L EVITIN. Cross-modal interactions in the perception of musical performance. Cognition, 101(1):80–113, 2006. V UILLEUMIER , P. How brains beware: neural mechanisms of emotional attention. Trends in Cognitive Sciences, 9(12):585–594, 2005. V YTAL , K. and S. H AMANN. Neuroimaging support for discrete neural correlates of basic emotions: a voxel-based meta-analysis. Journal of Cognitive Neuroscience, 22(12):2864– 2885, 2010. BIBLIOGRAPHY | WAHLSTR ÖM , J., A. L INDEG Å RD, G. A HLBORG, A. E KMAN, and M. H AGBERG. Perceived muscular tension, emotional stress, psychological demands and physical load during VDU work. International Archives of Occupational and Environmental Health, 76(8):584–90, 2003. W EISZ , N., T. H ARTMANN, N. M ÜLLER, I. L ORENZ, and J. O BLESER. Alpha rhythms in audition: cognitive and clinical perspectives. Frontiers in Psychology, 2:73, 2011. W ELCH , P. The use of fast Fourier transform for the estimation of power spectra: A method based on time averaging over short, modified periodograms. IEEE Transactions on Audio and Electroacoustics, 15(2):70–73, 1967. W ESTERMANN , R., K. S PIES, G. S TAHL, and F. W. H ESSE. Relative effectiveness and validity of mood induction procedures: a meta-analysis. European Journal of Social Psychology, 26 (4):557–580, 1996. W IJSMAN , J., B. G RUNDLEHNER, J. P ENDERS, and H. H ERMENS. Trapezius muscle EMG as predictor of mental stress. In Wireless Health 2010 (WH ’10), page 155, San Diego, CA, USA, 2010. ACM Press. W INKLER , I., M. J ÄGER, V. M IHAJLOVIC , and T. T SONEVA. Frontal EEG Asymmetry Based Classification of Emotional Valence using Common Spatial Patterns. World Academy of Science, Engineering and Technology, 69:373–378, 2010. W OLF, K., R. M ASS, T. I INGENBLEEK, F. K IEFER, D. N ABER, and K. W IEDEMANN. The facial pattern of disgust, appetence, excited joy and relaxed joy: An improved facial EMG study. Scandinavian Journal of Psychology, 46(5):403–409, 2005. W OLPAW, J. R., N. B IRBAUMER, D. J. M C FARLAND, G. P FURTSCHELLER, and T. M. VAUGHAN. Brain-computer interfaces for communication and control. Clinical Neurophysiology, 113(6):767–91, 2002. Y ORDANOVA , J., V. KOLEV, O. A. R OSSO, M. S CH ÜRMANN, O. W. S AKOWITZ, M. O ZG ÖREN, and E. B ASAR. Wavelet entropy analysis of event-related potentials indicates modalityindependent theta dominance. Journal of Neuroscience Methods, 117(1):99–109, 2002. Z ANDER , T. O. and S. JATZEV. Detecting affective covert user states with passive BrainComputer Interfaces. In C OHN , J., A. N IJHOLT, and M. PANTIC, editors, Proceedings of the 3rd International Conference on Affective Computing and Intelligent Interaction (ACII2009), Part II, Amsterdam, The Netherlands, 2009. IEEE Computer Society Press. Z ANDER , T. O., M. D. K LIPPEL, and R. S CHERER. Towards multimodal error responses. In Proceedings of the 13th international conference on multimodal interfaces (ICMI ’11), page 53, New York, New York, USA, 2011. ACM Press. Z ENG, Z., M. PANTIC, G. R OISMAN, and T. H UANG. A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1):39–58, 2009. 165 166 | BIBLIOGRAPHY Z HANG, Q. and M. L EE. Analysis of positive and negative emotions in natural scene using brain activity and GIST. Neurocomputing, 72(4-6):1302–1306, 2009. Z IMNY, G. H. and E. W. W EIDENFELLER. Effects of music upon GSR and heart-rate. The American Journal of Psychology, 76:311–314, 1963. SIKS Dissertation Series 2009 2009-01 Rasa Jurgelenaite (RUN) Symmetric Causal Independence Models 2009-14 Maksym Korotkiy (VU) From ontology-enabled services to service-enabled ontologies (making ontologies work in e-science with ONTO-SOA) 2009-02 Willem Robert van Hage (VU) Evaluating Ontology-Alignment Techniques 2009-15 Rinke Hoekstra (UVA) Ontology Representation - Design Patterns and Ontologies that Make Sense 2009-03 Hans Stol (UvT) A Framework for Evidence-based Policy Making Using IT 2009-16 Fritz Reul (UvT) New Architectures in Computer Chess 2009-04 Josephine Nabukenya (RUN) Improving the Quality of Organisational Policy Making using Collaboration Engineering 2009-17 Laurens van der Maaten (UvT) Feature Extraction from Visual Data 2009-05 Sietse Overbeek (RUN) Bridging Supply and Demand for Knowledge Intensive Tasks Based on Knowledge, Cognition, and Quality 2009-06 Muhammad Subianto (UU) Understanding Classification 2009-07 Ronald Poppe (UT) Discriminative Vision-Based Recovery and Recognition of Human Motion 2009-08 Volker Nannen (VU) Evolutionary Agent-Based Policy Analysis in Dynamic Environments 2009-09 Benjamin Kanagwa (RUN) Design, Discovery and Construction of Service-oriented Systems 2009-10 Jan Wielemaker (UVA) Logic programming for knowledge-intensive interactive applications 2009-11 Alexander Boer (UVA) Legal Theory, Sources of Law & the Semantic Web 2009-12 Peter Massuthe (TUE, HU Berlin) Operating Guidelines for Services 2009-13 Steven de Jong (UM) Fairness in Multi-Agent Systems 2009-18 Fabian Groffen (CWI) Armada, An Evolving Database System 2009-19 Valentin Robu (CWI) Modeling Preferences, Strategic Reasoning and Collaboration in Agent-Mediated Electronic Markets 2009-20 Bob van der Vecht (UU) Adjustable Autonomy: Controling Influences on Decision Making 2009-21 Stijn Vanderlooy (UM) Ranking and Reliable Classification 2009-22 Pavel Serdyukov (UT) Search For Expertise: Going beyond direct evidence 2009-23 Peter Hofgesang (VU) Modelling Web Usage in a Changing Environment 2009-24 Annerieke Heuvelink (VUA) Cognitive Models for Training Simulations 2009-25 Alex van Ballegooij (CWI) ”RAM: Array Database Management through Relational Mapping” 2009-26 Fernando Koch (UU) An Agent-Based Model for the Development of Intelligent Mobile Services 2009-27 Christian Glahn (OU) Contextual Support of social Engagement and Reflection on the Web 2009-28 Sander Evers (UT) Sensor Data Management with Probabilistic Models 2009-46 Loredana Afanasiev (UvA) Querying XML: Benchmarks and Recursion 2010 2009-29 Stanislav Pokraev (UT) Model-Driven Semantic Integration of Service-Oriented Applications 2010-01 Matthijs van Leeuwen (UU) Patterns that Matter 2009-30 Marcin Zukowski (CWI) Balancing vectorized query execution with bandwidthoptimized storage 2010-02 Ingo Wassink (UT) Work flows in Life Science 2009-31 Sofiya Katrenko (UVA) A Closer Look at Learning Relations from Text 2010-03 Joost Geurts (CWI) A Document Engineering Model and Processing Framework for Multimedia documents 2009-32 Rik Farenhorst (VU) and Remco de Boer (VU) Architectural Knowledge Management: Supporting Architects and Auditors 2010-04 Olga Kulyk (UT) Do You Know What I Know? Situational Awareness of Colocated Teams in Multidisplay Environments 2009-33 Khiet Truong (UT) How Does Real Affect Affect Affect Recognition In Speech? 2010-05 Claudia Hauff (UT) Predicting the Effectiveness of Queries and Retrieval Systems 2009-34 Inge van de Weerd (UU) Advancing in Software Product Management: An Incremental Method Engineering Approach 2010-06 Sander Bakkes (UvT) Rapid Adaptation of Video Game AI 2009-35 Wouter Koelewijn (UL) Privacy en Politiegegevens; Over geautomatiseerde normatieve informatie-uitwisseling 2009-36 Marco Kalz (OUN) Placement Support for Learners in Learning Networks 2009-37 Hendrik Drachsler (OUN) Navigation Support for Learners in Informal Learning Networks 2009-38 Riina Vuorikari (OU) Tags and self-organisation: a metadata ecology for learning resources in a multilingual context 2009-39 Christian Stahl (TUE, HU Berlin) Service Substitution – A Behavioral Approach Based on Petri Nets 2009-40 Stephan Raaijmakers (UvT) Multinomial Language Learning: Investigations into the Geometry of Language 2009-41 Igor Berezhnyy (UvT) Digital Analysis of Paintings 2009-42 Toine Bogers (UvT) Recommender Systems for Social Bookmarking 2009-43 Virginia Nunes Leal Franqueira (UT) Finding Multi-step Attacks in Computer Networks using Heuristic Search and Mobile Ambients 2009-44 Roberto Santana Tapia (UT) Assessing Business-IT Alignment in Networked Organizations 2009-45 Jilles Vreeken (UU) Making Pattern Mining Useful 2010-07 Wim Fikkert (UT) Gesture interaction at a Distance 2010-08 Krzysztof Siewicz (UL) Towards an Improved Regulatory Framework of Free Software. Protecting user freedoms in a world of software communities and eGovernments 2010-09 Hugo Kielman (UL) A Politiele gegevensverwerking en Privacy, Naar een effectieve waarborging 2010-10 Rebecca Ong (UL) Mobile Communication and Protection of Children 2010-11 Adriaan Ter Mors (TUD) The world according to MARP: Multi-Agent Route Planning 2010-12 Susan van den Braak (UU) Sensemaking software for crime analysis 2010-13 Gianluigi Folino (RUN) High Performance Data Mining using Bio-inspired techniques 2010-14 Sander van Splunter (VU) Automated Web Service Reconfiguration 2010-15 Lianne Bodenstaff (UT) Managing Dependency Relations in Inter-Organizational Models 2010-16 Sicco Verwer (TUD) Efficient Identification of Timed Automata, theory and practice 2010-17 Spyros Kotoulas (VU) Scalable Discovery of Networked Resources: Algorithms, Infrastructure, Applications 2010-18 Charlotte Gerritsen (VU) Caught in the Act: Investigating Crime by Agent-Based Simulation 2010-19 Henriette Cramer (UvA) People’s Responses to Autonomous and Adaptive Systems development through a learning path specification 2010-37 Niels Lohmann (TUE) Correctness of services and their composition 2010-38 Dirk Fahland (TUE) From Scenarios to components 2010-20 Ivo Swartjes (UT) Whose Story Is It Anyway? How Improv Informs Agency and Authorship of Emergent Narrative 2010-39 Ghazanfar Farooq Siddiqui (VU) Integrative modeling of emotions in virtual agents 2010-21 Harold van Heerde (UT) Privacy-aware data management by means of data degradation 2010-40 Mark van Assem (VU) Converting and Integrating Vocabularies for the Semantic Web 2010-22 Michiel Hildebrand (CWI) End-user Support for Access to Heterogeneous Linked Data 2010-41 Guillaume Chaslot (UM) Monte-Carlo Tree Search 2010-23 Bas Steunebrink (UU) The Logical Structure of Emotions 2010-42 Sybren de Kinderen (VU) Needs-driven service bundling in a multi-supplier setting - the computational e3-service approach 2010-24 Dmytro Tykhonov (TUD) Designing Generic and Efficient Negotiation Strategies 2010-25 Zulfiqar Ali Memon (VU) Modelling Human-Awareness for Ambient Agents: A Human Mindreading Perspective 2010-26 Ying Zhang (CWI) XRPC: Efficient Distributed Query Processing on Heterogeneous XQuery Engines 2010-27 Marten Voulon (UL) Automatisch contracteren 2010-28 Arne Koopman (UU) Characteristic Relational Patterns 2010-29 Stratos Idreos(CWI) Database Cracking: Towards Auto-tuning Database Kernels 2010-30 Marieke van Erp (UvT) Accessing Natural History - Discoveries in data cleaning, structuring, and retrieval 2010-31 Victor de Boer (UVA) Ontology Enrichment from Heterogeneous Sources on the Web 2010-32 Marcel Hiel (UvT) An Adaptive Service Oriented Architecture: Automatically solving Interoperability Problems 2010-33 Robin Aly (UT) Modeling Representation Uncertainty in Concept-Based Multimedia Retrieval 2010-34 Teduh Dirgahayu (UT) Interaction Design in Service Compositions 2010-35 Dolf Trieschnigg (UT) Proof of Concept: Concept-based Biomedical Information Retrieval 2010-36 Jose Janssen (OU) Paving the Way for Lifelong Learning; Facilitating competence 2010-43 Peter van Kranenburg (UU) A Computational Approach to Content-Based Retrieval of Folk Song Melodies 2010-44 Pieter Bellekens (TUE) An Approach towards Context-sensitive and User-adapted Access to Heterogeneous Data Sources, Illustrated in the Television Domain 2010-45 Vasilios Andrikopoulos (UvT) A theory and model for the evolution of software services 2010-46 Vincent Pijpers (VU) e3alignment: Exploring Inter-Organizational Business-ICT Alignment 2010-47 Chen Li (UT) Mining Process Model Variants: Challenges, Techniques, Examples 2010-48 Milan Lovric (EUR) Behavioral Finance and Agent-Based Artificial Markets 2010-49 Jahn-Takeshi Saito (UM) Solving difficult game positions 2010-50 Bouke Huurnink (UVA) Search in Audiovisual Broadcast Archives 2010-51 Alia Khairia Amin (CWI) Understanding and supporting information seeking tasks in multiple sources 2010-52 Peter-Paul van Maanen (VU) Adaptive Support for Human-Computer Teams: Exploring the Use of Cognitive Models of Trust and Attention 2010-53 Edgar Meij (UVA) Combining Concepts and Language Models for Information Access 2011 2011-18 Mark Ponsen (UM) Strategic Decision-Making in complex games 2011-01 Botond Cseke (RUN) Variational Algorithms for Bayesian Inference in Latent Gaussian Models 2011-19 Ellen Rusman (OU) The Mind ’ s Eye on Personal Profiles 2011-02 Nick Tinnemeier(UU) Organizing Agent Organizations. Syntax and Operational Semantics of an Organization-Oriented Programming Language 2011-20 Qing Gu (VU) Guiding service-oriented software engineering - A view-based approach 2011-03 Jan Martijn van der Werf (TUE) Compositional Design and Verification of Component-Based Information Systems 2011-21 Linda Terlouw (TUD) Modularization and Specification of Service-Oriented Systems 2011-04 Hado van Hasselt (UU) Insights in Reinforcement Learning; Formal analysis and empirical evaluation of temporal-difference learning algorithms 2011-05 Base van der Raadt (VU) Enterprise Architecture Coming of Age - Increasing the Performance of an Emerging Discipline. 2011-06 Yiwen Wang (TUE) Semantically-Enhanced Recommendations in Cultural Heritage 2011-07 Yujia Cao (UT) Multimodal Information Presentation for High Load Human Computer Interaction 2011-08 Nieske Vergunst (UU) BDI-based Generation of Robust Task-Oriented Dialogues 2011-09 Tim de Jong (OU) Contextualised Mobile Media for Learning 2011-10 Bart Bogaert (UvT) Cloud Content Contention 2011-11 Dhaval Vyas (UT) Designing for Awareness: An Experience-focused HCI Perspective 2011-12 Carmen Bratosin (TUE) Grid Architecture for Distributed Process Mining 2011-13 Xiaoyu Mao (UvT) Airport under Control. Multiagent Scheduling for Airport Ground Handling 2011-14 Milan Lovric (EUR) Behavioral Finance and Agent-Based Artificial Markets 2011-15 Marijn Koolen (UvA) The Meaning of Structure: the Value of Link Evidence for Information Retrieval 2011-16 Maarten Schadd (UM) Selective Search in Games of Different Complexity 2011-17 Jiyin He (UVA) Exploring Topic Structure: Coherence, Diversity and Relatedness 2011-22 Junte Zhang (UVA) System Evaluation of Archival Description and Access 2011-23 Wouter Weerkamp (UVA) Finding People and their Utterances in Social Media 2011-24 Herwin van Welbergen (UT) Behavior Generation for Interpersonal Coordination with Virtual Humans: On Specifying, Scheduling and Realizing Multimodal Virtual Human Behavior 2011-25 Syed Waqar ul Qounain Jaffry (VU) Analysis and Validation of Models for Trust Dynamics 2011-26 Matthijs Aart Pontier (VU) Virtual Agents for Human Communication: Emotion Regulation and Involvement-Distance Trade-Offs in Embodied Conversational Agents and Robots 2011-27 Aniel Bhulai (VU) Dynamic website optimization through autonomous management of design patterns 2011-28 Rianne Kaptein(UVA) Effective Focused Retrieval by Exploiting Query Context and Document Structure 2011-29 Faisal Kamiran (TUE) Discrimination-aware Classification 2011-30 Egon van den Broek (UT) Affective Signal Processing (ASP): Unraveling the mystery of emotions 2011-31 Ludo Waltman (EUR) Computational and Game-Theoretic Approaches for Modeling Bounded Rationality 2011-32 Nees-Jan van Eck (EUR) Methodological Advances in Bibliometric Mapping of Science 2011-33 Tom van der Weide (UU) Arguing to Motivate Decisions 2011-34 Paolo Turrini (UU) Strategic Reasoning in Interdependence: Logical and Gametheoretical Investigations 2011-35 Maaike Harbers (UU) Explaining Agent Behavior in Virtual Training 2011-36 Erik van der Spek (UU) Experiments in serious game design: a cognitive approach 2011-37 Adriana Burlutiu (RUN) Machine Learning for Pairwise Data, Applications for Preference Learning and Supervised Network Inference 2011-38 Nyree Lemmens (UM) Bee-inspired Distributed Optimization 2011-39 Joost Westra (UU) Organizing Adaptation using Agents in Serious Games 2011-40 Viktor Clerc (VU) Architectural Knowledge Management in Global Software Development 2011-41 Luan Ibraimi (UT) Cryptographically Enforced Distributed Data Access Control 2011-42 Michal Sindlar (UU) Explaining Behavior through Mental State Attribution 2011-43 Henk van der Schuur (UU) Process Improvement through Software Operation Knowledge 2011-44 Boris Reuderink (UT) Robust Brain-Computer Interfaces 2012-06 Wolfgang Reinhardt (OU) Awareness Support for Knowledge Workers in Research Networks 2012-07 Rianne van Lambalgen (VU) When the Going Gets Tough: Exploring Agent-based Models of Human Performance under Demanding Conditions 2012-08 Gerben de Vries (UVA) Kernel Methods for Vessel Traject 2012-09 Ricardo Neisse (UT) Trust and Privacy Management Support for Context-Aware Service Platforms 2012-10 David Smits (TUE) Towards a Generic Distributed Adaptive Hypermedia Environment 2012-11 J.C.B. Rantham Prabhakara (TUE) Process Mining in the Large: Preprocessing, Discovery, and Diagnostics 2012-12 Kees van der Sluijs (TUE) Model Driven Design and Data Integration in Semantic Web Information Systems 2011-45 Herman Stehouwer (UvT) Statistical Language Models for Alternative Sequence Selection 2012-13 Suleman Shahid (UvT) Fun and Face: Exploring non-verbal expressions of emotion during playful interactions 2011-46 Beibei Hu (TUD) Towards Contextualized Information Delivery: A Rule-based Architecture for the Domain of Mobile Police Work 2012-14 Evgeny Knutov(TUE) Generic Adaptation Framework for Unifying Adaptive Webbased Systems 2011-47 Azizi Bin Ab Aziz(VU) Exploring Computational Models for Intelligent Support of Persons with Depression 2012-15 Natalie van der Wal (VU) Social Agents. Agent-Based Modelling of Integrated Internal and Social Dynamics of Cognitive and Affective Processes 2011-48 Mark Ter Maat (UT) Response Selection and Turn-taking for a Sensitive Artificial Listening Agent 2012-16 Fiemke Both (VU) Helping people by understanding them - Ambient Agents supporting task execution and depression treatment 2011-49 Andreea Niculescu (UT) Conversational interfaces for task-oriented spoken dialogues: design aspects influencing interaction quality 2012-17 Amal Elgammal (UvT) Towards a Comprehensive Framework for Business Process Compliance 2012 2012-01 Terry Kakeeto (UvT) Relationship Marketing for SMEs in Uganda 2012-02 Muhammad Umair(VU) Adaptivity, emotion, and Rationality in Human and Ambient Agent Models 2012-03 Adam Vanya (VU) Supporting Architecture Evolution by Mining Software Repositories 2012-04 Jurriaan Souer (UU) Development of Content Management System-based Web Applications 2012-05 Marijn Plomp (UU) Maturing Interorganisational Information Systems 2012-18 Eltjo Poort (VU) Improving Solution Architecting Practices 2012-19 Helen Schonenberg (TUE) What’s Next? Operational Support for Business Process Execution 2012-20 Ali Bahramisharif (RUN) Covert Visual Spatial Attention, a Robust Paradigm for BrainComputer Interfacing 2012-21 Roberto Cornacchia (TUD) Querying Sparse Matrices for Information Retrieval 2012-22 Thijs Vis (UvT) Intelligence, politie en veiligheidsdienst: verenigbare grootheden?