Game Sound from Behind the Sofa:
An Exploration into the Fear Potential
of Sound & Psychophysiological
Approaches to Audio-centric,
Adaptive Gameplay
Game Sound from
Behind the Sofa:
An Exploration into the Fear
Potential of Sound &
Psychophysiological Approaches
to Audio-centric, Adaptive
Gameplay
Tom A. Garner
University of Aalborg 2012
Game Sound from Behind the Sofa:
An Exploration into the Fear Potential of Sound &
Psychophysiological Approaches to Audio-centric,
Adaptive Gameplay
Acknowledgements:
Thank you to my supervisor and mentor Mark Grimshaw: for
continuing to share your expertise, providing essential editorial
support and broadening my horizons. Additional thanks go to
Laura Petrini and Stephen Manning for keeping my numbers and
technical details correct. Thanks are also due to the staff at the
University of Aalborg, Manchester School of Sound Recording and
the University of Bolton for providing much needed financial
support and resources vital to the thesis’ completion.
Preface:
My foray into the study of computer video games, in many ways, feels much like a
predetermined event. Although I cannot make claim to being present at the birth of
first generation home entertainment gaming systems or the arcade machines that
predate them, I can be grateful for a personal gaming timeline that has almost
traversed the complete history, or at least an abridged version. As a child, games
consoles were prohibited items within my family home and whilst my friends enjoyed
the spoils of Commodore 64s, Sega Master Systems and Nintendo Entertainment
Systems (some even had retro Atari systems, giving me an addiction-laden
appreciation for Pong), I was frustratingly restricted to playing them during fleeting
visits to friends’ homes. Then fortune favoured me and Amiga were kind enough to
release the 500, and better yet, market it as a multi-function personal computer with
educational benefits. So, one Christmas, my parents presented me with a word
processor and animation/art designer and I certainly remember learning a
significant amount from Bart Simpson vs. the Space Mutants and Lemmings,
although my educational odyssey was unlikely to have corresponded to my parents’
intentions. From then on I resolved to explore games further and somehow
managed to buy my own Gameboy and Nintendo 64 by saving up with a twopound per week allowance; something I’m still rather proud of.
Games technology is a testament to the way in which humans have transcended
the pace of evolutionary development to accelerate progress in a truly dramatic
way. As computing power doubles roughly every 18 months the dreams we have
standing by, ready to implement, are soon to be realised. Much as in physics we are
getting closer to manipulating our world at a sub-atomic level, in computer
technology we move ever nearer to creating artificial life, extending our own
(perhaps indefinitely), and recreating our existence as a virtual construct. To have
the opportunity to be a part of the process is a great honour and unmistakeably
exciting.
Sound has always been an essential part of the computer video game experience,
drawing the player deeper into the virtual world and creating iconic and instantly
recognisable games moments, from the wonderfully relieving sound Sonic the
Hedgehog makes when he takes a breath from an underwater bubble, to the
levelling up sound of World of Warcraft that haunts the dreams of many a raider.
Nevertheless, audio still remains secondary to vision, despite being a significantly
more heavily utilised sense. Sound supports our immersion and sense of presence
within both virtuality and reality yet is arguably taken for granted in many
circumstances. I therefore hope that this work (and continuing associated research)
will support efforts to elucidate the value of sound (both within and beyond
computer games) and garner appreciation for sound as an invaluable aspect of
existence.
Contents:
1. Introduction & Thesis Structure……………………………………………………………………………1
Thesis Overview……………………………………………………………………………………………..….2
Definitions of Relevant Terminology………………………………………………………………………..4
The ‘Fear Problem’ with Computer Video Games & Other Relevant Areas of Development…5
Fear & Sound from a First-Person Perspective……………………………………………………………9
Outline of Thesis Strucuture…..……………………………………………………………………………..12
2. Emotion & the Nature of Fear in Games………………………...…………………………..………..13
The Importance of Human Emotion…………………………………………………….………………..14
Emotion: Origins, Definitions & Perspectives…………………………………………………………….15
Emotion Classification Systems…………………………………………………………………………….19
The Neuroscience of Emotion……………………………………………………………………........…..21
Emotions & Computer Gameplay…………………………………………………………………....…..22
An Outline of Fear: Definitions & Terminology…………………………………………………….….…28
The Value of Fear……………………………………………………………………………………….……30
3. Understanding Fear & Game Sound: Definitions, Processes & Variables…...……….…………34
Perspectives & Associated Theory……………………………………………………………………...…35
Understanding Fear……………………………………………………………………………………….…37
Processes & Variables Within Fear…...……………………………………………………………………40
Fear & Computer Video Games……………………………………………………………………..……47
The Potential of Acoustic/Psycho Acoustic Sound Parameters to Create & Intensify Fear…...49
Emotional Properties of Sound……………………………………………………………………………..49
Potentially Fear Evoking Acoustic Parameters………………………………………………………….52
4. Embodied Cognition & Sonic Virtuality………………...………………………………………......…55
Defining Virtuality………………………………………………………………………………………..……56
Concepts of Virtuality………………………………………………………………………….…………….57
The Virtuality of Sound.………………………………………………………………………………………59
Virtual Acoustic Ecology & Embodied Cognition…………………………………..………………….61
Concepts of Embodied Cognition………………………………………………………………………..62
Acoustic Ecology………………………………………………………………………………………..……66
Embodied Cognition & Virtuality………………………………………………………………….....……72
Reconciling Acoustic Ecology with Sonic Virtuality…………………………………………………...75
Sound Functionality & Modes of Listening…………………………………………………………...….77
5. Psychophysiology & Biometric Feedback Systems in Computer Gameplay………...……….80
Psychophysiology: Definitions & Approaches…………………………………………………………..81
Electrodermal Activity……………………………………………………………………………………….83
Electromyography……………………………………………………………………………………………85
Applications & Limitations…………………………………………………………………………………..87
Biometrics in Context: Emotion..…………………………………………………………………………...89
Biometrics in Context: Sound.....…………………………………………………………………………...92
Biometrics in Context: Computer Video Games……….……………………………………………...93
Electroencephalography…………………………………………………………………………………..98
Advantage, Limitations & Applications of EEG…………………………………………………………99
Perspectives on EEG Acquisition & Filtration…………………………………………………………..101
EEG Feature Extraction & Emotion Classification………………………………………………….….102
6. Methodology Designs…………………………………………………………………………………..107
Experiment 1: Web Mediated Assessment of Affective Game Sound……………………..……108
Experiment 1: The possibilities of E-Research……………………………………………..….………..109
Experiment 1: The Methodological Perils of E-Research…………………………………………….111
Experiment 1: Ethical Concerns……………………………………………………………………….…115
Experiment 1: Methodology Introduction……………………………………………………...………117
Experiment 1: Website Design…………………………………………………………………………….118
Experiment 1: The Horror Game Sound Designer……………………………………………………..119
Experiment 1: The Sounds of Fear………………………………………………………………………..120
Experiment 1: Sound Design………………………………………………………………………….…..121
Experiment 1: Pretesting & Participant Recruitment………………………………………………...124
Experiment 1: Offline Testing…………………………………………………………………….………..124
Experiment 2: Real-time Fear Value of Preselected Sound Parameters during Gameplay….125
Experiment 2: Preliminary Testing……………………………………………………………………..….125
Experiment 2: Preparation of Sounds……………………………………………………………………126
Experiment 2: Game Level Design……………………………………………………………………....126
Experiment 2: Environment & Game Equipment………………………………………………..……128
Experiment 2: Participants……………………………………………………………………………..….128
Experiment 2: Procedure…………………………………………………………………………………..129
Experiment 2: Data Collection…………………………………………………………………………...129
Experiment 3: Real Time Biometric Fear Assessment of Game Sound……………………………131
Experiment 3: Bespoke Game Design………………………………………………………………..…131
Experiment 3: Sound Design………………………………………………………………………………132
Experiment 3: Pilot Study……………………………………………………………………………….….133
Experiment 3: Testing Environment & Equipment……………………………………………………..134
Experiment 3: Participants, Procedure & Ethics……………………………………………………....134
Experiment 3: Data Collection……………………………………………………………………………135
7. Experiment Results & Discussions……………………………………………………………………..136
Experiment 1: Horror Game Sound Designer Results……………………………………….………..137
Experiment 1: Sound of Fear Results………………………………………….…………………………140
Experiment 1: Discussion…………………………………………………………………………..………143
Experiment 1: Conclusions & Experiment Summary………………………………………………....145
Experiment 2: Results…………………………………………………………………………….…………145
Experiment 2: Discussion…………………………………………………………………………….…….148
Experiment 2: Conclusions & Experiment Summary…………………………………………………150
Experiment 3: Results…………………………………………………………………………………..…..151
Experiment 3: Discussion………………………………………………………………………….……….156
Experiment 3: Conclusions & Experiment Summary………………………………………………….157
With Reference to the Academic Review……………………………………………………………..158
8. Hypothetical Frameworks……………………………………………………………………….……..160
Interaction & Processes within an Ecology of Fear…………………………………………………..161
Integrating Audio Classification into a Fear Framework…………………………………………….163
An Embodied Virtual Acoustic Ecology………………………………………..………………………168
9. Conclusions & Future Work………………………………………………………………..……………172
Summary of PhD Programme…………………………………………………………………...………..173
Conclusions and Retrospective Evaluations: Chapter 2……………………………………….……174
Chapter 3………………………………………………………………………….…………………….……176
Chapter 4………………………………………………………………………………………………..……178
Chapter 5…………………………………………………………………………………………..…………180
Chapter 6…………………………………………………………………………………………..…………185
Chapter 7…………………………………………………………………………………………..…………187
Chapter 8…………………………………………………………………………………………..…………190
Future Study: Consumer Grade Acquisition Devices…………………………………………...……192
Future Study: Outline…………………………………………………………………………….…………194
10. Appendix: Future Work, References and Complete Datasets…………………………………199
Xpresence: Purpose & Functionality…………………………………………………………………….200
Xpresence: Design Document……………………………………………………………….…………..202
Build & Implementation Strategy…………………………………………………………….………….211
References…………………………………………………………………………………………….……..212
Complete Raw Datasets…………………………………………………………………………………..255
Thesis Abstract:
The central concern of this thesis is upon the processes by which human beings
perceive sound and experience emotions within a computer video gameplay
context. The potential of quantitative sound parameters to evoke and modulate
emotional experience is explored, working towards the development of structured
hypothetical frameworks of auditory processing and emotional experience.
Research relevant to computer game theory, embodied cognition,
psychophysiology, emotion studies, fear processing and acoustics/psychoacoustics
are reviewed in detail and several primary experimental trials are presented that
provide additional support of the hypothetical frameworks: an ecological process of
fear, a fear-related model of virtual and real acoustic ecologies, and an embodied
virtual acoustic ecology framework.
It is intended that this thesis will clearly support more effective and efficient sound
design practices and also improve awareness of the capacity of sound to generate
significant emotional experiences during computer video gameplay. It is further
hoped that this thesis will elucidate the potential of biometrics/psychophysiology to
allow game designers to better understand the player and to move closer towards
the development of an automated computer system that is capable of interpreting
player-emotion and adapting the game environment in response, to create a
continuously evolving and unique, player-centred game experience.
Glossary & Regularly Abbreviated terms:
Affective value – Abstract differentiation between stimuli in terms of both their
potential to evoke emotional responses and the intensity of said response.
Affective realism – Approach to creating more immersive and ‘realistic’
virtual environments that focuses upon emulation of the emotional
interactive processes that exist is reality.
Biometrics – Characteristically refers to systems that utilise physiological data
to identify an individual for security applications. Within the thesis, biometrics
refers simply to the physiological data, regardless of application.
Biofeedback – Presentation of personal biometric data to the individual,
typically as visualised representations.
Biofeedback loop – A circular system in stimuli influence physiological data
and, in turn, physiological data influences stimuli in a continuous loop.
Computer Video Games (CVG) – Digital interactive media encompassing
home entertainment, mobile, hand-held, internet-based and arcade-based
systems. For the purposes of the thesis, the focus is upon home entertainment
(consoles, PC/MAC) systems.
Embodied Cognition (EC) – Concept of human cognition, asserting that
thought cannot be detached from the immediate environment or the
physiology/memory of the individual.
Electrodermal Activity (EDA) – Psychophysiological measure of electrical
activity conducted via the skin and related to sweat secretion. The term is
commonly used throughout to refer to skin conductance response.
Electroencephalography (EEG) – Psychophysiological measure of electrical
activity released during communication between neurons in the brain.
Electromyography (EMG) – Psychophysiological measure of electrical activity
during muscular activity. The term is used throughout the thesis to refer to
surface EEG techniques collecting signal data from facial muscles.
First-person Shooter (FPS) – Computer video game genre utilising a first-person
perspective as game display. Gameplay activity characteristically revolves
around projectile weapon combat.
Geographical Distance – The physical distance (space) that exists between
an individual and a specific entity/event.
Hypotheticality
–
A
measure
of
whether
an
entity/event
concrete/inevitable or abstract/hypothetical/unlikely to occur
is
Physical realism – Simulations of physical processes (gravity, collision,
photorealism, etc.) that supports the creation of virtual environments, that are
indistinguishable from reality.
Psychological Distance (PD) – Concept asserting that perception of entities
and situations is highly susceptible to the influences of geographical distance,
temporal distance, hypotheticality and social distance
Psychophysiology – Studying the relationships between psychological
manipulations and resulting physiological activity (measured in living
organisms) to understand mental and bodily processes and their relation to
each other.
Quantitative Acoustic Parameters – Characteristics of sound that can be
differentiated via quantitative measures (hertz, seconds, decibels, etc.)
Social distance – A psychological distance concept asserting that the
closeness of social relationships may influence an individual’s perception
towards a relevant entity or event.
Survival-horror – A classification of computer video game that most
commonly refers to aesthetic and mood as opposed to gameplay
mechanics. Games characteristically incorporate horror-theme mythologies,
characters and scenarios. Their aesthetic is commonly dark, uncertain and
threatening in efforts to imply danger and evoke fear.
Temporal distance – The distance (measured in time) that separates an
individual from an entity/event.
Web experimentation – an online task that requires participants to interact
with web-based materials and provide real-time and/or debrief responses.
Dedicated to my wife and son; Hayley and James Garner.
Thank you to the former for tireless support and encouragement
and to the latter for (almost!) waiting until I’d finished before being
born.
Tom Garner
2012
University of Aalborg
1
Chapter 1
Introduction and Thesis Structure
Garner, Tom A.
University of Aalborg
2012
2
Introduction and Thesis Structure
Chapter 1: Introduction
The central focus of this thesis is upon the methods by which human beings perceive sound
and process emotions during computer video gameplay. The overarching aim is to address the
following questions: ‗can quantitative acoustic parameters modulate an emotional
experience‘, ‗can the processes of sound perception and emotion be realised as mechanically
structured frameworks comparable to computer programming code‘ and, as an extension of
the previous question, ‗can psychophysiological data be processed in a way that would enable
a computer system to accurately determine a player‘s emotional state during play?‘ In concise
terms, the intended contributions to knowledge of this thesis are: to support more effective
and efficient sound design practices, to position sound as a critical element to generating
emotion and therefore as a central element of computer video games, to develop game
experience measures that allow the game designer to better understand the player and, finally,
to progress the development of an automated computer system capable of interpreting
emotional status (by way of psychophysiological and contextual data) and manipulating the
game environment in response, creating a continuously adaptive game experience and
making games capable of incorporating many innovative, emotion-centred, gameplay
mechanics. This opening chapter outlines the principal aspects of study with a view to
elucidating the key perspectives relevant to the thesis. Commencing with an outline of
relevant terminology and continuing with a testament to the projected contribution of the
thesis, current problems in game sound design, emotioneering, fear assessment, and emotion
classification are discussed, alongside ways in which this study could potentially support
developments across these fields. The chapter concludes with an outline of the thesis
structure.
THESIS OVERVIEW
Understanding emotionality is a crucial aspect of human-computer interaction and sound is a
critical component to consider when developing emotionality as it is directly associated with
the user‘s experience of emotions (Alves & Roque, 2009). This PhD thesis documents
theoretical research and associated experimentation within the study of acoustics and fear.
The work produced is continuously framed within the context of computer video games. The
aims of the thesis are to assess literature from a range of disciplines to develop a framework
of virtual acoustic ecology within the context of fear, to develop our understanding of the role
sounds (excluding musical and vocal) play in eliciting fear during computer video gameplay
and to provide quantifiable evidence in support of the hypothesis that manipulation of
acoustic properties can affect the nature of a fearful experience.
The primary overarching hypothesis of the thesis states that game sound, biometric data and
qualitative game experience descriptors have the potential to operate as a
psychophysiological feedback loop within which affective data, in response to crafted game
sound events/soundscapes, can be encoded into adaptive game parameters by way of an
automated emotionally intelligent system.
Tom Garner
2012
University of Aalborg
The above hypothesis is founded upon two more general hypotheses that are currently being
explored in academia. The first is that human emotion (specifically fear) can be understood as
arrangements of quantitative variables that exist within the brain and body. It is further
postulated that such developing biofeedback technology enables researchers to observe and
record many of the parameters of these variables to the degree that they can not only
distinguish between discrete emotional states, but also identify variations within individual
emotions. Within the context of fear in a survival horror game, this translates to the capacity
of an automated system to recognise if a player is feeling fear and establish the nature of that
experience. It is asserted that although the progressive psychophysiological equipment is
rapidly approaching this goal, biometric technology is not yet capable of meeting such
aspirations. Nonetheless, the hypothesis of this work asserts that automated affect assessment
can be attainable within a computer video game application, by way of qualitative descriptors
synchronised to game events and scenes that enable biometric data to be contextualised,
thereby supporting greater accuracy of recognition.
The second foundational hypothesis states that quantitative acoustic parameters
(characteristics of sound that can be measured in established units such as hertz, seconds and
decibels) have the potential to modulate the affective value of a sound without contextual
support. For example, a loud sound with an immediate attack, presented in a previously quiet
or silent environment, will always evoke a comparable shock response irrespective of
semantic properties, scenario or differences between individual listeners. Furthermore, it is
theorised that a comprehensive understanding of the association between sound parameters
and player affect has the potential to enable sound designers' effective control over a player‘s
emotions during play. If the primary hypothesis of this thesis could be conclusively
supported, such concepts could be applied to technological developments. Computer video
game engines could automatically recognise and learn effective fear-evoking strategies to
develop an awareness of the individual player‘s personal fear profile that could improve
gameplay immersion and enjoyment, contribute significantly to replay value and ultimately
(if embedded into the platform architecture rather than an individual game engine) could
translate across multiple games to consistently maximise the affective potential of any
experience undertaken utilising that hardware system.
The academic review within this thesis brings together core concepts of embodied cognition
(see Wilson, 2002), acoustic ecology (see Truax, 1978), virtual acoustic ecology (see
Grimshaw & Schott, 2008) computer video game experience (see Grimshaw, 2007) and fear
processing theory (see Massumi, 2005) to construct an acoustic ecology of fear within a
virtuality framework. Beginning with an overview of emotion theory and fear
conceptualisation/processing from contrasting perspectives; the thesis examines perception,
specifically: the six main concepts of embodied cognition (Wilson, 2002), thrownness,
construal level theory and psychological distance (Heidegger, 1927; Lieberman & Trope,
2008; Winograd & Flores, 1986). These concepts are strongly advocated within the thesis and
influence the conclusions that inform both the preliminary experimentation and hypothetical
frameworks documented later within the thesis. Existing empirical and conceptual research
concerning acoustic parameters, sound classes and modes of listening are also amalgamated
3
4
Introduction and Thesis Structure
and refined via the survival-horror game context. This research also includes a consolidation
of literature relevant to internet-mediated experimentation and a rationale is put forward,
advocating the internet as a reliable and powerful resource for empirical investigation within
this field. Empirical investigation includes several experiments measuring players‘
experience of fear by way of both innovative qualitative analysis (real-time intensity
vocalisation) and quantitative biometrics. Obtained data reveals that changes in the acoustic
parameters of game sound can have a significant impact upon the player‘s emotional (fear)
experience. Both empirical data and secondary research are consolidated to produce a
hypothetical process of fear that is re-contextualised into a gameplay relevant acoustic
ecology. The framework of sound-emotionality presented within the thesis leaves room for
further developments. Future work could explore the concepts presented in more highly
specified detail, most specifically, the variances in fear elicitation that can be observed in
response to a comprehensive range of parameters within individual acoustic effects. Increased
specification will address the impact of parameter settings, for example: high-pass filtering
within reverberation, degree angle within localisation, and individual frequency bands within
equalisation. Such detail would enable the development of a comprehensive framework of
subjective emotional experience and quantitative acoustic manipulation within a computer
video game context.
DEFINITIONS OF RELEVANT TERMINOLOGY
The term computer video game (CVG) is applied throughout in reference to the electronic
medium that embodies and contextualises the acoustic, emotion and psychophysiology
research explored within the thesis. This term is favoured above alternative descriptors as it is
a comprehensive term that accommodates the two most common terms presented within
recent relevant literature: computer game (for example, Grimshaw & Schott, 2008; Parker &
Heerema, 2008; van Reekum et al., 2004) and video game (Perron, 2004; Ravaja et al., 2008).
For the purposes of the thesis, CVG will refer to commercially available electronic software
and incorporates most modern platforms of gaming (including handheld/portable devices,
home consoles, personal computers, mobile devices/phones and internet gaming). Although
arcade, gambling/public house and early years children‘s toy machines are not explicitly
removed from reference, they are not directly relevant to this study.
The first-person shooter (FPS), like several other genres (and indeed entertainment
mediums), cannot be fully encapsulated within a general framework, notably in terms of the
finer details. As discussed in more detail, later within the thesis, the FPS genre is evolving
substantially and the nature of this study itself is to further support this progression. The
foundations of the FPS however, remain relatively constant; integrating a 360 o/3 axis firstperson perspective, visible avatar arms/hands/weapons, a real-time rendered 3D virtual
environment and gameplay mechanics centred primarily upon exploration, shooting and
survival. Some recent, innovative titles do oppose these characteristics, including Amnesia:
the Dark Descent (Frictional, 2010) and realMyst (Sunsoft, 2000) that involve no weapons or
combat despite the former being largely considered a survival horror game.
Tom Garner
2012
University of Aalborg
The distinction between CVG genre classifications is discernibly vague and it has been
asserted that the survival-horror genre is itself a subgenre of action-adventure (Boyce, 2011).
Fahs (2009) posits that survival-horror games are ‗one of the only genres not defined by
gameplay mechanics, but by theme, atmosphere, subject matter, and design philosophy‘,
several aspects that are discernibly more abstract, and variations in such features are more
difficult to differentiate between. Consequently, relatively few titles share a common
framework. Common variations between horror-themed games include perspective (first or
third-person) and combat system (firearms, melee weapons or unarmed) but direct
comparison of most modern titles frequently reveals significant differences between
approaches to the more subtle aspect of game parameter settings (health, weapon damage,
ammunition availability, enemy frequency, enemy health, etc.). Within this study, survival
horror therefore refers to any computer video game that crafts atmosphere, environment and
circumstance to evoke fear-related affect during gameplay irrespective of gameplay
mechanics or perspective.
With regards to sounds within a CVG context, the term game sound is favoured over game
audio only in response to the greater commonality of the former within analogous research
(see Collins, 2008; Grimshaw & Schott, 2008; Wu, Li & Rao, 2008, etc.). Collins (2008, p.
3) separates game sound into four categories: dialogue/speech, music, sound effects (typically
representative of an individual entity/event) and ambient sounds (characteristically
background sound, slow evolving and to establish scene). This research is concerned only
with these latter two classes and to support efficiency and transparency throughout the text,
the idiom sound effects is employed as a blanket-term incorporating both ambience and sound
effect classes. Both sound effects and soundscape ambiences are themselves sub-categorised
into numerous divisions of game sound throughout the thesis to accommodate the various
functions and traits that distinguish sounds within a computer video game.
THE ‘FEAR PROBLEM’ WITH COMPUTER VIDEO
GAMES AND OTHER RELEVANT AREAS OF DEVELOPMENT
At the time of writing, the AAA (referring to premium commercial game titles) games
industry appears divided with regards to how horror themed games should approach the
concept of frightening their audience, the most notable divide being between actionorientated and exploration/puzzle types. It is not possible to accurately determine a game‘s
approach to horror from its broader design choices, for example, whilst Amnesia: the Dark
Decent (Frictional Games, 2010) and F.E.A.R 2: Project Origin (Monolith, 2009) both share
first-person perspectives, the former precludes a combat system altogether whilst the latter
presents the player with an array of standard FPS weapons and plentiful ammunition supplies.
One notable trend, observable whilst considering development of modern horror games titles,
is the tendency amongst franchises to gradually move the focus away from generating fear
and towards increased action quotas. This progression can clearly be observed within the
Dead Space (Visceral Games), F.E.A.R (Monolith) and Resident Evil (Capcom) series all
three of which have significantly increased frequency of action set pieces, ammunition/health
5
6
Introduction and Thesis Structure
supplies and/or enemy spawns and have also introduced local cooperative play options. The
original Resident Evil (1996) title utilised many gameplay mechanics that epitomised the
survival-horror genre, including awkward camera angles, latency suffering controls,
inaccessible combat systems and overwhelming enemies that often required the player to flee
and regroup (Boyce, 2011). However, later incarnations of the series have strived to remove
such characteristics in favour of an over-the-shoulder perspective (interestingly popularised
by the 2005 sequel, Resident Evil 4 [Giant Bomb.com, 2012]) plus more smooth and intuitive
combat. Although such changes are, in some ways, notable improvements (Resident Evil 4
received a 96/100 Metacritic score [Metacritic.com, 2005] in addition to several games
awards) they may also be responsible for attenuating the fear-factor of survival-horror by
removing aspects of the game design that reduced player coping ability and raised tension
(Boyce, 2011). Increased accessibility of modern games arguably increases appeal in many
circumstances but has unfortunate consequences in horror-themed gaming. Howell (2011)
describes schematic game design as a consistency in mechanics between games that supports
access, navigation and interaction. Howell asserts that whilst such an approach has some
merit, there remains a risk of players feeling a lack of challenge in response to essentially
being escorted through game levels and, consequently, they are less inclined to explore any
innovative aspects of a game. Instead their experience is limited by their assumptions and
‗they will often attempt to play other similar games based on those expectations without
exploring the nuances of the individual title‘. These concerns are with regards to modern
gaming in general and therefore the consequences for games that wish to elicit horror are
arguably more severe with schematic game design causing once innovative approaches to
fear elicitation to become predictable and even comical. Whilst the above issue is a genuine
concern for game designers who wish to evoke fear, it could be argued that such
characteristics were destined to be challenged and removed, leaving a requirement for
immersion-based diegetic sources to supersede game mechanics as more continuously
effective sources of gameplay fright.
Modern horror-themed games can also be classified into two sub-groups, action horror and
psychological horror. Boyce (2011) asserts that the former is a largely western horror
tradition, based primarily around shock and gore, whilst the latter is characteristically eastern
and focuses upon narrative, atmosphere and unease. This certainly has been an apt
differentiation (e.g. the Silent Hill series [Konami, Japan] in comparison to the Doom Series
[ID software, USA]). However, recent game developments are beginning to blur the
distinctions as some eastern games take increasing action-orientated approaches (e.g.
Resident Evil 6 [Capcom, 2012]) whilst some western-developed games have employed a
distinctly eastern approach to atmosphere and pacing (e.g. Alan Wake, [Remedy, 2010]).
From a game sound perspective, the above differentiation between design approaches is
crucial to sound design that intends to evoke a fearful response in the player. Whilst
psychological horror dictates steady pacing and therefore slowly evolving, unnerving
atmospheres; action horror requires shocks, disgust-laden horrific revelation and against-theclock tension devices. This points to significant differences in sound design, with the former
approach more likely to employ dissonance, uneven rhythmic structures, uncomfortable
Tom Garner
2012
University of Aalborg
ambiences slow-building rises in pitch and tempo, distorted and sharp equalisation, etc. In
contrast, an action-horror theme would be expected to require short attack bursts of sound,
high contrast volume changes and aggressively quick tempos. Whilst psychological horror
deals in sounds that connote the unknown or the uncanny, action horror sounds signify
immediate, characterised danger and disgusting imagery.
To better understand the potential for sound to elicit discrete emotional states it is crucial that
the evidence collected be obtained from a multi-faceted approach and consequent inferences
made in response to concurrent patterns of data. Qualitative psychological responses provide
detailed and contextualised information but retrospective analysis lacks temporal accuracy,
cannot reliably differentiate more minute changes in affective valence or intensity, and is
susceptible to false response, suppression or accentuation based upon participant agenda. The
opposing approach of quantitative physiological analysis cannot accurately reflect upon
circumstantial nuances and there is currently no relevant system that does not have a question
mark positioned above its translation process from physical measure to emotional experience.
Quantitative psychophysiological analysis does however present a solution to subjective
approaches, circumventing participant agenda and obtaining precise temporal and signal
resolutions, allowing researchers to accurately observe when a change occurred and calculate
the size and statistical character of that change. Psychophysiological approaches further
provide opportunity for the development of automated, artificial emotion recognition systems
in which physiological data can be contextualised and emotional states ‗understood‘ by a
software intelligence. Such progress reaches beyond our understanding of the affective
potential of game sound to the concept of emotion-biofeedback loops: automated systems
capable of amalgamating physiological responses with game data logging data (from
individual events to overarching situations and surrounding virtual environments) to
accurately and reliably infer emotional states during computer gameplay and feed that
information back into the system. The game engine can then respond with changes to,
potentially, any conceivable parameter of the game; from generating a sunrise in response to
a player‘s happy state, to increasing the avatar‘s physical action statistics (run faster, jump
higher) in response to a player‘s aggressive state. If such a system were utilised within
modern games titles: increased concentration could enable the bullet-time function in F.E.A.R
(Monolith, 2006) or Max Payne (Remedy, 2001), elevated relaxation could increase your
speed and chance of success in defusing a bomb in Rainbow Six: Vegas (Ubisoft, 2006), and
angry emotional response could unlock additional ‗renegade‘ conversation options in Mass
Effect (Bioware, 2007).
In addition to such game-specific mechanics, biofeedback systems facilitate the potential for
radically improved artificial intelligence systems that could empower non-player characters
(NPC) with emotional intelligence. NPCs could react appropriately in real-time to playeremotion states enabling the player to interact with characters like never before and
simultaneously opening up a world of possibilities for new game mechanics. Biofeedback has
the potential to allow a player to: intimidate or calm a suspect during an interrogation, barter
with a passing traveller over the cost of a new plasma rifle, or convince a friendly character to
believe in and join your crusade, all by way of feeding physiological information into the
7
8
Introduction and Thesis Structure
game engine that is then translated during gameplay into an appropriate NPC response.
Biofeedback emotion recognition systems also present a valuable contribution to the
development of serious games projects. Emotional intelligence in NPC characters could
support person to person communication training (e.g. sales and marketing industries), stress
management training for high-risk and cognitively demanding tasks (emergency services,
military, etc.) or emotion training (relevant to psychotherapists, social workers, teachers, etc.)
to name a few.
Biometrics within computer video games are currently at a developmental stage it is
anticipated that within ten years the technology will become mainstream, the central
application being to create an adaptive system, capable of learning the preferences,
motivations, and emotional temperaments of individual players and, with that information,
creating unique and evolving gameplay experiences (McAllister, 2011). Whilst this particular
application is far from an established technology within contemporary gaming, games
production companies are currently utilising biometrics for usability and user experience
(quality control) testing (Tychsen & Canossa, 2008). Developers of current commercial-grade
biometric headsets (www.emotiv.com, www.neurosky.com) advertise gaming as a key
application of their hardware. On May 9th, 2012, several websites published rumours that
Microsoft was preparing to patent a pressure-sensitive game controller design, capable of
recognising an individual user form their unique hand pressure patterns (Greene, 2012). Patel
(2009) discusses the Wii Vitality, and adaptation of the current Wii motion controller that
detects heart-rate and Sony has been reported to have recently patented a bespoke biometric
controller capable of measuring muscular movement, heart rate and sweat secretion
(Humphries, 2011). With the three primary powerhouses of games development all revealing
openness to the technology but yet to take the proverbial plunge, there is clearly a sense that
biometrics is not currently perceived to be commercially viable. However, the production of
such devices as concept suggests that biometrics as a future potential is great and these
manufacturers are all very keen to be the forerunners in marketing biometric games
technology successfully.
Cultivating a better understanding of sound within a computer gameplay context is not
without substantial merit. Until recently, sound has been perceived as being of secondary
importance in virtual reality and computer video games systems (Murphy & Neff, 2011).
Alves and Roque (2011) concisely state the chief concern with regards to development of
sound in games, stating that it ‗remains the craft of a talented minority and the unavailability
of a public body of knowledge […] leads to a mix of alienation and best-judgement
improvisation in the broader development community‘. Approaches to game sound design
vary significantly and this fragmented character lacks cohesion, even within genres, and
ultimately is slowing development. Alves and Roque (2011) suggest a holistic and multidisciplinary approach to game sound that could support the eventual creation of guidelines to
unify progression between developers and researchers. Murphy and Neff (2011) posit that
game sound is still commonly treated as window-dressing, accentuating the immersive
capacity of the visuals and not yet fully appreciated as of equal importance in creation of
truly immersive virtual worlds. Although specifically referring to spatial sounds, Murphy and
Tom Garner
2012
University of Aalborg
Neff also assert that one of the key issues with game sound is that it ‗remain[s] focussed on a
generalized listening experience‘ and does not attempt to recreate the individualised listening
that we experience every day. Hug (2011) suggests that game sound, as a growing entity,
lacks independence and ‗in many ways still seems to live with its parents, Mrs Film Sound
and Mr Realism‘. Hug also raises an important question regarding the ultimate aspirations of
VR and game sound, noting that a conflict exists between desires to develop a filmic, hyperreal aesthetic and intentions to emulate reality, creating a virtual soundscape indistinguishable
from the real. Hug places the representative sound design of the survival horror genre within
the former category, as is logical when considering that many of the entities and phenomena
contained within the genre exist beyond the parameters of our natural universe and, as such,
numerous sounds associated with such entities cannot originate from an emulation of reality.
Hug accepts, however, that sounds within this context are afforded acceptance by the listener
if they are appropriately representative. This requirement is complicated by the definition of
this term as it could refer to convention as established by proceeding games or comparable
motion pictures (extending even to sounds that reflect a description within a novel) or
expectation based upon prior experience of analogous physical interactions.
The motion picture Jurassic Park (Spielberg, 1993) presented a complete sonic characteristic
for the Tyrannosaurus Rex based entirely upon conjecture inferred from the sparse evidence
available. The acoustic attributes (intensity, relative pitch, timbre, ADSR [attack, decay,
sustain, release], etc.) of the beast‘s mighty roar are based upon assumptions of the physical
properties of the animal, surmised from the skeleton, yet the roar is believable because it
matches expectations drawn from the listeners‘ own suppositions (drawn primarily from
visual analysis). This extends even to sounds that have significant (but not complete)
grounding in reality, such as the earth-shaking stomp of the dinosaur. Whilst some elements
of the physical interaction can be proven (nature of the surface being walked upon, acoustic
properties of the surrounding environment, etc.) the intensity of the sound is inferred as is the
weight of the creature, the typical velocity of its steps, the contours and shape of the foot,
etc.; all are estimates.
Survival horror games clearly do not have a fixed position between fantasy and reality. For
example, a modern-day, zombie-themed title may contain many sounds analogous to the real
word (gunshots, limbs being severed, carcasses being consumed, etc.) whilst a science fiction
horror title may contain comparatively more fantastic sounds (plasma rifle shots, tractor
beams, teleports, etc.). As a result, this genre is (arguably) particularly difficult to approach
with an established consensus on sound design, but conversely, to effectively produce such a
system would be an impressive and substantial step forward.
FEAR AND SOUND FROM A FIRST-PERSON PERSPECTIVE
The recent developments in computer video games have consequences for the first-person
shooter (FPS). Specifically, the run and gun association to FPS games has arguably now
become a stereotype. Critically acclaimed and commercially successful first-person
perspective titles have begun to move away from the ‗mindless shooter‘ generalisation by
9
10
Introduction and Thesis Structure
means of intricate narrative and character development, more sophisticated problem solving,
moral choice, strategic cooperative multiplayer and role-playing elements such as weapon
upgrading, item management and open-world questing. The changes being witnessed here are
not part of a linear progression but rather a branching of the FPS into a number of sub-genres,
developing in parallel.
Although the first-person shooter game type is arguably not representative of the survival
horror genre, several modern titles exist that utilise this perspective (e.g. Darkness Within:
Pursuit of Loath Nolder [Zoetrope Interactive, 2007] and Call of Cthulhu: Dark Corners of
the Earth [Headfirst Productions, 2005]). The FPS genre has topped the global sales charts
between 2009 and 2011 (Independent.co.uk, 2009; Guardian.co.uk, 2010; Guardian.co.uk,
2011), supporting assertions that the format is both extremely popular and substantially
profitable. With relevance to the thesis, several practical considerations position the FPS
format above the various alternatives, most notably the availability of powerful graphic user
interfaces (GUIs) that enable high levels of customisation and world building tools without
requiring substantial programming ability. First-person source development kits (SDK) are
arguably the most advanced toolsets for modern game creation, incorporating the latest
technological developments in sound, graphics, artificial intelligence systems and scripting.
FPS game modification communities are substantial with individuals and groups
experimenting with the engines and sharing results. As a result FPS-SDKs provide a wealth
of opportunities to develop bespoke test-games tailored precisely to the desired research
methodology.
With regards to game sound, the current available technology of the FPS engine presents a
wealth of opportunity for experimentation. At the time of writing, the two principal
competitors in this milieu are the Unreal Engine (Epic Games, 1993-2012) and the
CryEngine (Crytek, 2004-2012). Currently in their third incarnations, these game engines
both exist as free to use internet-based versions (regularly updated fully downloadable SDK,
incorporating all the features and technology but providing minimal content [models, sounds,
textures, etc.]) and game-based versions (bundled SDK that includes access to all content
present in the accompanying game but cannot be updated and thus the technology can
become out-dated). Both engine SDKs provide impressive arrays of audio manipulation tools
that transform basic sounds into integrated sound events (an individual or group of sounds
integrated into a game via digital signal processing native to the engine). Sounds can be
accurately localised within 3D space (multi-channel playback compatible); attenuated in
response to physical objects within the virtual environment, looped and randomised to enable
greater length ambient sounds without large numbers of varied source recordings; and
modulated in terms of pitch, volume, equalisation, etc. to enable a single repetitive source
sound to appear as many. Both engines also support audio optimisation and are compatible
with modern compression algorithms (mp2/mp3, Ogg Vorbis, etc.) allowing the sound
designer to implement complex sonic landscapes and events whilst remaining resourceefficient (Mycryengine.com, 2012; Unrealengine.com, 2012). Although both the Unreal and
CryEngine systems are comparable in many respects, the latter arguably provides greater
functionality and control due to full integration with the FMOD middleware program, a
Tom Garner
2012
University of Aalborg
professional-grade toolset designed as a middle-ground between digital audio workstation
(DAW) and game sound implementation tool, ultimately offering a larger number of options,
parameter settings and features than the Unreal Engine‘s native audio system (fmod.org,
2012).
As a genre, the first-person shooter provides the desired tools and commercial appeal.
Furthermore, it is the nature of the first-person perspective that supports the fear and sound
contexts of this thesis. Third-person perspectives may effectively elicit fear as an empathetic
emotion (Perron, 2004), analogous to typical filmic approaches. Sound design, however,
arguably connects more directly with the audience/player as a result of the immersive quality
of sound. This is particularly true when considering extra-diegetic sound design (voice-overs,
music, hyper-real effects) but acoustic treatments of diegetic sounds are also ordinarily
intended to affect the audience directly (for example, an extended period of silence followed
by a jolting, intense sound). Whether playing a computer video game or watching a film,
sound exists within a three-dimensional environment within which the viewer is also placed,
the energy waves travelling through real spaces and reflecting off real surfaces (Grimshaw,
2007). Film sound designers commonly mix sounds to a first-person perspective, with the
audience (as opposed to a diegetic character) as the central point of reference, often with the
intention to facilitate immersion (Massey, 2004). The same is true in the majority of survival
horror games‘ sound design, within which third person visuals are paired with first-person
sound. Logically, it could be assumed that the intention here is to combine immersive impact
within one aspect with the empathetic effect of another. This thesis does not attempt to assess
the truth in this assumption nor compare first and third-person perspectives for fear elicitation
effectiveness and instead acts upon the assumption that immersion and presence, consolidated
by both sound and visuals, have greater potential to successfully evoke fearful responses from
players.
The potential value of better understanding fear, both in general and within a game sound
context is discussed in chapter 3. However, the thesis is intended to present frameworks
integrating computer gameplay and sound that could be expanded upon in future research to
study discrete emotions other than fear. The foremost reasoning for selecting fear as the
particular emotional state under inspection is that fear is debatably the sole emotion, outside
of those intrinsically associated with gaming as an activity (frustration, excitement, pride,
etc.), that has defined an entire genre. Fear can also be ostensibly described as a relatively
more concrete emotional state, with a greater consensus regarding metaphor and simile.
Consequently, fear is potentially a more suitable candidate for quantitative analysis in
comparison with a more abstract emotion such a joy. This thesis retains focus consistently
upon fear-related assessment of game sound and does not explore alternative emotional states
but suggests such a route as a viable future extension.
Throughout the thesis, there is a distinct emphasis upon sound as opposed to visuals or
mechanics and the intentions of the study are to support developments exclusively in sound
effects (non-speech/non-musical) for computer video game applications. Within CVG studies
(and indeed several related disciplines) sound, particularly sound effects (as opposed to
11
12
Introduction and Thesis Structure
speech and music) are underrepresented when compared to specialities such as graphics,
artificial intelligence and game mechanics (Collins, 2008) and the technology surrounding
game sound is equally overshadowed by graphics and artificial intelligence (Parker &
Heerema, 2008). This in itself is a valid reason to pursue a study in game sound but
additional motivations include improving accessibility and experience for visually impaired
audiences and capitalising upon the immersive, three-dimensional nature of sound as a
solution to the shortfalls of current visual displays. It is arguably appropriate to assume that
successful development of virtual worlds indistinguishable from our own, that elicit deep and
genuine emotional states akin to reality, cannot be attained by way of a single developmental
aspect. Ultimately, equivalent breakthroughs across all relevant areas will be required to
achieve such goals and it is therefore crucial that sound not be left behind.
OUTLINE OF THESIS STRUCTURE
The overarching structure of the thesis can be separated into three segments, the first of
which provides literature reviews relevant to the four primary areas of study: emotions
(chapters 2 and 3), acoustic ecologies (chapter 3 and 4), virtuality/game theory (chapter 4)
and psychophysiology (chapter 5). A discussion regarding the nature of web-mediated
experimentation environments is also presented as part of the methodology chapter (6),
providing additional background research in preparation for a related preliminary trial. The
second segment encompasses the methodologies (chapter 6) and results/discussion (chapter
7) of three preliminary trials that assess the fear-related affective potential of various different
sounds and digital signal processing (DSP) parameters. The final segment consolidates the
conclusions raised throughout the thesis and presents a set of hypothetical frameworks
intended to reflect the conclusions of the preceding chapters and visualise the conceptual
processes that occur during sound perception and emotion processing in a computer video
game context. These frameworks are the primary contribution this thesis makes to knowledge
and go some way to supporting the primary research hypothesis by proposing a set of models
that present auditory processing and emotional experience as intrinsically linked to
physiology and psychology, revealing how influencing one area of the framework will have
significant and potentially dramatic effects upon another.
The thesis closes with an overview of future study and, in the appendices, documents raw
data obtained from the preliminary trials and presents a design document for Xpresence, a
bespoke biofeedback software program intended to manipulate game parameters in response
to user-affective data.
The next chapter commences the literature review with an overview of emotions, both within
general and CVG contexts. The importance of human emotion throughout our evolutionary
history is considered and key perceptions are documented. In preparation for further
discussion regarding automated emotion recognition, classification methods are also
evaluated alongside an account of current neuroscience perspectives and an introduction to
relevant concepts of fear.
Tom Garner
2012
University of Aalborg
13
Chapter 2
Understanding Emotion and the
Nature of Fear in Games
Garner, Tom A.
University of Aalborg
2012
14
Understanding Emotion and the Nature of Fear in Games
Chapter 2: Understanding
Emotion and the Nature of Fear
in Games
INTRODUCTION
A comprehensive and precise understanding of emotion, both in terms of evocation and
recognition, could signify a substantial landmark in games advancement. Emotionality in
gaming already plays a pivotal role in the design and development of new games titles
(Lazarro, 2004). This chapter provides an introduction to the thesis by way of outlining one
of the key areas of study it builds upon. Beginning with a testament to the value of research
concerning human emotion, this chapter then presents an overview of relevant concepts
relating to general emotion theory. Later sections increase specificity, discussing human
emotion within a computer video game context; and finally, the experience of fear is outlined
to pre-empt more detailed discussion within later chapters.
THE IMPORTANCE OF HUMAN EMOTION
The original debate concerning the function of human emotion would discuss whether a
formal function of emotion existed at all. Stoic philosophers who proposed enlightenment
through logic and reason would argue that emotions arose from false judgements (Baltzly,
2010). Contesting Athenian philosophies asserted that emotions were not irrational, but were
instead a function of cognition (Konstan, 2006: 421). Descartes (1649) believed that emotions
could be both function and dysfunctional depending upon the context. Darwinian theories
(see Darwin, 1872) asserted the functionality of emotion: insisting upon an evolutionary basis
for emotion origin within which emotions have developed to become crucial to our ecological
adaptation and survival. Oatley, Keltner and Jenkins (2006) document several specific
emotion functions that relate to evolutionary thought: orientation (prioritisation of attention
towards imminent threats/opportunities), organisation (emotional influence upon
physiological processes to support survival-related action) and communication (supporting
social interaction and ultimately, reproduction). Contemporary research into mirror neurons
(neurons that fire both when performing an action and observing the same action in others to
promote empathetic understanding [Keysers, 2011]) supports communication as a pivotal
function of emotion.
The concept that successful interaction between humans relies heavily upon emotional
communication and understanding has been applied to human-computer interaction (HCI) by
Reeves and Nass (1996), who argue that both natural and social factors are prevalent during
interactions between man and machine. Picard (2000) posits that a lack of emotional
understanding between humans inexorably leads to frustration and the same effect is possible
Tom Garner
2012
University of Aalborg
in certain human-computer interactions. Emotional interactivity between software and user
has influenced the consumer sales of computer technology (Norman, 2004) and existing
research has revealed significant positive correlation between user enjoyment and emotional
excitement (specifically, suspense) within a computer video game context (Klimmt et al.,
2009).
Emotionality is now an established component of many computer video game (CVG) titles
and a growing body of research supports its importance (Freeman, 2003; Holbrook et al.,
1984). Perron (2005) asserts that emotional experiences resultant of gameplay have a great
potential for improving player experience and that the more intense the emotion, the greater
the perceived experience. Perron also describes the experience of fear within a survival horror
game as a pleasure and a significant incentive to play. In addition to a positive influence upon
immersion, performance and learning (Shilling et al., 2002), emotionality has the potential to
grant players access to a wider spectrum of emotional states than cannot be easily achieved in
reality (Svendsen, 2008: p.74). As Wilde (1891) suggests:
“Because art does not hurt us, the tears that we shed at a play are a type of the exquisite
sterile emotions that it is the function of art to awaken.” (Oscar Wilde, 1891)
It could be asserted that the majority of the populations living in developed nations are
currently fortunate enough to exist in an environment within which many of the evolutionary
functions of our being are not fully required to ensure our survival. For us, survival denotes
something quite different from that which evolution first developed our emotions for. To an
extent, we are capable of going far beyond utilising our emotions to survive an environmental
scenario (such as fear engaging a flight response to escape a predator) to actively manipulate
our environment in a way which may remove the possibility of a threat even occurring. As
such, we are equipped with emotional functionalities that exist somewhere between underexerted and redundant. One particular theory is that, in many ways, artistic creation has
sought to fill this void by way of engaging our most intense emotion processes and enabling
us to fully immerse ourselves within the human experience without the traditionally
associated physical risk. The latter sections of this chapter will explore this idea further in an
attempt to elucidate why we desire emotional experiences, including those perceived to be
inherently negative.
EMOTION: ORIGINS, DEFINITIONS & PERSPECTIVES
This section presents a brief history of emotion research and outlines the developing nature of
emotion theory; both in terms of what emotions are and where they originate from. More
contemporary literature is presented that makes reference to emotions and affect, specifically
within a computer video game context, to better elucidate some of the central concepts that
this thesis will build upon.
15
16
Understanding Emotion and the Nature of Fear in Games
In terms of etymology, emotion can be differentiated from commonly associated
terminology: affect, feeling, temperament and mood. For the purposes of the thesis, affect is
utilised as the blanket-term, encompassing all associated vocabulary. Positioning affect as an
overarching term takes influence from Fleckenstein (1991: p.448), who describes affect as
‗an extricable element of cognition‘. Watson and Clarke (1994) distinguish emotions from
moods, asserting that the former are event-related, reactionary responses to individual
triggers whilst the latter are repetitious, cyclic and better thought of as emotional
summarisations of a more prolonged period. Gray and Watson (2001) suggest that emotions
are responses to external stimuli whilst mood refers to responses to both external and internal
bodily processes. They also suggest that whilst emotions may be of a full range of intensities,
mood is characteristically low to moderate intensity. Thayer et al. (1996: p.5) connects the
term mood to feeling, describing mood as ‗a background feeling that persists over time‘.
Terms such as feeling and temperament appear largely to function as synonyms, whilst
emotion and mood help clearly differentiate immediate and prolonged affective response.
Consideration for human emotion reaches back to Greek philosophy. Aristotle‘s conceptual
approach documented theories that arguably remain pertinent in modern research. Emotion
was perceived to be dependent upon our belief structures, associated with bodily action and
able to give an argument a perceived truth. Emotions were, however, argued to be capable of
generating irrational behaviour and were characterised as a component of the human
condition that one must take responsibility for. A suitable, encompassing hypothesis that
notably resonates with modern theory (discussed later within this thesis) is the concept of an
interrelationship existing between emotions, cognition and behaviour; within which emotions
may manipulate perception (or action, judgement, etc.) whilst conversely, these entities may
also manipulate emotion. Descartian philosophy expands emotion theory further, presenting
six fundamental discrete emotions (wonder, desire, joy, love, hatred and sadness) and
perpetuating the notion of regular interplay between emotions, physiological factors and the
environment. Descartes (1649) also proclaimed the existence and involvement of a soul
within an emotion framework and argued that conscious thought cannot entirely regulate
emotion, but can suppress it to an extent. Descartian theory posits that all emotional states
have inherent function, however; in correspondence to Greek philosophy, particular
emotional states within certain contexts are described as dysfunctional. Darwin‘s work (1872)
may have been one of the pioneering scientific explorations of emotion; nevertheless the
conclusions of Darwinian thought still largely resonate with the philosophies that preceded it.
What could arguably be depicted as Darwin‘s primary contribution to emotion theory was his
notion of an evolutionary origin and development process for human emotion. Within this
framework, emotions develop steadily over time in response to changing environments that
necessitate emotional activity/processing for survival. Figure 1 (below) outlines a
comprehensive, albeit not exhaustive, list of emotion perspectives and notions that are
separated into macro (more general, overarching) and micro (more specified, component)
theories to provide a concise overview of general emotion theory. Many of the theories
detailed here are described in greater detail within Changing Minds: in Detail (Straker, 2010)
and Understanding Emotion (Oatley, Keltner & Jenkins, 2006).
Tom Garner
2012
University of Aalborg
Figure 1: Consolidation of relevant emotion theories
Macro Theories
Description
James-Lange theory
Emotions are in response to bodily changes that are
themselves, responses to the environment.
Cannon-Bard theory
Bodily changes are in response to emotions which are more
directly tied to the environment.
Singer-Schachter /
Emotions are equivalently susceptible to physiological and
Two-factor theory
cognitive factors.
Situated Emotion
Emotions are not completely internalised. Emotion is the
theory
product of a orga is s i vestigatio of its e viro e t.
Affective Events
theory
Emotions exist within a timeline and are influenced and caused
by events (emotion episodes).
Appraisals theory
Emotions follow appraisal (or evaluations) of an event and
relational (relate self to object).
Emotions are broad entities that exist as a series of
synchronised physiological and cognitive components.
Life is a drama within which emotions support our roles a d
enable a dramatic performance built around strategy and rules.
Emotions are amplifiers of drive (hunger, lust, etc.) which
define action choices via motivation or repulsion.
Emotions originated and developed via evolutionary processes
and can transcend behavioural and cognitive influence to be
compared across different cultures and species.
Emotion is a valence-related mental state organised within the
limbic system.
Description
An emotional bias/preference may continue after the initial
appraisal has been invalidated.
Tendency to over-estimate the duration of a novel emotional
state.
Emotional state may cause fixed attention upon a specific
entity, causing a false assumption that our current emotions
are completely dependent upon that entity.
Tendency to over-estimate the overall impact of a novel
emotional state.
Judgements are inescapably affected by emotion.
Component Process
theory
Dramaturgical theory
Drive theory
Evolutionary theory
Neuro-physiological
theory
Micro Theories
Affect Perseverance
Durability Bias
Focalism
Impact Bias
Mood-Congruent
Judgement
Mood-Congruent
Memory
Mood-dependent
Memory
Opponent-Process
theory
Emotional Contagion
The presence of emotion supports the passage of information
into the long-term memory (LTM).
Recollection of emotional memories is more likely to
correspond to the emotional state being experienced.
Emotions act in opposing pairs (e.g. Pleasure – pain).
Experience of one will suppress the other.
Emotional state can spread quickly through large crowds.
Citation
James (1884)
Cannon (1931)
Schachter &
Singer (1962)
Griffiths &
Scarantino
(2009)
Weiss &
Cropanzano
(1996)
Arnold (1954)
Scherer (1988)
Hochschild
(1979)
Tomkins (1962)
Darwin (1872)
Gainotti (2000)
Citation
Sherman & Kim
(2002)
Wilson et al.
(2000)
Erber & Tesser
(1992)
Gilbert et al.
(1992)
Isen et al.
(1978)
Eich, Macauley
& Ryan (1994)
Eich, Macauley
& Ryan (1994)
Solomon (1980)
Jones & Jones
(1995)
Nearing the turn of the 20th century, researchers with expertise exclusively related to emotion
theory began to emerge. James (1884) prescribed to the philosophy that a clear correlation
existed between emotional experience and physiological changes, but in addition, proposed a
causal pathway through which such physiological effects preceded (and were directly
responsible for) emotional experience (the James-Lange theory of emotions). In response to
the lack of conclusiveness presented within this explanation, a reverse-theory unsurprisingly
17
18
Understanding Emotion and the Nature of Fear in Games
surfaced, claiming the opposite causal pathway within which emotional experience was
responsible for physiological effects (the Cannon-Bard theory). Cannon (1931) presented a
neuroscience perspective of emotion, insisting that emotional states were dependent upon the
neural programming of the brain. Cannon (1931) and Bard (1928) presented the three-level
function system that differentiated emotional thought from reflexive and logical, placing each
on a continuum based upon each type‘s level of autonomy (emotions were centralised
between the highly autonomic reflexive thought and the largely conscious logical thought).
In very broad terms, theories of emotional experience can be separated into three categories:
somatic theory (physiology rather than judgement is essential to emotion), cognitive theory
(judgement is the crucial determinant of emotion) and perceptual theory (a hybridisation of
the other two). Oatley, Keltner and Jenkins (2006) differentiate emotions from moods,
personality and affect. Here affect is presented as a blanket term encompassing emotion,
mood and personality as sub-classifications (and is used in this way throughout the thesis),
each of which is distinguished by its intensity and temporal characteristics. Emotions have
quick onset and dissipation times and typically last for a relatively short time, whereas moods
have a lower intensity, more gradual fade in/out quality and can last for weeks, months or
years. Personality is incorporated as it acts as a determiner of emotion and mood (e.g.
introversion = anxiety = fear [within a social context]) however, it is itself determined by
associated emotions and moods.
The Dramaturgical theory (Hochschild, 1979) positions emotions as enablers of sociocultural roles (for example, love provides a general script for the role of protector within a
relationship between parent and child). Isen (1987) highlighted the socio-cultural aspect of
emotional function, positing that emotional experience within one situation can affect social
judgements and behaviour within a subsequent situation; an ongoing looping mechanism that
will be regularly referenced within the thesis. Oatley, Keltner and Jenkins (2006) provide
additional insight into a culture-based understanding of emotion, presenting a complication in
western cultures, reconciling the conflict between perceiving inappropriate behaviour as
emotional, and the romanticist perspective that hypothesises emotional behaviour as the true
expression of individuality and humanity.
Another distinction separates eastern and western perspectives into independent (based upon
individualism, in which people are viewed as autonomous and unique) and interdependent
(relating to collectivism; sense of shared goals and social cohesion) forms respectively.
Within these perspectives different emotions, usually contextually bound, are treated
differently. Oatley, Keltner and Jenkins (2006) present anger as an example, stating that an
interdependent culture may look upon anger as unacceptable between relations but an
independent culture is more likely to accept this emotion as an assertion of authority or
independence, a phenomenon that has been observed in relevant research (Miyake et al.,
1986). Benedict (1946) connects the notion of values to emotion, stating that in western
society a value such as sincerity relates to behaviour that genuinely reflects emotional state;
whilst eastern society perceives sincerity more as behaviour that satisfies a social duty
without emotional conflict. Just as different cultural perspectives can alter the way in which
Tom Garner
2012
University of Aalborg
emotions are experienced within a social environment, certain emotions have been shown to
not exist within certain cultures due to the social value structure of that society. Other
cultures claim ownership of an emotional state that is unique to them, such as the German
term Schadenfraude (shameful joy: pleasure derived from the suffering of others) or the
Bengali term Obhiman (sorrow caused by the insensitivity of a loved one [Russell, 1991]).
Hyper and hypo-cognition of emotions refers to a mechanism by which an emotion can be
given increased or decreased priority via cultural developments and social discussion.
EMOTION CLASSIFICATION SYSTEMS
Classification of human emotion is another ongoing area of study that can be separated into
two primary approaches - discrete and dimensional. The six emotional states model of
Descartian theory exemplifies early discrete classification and has since been presented in
many distinct incarnations. Plutchik (2002) proposed a wheel of emotions (figure 2), a
circular arrangement of eight primary bipolar emotional states, each of which is further
broken into sub-components as determined by the intensity of the emotional experience.
Figure 2: The Wheel of Emotions, by Robert Plutchik (2002)
Differentiation between emotions has drawn upon alternative origin perspectives; basic
(determined by biology and evolution, universal to all humans) and complex (idiosyncratic,
socio-cultural) emotions relate to the theories documented earlier within this chapter. The
19
20
Understanding Emotion and the Nature of Fear in Games
alternate means by which emotions are believed to be triggered has also been utilised to
differentiate emotions into discrete categories. States evoked by external stimuli within the
environment have been identified as separate from those caused by internal physiological
states (hunger, pain, fatigue, etc.), categorised respectively as Classical and Homeostatic
(also referred to as Primordial) emotions (Craig, 2008; Denton, 2006). Prior to Descartian
classification, the 1st Century BC Chinese encyclopaedist Li Chi (referenced by Russell,
1991: 426) identified joy, anger, sadness, fear, love, disliking and liking. A critic of
Descartian dualism, Spinoza (1677) challenged Descartes‘ six emotions with a simplified
paradigm that included only pleasure, pain and desire. In contrast to the relatively minimalist
classical categorisations of human emotion, contemporary classification largely favours a
more comprehensive list with many terms suggesting a leaning towards high-specificity,
socio-culturally driven emotions that include: separation distress, aversive self-consciousness
(Prinz, 2004) and pride in achievement (Ekman, 1999).
The distinct lack of cohesion between contemporary emotion classification systems may
suggest that discrete classification of emotions is an inappropriate approach to emotion
recognition and feedback systems as the significant inconsistencies between contemporary
emotion class lists may restrict such recognition systems to being imprecise or potentially,
completely arbitrary. Dimensional classification systems provide a practical alternative and
typically plot emotional data along two or three-dimensional axis by way of related factors
that are less difficult to ascertain by way of physiological measurements (including: heart
rate, respiration rate, galvanic skin response, electromyography and electroencephalography)
commonly interpreted as arousal and valence along a two dimensional plane. Dimensional
classification arguably reflects the lateralisation (specialisation of function between the brain
hemispheres) effects documented within emotion theory. Lateralisation theory has presented
a number of assertions relevant to classification development that include: The right
hemisphere is superior at recognition of emotional expression (Strauss and Moscovitch,
1981), emotions of positive valence correlate to greater left hemispherical activity whilst
negative valence corresponds to greater right hemispherical activity (Davidson et al., 2003).
The exact nature of lateralisation effects and emotions are yet to be fully uncovered, and it
has been suggested that emotion processing itself is largely dominant within the right
hemisphere (Tucker, 1992), which if true, limits the practical application of the above
theories.
Dimensional classification can itself be sub-categorised into several commonly utilised
forms: the circumplex model (see Russell, 1980), the ‗consensual‘ positive activation /
negative activation model (PANA, see Watson & Tellegen, 1985) and the vector model
(Bradley et al., 1992). Both the vector and circumplex models integrate valence (centralised
at zero, with the potential for positive and negative measures) and arousal (typically
commencing at ‗1‘ and increasing in integers). The difference is a disagreement with regards
to the possibility of neutral valence / high intensity emotional states. Within the circumplex
model, such an entity is possible (creating a characteristic circular ‗O‘ shape); whereas within
the vector model it is not (creating a ‗<‘ shape).
Tom Garner
2012
University of Aalborg
Ruben and Talarico (2009) describe the PANA model as ‗a 45 degree rotation of the
circumplex model‘, in which positive and negative activations are ‗anchored‘ at opposite
ends of the model. Li et al. (2010: p.146) argue that although the PANA model is an
improvement, all incarnations of the circumplex model lack ‗acceptable theoretical and
psychometric integrity‘, as the positioning of discrete emotion labels along the dimensional
plane reflects how lay people visualise emotion structure in their minds rather than an
informed, evidenced framework. Continuing to explore this problem, Li et al. discuss the
bifurcation model, a relatively modern approach born from complexity theory (as opposed to
the reductionist approaches that are more commonly employed as scaffolds of emotion
theory) that asserts a dynamic nature of discrete emotions and a fluid, self-organising system
in which ‗emotions aggregate (or disaggregate) depending on different environmental
circumstances‘ and ultimately strive to reach and maintain equilibrium (Li et al., 2010:
p.148). The potential for practical application of these contrasting approaches is arguably
dependent upon the specific aims and characteristics of the project being undertaken.
Classification within emotion research is an area of continuing development and is not
exhaustively documented within this chapter. Later chapters further explore and evaluate
classification in greater detail, within the context of computer video game-based emotion
feedback loop systems.
The debate between basic and constructionist accounts of emotion is of particular relevance
to this thesis as it concerns our potential to recognise/interpret emotions by way of machine
code. The basic account of emotion asserts that the function, meaning and, therefore,
expression is quantifiably different between each emotion. As such, recognition systems
would not require contextual knowledge as a unique neural and physiological pattern would
elucidate distinction between the various emotional states (Ekman, 1992). Constructionist
theory, however, integrates semantic meaning and conceptual knowledge and argues that
such factors would blur the distinctions between emotional states if only brain activity and
physiology were observed (Barrett, 2006). In development of a software-based emotion
recognition system, it must be decided whether contextual information is necessary to enable
successful categorisation and, if so, by what means such information will be acquired. This
particular question is explored in more detail later within the thesis.
THE NEUROSCIENCE OF EMOTION
“[I]t is becoming increasingly accepted that emotions comprise a significant component of
rational thinking and human behaviour” (DeGroot & Broekens, 2003)
It is the term ‗rational‘ in the above quote that points to the lines between emotion and
cognition, instinct and logic, and feeling and reason becoming increasingly blurred, and
suggests that a comprehensive understanding of human thought processes must include
emotion as an integral component; as Perron (2004) states: ‗cognition and emotions work
together‘. The previous section approached emotion theory from a range of perspectives but
with the focus being on theoretical models and philosophical notions. Within this section, the
biological and chemical processes that constitute affective neuroscience and embody
21
22
Understanding Emotion and the Nature of Fear in Games
emotional experience are discussed. Our perception of emotions within contemporary society
continues to connect emotions to human biology, for example, the symbolic association
between emotion and the heart still permeates recent music and art. VanScoy (2006) asserts
that emotions and human biology share a steadfast connection: ‗emotions are […] wholebody states that activate hormonal responses, the cardiovascular system and other systemic
reactions‘. Winkle (2000) insists upon a significant connection between emotions and body
chemistry, positing that an enduring suppression of anger inexorably leads to anxiety and
depression by way of toxicosis. Particular chemical secretions have been associated with
specific emotional states, including: cholecystokinin with fear/panic attacks (Bradwejn,
1993), dopamine with desire/motivation (Rolls, 2000) and serotonin with aggression
(Crockett et al., 2008). Research has questioned this association however, and relevant
research has documented several occurrences of individuals receiving damage to emotionrelated physiology yet maintaining an accurate mental understanding of emotion (see Oatley,
Keltner & Jenkins, 2006). Popular beliefs allude to the potential for controlling our emotional
state by way of conscious physiology control (closing the eyes, slowing breathing, changing
stance, etc.), however some of these assumptions have been contested (Conrad et al., 2007).
Whilst there may be a consensus that the primary organ responsible for emotion processing is
the brain; debate remains as to which particular structures within the brain support emotions,
how they interrelate and the role of the nervous system within the overall process. The
collection of brain structures known as the limbic system is commonly believed to house the
central processes of emotional experience (MacLean, 1952; Panksepp, 2005) although
developing research increasingly advocates the involvement of high-level cortical structures
in emotion processing (Bechara, Damasio & Damasio, 2000; Cardinal et al., 2002; Maddock,
1999). Rolls (2000) postulates that connections between the pre-frontal cortex and the
amygdala reveal a cortical association with emotional valence and motivation. One
explanation for this association, which supports a more fragmented perspective between
reason and emotion, is that the higher cortical functions observed during emotional
experience are acting primarily as suppressors (Levesque et al., 2003).
LeDoux (1995) advocates the amygdala as the primary neural structure involved with
emotion processing, and this particular structure has also been described as playing ‗a crucial
role in the development and expression of conditioned fear‘ (Davis et al., 1992: p.255).
Anderson (2002) contests the amygdala‘s prioritisation; presenting evidence that damage to
the structure did not impact upon patients‘ expression of emotion, nor their ability to
experience varying emotional valence. The notions that specific neural structures are of
elevated importance in emotion processing do not suggest that emotional experience is
contained within individual structures, and the ongoing debate regarding the prioritisation of
emotion-related structures suggests that emotional experience is a hugely complex system;
potentially incorporating every element of the human body, from the largest organ systems to
the smallest chemical components and the billions of individual interactions that occur
between them.
Tom Garner
2012
University of Aalborg
Figure 3: Neural structures commonly associated with emotion processing (see Dalgliesh, 2004)
Structure
Amygdala
Anterior
Cingulate
Cerebellum
Hippocampus
Role
Detection of emotional significance
Associated with motivational behaviour /
Supports subjective awareness of emotions
Emotion regulation
Inhibition, memory and space / associated with
anxiety
Hypothalamus Translates electrical impulses of the nervous
system and hormonal secretions of the
endocrine system
Insula Cortex Embodies emotional experience in physiological
changes via connection to various structures that
regulate autonomic bodily functions
Apply executive function (analysis,
Prefrontal
introspection) / connection to ventromedial
Cortex
prefrontal cortex
Associated with several limbic structures /
Ventral
association to emotion derived from reward
Striatum
Relevant Research
LeDoux, 1995
Jackson et al., 2006
Sell et al., 1999
Grey & McNaughton,
2000
Papez, 1937
Marley, 2008
Price, 1999
Gregorious-Pippas,
Tobler & Schultz, 2009
Dalgleish (2004) presents an in-depth account of the neural structures involved in emotion
processing, a summary of which (incorporating references to relevant literature) is presented
below (figure 3). Whilst traditional cognitive neuroscience has characteristically omitted
emotion from models of thought processing (Cacioppo & Gardner, 1999), distinguishing
affective neuroscience (the study of emotion‘s neural routines) as a separate field;
contemporary research has revealed overlap between these areas of study (Davidson, 2000),
suggesting that emotion and cognition are best observed as interrelating elements of an
inclusive framework. LeDoux et al. (2004) supports this notion in an account of developing
lists of neural structures believed to be associated with emotion. These lists reveal an initial
separation of limbic and cortical structures (governing emotional and rational thought
respectively), and the inclusion of various cortical structures into emotion-function lists
reveals a gradual blurring of the initial distinctions.
One of the most significant debates surrounding affective neuroscience is that of autonomic
specificity. To provide a brief informative background to elucidate this issue, the nervous
system can be separated into two classifications: the central nervous system (CNS) and the
peripheral nervous system (PNS). The former refers to the bone-housed structures within the
brain, retina and spinal column; whilst the latter denotes neural networks that exist outside of
the central structure and which connect the CNS to the various organs of the human body.
The PNS houses two further sub-categorisations: the somatic nervous system (SoNS) and the
autonomic nervous system (ANS). Whilst the SoNS characteristically represents voluntary
movements, the ANS refers to the neural pathways between the brain (thought) and body
(action) that typically reflect subconscious behaviours. The final (for the purposes of this
outline) sub-categorisation distinguishes the sympathetic nervous system (SNS) from the
parasympathetic nervous system (PSNS); the former referring to excitatory processes
23
24
Understanding Emotion and the Nature of Fear in Games
(increased heart-rate, deeper respiration, etc.), whereas the latter represents inhibitory
processes that suppress the effects of the SNS in order to maintain an internal equilibrium, or
homeostasis (Cannon, 1926). The SNS has been associated with fight or flight response and
has been described as critical to social behaviours (facial expression, vocalisation) via the
ventral vagal complex (Porges, 1998). For a comprehensive study of the nervous system, see
Peretto (1992).
The significant interrelating connections that exist between the brain, nervous system and
bodily organs have generated opportunity for a great flurry of continuing empirical and
theoretical research that attempts to identify and quantify causal associations between the
many elements of this complex system. The autonomic specificity debate exists as a primary
interest within this field. Levenson (2003: p.212) refers to autonomic specificity as ‗the
notion that emotions can be distinguished in terms of their associated patterns of autonomic
nervous system activity‘, a perspective that relates to the basic account of emotion detailed
earlier within this section. There is significant support from relatively recent research that
emotional specificity exists (Levenson, 1992; Witvliet & Vrana, 1995). However, more
recent literature has suggested that although the observable autonomic effects reveal
differences between emotions, there is little to no discernable pattern that might support an
emotion recognition system (Christie & Friedman, 2004). Levenson (2003) appears to
support this assertion, describing the major criticisms of autonomic specificity as a priori,
primarily in that they critique opposing data but do not present their own. However,
Levenson does agree that patterns of autonomic specificity may not enable emotion
recognition, as differences are ‗likely to be ―prototypical‖ in nature, with particular
occurrences of a given emotion showing variation around […] central tendencies‘.
EMOTIONS & COMPUTER VIDEO GAMEPLAY
The focus of the thesis is upon emotional experience with relevance to CVG sound. Whilst
later chapters will address this context in greater detail, this section provides an outline of
emotional processes that exist within a computer video game playing experience. The ability
of computer video games to evoke emotions in players is well documented, particularly in
research connecting gameplay to aggression (Winkel, Novak & Hopson, 1987). The
interactive nature of a computer video game distinguishes it from passive recreational media
and therefore dictates unique emotion characteristics. Games require player-action to
progress and the emotional state of the player can be described as a facilitator of that action
(DeGroot & Broekens, 2003). Certain research has consequently argued that game
development should centre around the player, not the game (Ermi, 2005). Several terms
relevant to both positive and negative emotional experience have been closely associated with
gaming, including interest, enjoyment/fun, anger and frustration (Perron, 2004).
Perron (2004) presents several emotion concepts that are most unique to a CVG medium,
though they do share characteristics and are based upon film theory. Perron refers to Tan
(1996) in describing the F-emotion (fiction emotion: empathetic states also referred to as
witness emotions because they arise from the individual‘s observation of the fictional
Tom Garner
2012
University of Aalborg
environment/scenario) and the A-emotion (artefact emotion: derived from appreciation of the
artistry/craft that built the observed fiction, felt during brief realisations that the film/game is
not real). In a later paper, Tan (2000) also proposes the R-emotion (representative emotion:
denoting states evoked from action/interaction within a fictional world). Perron shifts the
focus exclusively onto CVG, disclosing the G-emotion (gameplay emotion), emotions that
arise from a hybridisation of fiction (narrative) and representative (action) emotion. For
example, when undertaking the role of Gordon Freeman in Half-life 2 (Valve, 2004), we may
witness the dystopia within which we are placed as we read a newspaper headline entitled:
‗Earth surrenders‘. We may also simultaneously reflect on the fact that this environment is a
direct consequence of our own prior actions and that it is our responsibility to produce further
action to set things right.
Arguably, action and narrative are intrinsically tied within story-driven computer video
games and the G-emotion acknowledges this to present an emotion-system unique to
computer gameplay. In addition, Perron (2004) also differentiates between circumstancecaused (events caused outside of a player‘s and NPC‘s control), other-caused (direct
responses to NPC action) and self-caused events (direct responses to player action), arguing
that the same circumstance could produce a different discrete emotional state depending on
the event type. For example, in Left for Dead (Valve, 2008), receiving a first-aid pack from a
fellow survivor may evoke liking (other-caused) whilst presenting the first-aid to that
survivor may evoke pride (self-caused) and finding additional aid during the level when you
are close to death may evoke joy (circumstance-caused).
One of the key characteristics of a computer video game is the lack of a predetermined
sequence of events and the very essence of the word ‗game‘ necessitates that it must be
possible to both win and lose. Therefore it is highly unlikely that a player would successfully
overcome every obstacle within a game on the first attempt and instead, would likely replay
several sections of gameplay repeatedly before progression. Consequently, developers cannot
contemplate player emotional states solely upon initial exposure to the stimuli they have
embedded, but instead must consider how a varied number of repetitions may impact upon
player emotions over time. The integral nature of this characteristic relates to the positioning
of challenge as a game-relevant emotional state. Ermi and Mäyrä (2005) describe challenge
as consisting of cognitive load and pacing, arguing that ‗quality of gameplay is good when
these challenges are in balance with each other […] and the abilities of the player‘. Klimmt
(2003) argues that the connection between challenge and emotion is the shifting between
positive and negative states, in which challenge difficulty will result in frustration (and
possibly anger and aggression) that, in turn, will enable success to evoke intense euphoria and
pride (as a result of overcoming an obstacle that the prior negative emotional state caused to
be perceived as significantly difficult) in tandem with a feeling of relief as the stimuli that
once caused intense negative emotional states are relinquished. Perron (2004) reflects this
notion with the concept of motive-consistency and motive-inconsistency, within which
Perron asserts that the two opposites are connected and that negative emotions, derived from
challenge (frustration, anger, contempt, sadness), facilitate future-based positive emotions
when the obstacles are overcome.
25
26
Understanding Emotion and the Nature of Fear in Games
Within a survival horror game context in particular, challenge is often a frustration for
developers, as a core element of fear is the unknown and repeated experience of the same
event quickly suppresses the fear-content. Challenge also encourages high quantities of
action, dictating larger numbers of enemies which also leads to repeated exposure even if the
player does not repeat gameplay sections. The Dead Space (Visceral Games, 2008) series
exemplifies the pacing problem of challenge; arguably presenting a genuinely frightening
opening but (despite outstanding atmosphere and artistic quality) the intensity diminishes
quickly as the repeated exposure to enemies quickly enables the player to memorise the
characteristics of the enemy (reducing shock and improving coping ability), create effective
strategies to defeat them and even predict when they will ambush. The contextual
dependency of emotions means that the same action/activity/stimulus can have significantly
different emotional effects if the situation is varied (Blythe & Hassenzahl, 2003). In this
scenario, designers are essentially damned however they approach the problem as slow
pacing and low challenge may facilitate a more consistent and intense fearful experience, yet
a linear and unrewarding gameplay experience will be the inevitable side-effect.
Alongside challenge, immersion stands as a crucial and well-documented gameplay
experience that has been directly associated with emotion (Brown & Cairns, 2004; Ermi,
2005; Nacke & Lindley, 2008). Whilst not a precise definition; immersion has been referred
to as the ‗essence of games‘ (Radord, 2000). Brown and Cairns (2004) provide a
comprehensive review of immersion research, citing depth, realism and atmosphere as key
components. Their three-level theory of immersion provides a logical characterisation of
immersion; separating engagement (interaction with an accessible entity that the user has an
interest in), engrossment (‗when game features combine in such a way that the gamers‘
emotions are directly affected by the game‘ [Brown & Cairns, 2004: p.1299]) and total
immersion. Winograd and Flores (1986) refer to Heidegger‘s ‗being ready to hand‘, asserting
that the invisibility (a system that is easy to use off-line without conscious thought) of the
control interface enables immersion by allowing the user to focus their attention entirely upon
the action taking place on the screen, an argument that may have been ignored in the
development of modern gesture-based control systems such as the Kinect (Microsoft, 2010).
Ermi and Mäyrä (2005) separated immersion into three discrete categories: sensory (relating
to realism and technical quality [graphics, sounds etc.]), challenge-based (attention is
focussed upon the objective cognitive faculties that are directed towards completing that
objective) and imaginative immersion. Although few games would focus entirely upon a
single form, the primary mode of immersion featured within a game is likely to depend upon
its genre and audience; for example, Tetris (Pajitnov, 1984) focuses upon challenge-based
immersion, relying on the addictiveness of the gameplay and desire to achieve ever-higher
scores whereas Myst (Cyan, 1993) targets sensory immersion with a focus upon highly
detailed first person 3D game worlds that encourage exploration and sensory stimulation.
Tom Garner
2012
University of Aalborg
Total immersion, as described by Brown and Cairns (2004) relates to presence (in their
description an experience that gives the player a sense of detachment from reality), which is
in itself an affective gameplay state that relates to emotionality. Nacke and Lindley (2008)
describe presence as a combination of sensory and challenge-based immersion, coupled with
‗feelings of empathy and atmosphere‘. Lombard and Ditton (1997) support this definition,
describing presence as ‗a psychological experience of non-mediation, i.e. the sense of being
in a world generated by the computer instead of just using a computer‘. In a study conducted
by Ermi and Mäyrä (2005) parental opinion regarding their child‘s gameplay experience
revealed a possible connection between immersion and emotion when parents displayed
concern that their children were too immersed within their games. The parents‘ primary
reason for this concern was that they perceived their children to be more engaged emotionally
with activities within the game-world as opposed to reality.
Another emotion-relevant gameplay experience is flow, a theory posited by Csikszentmihalyi
(1990) that assimilates both challenge and immersion, stating that a successful balance of
challenge and skill causes a player to become engrossed within the activity. Here, flow is
essentially a temporal measurement of enjoyment (as determined by immersion and
challenge) that can be broken if gameplay is too difficult or if there is a game event that
disrupts immersion (such as a jarring diegetic shift within Fallout 3 [Bethesda, 2008] in
which your player statistics, which so far had been set using diegetic systems, can be reset
from an extra-diegetic interface window). This definition is not uncontested and a significant
number of contrary theories of flow have been presented within recent literature (Novak et
al., 2000). Nevertheless, flow has been acknowledged as a lay-term that many game players
have a fluent concept of (Nacke & Lindley, 2008). McMahan (2003) implicitly connects the
concept of flow to immersion when stating that one of the three conditions required to
generate immersion within gameplay is a consistent game world (alongside matching if
player expectations and meaningful activities provided for the player), suggesting that a break
in flow will interrupt immersion. Elis et al. (1994) presented a four-channel model of flow
that is frequently used within the context of computer gameplay experience (Nacke &
Lindley, 2008), in which flow is a trade-off of boredom and anxiety that requires a high but
balanced level of challenge and skill. According to the model, a balanced but low level output
of these measures results in an apathetic response. Kivikangas (2006) revealed the lack of
significant correlation between flow and psychophysiological measures of basic emotions,
suggesting that flow could be better perceived as an emotion in its own right.
Marsella and Gratch (2002) distinguish two modes of emotion modelling associated with
computer gameplay: communication-driven modelling refers to developing character
believability by way of improved non-player character (NPC) emotional expression; whilst
appraisal-driven modelling shares characteristics with emotion recognition, describing the
development of systems that generate an NPC emotion state based upon their pre-set
beliefs/desires and their current circumstances. Academic research has repeatedly attempted
to elucidate computer game-playing experience by way of a dimensional model. Pine and
Gilmore (1999) presented a two-factor model placing participation (the level of interactivity,
ranging from passive to active) and connection (relating to immersion, ranging from
27
28
Understanding Emotion and the Nature of Fear in Games
absorbing to immersive) along respective dimensional axis. From this model, four broad
classifications of game playing experience are revealed: entertainment (absorption and
passive participation), educational (absorption and active participation), aesthetic (immersion
and passive participation) and escapist (immersion and active participation). Frome (2007)
presents eight classes of gameplay emotions based upon the source of the emotion and the
role of the player. Ecological sources refer to ‗when a player responds to a videogame in the
same way [they] respond to the real world‘ (e.g. actually ducking your head whilst moving
your avatar through an enclosed space), whilst narrative, game and artefact emotions relate to
terms documented earlier within this chapter. Observer-participant and actor-participant
relates to the witness-action distinctions documented earlier and combine with the source
types to generate the eight classes.
Figure 4: Frome’s (2007) eight classes of gameplay emotion
Source of Emotion
Ecological
Narrative
Game
Artefact
Observer-participant
Sensory environment
Narrative situations
Game events
Design
Audience Roles
Actor-participant
Proprioception
Roleplay
Gameplay
Artistry
Frome‘s (2007) classification system is arguably comprehensive and rational, but the
presence of so many conflicting concepts regarding gameplay emotions confirms that our
understanding suffers from the same lack of precision and clarity that plagues emotion theory
as a whole. There is nonetheless a significant value in these concepts with relevance to
development of emotion-recognition systems within games, specifically in terms of
establishing context to enable vague dimensional physiological data to be processed through
game event/situational filters to reveal accurate discrete emotional states.
AN OUTLINE OF FEAR DEFINITIONS AND TERMINOLOGY
Within the realm of fictional media, the experience of fear can be sub-categorised by the
terms: horror, terror, suspense and shock. Horror, by definition, is reactive and describes
disgust or recoil from an object or scenario that grossly offends the senses; it is commonly
related to revelation, shock and surprise and often relates to the cinematic term gore. Shock
and surprise differ from horror, however, in that they do not necessitate a disgusting stimulus,
but rather an intense surprise that the receiver did not anticipate. Terror precedes horror and is
related to suspense, apprehension and anxiety. A feeling of terror is typically the result of
suggestive stimuli resulting in a perceived sense of conviction that something horrific is
about to occur. Suspense differs from terror in that it is not necessarily an anticipation of
something horrific. Suspense can be used to describe both a precursor to terror and a more
generic feeling of anticipation in the presence of the unknown. Varma (1966) supports the
distinction between terror and horror; defining terror as the ‗awful apprehension‘ and horror
as the ‗sickening realisation‘.
Tom Garner
2012
University of Aalborg
This paper posits the notion of threat as an absolute necessity for experiencing fear and, in
the context of a computer video game, argues that the only way to create a fear response is to
present the player with a threat. In a virtual environment, irrespective of whether the threat is
real (fear of failing, witnessing the grotesque, being startled) or virtual (fear of death,
suffering, personal loss), it is vital that they be perceived as significant by the player if a fear
response is to be generated. For the purpose of this paper, Varma‘s (1966) definition is
altered to fit this specific context. Although a threat is not necessarily required to induce
horror, this definition of a horrific experience is not one that will induce fear, but more likely
sadness or disgust. Witnessing a grotesque scene or act may be described as a horrific
experience yet without an attached threat to the witness, fear will not be experienced.
For the purposes of the thesis, horror is defined as the individual events (a fallen branch snaps
nearby, a heavy panting can be heard behind you, etc.) within a complete experience that
presents, or implies, threat. The more penetrating events can be described as shock-horror,
referring to explicit individual revelations that cause intense pre-cognitive reactions
(exclamation, screams, jaw/fist clenching involuntary jump, etc.) and is characteristically
short and intense with heavy attack. Conversely, suspense-horror describes an implicit event
that manufactures the perception of a threat without causing a shock response and instead
contributes to a more sustained, developing and gradually intensifying experience. Terror is
defined as the overall fearful experience, the complete scenario. Assuming that a significant
threat is a necessity of fear, terror begins at the point a threat is perceived and ends when only
the threat is resolved. The feeling of terror is not constant however, and can vary in intensity
depending on horrific stimuli and the player‘s unique emotional processing. These definitions
are strictly created for the purpose of classifying experiences of fear as induced by sound in a
computer video game and do not intend to replace or modify existing definitions outside of
this context. Figure 5 visualises these concepts to elucidate the interactions between horror
and terror within an exemplary survival horror game sound context.
Figure 5: Terror over time – a theoretical example of the relationship between horror and terror
29
30
Understanding Emotion and the Nature of Fear in Games
THE VALUE OF FEAR
One of the primary concerns of this thesis is enhancement of the experience of fear within the
context of survival horror computer gameplay through manipulation of game sound. In
relation to this objective, this section briefly outlines the contextually relevant definitions and
terminology surrounding fear and also discusses the notion of positive fear associations,
addressing why we would consciously wish to experience fear.
The intrinsic presence of hope and fear within a suspenseful narrative increases the potential
for user-character investment and emotional experience (Alwitt, 2002). In a non-interactive
context, Zillman (1996: p.200) identifies fear as the crucial component of suspense; arguing
that concern for the well-being of the narrative characters and anticipation of negative plot
resolutions generates the suspense that hooks the viewer. For video games, the habitual
association with scenarios containing a variety of severely negative potential outcomes
suggests that fear and suspense are already acknowledged and, within game genres such as
survival-horror, fear is a necessity (Tinwell et al., 2010). In A Philosophy of Fear, Svendsen
(2008) argues that there exist a number of reasons why fictional fear can manufacture a
positive emotional experience (p.75). Svendsen begins by suggesting that fictional fear is a
physically safe means of experiencing danger and is therefore more likely to induce positive
feelings such as excitement (p.76). Svendsen elaborates upon this paradoxical notion,
asserting that any emotional experience results in a measure of positive feedback derived
from the resultant ‗feeling of being alive‘, and that experiencing high intensity emotions
increases the likelihood of this positive sensation, regardless of the valence of the initial
emotion (Svendsen 2008: p.75). Svendsen also identifies relief as a further positive
consequence to the ceasing of a negative emotion. Relating closely to gameplay experience,
Perron (2004) contributes by suggesting an instinctive feeling of success results from facing
your fears, standing tall against a terrifying force or creature, and succeeding. Perron‘s
argument can be compared to that of Kant (1964). Kant describes the procedure of positive
experience drawn from fear-inducing stimuli: ‗The first shock of the sublime is turned
around, in such a way that we gain an awareness of the elevated in ourselves, namely reason,
and that judgement therefore finally experiences a feeling of delight‘ (p.80). These concepts
can be comfortably applied to the experience of a survival horror game and it would be
logical to assert that a possible attraction to this gaming genre is the opportunity to face a
fearful object/scenario and to overcome it. The feeling of delight is consequential of the
feelings of success (the enemy has been vanquished), relief (you have survived) and
excitement (resulting from conscious appreciation that you are experiencing rare and intense
emotions).
The literature in this field does not conclude that a positive response to a negative affect is
manifested purely by the desisting of the initial negative emotion. Kant‘s (1964) concept of
the sublime suggests that there is an intrinsic aesthetic quality to objects and acts traditionally
perceived as horrific and macabre. According to Kant, a sublime experience results from the
appreciation of an object‘s unfathomable size (mathematically sublime) or an object‘s
incomprehensible power (dynamically sublime). The survival horror genre itself testifies to
Tom Garner
2012
University of Aalborg
the value of dark artistry, in creating both horrific creatures and scenes of gruesome death and
destruction as gameplay selling points. Furthermore, Kant‘s (1964) definition of the
mathematically and dynamically sublime can be seen manifested in many survival horror
antagonists, particularly final boss characters (antagonist characters typically battled during
level or game finales). Svendsen (2008) continues to explore these ideas via interpretation of
an Aristotlelian concept, catharsis. Catharsis suggests ‗there is a favourable effect on the
observer because he or she witnesses fearful impressions from a scene‘ (p.87). Aristotle
himself does not detail a procedure for these effects, however Svendsen interprets an
‗emotional discharge in which the observer gets rid of inner tensions that it would otherwise
be difficult to find expression for in society‘, and also a proposes a potential contribution to
personal moral development – ‗catharsis would teach us to fear the right things in the right
way at the right time‘.
In addition to gameplay and physical design applications, the concept of the sublime could
also be related to character and narrative designs. In both fictional worlds and in reality it is
the serial murderers who commit their acts for no other reason than the act itself - to create art
through terrible crime – thus becoming, under de Quincey‘s (2006) categorization, sublime
artists. Analysis of several popular culture mediums reveals that it is the ‗sublime murderer‘
who is the most captivating and terrifying, such as: John Doe from the film Seven (Fincher,
1995), Alexander Cohen from the game Bioshock (2K, 2007) and Francis Dolarhyde from
the novel Red Dragon (Harris, 1981). Such characters arouse a far greater fear because the
lack of a traditional motive removes conventional logic, decreases predictability and suggests
a far more monstrous persona. Captivation could be explained via the same processes that
attract an audience to fear - the inherent excitement in exploring the unknown, the different
and the dangerous (Svendsen, 2008).
The survival horror genre provides many examples that reflect all of the ideas detailed above.
Final boss characters are predominantly colossal and powerful, inspiring Kant‘s sense of the
mathematically and dynamically sublime. Games with a focus on narrative often include De
Quincey‘s sublime characters who perceive morality as subordinate to aesthetics, including:
Alexander Cohen and Doctor Steadman from Bioshock (2K, 2007) or Serial Killer X from
Condemned: Criminal Origins (Monolith, 2005). The creatures of many survival horror titles
match Svendsen‘s concept of dark attraction, with many adversaries, horribly disfigured
human or animal creations dripping blood and bile, seen acting out graphic acts of extreme
violence. Finally, no survival horror game is complete without scenes that build suspense,
elements that shock, and narratives that assign the player the role of victim or prey.
The ultimate goal of research into computer video games (CVG) is to enhance the experience
of playing them. This thesis is concerned with enhancing the experience of fear within the
genre of survival horror through manipulation of game sound. Two immediate questions are
hereby raised – will enhancing the experience of fear improve the player‘s positive
experience of game-play, and would someone consciously wish to feel fear? Consider this
scenario: You experience a fictional stimulus (horror film or computer video game). The
experience not only causes you to jump in your seat, but also induces a lingering sense of
31
32
Understanding Emotion and the Nature of Fear in Games
dread during which your mind generates personalised scenarios based upon your experience.
You feel anxious and unnerved, and even lie awake at night unable to shake the sensation of
fear. These feelings pass yet the experience remains significant, and you find yourself
recommending the stimulus to others. Anyone who has experienced such a phenomenon
could be inclined to agree in retrospective appraisal of the experience, that they felt some
sense of positive affect towards it. A second question is presented: Why would someone
consciously wish to experience fear?
Developing Kant‘s (1995) concepts regarding the sublime has sired a controversial theory;
that if morality were presupposed for aesthetics, then positive appreciations such as beauty,
symmetry and artistry could be applied to acts of extreme violence, torture and murder (De
Quincey, 2006). Svendesen (2008) asserts that under certain conditions, human perception
can place aesthetics above morality, consequently freeing the individual to experience
enjoyment from witnessing (and even participating in) acts of extreme violence, torture and
murder. In reality, such a hierarchical shift would most likely be the result of extreme
circumstances (Oppenhiemer and Frohlich, 1996). In a virtual world however, this can be
achieved through detachment (aspects of gameplay that restrict the opportunity to connect
emotionally with a non-player character) and through the construction of a framework of
virtual rules. These methods allow the player freedom from moral responsibility and even to
delight in acts ranging from the morally questionable, such as killing enemy soldiers or alien
monsters in Half-life 2 (Valve, 2004) to deplorable acts, such as attacking and killing
innocent civilians in Grand Theft Auto 3 (Rockstar, 2001) or Call of Duty: Modern Warfare 2
(Infinity Ward, 2009).
Exploring the above argument, it has been theorised that to fully appreciate the sublime of
terrible acts, we must do more than separate ourselves from morality. Svendsen argues that
‗an aesthetics of transgression always presupposes morality, since it is morality that makes
the transgression possible […] The fact that a given act exceeds a moral or legal norm is an
important precondition of its aesthetic quality‘ (Svendsen, 2008: p.86). This assertion could
be interpreted to suggest that CVG players may seek opportunities to commit acts they
understand to be morally or legally wrong provided that there is a guarantee of no real-world
consequences. The reason for this is the aforementioned attraction to new experiences,
exploration of virtual worlds vastly differing from our everyday reality and a freedom from
responsibility. Pearce‘s (2008) survey corroborated this argument, suggesting that exploration
and new experiences were a top requirement for players. Consulting the game summaries on
the backs of many CVG cases strongly suggests that games industry marketing also believes
in this concept. A convenience sample of survival horror games (Fear 2, Condemned, Dead
Space, Left 4 Dead and Bioshock) all described their offering of new experiences; actions the
player could perform, weapons and special powers to be wielded, enemies to be confronted
and exciting new worlds and scenarios to explore.
Tom Garner
2012
University of Aalborg
CONCLUSIONS & CHAPTER SUMMARY
The information presented within this chapter advocates emotion theory as a worthy field of
contemporary study, with distinct potential value for cognition theory, human-computer
interaction developments and computer video games. Beyond this, better understanding of
emotion and affective processes has inherent merit in a more general sense, providing greater
insight into our everyday experiences and behaviours. Comprehensive frameworks, which
can be applied to the development of any product or service, would enable CVG businesses
to progress through better understanding of their clientele and may have notable worth for all
mediums of communication.
When assessing the value of fear, different forms of value are revealed; the most notable
being cathartic release (coupled with intense excitatory responses) that enables individuals to
experience fear within a physically safe environment and phobia therapy (repeated exposure
to fear-object, again within a safe environment, to facilitate coping practice). This chapter
also addressed the relevant concepts of the sublime (aesthetic appreciation of horror and
violence) both in the form of events and characterisation. The outline of fear definitions and
terminology provides an introduction to fear concepts and these initial ideas are expanded
upon within the next chapter, in which fear and its associated details are analysed in greater
detail, alongside a discussion examining the modern concepts of fear processing both within a
general and a CVG contextualisation.
33
Chapter 3
Understanding Fear and
Game Sound – Definitions,
Processes and Variables
Garner, Tom A.
University of Aalborg
2012
Tom Garner
2012
University of Aalborg
Chapter 3: Understanding Fear
and Game Sound - Definitions,
Processes and Variables
“The oldest and strongest emotion in mankind is fear and the oldest and strongest kind of
fear is fear of the unknown.” Bleiler E.F, Foreword written in - Lovecraft H.P, (1973)
Supernatural Horror in literature
INTRODUCTION
This chapter documents core theoretical research and associated experimentation relevant to
the study of fear. Commencing with a brief examination of connected areas of study (Humancomputer interaction and cognition); this chapter investigates the definitions and subcategorisations of fear, alongside an analysis of the emotional potential of sound and
individual acoustic/psychoacoustic parameters that could modulate an affective experience
during computer video gameplay. These discussions provide a theoretical foundation upon
which a new framework for the fear experience, both in a general and computer gamespecific context, is constructed (chapter 9). Such a model is intended to expose a core
procedure of human fear response that may transcend interpersonal differences, ultimately
with the intention to better inform creative decisions relevant to computer video game
development and sound design.
FEAR: PERSPECTIVES AND ASSOCIATED THEORY
The intrinsic presence of hope and fear within a suspenseful narrative increases the potential
for user-character investment and emotional experience (Alwitt, 2002). In a non-interactive
context, Zillman (1996, p.200) identifies fear as the crucial component of suspense, arguing
that concern for the well-being of the narrative characters and anticipation of negative plot
resolutions generates the suspense that hooks the viewer. For video games, the habitual
association with scenarios containing a variety of severely negative potential outcomes
suggests that fear and suspense are already acknowledged and within game genres such as
survival-horror, fear is a necessity (Tinwell et al., 2010). Whilst the fundamental approach to
fear manipulation is drawn from the creative instinct of the developer, a theoretical
framework for understanding the macro and micro processes that exist within a fearful
experience would serve to increase the intensity and reliability of fear-induction design in
survival horror video games, films and beyond.
The potential for a video game to evoke emotional responses beyond those intrinsically
drawn from gameplay (such as frustration, anger, joy, competitiveness) supports the
increasing sophistication of game design that is allowing virtual worlds to more closely
reflect reality. Whilst several game genres would arguably benefit from emotion-related
developments, it is the fundamental first-person shooter (FPS) approach of positioning of the
35
36
Understanding Fear and Game Sound – Definitions, Processes and Variables
player within a simulated environment with a first-person perspective that supports the notion
of ecological relationships paralleling those of the real world. For the purposes of this
chapter, ecology refers to the relationship between a living organism and their surroundings.
A computer video game places the player in a virtual environment which in itself exists
within reality, revealing three interrelating entities (the player, the game and the environment)
that give a virtual ecology a character dissimilar to the traditional, natural ecology.
For the purposes of this discussion, reality refers to the everyday world in which we operate;
whilst virtuality refers to the artificial environment contained within a computer video game.
Players relate to the environment of reality through visual, acoustic, kinaesthetic, haptic,
olfactory and gustatory interactions that can be mapped onto ecological profiles. A more
detailed discussion regarding virtuality (and an attempt to untangle the array of associated
terminology) is presented in the next chapter. Initial contemplation regarding the nature of
reality and virtuality posits the terms as related opposites, the two sides of our existence. As
discussed in greater below, the distinction between these terms is notably unclear, with a
range of variables potentially determining the real or unreal nature of an entity. In response to
this a reality-virtuality continuum, connecting the real with the virtual and positioning them
on polar extremes of the spectrum, is a concept that acknowledges the difference between
real and virtual but also the explicit association that exists between them. The inherent
problem with a continuum is the assumption that it requires polar extremes which would
dictate that reality and virtuality existed in some forms as an absolute. Calleja (2012)
reconciles the terms, placing virtuality as an aspect of reality, acknowledging the distinction
whilst emphasising the connection between the two. With regards to this chapter, it is the
association, rather than difference, between reality and virtuality that supports the argument
that in order to achieve an inclusive understanding of an ecological system within a virtual
model, we must first comprehend the ecology as it exists within reality.
This chapter primarily utilises a cognitive perspective of psychological processes to construct
the fear framework. Evolutionary and behavioural concepts are incorporated; however, they
are presented as components of a bespoke cognitive design. The cognitive perspective has
long been associated with computer functionality to the degree that the processes of a
computer provide a metaphorical scaffolding to support the cognition concept and traditional
cognitive theory disassociates emotion from cognition, describing each as individual
processes that interact (Zajonc, 1984). Cognition theory has recently been challenged
however, and emotions argued to be not only an integral component of the cognitive process,
but also influencing factors of attention, thought and behaviour (Carroll, 1999; Grimshaw,
Lindley & Nacke, 2008; Norman, 2004; Shinkle, 2005). Niedenthal et al. (1999) argue that
the intense nature of an emotional experience supports the preservation of past events, stored
as salient recollections within long-term memory (LTM), and that these memories have the
capacity to influence present comprehension and decision making.
Tom Garner
2012
University of Aalborg
Human-computer interaction (HCI) is a popular topic for research and is arguably one of the
key components of an emotion-based framework that details interactions between player and
game. HCI has been argued to be largely social, and an understanding and appreciation of
emotional function is as crucial to human-computer interfacing as it is to exchanges between
two people (Reeves & Nass, 1996). The concept of Emotional Intelligence has been applied
to computer software in efforts to reduce user frustration and increase productivity (Picard,
2000), although this does not explore beyond the functional, nor venture into the potential for
developments in a recreational and/or social HCI context, such as a computer video game.
The nature of computer video games makes them ideal for exploring human-computer
relations (Keeker et al., 2004). Within certain games, further relations can also be observed.
Virtual worlds can depict representations of any corporeal form (of both real and fictitious
origin) and, furthermore, such worlds utilise semantics and metaphor to project abstract
concepts. Graphics technology is propelling us closer towards photo-realistic computer
generated characters, and artificial intelligence (driven by HCI research), with the capacity to
evoke player emotions through believable affective displays and model accurate emotional
responses to player input, has the potential to dramatically improve the intensity and diversity
of our emotional experience whilst playing computer video games. Such diversity will
arguably facilitate improved potential for evoking discrete emotions, such as fear.
The subsequent sections within this chapter explore the concept of fear; first by defining key
components and relevant terminology, then utilising existing research to explore the inner
processes, individual variables and relationships that characterise fear. Differences between
reality and virtuality-based experience are documented, and game sound research is reviewed
to assess the roles of acoustic parameters, psychoacoustic listening modes and cognitive
appraisals of audio, within a virtual ecology of fear.
UNDERSTANDING FEAR
For the purposes of this chapter, fear is positioned as a master-term, to which all associated
words (horror, terror, anxiety, suspense, etc.) branch from. Freud (1956) asserts that fears are
subconscious efforts to avoid disturbing experiences; generating aversive behavioural
responses to stimuli perceived as threatening to an individual‘s physical and/or psychological
well-being. A more recent perspective defines fear similarly as 'an activated, aversive
emotional state that serves to […] cope with events that provide threats to the survival or
well-being of organisms' (Öhman, 2000). The fear response is commonly associated with
aversive behaviour (Brown et al., 1951; Öhman, 2000; Schneider, 2004) neatly characterised
by Gray (1971) as fight, flight and freeze actions. This concept positions object (stimulus),
perception (of threat) and response (aversive action) as organised components of an
interactive process. Although such an understanding may appear over-simplified, the notion
of object, threat and response is rarely questioned but rather built upon and expanded around
this foundation.
37
38
Understanding Fear and Game Sound – Definitions, Processes and Variables
A step towards a more reductionist view positions the awful apprehension of terror and the
sickening realisation of horror (Varma, 1966) as crucial elements of the fear sensation.
Rockett‘s (1988) understanding of these terms strongly reflects that of Varma‘s; describing
horror as a revelatory event, incurring deep upset manifest as overt human behaviour, and
terror as the anticipatory trepidation. Terror is evasive, action-orientated and situational
whilst horror encourages fixation and is object-focused (Schneider, 2004; Perron, 2004). The
two components are very possibly co-dependent (each requiring the other to exist) and the
causality between them appears to be unidirectional; meaning that an existing appreciation of
a horrific stimulus is required to rouse the relevant sensation of terror, as can be observed in
phenomena such as phobias (Poag, 2008 p.232) and post-traumatic stress disorder (Yehunda,
2002). The computer video games industry shows an awareness of the various cogs spinning
within the fear machine; with different titles employing varied tactics based upon these
elements. A classic example of this distinction is notable upon comparison of the original
Resident Evil (Capcom, 1996) and Silent Hill (Konami, 1999) games; the former relying
heavily upon horrific gore and startle, and the latter employing steady pacing and terrifying
slow-building tension. This is not to say that these titles were opposites in their approach.
Both utilise gore, violence, horrific monstrous antagonists, tense uncertainties, and striking
revelations. Baird‘s (2000) fear process of ‗a character presence, an implied off-screen threat
and a disturbing intrusion‘ arguably applies to both games, whereas the difference lies in the
subtle variations in pacing and direction of attention.
One of the core elements of a horrific experience is startle (also referred to as shock or
surprise). Bradley et al. (2002: p.463) state that ‗abruptness is the key to startle elicitation:
Ideally, the rise-time of the startle stimulus should be instantaneous‘. The two components of
startle are documented in Reisenzein (2000) as ‗evaluation of the stimulus as unanticipated‘
and ‗reaction time‘; supporting the notion that a startle must be both unexpected and sudden
(allowing little or no time to appraise the situation cognitively or produce a rational reaction).
Perceiving the startle effect as a variable in the horror-terror interaction helps us to neatly
distinguish the two gaming approaches to evoking fear during play. Whereas the horror
approach utilises immediate startle probes that encourage autonomic response behaviour, the
terror approach employs forewarning and paced revelations that support cognitive appraisals
and generation of unnerving hypotheses from our expectations of the macabre. For the
purposes of understanding startle in a CVG context, it is the temporal element that establishes
the difference between horror and terror-based approaches to fear elicitation.
Terror, anxiety and suspense cannot be viewed simply as indicators of intensity. Whilst it is
logical to assume that the relative values of the quantitative variables associated with fear
(probability, temporal immediacy, potential damage, coping ability and spatial proximity of
the negative event) can distinguish between these three types, they do not merely exist on a
basic linear construct. Whereas certain definitions of anxiety bear resemblance to terror
(Rachman, 2004: p.3; Attwell, 2006: p.2), anxiety can refer to a relatively long-term state of
distress incited by more general, implicit cues (Brown et al., 1951). Stevenson (2008: p.11)
identifies anxiety as an internal experience, greatly associated with physiological responses of
the Sympathetic Nervous System (SNS); a view reciprocated with Bourke (2005: p.189), who
Tom Garner
2012
University of Aalborg
described anxiety as fear from within, and to distinguish between fear and anxiety quotes
Freud: ‗[A]nxiety relates to the condition and ignores the object, whereas in the word fear
attention is focussed on the object‘. Freud‘s theory does not suggest that the framework of
anxiety is devoid of an object, instead that the connection between object and individual is
indirect and distant.
The concept of object is analogous to that of threat which is at the heart of an anticipatory
fear response (Lazarus, 1964). Other synonymous terms within this context include danger
and peril (Bleiler, 1973). Ultimately, such terms could be defined as loss of that which is
perceived valuable and gain of that which is painful. If threat is defined as the true underlying
source of a fear response, it could be suggested that the threat is neither the invading entity
nor the action; but instead the loss that may result. For example, the true source of our fear is
not necessarily the psychotic killer advancing, or the act of a vicious attack, but the
permanent damage or death that their assault signifies. The notion of loss has been applied to
our understanding of CVG emotional experiences with research suggesting that loss of
progress and flow is a key contributor to stress and tension during gameplay (Perron, 2005;
Shinkle, 2005). Although both fear and anxiety can be reduced to well-being defence
procedures, the absence of an immediate and objective threat distinguishes between the two.
Here anxiety is described as an undesirable internalisation of the horror-terror process and it
is consequently associated with the purely negative response to fear stimuli. After
experiencing a terrifying stimulus, such an internalisation would lead to continued production
of unnerving hypotheses for a prolonged period; even after the object has been removed (the
film is over, the game console switched off). The fearful sensation would continue outside the
boundaries of the stimulus and potentially attach itself to perceptually related entities - all of
this outside the control of the individual. Within a CVG context, all cognitive, autonomic and
behavioural responses to fear can be viewed as positive provided they occur only within the
temporal boundaries of gameplay and the user-defined intensity margins. The players have
willingly subjected themselves to these stimuli, understanding the consequences but reserving
the right to cease all frightening sensation at their command and expectant that removal of the
fear object will do so. In this circumstance, a continued sensation without object denotes loss
of emotional control and there is the potential that such anxiety may harm emotional and
physical wellness; in extreme cases insomnia, paranoia, panic attacks, paraesthesia etc. are
possible (Marks and Mataix-Cols, 2004: p.6).
Acting in this framework as a counterpart to anxiety, suspense is defined as a desirable
emotional sensation and identified as a critical component of fiction media, and also a driver
a CVG enjoyment (Klimmt et al., 2009). Zillman defines suspense as ‗an experience of
uncertainty whose hedonic properties can vary from noxious to pleasant‘ (Zillman, 1996:
p.200), suggesting that the value of uncertainty is the causal variable that defines the
experience and that high levels of uncertainty are likely to be distinctly unpleasant. Within
the boundaries of fictional media, however, this unpleasantness would arguably be a lack of
coherence in the plot or a difficulty for the audience to relate to the events rather than a
response of genuine upset. The notion of uncertainty is arguably a requirement of both fear
and suspense (Carroll, 1996: p.73; Massumi, 2005; Perron 2004). Furthermore, it can be
39
40
Understanding Fear and Game Sound – Definitions, Processes and Variables
attributed to both the concepts of terror and horror; the former because, as Bleiler (1973)
states ‗uncertainty and danger are always closely allied; thus making any kind of an unknown
world a world of peril and evil possibilities‟, the latter because shock is an intrinsic part of a
horrific event (distinguishing horror from pain, sadness and disgust). Massumi argues that
fear is derived from threat, and that a genuine threat cannot take a substantial and immediate
form; instead the nature of a threat is an indeterminate futurity, ‗[i]ts future looming casts a
present shadow, and that shadow is fear‘ (Massumi, 2005). Perron (2004) argues that without
uncertainty, suspense cannot occur; a view supported by Comisky and Bryant (1982) who
noted participant rated suspense was minimal when either success or failure appeared
absolutely certain. Carroll (1996) posits that audiences are even capable of experiencing
suspense during repetitions of fiction because the investment in the protagonist is sustainable
over several repeat experiences, and recidivist behaviour creates a sense of denial where the
outcome is displaced and the focus is on the present chronology within the fiction.
PROCESSES AND VARIABLES WITHIN FEAR
The previous section differentiated between the terms associated with the human fear
response; identifying them as individual components within a larger paradigm. This section
builds upon these foundations by exploring the dynamic interactions that exist between these
components and the internal and external variables that potentiate the various response
behaviours, in an effort to understand the process of fear from beginning to end.
Existing theory asserts that fear responses originate from both a central evolutionary circuit
and conditioned behavioural responses (Staats & Eifert, 1990). Evolutionary based emotional
responses (arguably including fear) are hard-wired processes that can be observed in both
humans and animals (Panksepp, 1991), suggesting that a fear response is likely to be
instinctive and display comparable response behaviours between individuals and even species
in certain circumstances.
In Fear (The Spectrum Said), Brian Massumi (2005) argues that, if exposed to the same fear
stimulus, each individual will experience the sensation differently; a notion supported by
Cacioppo et al. (1993) who observed varying emotional experiences between individuals in
response to identical physiological and somatic states. Massumi does, however, outline a
process which exists along a temporal plane that can be interpreted as a universal framework
of the fear experience. Threat is the origin of a fear sensation yet ironically is a futurity that
can only be manifest in the present if a fearful response is generated. Massumi refers to the
chronological order of events as the line of fright, and argues that, during the initial stages of
fear the emotional sensation and physical actions of the body are indistinguishable, moving in
parallel along the line of fright. At this stage emotional and physical responses are both
governed by the conditioned autonomic processing of the threat. Overt physical action
(characterised as fight or flight within a fear scenario), although typically determined
consciously by the SNS, appears automated.
Tom Garner
2012
University of Aalborg
Literature suggests that certain involuntary motor movements (covering face, shutting
eyelids, evasive running, etc.) can result when the action impulse is transmitted directly to the
spinal cord, not reaching the central nervous system; a procedure known as a reflex arc
(Ganong, 2001: p.123). Massumi (2005) argues that beyond this stage the subconscious and
cognitive loops begin to diverge as the former continues to be influenced primarily by the
origin stimulus and begins to desist over time. For example, when confronted by a predator
the subconscious response to run is activated and, as the threat reduces due to increased
distance, speed decreases and the autonomic action tendency concludes. Massumi describes
the cognitive loop as cumulative; taking continuing influence from the changing
environment, the response actions and internal representations. Cognitive processing
continues beyond the cessation of the subconscious loop and it is at this stage that initial
shock and automated response subsides, allowing emotional evaluation and reflective thought
to occur. The above describes a process similar to the stimulus-behaviour-emotioninterpretation (motor feedback) pathway of Ellsworth (1991), whose research also
documented two alternative pathways and suggested that the nature of the stimulus would
determine which was employed.
The interior of human emotion processing consists of the subcortical system and cognitive
appraisal; two interrelated, continuous feedback loops connecting the physical environment
to the human mind (Lang et al., 2000). Located in structures such as the thalamus (Öhman,
2000), the sub-cortical routine is concerned with the immediate environment and information
is only partially processed, allowing for more instantaneous communication with the
autonomic nervous system (ANS); which commands several physiological responses known
to be affected by fearful stimuli such as heart rate, respiration, pupil dilation, and blood flow
(Funkenstein, 1958: p.223). In contrast, cognitive appraisal (located in the prefrontal cortex)
introduces numerous conceptual notions such as logic, comprehension, and semantics; it also
involves the identification and communication of our emotional states (Mériau et al., 2006).
One perspective argues that high level construals originate from low-level received sensory
input in a bottom-up model (Clarke, 1997). For example, a creaking floorboard heard
downstairs under cognitive analysis could return increasingly high level construals such as
there is an intruder downstairs, leading to their intention may be to hurt me and finally I am
in danger.
Cognition is capable of regulating the sub-cortical output, the somatic response and (to an
extent) autonomic reactions for various task-orientated goals; including suppression,
accentuation and false response (Ekman & Freisen, 1975; Gross & Levenson, 1993; Ochs et
al., 2005). Sotres-Bayon et al. (2006) state that the high level thought processes originating
from the medial prefrontal cortex (mPFC) play a vital role in emotion regulation. Within a
healthy human model, mPFC regulation uses rational thought to identify when the emotional
consequences of a stimulus changes from threatening to secure suggesting that one potential
cause for unwanted anxieties is failure of the mPFC to regulate autonomic reactions.
Incorporating cognitive reasoning demands we acknowledge the differences that exist
between individuals such as gender, culture and personality and their potential influence on
affectivity and emotional response (Mériau et al., 2006; Hamann & Canli, 2004). In addition,
41
42
Understanding Fear and Game Sound – Definitions, Processes and Variables
the theoretically vast and continually expanding nature of long-term memory (LTM)
generates the notion that present cognitive processes could potentially be influenced by
innumerable LTM information gathered throughout an individual‘s life.
Lang et al. (2000) argue that the reactions of the human body to negative stimuli ‗depend on
the activation of an evolutionarily primitive subcortical circuit, including the amygdala and
the neural structures to which it projects‘. They suggest that fear appraisal and response
originates from human ancestry and the evolutionary principle of survival; a procedure that
reveals matching response patterns ‗as [we] process objective, memorial, and media stimuli‘.
Further research expands upon this notion, positioning cognitive reasoning as an integrated
development (much like an upgrade). Research (Lakoff & Johnson, 1999: p.4; Wilson, 2002)
identifies reason as evolutionary; arguing that all information processing (including rational,
higher level cognition) and behavioural response are developments of animal processes; a
notion called rational Darwinism, that places humans on a continuum with animals and
ultimately suggests that the nature of our thought processes will continue to evolve as time
passes. A review of relevant literature reveals notable support for the concept of such an
integrated autonomic-cognitive system. Much as elements of the respiratory system can be
controlled via automated and conscious commands for adaptive efficiency, effective response
to fearful stimuli demands that the system be responsive to time constraints and threats
developing at a socio-cultural rather than evolutionary rate. For example, in the event of an
ambush mugging attack, the subcortical responses of fight, flight and freeze may have
detrimental consequences (the victim is outnumbered and likely to be outrun) and a cognitive
compliance response requiring override of the subcortical impulse and a rationalisation
argument (that losing possessions is a fair price for life and health) is most likely to ensure
survival.
The above argument identifies the physical and abstract components that make up the fear
response yet the question remains as to how these systems work together to mobilise the most
appropriate behaviour in response to the vast array of fear-related scenarios. To understand
this, we must attempt to chronologically examine the individual sub-processes and related
variables. Within this framework, the fear process must arguably commence with an input
threat assessment to establish which routine to activate, horror or terror. The characteristics
of threat-associated stimuli under initial scrutiny are physical and temporal distance
(Blanchard & Blanchard, 1989; Fanselow, 1994). Immediacy of the threat as defined via
these variables activates the horror-pathway leading to defensive action, and nociceptive
reflexes should damage be sustained (Lang, 1995). Increased distance instead stimulates the
terror-pathway, characteristically resulting in immobility, bradycardia and hyperattentiveness (Smith, 1991); a response notably referred to as the behavioural inhibition
system (Gray, 1982).
Within a genuinely fearful situation, several cues may be observed and terror may not always
precede horror. A horrific experience is partially characterised by a startle response and
consequently, any cue perceived to be sudden has the potential to initiate the horror-pathway.
However, the intensity of the stimulus dictates the subcortical activation and the degree to
Tom Garner
2012
University of Aalborg
which the cognitive feedback loop can attenuate behaviour. Gameplay during a particularly
frightening scene may include several sudden audio stimuli that stimulate a low-intensity
response (creaking floorboard, object knocked over) accentuating the terror in anticipation of
the final revelation. Fanselow (1994) describes three stages of fear behaviour that can be
readily applied to a survival horror scenario: pre-encounter defence refers to initial anxiety
experienced when entering an environment where predators are expected to appear (a dark
tunnel, old mansion, or dilapidated factory); post-encounter defence describes heightened fear
in response to cues that signify the presence of a predator (approaching footsteps, nearby
items knocked over, etc.); and circa-strike defence refers to an intense fight or flight response
when in region of physical contact and imminent threat (revelation of monster and attack).
The descriptions of the latter two stages reveal striking similarity to our established
definitions of terror and horror respectively. The concept of pre-encounter defence, however,
is one that has not yet been addressed within our fear framework and for the purposes of this
chapter is referred to as the caution stage.
Fearful stimuli can be understood as emotional prompts and cognitive cues for problem
solving (Perron, 2004). Understanding of the relationship that exists between cognitive and
subcortical processing requires identification of the variables that determine the degree of
control each opposing force will exert. Yurgelun-Todd and Killgore (2006) identify a positive
correlation between increasing age during adolescence and prefrontal cortex activity
measured during a fear-related activity. Hale et al. (1995) revealed that increasing the level of
fear arousal in a message would result in a shift from systematic (comprehensive analysis /
high cognitive load) to heuristic (partial analysis / economic cognitive load) process and
concluded that there is a positive correlation between fear sensation and economisation of
processing routine. Whilst the immediacy of the threat determines the type of behavioural and
autonomic response, it is the intensity of the fear sensation that defines the dynamic between
cognitive and subcortical control. Reber, Schwarz and Winkielman (2004) identify ease of
perceptual processing as a causal variable of emotional experience. Causes of disassociation
between input cues such as semantics, modality (visual, auditory, etc.) and attributes have
been shown to decrease temporal processing speeds and evoke negative emotional valence
(Spence et al., 2001). In accordance with the routines described earlier, an increased negative
emotional experience is expected to further increase activation of the subcortical response
(and, correspondingly, attenuate cognitive processing); the mind essentially perceiving the
complexity of the threat cues as a rise in danger level. However, ease of processing should
not be confused with ease of identification. Within the context of audio processing, Alho and
Sinervo (1997) argue that sub-cortical (referred to as pre-attentive) processing can be
observed in subjects when appraising complex patterns of sound; this suggests that the subcortical routine is capable of processing more than very basic stimuli. Although this
autonomic process is capable of identifying a deviant object within a complex and dynamic
environment, the task of identification is still arguably a base-level thought process in
accordance with Bloom‘s taxonomy of thought (Krathwohl & Anderson, 2001).
43
44
Understanding Fear and Game Sound – Definitions, Processes and Variables
The purpose of the terror routine is to alter the physiological state in a way that maximises
opportunity for aversive response should an immediate threat be presented. Utilising positron
emission topography (PET) to measure cerebral blood flow, Kimbrell et al. (1995) noted that
fearful stimuli induced greater blood flow in the inferior frontal gyrus (associated with the
go/no go principle) and the left temporal pole (associated with the ability to make lexical and
semantic links between different words, making it possible to understand a story [Dupont,
2002]). Conversely, fearful stimuli revealed decreased activity in the right medial cortex
(high-level executive functions and decision-related processes [Talati & Hirsch, 2005]), the
right superior frontal cortex (self-awareness [Goldberg et al., 2006]) and the parietal lobe
(integrates sensory information from different modalities, particularly determining spatial
sense and navigation). Here, neurobiology supports behaviour, as the inferior frontal and
right medial cortex initiates an urgent and direct response routine and the left temporal pole
and parietal lobe can be attributed to context (the participants were recollecting past
experiences of anxiety, not experiencing physical fearful stimuli). The contextualization of
the Kimbrell et al. experiment suggests that overall activity is unlikely to fit the above profile
in a direct-interactive fearful scenario and ethical considerations limit researchers‘ ability to
expose participants to immediate physical threats. Fortuitously, the nature of a CVG
environment allows for a simulation that may well reveal the exact neurophysiology of an
individual‘s fear response.
Bradley et al. (2005) compared the effects of pleasant/unpleasant stimuli and threat/safe
associations on the startle reflex. Experiment results revealed that unpleasant stimuli
potentiated startle regardless of subtext and that stimuli connoting threat potentiated startle
regardless of inherent meaning. This supports the notion that the fear response process is
sensitive to both objective and subjective fear-object attributes. Orgs et al. (2007) and Van
Van Petten and Rheinfelder (1995) collected evidence that conceptual priming via two related
inputs produced faster reaction times when compared to unrelated inputs. In response to a
fearful scenario we are primed by the initial stimulus, allowing us to respond to associated
subsequent stimuli immediately; as Smith (1999) states: ‗A fearful mood puts us on
emotional alert, and we patrol our environment searching for frightening objects‘, allowing us
to react with more immediacy and increasing the probability of successfully evading the
threat. The above findings support the notion that subconscious appraisal (dependent on
biological variation and behavioural conditioning) of a terror stimulus stimulates a pattern of
physiology that primes the individual for action in response to a horror stimulus (immediate
threat). In the context of a horror film, prior knowledge of upcoming events generated
increased sensations of fright and upset (Cantor et al., 1984). However, the nature of the
forewarning arguably contained little information that could be utilised to aid survival
(supporting uncertainty); with cues consisting of shadowy figures and sounds of masked
position as opposed to cues that could reveal the location, identity or weaknesses of the
threat. This suggests that forewarning cues that insinuate threat rather than describe it have
greater potential to evoke a terror response.
Tom Garner
2012
University of Aalborg
As mentioned earlier, the startle component of a horrifying experience has the potential to
significantly alter the intensity of the overall sensation. It has been proposed that the startle
mechanism is continuous and that the human individual is unlikely to ever be in a complete
state of non-alert (Hoffman & Searle, 1965). Brown et al. (1951) suggest conversely, that a
startle response can be potentiated by a preceding cue that connotes danger and threat via a
conditioned association. The response is sensitive to the individual‘s current emotional state
and, consequently, such affect-toned material preceding a startle probe has the potential to
significantly potentiate or attenuate the intensity of the response (Lang et al., 1993; Roy et al.,
2008). Relevant experimentation reveals that this effect remains consistent despite crossmodality between stimuli (for example, visual affect stimuli followed by auditory startle
probe) and that positive affect preceding a startle probe invariably reduces the behavioural
response whereas a negative emotional state has a magnification effect (Frijda, 1994; Vrana
et al., 1988; Yartz & Hawk, 2000). One particular emotional state, capable of dramatically
potentiating the startle effect is anxiety (Cuthbert et al., 2003; Kumari, 2001); an effect
further increased if the nature of the anxious state relates semantically or perceptually to the
startle probe. The threat of pain and the induction of disgust have also been associated with
potentiating of the startle reflex (Bradley et al., 2005). Yartz and Hawk (2002) found that
fearful stimuli caused a greater startle response when compared to disgusting stimuli, but
only in male participants. This suggests that we cannot predict the startle potential (and
consequently horror potential) of a terror stimulus without knowledge of individual character
and psychophysiological features.
This relationship between fear types is arguably not unidirectional, and a horrific experience
has the potential to influence future terror appraisals. Barlow and Durand (2009: p.123)
provide a review of anxiety causes; proposing an integrated model of anxiety induction
entitled triple vulnerability theory. This theory identifies generalised biological vulnerability
(diathesis: genetic causes of fear/anxiety susceptibility), generalised psychological
vulnerability (associated with a lack of self-confidence and an overarching belief that the
world is a dangerous place) and specific psychological vulnerability (a belief pertaining
towards a discrete object or situation) as causes of anxiety. The latter can be associated with
horrifying experience in that specific, intensely emotional events such as these are strong
candidates for future anxiety developments. A horrific experience can potentially connect a
substantial number of seemingly disparate items via conceptual networking links (Medin et
al., 2000), creating an intricate mesh of associations and, as a result, massively increasing the
number of memory items that could potentially impact upon the perception of future objects
or events as fear-related and terror-inducing. Upon examination of the above literature, the
bi-directional relationship appears to exist within a two-stage framework of fear (terror
primes horror). If we are to accept Fanselow‘s (1994) three-tier construct, then there is the
relationship between pre-encounter (environment orientated caution) and post-encounter
defence (object orientated terror) to consider. Understanding of the exact nature of these
relationships remains, at present, theoretical. However, an initial hypothesis; that the same
priming relationship that exists between terror and horror also incorporates caution and that
the three stages are interrelated, is worthy of consideration.
45
46
Understanding Fear and Game Sound – Definitions, Processes and Variables
Conscious cortical brain activity occurring during a terror state may arguably be working
against the survival instinct; shifting focus away from the immediate environment to
contemplate higher level construals and, consequently, reducing ability to react to a sudden
threat. The imperfect nature of the human response system means that we are susceptible to
both false-positive (believing there is a genuine danger when there is not) and false-negative
(believing there is no danger when there actually is) error. Differences between individual
coping styles are exemplified by the monitor-blunter spectrum. With regards to the extreme
poles of this continuum, monitors are individuals highly sensitive to fearful stimuli; revealing
strong semantic associations between the context of threat and numerous memory items. In
contrast, a blunter is comparatively insensitive to fearful stimuli and significantly less likely
to associate memory items or current stimuli to their concept of threat (Folkman & Lazarus,
1990; Miller, 1998). Sparks (1989) revealed that individuals identified as monitors generated
positive emotional responses in the presence of forewarning cues and negative response in
their absence; individuals identified as blunters revealed opposite results. Emotional response
was identified via debrief questionnaire and galvanic skin response (GSR) data designed to
assess the overall experience as opposed to phasic examination of the startle response. This
suggests that although forewarning invariably amplifies startle, an absence of warning may
create a more frightening overall experience (providing the individual is a monitor). A logical
conclusion could be that, in order to maximise the potential intensity of both a terrifying
(preceding) and horrifying (startle) experience, the environment preceding a startle probe is
required to contain stimuli that connote negative affect but that (in the case of monitors)
reveal little to no information that could assist in coping (size, position, movement, speed,
etc.).
In summary, the subcortical processes and autonomic physiology in response to terror stimuli
are designed to prepare the body for a horrific confrontation. The function of higher level
thought is primarily homeostatic; attenuating the subcortical routine to mitigate the
physiological changes in response to absence of threat. It is the cognitive appraisal routine
that is most susceptible to fault however. The nature of the cortical system means that
thoughts transcend the here and now and, as a result, past experiences and future conjecture
can bias an otherwise objective evaluation of the current situation. These biases appear as
type 1 and type 2 statistical hypothesis errors (false positive and false negative) that either risk
an individual under-preparing in the face of a threat or cause needless anxiety through the
conjuring of an unreal threat that is unsubstantiated by objective evidence. The chronological
period within which both errors can occur is after the physiological priming of terror and
before the possibility of horrific revelation. A suitable period of time between these events
allows the cognitive functions to analyse the situation. It is here that a blunter may
underestimate the threat by semantically associating the stimulus to non-threatening concepts
whilst the monitor overestimates, relating the stimulus to inappropriately dangerous theories.
At this time, such a structure remains theoretical and would require real-time observation of
neural activation during a genuine fear experience. It is also acknowledged that these
dynamics are focussed upon the short term and do not account for effects such as prolonged
horrific experiences on the perception of terror and anxiety.
Tom Garner
2012
University of Aalborg
FEAR AND COMPUTER VIDEO GAMES
Within the context of a computer video game, the sensation of fear cannot directly stem from
an actual threat to physical wellbeing, if that were the case the respective industries would
have been required to take a rather relaxed view on ethical concerns. Instead, such media
attempts to displace actual physical damage via representation; creating a virtuality in which
the audience can be hunted, attacked injured and killed without actual sustained damage
(within reality). Fictional media can utilise the notion of self-perpetuating fear, wherein
audience members are afraid that the media may make them terrified or cause them to jump
in their seats (Massumi, 2005). However, if identification and acceptance of a threat is central
to a genuine fear response, phobophobia alone may not comprehensively clarify a fearful
experience during gameplay. Instead, an alternative may exist in which an artificial
environment is (at least partially) accepted as reality; causing a player to experience a fear
that is comparable to a natural fear sensation.
Although a distinction between virtuality and reality at first appears clear, such boundaries
rapidly deteriorate upon close inspection. The previous paragraph suggests that acceptance of
a virtual environment as real has a substantial potential for inducing more intense affect, and
perceptions of a genuine fear experience. The line separating man from machine and reality
from virtuality is becoming increasingly blurred (Sorgatz, 2007). This may serve to facilitate
both a computer video game player‘s perception of a virtual threat as real and manifest an
experience of genuine fear in the absence of physical danger. In Discrimination and
Perceptual Theories (1976), Alvin Goldman suggests that a representative entity (described
as façade) can generate a false perception of reality. Within a real-world scenario a proposal
fulfilling an individual‘s requirement for acceptance (as real) may not necessarily be the
truth. Goldman elucidates, describing a papier-mâché facsimile of a barn that on visual
appraisal is undeniably a barn, however deeper investigation (additional sensory data,
attempts to exploit expected function from object) unveils the deception and upon discovery
alters the perceiver‘s reality, simultaneously disproving the knowledge of the prior proposal.
The subjective perception of reality is distinguished from the acquisition of objective
knowledge; however Goldman‘s theory suggests that knowledge is itself, a belief. Within a
CVG context, virtual worlds have potential to represent reality and exploit causal
assumptions to facilitate immersion and manufacture belief in the façade. The facsimile barn
within reality may generate a perceptual certainty for the viewer via no more than distant
visual data; by comparison the virtual barn may provide a wealth of additional inputs
(additional visuals, audio, interactive physics) to consolidate its appraisal as real.
The information detailed above is focussed primarily on the core processes of fear and
arguably describes the phenomena as a negative experience necessary for coping with threat.
Irrespective of this, research has suggested that not only is fear capable of producing positive
emotional experiences (Andrade & Cohen, 2006; Svendsen, 2008: p.75) but that it is a
sensation that many will actively seek out as a way of emotional exploration (Pearce, 2008).
Poole (2000) argues that within the context of a supposedly pleasurable overall experience
such as a computer video game, consumers wish to experience emotional variety and
intensity within user-defined boundaries. Perron (2005) describes the pleasurable experience
47
48
Understanding Fear and Game Sound – Definitions, Processes and Variables
of fear as recreational terror, likening the experience to that of a rollercoaster; a sensation
that lies between security and uncertainty. Placing the player experience between these polar
opposites is possible via manipulation of boundaries (Perron, 2005; Pinedo, 2004). A gamer
who may wish to experience the visceral nature of war, nevertheless is highly unlikely to
desire the experience of being shot. Consequently, very few fictional horror media genuinely
terrify their audience (Schneider, 2004: p.135). Boundaries between desirable and detrimental
emotional experiences arguably follow the same principle and, by definition, real terror is an
emotional experience no individual is likely to long for, but paradoxically the terms terror
and terrifying remain two of the most prevalent descriptors used to market computer video
games within the survival horror genre.
Andrade and Cohen (2006) argue that the disparity between individuals is a variable likely to
impact upon whether a fearful scenario is regarded entirely as a negative experience or as a
co-activation of positive and negative. For example, a survival horror game that is perceived
to be deeply upsetting and disturbing to one individual may be experienced as partly
disturbing and partly exhilarating to another. Andrade and Cohen put forward the notion of
displacement as the potential cause of this variation between individuals; a concept echoed by
Perron (2005). Displacement in this context refers to the psychological distance the
individual has placed between themselves and the stimulus. Such an effect may even cause
players to misread a horrific emotional cue and respond with laughter (Giles, 1984).
Displacement theory resonates with that of ‗a bounded experience of fear (Pinedo, 2004:
p.106). In a CVG context, displacement appears to be the antonym of presence and
immersion; if a player had a deep sense of presence within a virtual scenario, a fear stimulus
would have greater potential to create an exhilarating and intense adrenaline response, but
also increases the risk of genuinely disturbing and upsetting the player to the point of
withdrawal from the game. Alternatively, if a player was too far displaced from the scene, the
apprehension that the stimuli do not constitute an actual threat to well-being would reduce the
danger of genuine upset but would simultaneously risk nullified impact, boredom and
misreading of the stimuli. Experiencing horror and terror sensations does not require the
objects of fear to exist within reality, as individuals can be moved by the imaginary (Carroll,
1990 p.88); a notion referred to by Tan (1996) as fiction emotions. Survival horror games
project a real threat of physical danger creating a partial experience ‗bounded by the tension
between proximity and distance, reality and illusion‘ (Pinedo, 2004: p.107). These games
(like all fictional horror media) seek to strike a balance, raising the realism and immersion to
a level that enraptures the audience; evoking autonomic behavioural responses to create a
genuine sensation of fear, whilst providing time (periods of absence from threat) and reality
cues (events and/or entities that remind the player that they are playing a game) for
subsequent relief.
Effective manipulation of the fear-boundary concept requires a full appreciation of the factors
that make CVG a unique medium. The interactivity of a computer video game immediately
differentiates it from a film, providing genuine player consequences (loss of progress, threat
of repetition) alongside an increased association between fictional protagonist and player self
(Shinkle, 2005) and the opportunity for ―testing the limits of the game, playing with the game
Tom Garner
2012
University of Aalborg
instead of playing the game‖ (Perron, 2004). The fluctuating temporal nature of a CVG
experience (as a result of save, load and checkpoint replay functions) means that already
experienced horror moments (a difficult boss fight where the avatar was killed and the player
must return to the last checkpoint) can induce terror during a repeat play; the horror event recharacterised as a terror stimulus (Perron, 2004). Fear induced action tendencies that cannot
be realised when watching a film are a central aspect of the dynamic between the player and
the game, and these interactions generate virtuality-based feedback emotions, described by
Perron (2004) as gameplay experiences. Examples of gameplay experiences include
challenge (Nacke & Lindley, 2008), game immersion (Brown & Cairns, 2004), flow
(Csíkszentmihályi, 1990) and frustration (Perron, 2005) and must be considered before a
comprehensive ecology of CVG fear can be constructed.
The discussion documented above suggests that evolutionary developments are crucial to the
characteristics of the fear process as the ultimate purpose of fear is arguably to facilitate
homeostasis by enabling the body to evade dangers that may cause injury, dysfunction or
death. The exact nature of the fear process however is much less straightforward; integrating
preconscious autonomic (both evolutionary/biological and behavioural) circuits with
cognitive internalisation, reflective with reflexive thought and integrating memory,
personality, physiology and environment into a complete scenario. The following sections of
this chapter will progress these notions further by exploring the affective qualities of audio
and laying the theoretical groundwork that will inform an integrated conceptual design that
assimilates the ecological fear framework with game sound theory in an effort to construct a
virtual acoustic ecology of fear.
THE POTENTIAL OF ACOUSTIC AND PSYCHO-ACOUSTIC
SOUND PARAMETERS TO CREATE AND INTENSIFY FEAR
The preceding section addressed the substantial terminology associated with fear, elucidating
the individual processes that exist within a fearful experience and demonstrating how they
interact along a chronological path. Here, both the acoustic and psychoacoustic parameters of
sound that can potentiate fear are discussed. Existing empirical and conceptual work is
addressed and then expanded upon; integrating acoustic parameters, audio classes and modes
of listening, into a structure of fear that is then re-contextualised into a gameplay-relevant
acoustic ecology. These sections commence with an outline of various acoustic parameters
that have been associated with human emotional response; followed by an exploration into
the emotionality of sound in an attempt to elucidate ways in which sound can not only
propagate emotional meaning, but also evoke a listener‘s emotional reactions across a wide
variety of discrete emotions that includes, but is not limited to, fear.
EMOTIONAL PROPERTIES OF SOUND
Sound is a critical component to consider when developing emotionality, as it is directly
associated with the user‘s experience of emotions (Shilling et al. 2002; Alves & Roque,
2009). Parker and Heerema (2007) suggest that sound carries more emotional content than
any other part of a computer game. Grimshaw et al. (2008) discovered that players felt
significant decreases in immersion and gameplay comfort when audio was removed from
49
50
Understanding Fear and Game Sound – Definitions, Processes and Variables
gameplay; an assertion also made by Jørgensen (2006) who, via observations and
conversations with players, revealed that an absence of sound caused a reduction in
engagement such that ‗the fictional world seems to disappear and that the game is reduced to
rules and game mechanics‘. Foley sound design supports the emotionality of sound effects in
creating both fantastic and everyday worlds. Ekman (2008) describes how ‗often non-realistic
sounds are purposefully used to make the action sound better‘. She exemplifies this process
as ‗walking on cornstarch sounds much 'more real' on film than the actual sounds of walking
on snow‘. Shilling et al. (2002) quote industry professionals: ‗A game or a simulation without
an enriched sound environment is emotionally dead and lifeless‘, implying that sound effects
must be analysed in terms of their emotional qualities so that they may be implemented in a
way that will maximise the audience‘s sensory experience.
If we accept that sounds can be manipulated to maximise emotionality, it is reasonable to
assume that specific game genres require specific audio ‗emotioneering‘ (Freeman, 2003).
Therefore the survival horror genre, most commonly associated with the emotion of fear,
would require emotion-based sound design that strived to evoke fearful responses (Kromand,
2008). As will be discussed later within this section, there are many acoustic and
psychoacoustic properties of sound that could be investigated as to their fear-inducing
potential. Some are quantitative, in that they can be objectively measured and applied to
synthesis and audio processing. Others are more qualitative, based upon perception of a
sound‘s (or collection of sounds‘) meaning(s) and are influenced by factors such as culture,
experience, context and expectation. Later sections within this chapter survey relevant
academic literature concerning the affective properties of sound in both quantitative and
qualitative classes, particularly with reference to discomforting properties within the context
of computer video games. Slaney (2002) concedes that the dynamic characteristics of sound
make it difficult to analyse using objective acoustical measurements. Nevertheless, several
approaches have been documented that identify quantifiable sonic parameters that can be
associated to a sound‘s emotionality. Cho et al. (2001) provided evidence that pressure level,
loudness and sharpness of a sound can directly affect emotional valence and intensity.
Loudness and sharpness are admittedly perceptual, psychoacoustic properties; however, using
a model outlined by Zwicker and Fastl (1999), such properties can still be measured to
provide objective values.
Moncrieff et al. (2001) reference attack-decay-sustain-release (ADSR) as a quantifiable
sound energy parameter showing a significant association between ADSR and specific
emotional responses. Bach et al. (2009) document the concept of increasing intensity as a
measurable audio property that is psychoacoustic in nature via its intrinsic nature as a
warning cue, while signal to noise ratio (Ekman, 2008) can also affect a sound‘s emotional
impact because of ease of cognitive processing. Periodicity, tempo and rhythm have the
potential to elicit substantial affect through audio-physiological effects such as entrainment
wherein, according to Alves and Roque (2009), a rhythmic simulation of a heartbeat, steadily
increasing in tempo, has the potential to induce an increase in the heart rate of the listener.
Parker and Heerema (2007) suggest that an evolutionary survival instinct exists today that
Tom Garner
2012
University of Aalborg
encourages humans to associate low-pitched sounds (growls and rumbles) with predators and
consequentially experience fear in response to such a stimulus.
Reverberation is one regularly implemented effect that can affect a player‘s perception of the
game environment (Grimshaw, 2007). Alongside reverberation, an important function of the
audio effect delay is to provide architectural and material information regarding the listener‘s
environment: long reverberations and delays suggest reflective spaces that are large in
comparison to the listener who can be made to feel quite small and lonely through this
technique. Winer (1979) documents how the application of frequency manipulation or
equalisation (EQ) affects a sound‘s emotionality and aesthetic. Localization of a virtual
object, although currently limited in terms of game implementation, has significant emotionrelated potential (Winer, 1979; Sonnadaraa et al., 2006; Steele & Chon, 2007). The dopplereffect can also be measured objectively and manipulated to further create a more realistic
illusion of position, direction and speed. Compression and normalisation techniques are used
regularly across a multitude of audio applications; whilst their primary function is to limit
erroneous sound pressure levels and create a more uniform audio stream, manipulation of
such parameters creates noticeable differences to a sound‘s psychoacoustic properties and
therefore begs investigation as an emotioneering parameter.
Earlier sections within this chapter detailed the characteristics of the sub-categories of fear:
horror (associated with shock/surprise) and terror (suspense, anxiety and threat). Established
literature describes implementation of this knowledge via a number of audio design
techniques. Breinbjerg (2005) posits that intentional ambiguity of a sound's source and
location is critical to building suspense and terror, arguing that ‗[k]nowing that something is
happening around the corner, without knowing precisely what it is, is most frightening‘.
Breinbjerg also suggests that a lo-fi audio soundscape consisting of many interfering sounds
can increase disorientation and decrease the player‘s perceived coping ability. Kromand
(2008) exemplifies this by describing the implementation of sensory fillers (sounds irrelevant
to gameplay) that nevertheless resemble sounds relevant to gameplay. This practice dissolves
the barrier between diegetic and non-diegetic sound, consequently encouraging the player to
cautiously treat every sound as a threat harbinger; suspense is characterized (in this context)
as a more prolonged, less intense feeling of terror. Kromand (2008) suggests that this can be
achieved via a system of audio ‗warning‘ cues that steadily reveal localization and movement
information. He argues that the consequentially slow rising of intensity, plus no clear
indication of when the inevitable shock will occur, manifests as suspense for the player.
Parker and Heerema (2007) propose that acousmatic sounds perceived as threatening increase
the sensation of terror: ‗A prey animal that can only hear the predator is in an unknown
amount of trouble, and it pays to believe the worst‘. Reber, Schwarz and Winkielman (2004)
argue that positive value judgements of audio strongly correlate with the ease with which
they can be processed. Inverting this argument supports the notion that a sound that is
difficult to identify, localize, and/or apply semantic meaning to, will evoke negative
judgements.
51
52
Understanding Fear and Game Sound – Definitions, Processes and Variables
Shock-horror requires a different approach. Despite Alfred Hitchcock‘s famous objection to
shock (often referred to as cheap and simplistic) it remains a hallmark of the survival horror
game genre. The most frightening part of the original Resident Evil (Capcom, 1996) is
arguably the shocking moment when two mutant dogs jump through a window to attack the
player‘s avatar. Xu et al. (2005) state that an audio shock is most effective when it is
preceded by silence; a technique utilized in the aforementioned example. Cho et al. (2001)
insist that acoustical properties of audio (specifically intense loudness and sharpness) can
produce quantitative increases in negative emotional valence. These sonic characteristics are
typically descriptive of audio designed to shock. Kromand (2008) details a deceptive
technique that can be arguably associated to shock. This technique first establishes a sonic
convention that aids player survival (Kromand uses the radio from Silent Hill 2 [Konami,
2001] as an example) then intentionally defies this convention and morphs the semantic
meaning of the sound from supportive to antagonistic.
Cox (2007) tested various sounds assumed to be disgusting and horrible in nature suggesting
that (mainly as a result of cultural factors) individual sounds can have distinctly different
levels of perceived disgust. There appears to be a fine line between the disgusting and the
horrific and, although Cox suggests that a sound can be exclusively either, it seems
reasonable to assume that perceived disgust will impact upon an overall sensation of horror
when combined with a perceived threat. Parker and Heerema (2007) describe third-person
audio cues as distinctly horrific in nature. They use a human scream as an example, asserting
that ‗[a]s humans we tend to react with emotional similarity when we hear such sound, not in
sympathy so much as in fear of whatever is inflicting pain or fear on the other‘. In this
example the sound is not only shocking due to its sudden, sharp and intense acoustic quality,
but also horrific in that it implies the presence of a horrific creature and/or act.
POTENTIALLY FEAR EVOKING ACOUSTIC PARAMETERS
Relatively little literature exists concerning the effects of quantitative acoustic properties
upon a listener‘s emotional state (with specific reference to fear). However, relevant research
indirectly referencing this area of study or presenting relative theories is detailed throughout
this section. Several concepts regarding the potential of specific sound parameters to
manipulate emotional affect are concisely accumulated by Grimshaw (2009). Identified
notions include: rapid onset/offset (attack/release) of an audio signal relates to a perception of
urgency, slower attack relative to faster release increases perceived intensity by way of
connoting an approaching source, and both loudness and frequency equalisation have the
capacity to attenuate and amplify negative emotional activation.
Attack (a component of ADSR) refers to the distance (time) between the onset of a sound and
the intensity/volume peak. Research relating to this parameter suggests that short attack
periods (sudden, immediately intense) potentiate greater startle responses that are likely to be
interpreted as frightening events; a response similar to the primitive Moro reflex, often
referred to as the Strauss reflex in adults (Mulhall, 2011). In contrast, long attack periods
slowly introduce a sound to the listener thereby greatly reducing startle potential. Long attack
may however be employed when presenting a sound that establishes setting and builds
Tom Garner
2012
University of Aalborg
tension (Parker & Heerema, 2007). An attack of considerable length may have further
connotations of threat, suggesting that the source is moving towards the listener (Bach et al.,
2009). Grimshaw (2009) addresses audio de-localisation (manipulating the sound to mask the
position of the source), suggesting that in the context of a predator sound, occlusion of the
source‘s position may augment the fear sensation, but that this de-localisation effect cannot
be generalised to all sounds. Localisation refers to the positioning of the originating audio
source in relation to the player and is determined by the interrelations between the source, the
listener and the surrounding environment. Various assertions associated with localisation
exist. Increased localisation difficulty may cause increased fear sensation due to uncertainty
of source location reducing coping ability and consequently amplifying the severity of the
threat (Ekman & Kajastila, 2009; Grimshaw, 2009). Point-like sounds that are easily
localised within the environment are perceived to be more frightening if localised behind the
player than in front (Ekman & Kajastila, 2009).
Acousmatic audio (sounds that have no visible source on screen) causes similar emotional
effects if connoting a threat by limiting information that may support a coping strategy
(Chion, 1994). Grimshaw (2009) also documents several additional psychoacoustic sonic
properties associated with negative emotion experience; highlighting the unexpected nature
and occurrence of an audio entity and the concept of defamiliarisation (the
processing/distortion of a familiar sound to create the strange/uncanny). Sounds identified as
approaching evoke a greater fear response than sound identified as receding (Bach et al.,
2009). Signal to noise ratio (Ekman, 2008; Grimshaw, 2009) expresses a similar theorem,
that decreased clarity (low-fidelity) resulting from a low signal to noise ratio may increase a
fear sensation by way of increasing the difficulty in identifying and localising the signal
source. Low quality or distorted audio may also lead to an uncanny sensation resulting from
exposure to sound that connotes a familiar entity, but is presented in a deformed manner that
generates an unsettling psychological disturbance (Kirkland, 2009).
Tempo derives from the musical definition, specifically referring to the frequency of repeated
sounds (or repetitions of significant components of a soundscape). Research regarding
entrainment (synchronisation between resonant systems) has asserted that alternative tempo
properties have the potential to change the rate of brainwaves, heartbeat, respiration, etc.
(Alves & Roque, 2009). This theory can be applied to a gameplay horror scenario by
suggesting that increased tempo of threat-relevant audio may augment a fear sensation both
cognitively (quick tempo of enemy footsteps connotes fast moving opponent and increased
threat) and physiologically (by way of entrainment, the quick tempo of repeating audio
encourages increased heart-rate, which is then interpreted as increased stress and perception
of fear). Grimshaw (2009) suggests that ‗frequency might have an effect on the
unpleasantness of sound and this may lead to negative affect‘. This notion has been explored
further, addressing the perceptual acoustic property of sharpness (Cho et al., 2001). The
sharpness concept refers to the frequency and purity of a sound and suggests that increased
sharpness (higher frequency and purer tone) produces increased discomfort and negative
affect for the listener. Parker and Heerema (2007) suggest that evolutionary development
instils instinctive fear responses to certain audio properties. Specifically, slowly evolving,
53
54
Understanding Fear and Game Sound – Definitions, Processes and Variables
low frequency audio encourages a fear response by way of association with predator
growling, whilst comparatively, high-frequency sounds evoke the same response via
instinctive connotations of human screams.
CONCLUSIONS AND CHAPTER SUMMARY
This chapter has presented a review of relevant literature, to support two hypothetical
frameworks. The first accounting the processes of a fearful experience within a computer
video game context and the second detailing the interaction between real and virtual acoustic
ecologies within a game sound context. From this discussion, several opportunities for further
study are revealed. Adjustments in quantitative acoustic parameters such as reverberation,
tempo, rhythm, loudness and spectral frequency could be compared in both situationintegrated and disassociated classes during survival-horror gameplay to establish the potential
of individual acoustic qualities in modulating the intensity of player fear. Use of
proprioceptive audio remains inconsistent within mainstream horror game titles and the
opportunity exists to compare presence, contextualisation and acoustic characteristics of this
sound type. Event logging systems support the collection of player action during play,
providing an opportunity to confirm the classification of a sound as an attractor or retainer.
Electroencephalogram hardware supplies a means of testing listening function by way of
measuring cortical activity to suggest the way in which the player is auditioning the sounds
presented. Ultimately, this information has the potential to allow game sound developers to
direct the perceived intensity of the player-fear experience by effectively preparing the player
in the initial stage, then utilising biometrics and avatar action logging to identify (in realtime) heightened anxiety and emotional arousal, activating both terror and horror sounds that
capitalise upon prior preparation. Such a system has the potential to both extend the replay
value of a horror game and present a genuinely frightening recreational experience, testing
the nerves of even the most hardened survival horror computer game enthusiasts. This
chapter nominates ADSR, pitch, sharpness, periodicity, reverberation, loudness, equalisation,
and localisation parameters as strong candidates for potential fear elicitation, based upon the
scenarios they can signify.
The next chapter provides additional background information before an amalgamation of
theoretical concepts is undertaken. The notion of virtuality is explored in an attempt to
elucidate the nature of virtual environments with relevance to game sound and the firstperson shooter (FPS) genre. Embodied cognition theory is discussed in detail and forms a
core foundation from which the meaning of virtuality is explored. In addition, the subsequent
chapter will also document audio functionality and the modes of listening; stressing the
importance of contextualisation upon audio perception and investigating the continuum of
sonic virtuality in an attempt to comprehend the relationship between audio properties and
perception of a sound as real (‗genuine‘) or virtual (‗false‘).
Chapter 4
Embodied Cognition and
Sonic Virtuality
Garner, Tom A.
University of Aalborg
2012
56
Embodied Cognition and Sonic Virtuality
Chapter 4: Embodied Cognition
and Sonic Virtuality
INTRODUCTION
This chapter explores acoustic ecology (AE) theory within the domain of computer video
games and examines the variables and measurements of sonic virtuality. Also discussed are
the various concepts connected with embodied cognition (EC) and knowledge theory, and
also the associative notion that reality is an illusion and that all existence is inherently virtual.
As stated within the thesis introduction, the term computer video games (CVG), refers chiefly
to modern games, more specifically, the first-person shooter (FPS) genre that places the user
within a three dimensional virtuality. This chapter explores concepts of virtuality, primarily
to better elucidate the nature of its technological associate, virtual reality. First, discussing the
possibilities of virtuality and its distinction from reality and artificial reality, this chapter
subsequently addresses the virtuality of sound, deliberating on what determines a sound to be
real or virtual and presents a graded continuum of virtuality. Previously referenced within the
thesis, embodied cognition theory is explored in greater detail and applied to virtuality and
acoustic ecology concepts, incorporating modes of listening, attention and cross-modal
effects, construal level theory and psychological distance in an effort to better elucidate the
nature of sound within a virtual environment.
DEFINING VIRTUALITY
A precise understanding of virtuality is notably hindered by disagreements concerning its
exact definition. From a perspective of popular usage, the word virtual is synonymous with
near and almost, which - when applied to our understanding of virtual reality (VR) describes a near-perfect recreation of reality, placing virtuality a single notch below reality at
the top of a continuum, with unreal at the base. The term virtual is also echoed in words such
as essential and fundamental - a reflection that could describe VR as an approximation of
actuality that achieves the most significant aspects of the reality it imitates whilst being
distinguishable in the minor details. We differentiate between virtuality and virtual reality,
identifying the former as a more general term for describing an entity as less than real. In
contrast, virtual reality can refer to both a technological development implementing computer
generated sensory information, and an individual‘s subjective perception of reality. As the
definition of virtuality will be developed throughout the chapter, the opposing term real will
also be addressed. To clarify later statements, real will primarily (unless otherwise specified)
refer to entities (events, objects, processes, etc.) that exist outside of VR and are observable
within the natural world (as opposed to an emulated one).
John Leslie King (2007: p.13) identifies the inherent meaning of virtuality as the opposing
counterpart to what is real, suggesting that without the virtual, reality could not exist. Logical
deduction develops this notion to posit that virtuality has existed for as long as we have been
able to comprehend reality. The difficulty with this definition lies in our limited and abstract
Tom Garner
2012
University of Aalborg
understanding of real. King (p.14) cites research identifying cognitive perception of the
unreal as real (referencing actual improved sound quality affecting perceived visual quality of
a multimodal stimulus). What we perceive is unequal to what is. This notion could be
extended to suggest that each individual possesses their own unique reality, containing
various perceived truths that are accepted within that reality with a conviction comparable to
that felt towards objective truths. This calls into question the existence of so-called objective
truths and asks: is all existence inherently virtual? Does the very nature of our thought
processing distil all knowledge and certainties to beliefs and opinions? Could the oxymoronic
signifier virtual reality, in fact, be the single concrete truth?
Accepting virtuality as simply the antithesis of reality would classify the term as anything
unreal (unnatural, inauthentic, false, etc.). From a philosophical perspective, such a definition
is valid at a theoretical level, but ultimately limits our opportunity for development, both in
terms of conceptual understanding and technological advancement. A more detailed
understanding of virtuality would support creative design choices within various industries,
developing the immersive qualities of various media to ultimately generate a better user
experience for innumerable products.
The associations between virtuality as a concept and virtual reality as a technology advocate
computer video games as a prime medium for virtuality research. Whilst visual information
from a game exists on a two dimensional plane, game sound exists within the same 3D
environment as the listener (Grimshaw, 2008), revealing greater complexity and blurring the
lines that separate reality from virtuality; consequently promoting sound as the preferred
sensory modality for this research.
CONCEPTS OF VIRTUALITY
„There has never been a totally secure view of reality, certainly not in the industrial era of
history. People say that the world is not as real [as] it used to be.‟ (Woolley, 1993: p.6)
From a highly theoretical perspective, it is appropriate to suggest that, should a computer
video game successfully generate a virtual world indistinguishable from that of reality, then
potentially any dream from within the imagination could be fully experienced in both body
and mind; with a further opportunity for such dreams to be shared, exchanged and
experienced alongside other people. Such possibilities reveal a substantial potential value for
virtuality research, whereby a comprehensive understanding of human perception could
facilitate technology capable of placing a user within a total immersion environment that is
perceivable as reality but with the limitless freedoms that virtuality could afford.
Chalmers et al. (2009) argue that although human beings perceive the world with five senses,
cross-modal effects can have substantial impact upon perception ‗even to the extent that large
amounts of detail of one sense may be ignored when in the presence of other more dominant
sensory inputs‘. This may go some distance to explaining the immersive capacity of a
computer video game. Whilst information received during gameplay may typically be limited
57
58
Embodied Cognition and Sonic Virtuality
to visual, audible and haptic data, the way in which the information is presented generates an
illusion of full sensory input (a concept that is more fully explored later in the EC
discussion). Hughes and Stapleton (2005) support this notion, arguing that ‗the goal of VR is
to dominate the senses, taking its users to a place totally disconnected from the real world‘.
In her book The Rational Imagination, Ruth Byrne (2005: p.3) identifies counter-factual
imagination as a component of creative thought in which an impossibility, related to reality,
can be experienced within the mind‘s eye. Reality is transformed through manipulation of
facts into an alternate world of fiction, a world that could arguably be described as virtual.
Imagining an alternate reality in this way is not a passive ‗dream-like‘ experience but rather
an (inter)active one, directed by the creator in real-time. This differentiates the alternative
reality of counter-factual imagination from film, theatre and fiction novels. Computer video
games, however, share with imagination the opportunity for real-time user interaction within
an impossible environment. For Michael Heim (1998: p.4), virtual reality is technological an ‗emerging field of applied science‘. This focus differentiates between terms associated
with unreal, separating virtuality from Artificial Reality, a phrase that Woolley (1993: p.5)
defines as a circumstance in which a fiction has become fact through creation of a product
that originally only existed within a fictional world. In utilising a technological perspective, it
is best to avoid more abstract theoretical notions and a clear distinction between reality and
virtuality, alongside a practical method for measuring virtuality, is required. Heim (1998)
distinguishes between hard and soft virtual reality (VR). Soft VR is described as essentially a
diluted form; that is anything based in computers or that can be argued as other than real
(Heim illustrates by way of advertising techniques that state their product or sevice as the
real thing, suggesting that their competitors are less than real and, in essence, virtual). Heim
argues that such a definition is counterproductive to the progression of virtual reality
development and presents the three standards of Virtual Reality (1998: p.7): immersion,
interactivity and information intensity.
In the book Architecture Depends (2009: p.87), Jeremy Till asserts that an established
statement by Laurie Anderson (that virtual reality will never be fully accepted as truth until
developers ‗learn how to put in dirt‘) is not to be literally translated and subsequently posits
that dirt refers to temporality. The more obvious, literal definition appears valid nonetheless
in that the technological restraints put upon virtual world construction (e.g. texture tiling,
model duplication, sound repetition, etc.) dictates a flatness that differentiates VR from
reality. Within even the most complex game engines, every pixel is accounted for and clearly
stated within the program code whilst reality is encapsulated in (seemingly) infinite and
uncontrollable variations. Modern VR systems do attempt emulation of temporality,
particularly within the genre of role-playing and real-time strategy games. Weapons may
deteriorate, buildings may fall into disrepair, plants and animals may starve, and relationships
may lose closeness; these are but a few examples of gameplay mechanics simulating the
temporal dirt of reality, with consequences and interrelations that relate directly to the player.
However, the association between virtual reality and time documented by Till asserts that
such efforts remain distinctly unreal because such temporal elements are autonomous and
most commonly relate to an internal clock as opposed to the globally shared system of time.
Tom Garner
2012
University of Aalborg
Hughes and Stapleton (2005) consolidate relevant literature to elucidate various conceptual
positions on the reality/virtuality continuum. In this framework reality and virtual reality
(VR) exist on polar extremes with mixed reality (MR - a relatively even distribution of real
and virtual elements within an environment) occupying the centre point, augmented reality
(synthetic objects added into a real landscape) positioned between MR and reality, and
augmented virtuality (real objects, situated within a virtual landscape) located between MR
and VR. In addition, Hughes and Stapleton note that MR also refers to the entire spectrum of
reality/virtuality and that the reality types are points on a continuum rather than discrete
classes of reality.
THE VIRTUALITY OF SOUND
A player is situated in their living room, seated comfortably and initiating their game
experience. As they commence by progressing through the game setup menus, a variety of
feedback beeps confirm their actions within the interface. As the player enters the game, they
hear a transcendent voice counting down to match-start followed by verbal orders from a
non-player character (NPC) heard over a short-wave radio. The player hears the sound of
their plasma rifle charging, their footsteps in the snow and the rattle of their equipment as
they proceed. As an enemy is observed and engaged, the sound of plasma weapon firing
dominates the soundscape, followed by a visceral impact expressed by the sound of screams
and melting flesh. As the enemy lays sprawled across the battlefield, their voice (originating
from a live player via voice-over internet protocol [VoIP]) is heard loud and clear;
―Whatever! Lucky shot!‖ as a chorus of cheers confirms the player‘s righteous kill.
The above scenario elucidates the complexities that arise when attempting to classify and
measure sonic virtuality. Feedback sounds supporting menu/interface navigation have no
physical relationship between action and sound, with the chosen sound sample retaining only
a semantic association. Transcendent voice-overs are pre-recorded material with no
identifiable (real or virtual) source whilst NPC radio messages originate from a clearly
identifiable source but are not propagated via a natural physical source. The sound of a
fictional entity such as a plasma rifle (a symbolic auditory icon [Grimshaw and Schott,
2008]) does not possess physical causality and can only be created by way of synthesis or
sampling (recording) of an alternative sound (again, with only a semantic link between sound
and source/action). Footsteps in the snow may utilise a sample taken from a genuine source
yet often a recording of walking on corn starch is implemented (an oft-used technique in
cinema). This represents a common example of hyper-real sound, in which designers have
consistently utilised an alternative sound sample to the extent that a listener is more likely to
recognise the false sample as more genuine than the physical sound.
All of these sounds originate from the game engine, but they still exist within reality propagated by artificial (but nonetheless physical) equipment, travelling and reflecting
through real spaces (Grimshaw, 2008). In a similar vein, Natkin (2000) documents a practical
implementation of virtual soundscapes within real spaces, in which the sound waves are
propagated via headphones and utilise postproduction processing to create an artificial
representation of the physical space. In this circumstance it is not the actual sound that is
59
60
Embodied Cognition and Sonic Virtuality
virtual, but instead the method with which the sound is broadcast. For Natkin, this
propagation of sound is virtual, though the sound samples themselves are high likely to be
classified by many as real. Bronkhorst (1995) provides a correlating argument, identifying a
differentiation of sound virtuality between sounds that originate from a natural and from an
artificial propagation source.
This notion is complicated further when considering VoIP sound. Although VoIP transmitted
sound originates from a live speaker, the information is digitised then re-encoded as sound
wave data before it reaches the listener and commonly a delay exists between speaking into
the headset and receiving the sound from the sound output. Whilst the sound input possesses
a stronger temporal and semantic association to the received output, the propagation is
nonetheless artificial. This can be expanded to question the reality of electronically amplified
live sound and further complexity arises when considering the difference between analogue
and digital signal processing (an issue that has garnered significant debate in recent years). If
we were to define virtual sound as any sound propagated by an artificial projection medium,
then a significant proportion of academic experimentation into sound deals exclusively with
virtual sound. This highlights the following questions: is the nature of
recording/playback/amplification of a sound originally emitted from the natural world enough
to make that recording virtual? Is there a distinct separation between virtual and artificial?
At present, mainstream computer video games generate sensory data in visual, sound and
tactile modalities. As a result, even game developers that strive to achieve realistic
soundscapes must expand the purpose of sound to perpetuate a simulated sensation of
olfactory, tactile and gustatory input via representational inference. A visually rendered
corpse alone may not trigger an olfactory sensation of decay, but the sound of buzzing flies
and wriggling maggots around it has a much greater potential to do so. This issue establishes
a clear divide between reality and the virtuality of current computer video games. Sound
designers are often required to compromise between playability (a sound design that supports
player action via extra-diegetic sound feedback, representative of other sense data) and
realism (a soundscape more akin to reality, focussed upon diegetic sounds). The notion of
realism is questionable, however, when observing a game experience as a whole. Although a
realistic virtual soundscape may (virtually) reflect its reality counterpart, the lack of sound
input compensating for other sensory modalities creates an incomplete experience, lacking
immersion and, ironically, appearing unrealistic. Extended exposure to fictitious Hollywoodesque foley sounds has determined that genuine source recordings of many dynamic sounds
(shotgun blasts, footsteps in the snow, etc.) are often perceived to be unrealistic. As the lines
separating the virtual and the real become increasingly blurred; the audience is becoming
more likely to be immersed by the hyper-real than the real. Games are not simply simulations
of real events; they are unique constructs that are better perceived as real-life activities
(Shinkle, 2005).
Tom Garner
2012
University of Aalborg
Several questions concerning virtuality are hereby raised: how can we better understand sonic
virtuality? Where does reality end and virtuality begin? Can virtuality in relation to sound be
classified and measured and, if so, how? Subsequent sections address these questions and the
chapter includes a conceptual framework to facilitate clearer classification of sounds and
support future development of a workable measurement system for sonic virtuality. Here, the
various concepts of virtuality (as they relate to sound) form a framework that distinguishes
between real and virtual sound, whilst allowing space for incremental points between the
polar extremes. Understanding the complex interrelations between human and environment
(be it real or virtual) is fundamentally shaped by the psychological approach adhered to when
theorising the processes of the human mind. This chapter advocates EC theory to explore the
way in which we define, perceive and interact with sound. Virtual and real acoustic ecologies
are also explored within the context of affective response to computer game play.
VIRTUAL ACOUSTIC ECOLOGY AND EMBODIED COGNITION
The following sections explore the relationships that exist between player, game and
environment from a more distant but comprehensive perspective. Commencing with a
detailed review of embodied cognition (EC) theory, the relevance of EC is explored, when
considering the way in which we perceive and relate to game sound. Original acoustic
ecology theory is presented and contrasted with the more recent notion of virtual acoustic
ecologies and finally, EC theory is overlaid to support the production of an embodied virtual
acoustic ecology (eVAE), presented in chapter 9.
Human-audio interactions have been demonstrated to be self-sustaining, autopoietic systems
referred to by Grimshaw (2008) as acoustic ecologies. Within the domain of reality, the
acoustic model has already been discussed, refined, and more recently has been adapted;
integrating relevant game theory to establish a virtual acoustic ecology to elucidate the
dynamic interaction between the player(s) and artificially generated audio landscapes.
Understanding the complex interrelations between human and environment is fundamentally
shaped by the psychological approach adhered to when theorising the processes of the human
mind. This chapter cites Embodied Cognition (EC) as the preferable perspective template for
our ecology model, with a detailed examination of EC concepts and their impact upon the
understanding of fear processing during game play. Situational and temporal cognition are
evaluated within a CVG context, alongside cognitive offloading mechanisms and bottom-up
construction theory. Core concepts of embodied cognition, acoustic ecology, CVG
experience and fear processing theory are brought together to inform construction of an
acoustic ecology of fear within a virtuality framework. Beginning with an overview of fear
conceptualisation and processing from various perspectives, these sections progress by
examining the six main concepts of embodied cognition and integrate them to put forward an
embodied perspective of fear processing which is then applied to construct a model of virtual
ecology between player and game world with specific focus upon the audio landscape and the
experience of fear.
61
62
Embodied Cognition and Sonic Virtuality
CONCEPTS OF EMBODIED COGNITION
The mind-body problem associated with Cartesian Dualism (discussed in more detail later
within this chapter) questions the notion that the mind exists as a separate entity from the
body in favour of an integrated system. In searching for an approach to elucidate this
conundrum, research can be referenced that supports the concept of such a unified system and
identifies problems with a centralised framework. Stepper and Strack (1993) revealed that the
posture of an individual had a notable effect on emotional response. Duckworth et al. (2002)
produced an experiment that revealed faster reaction times to positive stimuli when
responding by moving a lever towards the body, and matching results to negative stimuli
when moving the lever away from the body; revealing an association between emotional
valence and physical movement. Core cognitive tasks such as reading and comprehension
have also been associated to bodily states. Havas et al. (2007) induced smile and frown states
during emotion comprehension tests and discovered that smile induction facilitated
understanding of positive events and inhibited the comprehension of negative events, whilst
frown induction facilitated negative understanding and inhibited positive understanding.
Existing research strongly supports the notion of integrated cognition, stating that ‗[t]he
biological mind is, first and foremost, an organ for controlling the biological body‘ (Clarke,
2007: p.1).
Traditional artificial intelligence (AI) frameworks operate on a sense-model-plan-act (SMPA)
system, a framework based on traditional cognitive theory that has yet to meet the immediate,
dynamic requirements of a real environment. Anderson (2003) criticises the central SMPA
system, suggesting that in attempting to react to real-world dynamics, a central system would
need to store an individual response plan for every potential future outcome. The number of
outcomes would be dependent on number of variables compounded by variance of each
variable and accommodating every possible interaction between two or more variables; as
such the likely number of stored action strategies would be unmanageable. Anderson (2003)
suggests that an intelligence structure must relate abstract thought to primitive, evolutionary
devices but does point out that, although this target has not been achieved and handling of
such data is beyond current computing technology, progress is being observed and a central,
representational model should not be altogether dismissed. The fundamental lack of
immediate responsiveness is arguably the essential problem associated with the centralised
cognition concept. As also documented earlier within this chapter, Clarke‘s (1997: p.21)
concept of the representational bottleneck reveals the efficiency limitations of a central
processing design and advocates the integrated model. Von Uexkull‘s (1957) concept of
Umwelt may explain how the mind reduces incoming data to increase efficiency of
processing, but further questions are raised; is perception defined by lifestyle and what
factors dictate the nature of these desires and needs?
Tom Garner
2012
University of Aalborg
We cannot hope to truly understand the nature of our reality; a task described by Fox (1997)
as ‗like trying to carry water in our hands. It is not a thing to grasp or keep‘. Fox references
Heidegger‘s concept of Befindlichkeit, as ‗the way our thrownness is disclosed to us‘ (1997),
and suggests that how we interpret, attend to or be with our thrownness is our only real
freedom. Although disagreement exists surrounding exact nature of this concept, Dreyfus
(1991: p.173) concludes that it is our emotional state that defines our individual
Befindlichkeit, proposing a parallel between affect, thrownness and embodied cognition in
that an individual cannot be separated from their emotions; as Dreyfus states, ‗we cannot get
behind our moods; we cannot get clear about them, and we cannot get clear of them‘ (p.173).
Rapid eye movement (REM) dream states provide a possible example of disembodied
cognition, in which partial nullification of primary sensory cortices allow the mind to become
disconnected from the environment; it is posited that the sensation of presence and
experience is consequently internalised (Laureys & Tononi, 2009: p.100).
Dreams originate from the forebrain regions that also govern much of cognitive processing
during conscious states (Bischof & Bassetti, 2004; Solms, 1997). During REM sleep, the
limbic regions of the brain such as the amygdala have been shown to measure greater activity
during a dream state than in wakefulness. Areas of the prefrontal cortex that receive input
from limbic structures measure significant activation during REM sleep; including the areas
of the forebrain that process mental imagery, spatial awareness and symbolic representation
(Laureys & Tononi, 2009: p.94). Research posits that higher level thought processes can
occur during dream states, including: conscious perceptual representation (LaBerge &
Degracia, 1998); speech production (Salzarulo & Cipolli, 1974); and metacognition (Kahan
& LaBerge, 1994). This suggests that cognitive interpretation of internally generated data is
plausible when we consider our ability to make sense of our surroundings during a dream.
A lucid dream (in which the dreamer is aware that they are dreaming) has significant
potential for facilitating higher-level internalised thought. LaBarge and Degracia (1999)
review a number of texts that claim individuals are capable of situating themselves within a
dream world, in which their simulated senses can influence their actions and react to temporal
influence due to an awareness of precedents and antecedents within the dream. Sensory
stimulation across all modalities, whilst acknowledged not to be real, is nonetheless felt by
the dreamer as a vivid sensation that is close to that experienced in reality. LaBarge and
Levitan (1998) conducted studies on lucid dreamers, evaluating the difference in subjective
sensation of somatosensory experiences. The results indicated that the brain is capable of
modelling particular touch sensations; specifically light touch and pressure were vividly
experienced (however pain was not) during lucid dreaming. LaBarge and Degracia (1999:
p.299) go on to posit that the experiences that occur during a lucid dream are likely to be to
be remembered after waking and, ultimately, have the potential to transcend from the dream
world to reality and alter the course of a dreamer‘s waking life.
63
64
Embodied Cognition and Sonic Virtuality
Cognitive thought during wakefulness is arguably influenced by immediate sensory input to
the degree that the environment must be integrated into all frameworks of human thought
processing, thereby observing the mind within an ecological perspective. However, during
unconscious states associated with dreaming, the mind appears capable of interpreting
internally generated data, evoking emotional sensations via the limbic regions of the brain
and also stimulating autonomic physiological activity and virtual motor responses.
Furthermore, the research into lucid dreams supports the notion that the mind is able to reflect
upon internalised scenarios and respond with voluntary virtual interactions. Internally
generated stimuli during dream experience supports the notion of sensory simulation (a
central aspect of embodied cognition theory); suggesting that this phenomenon can occur in
both consciousness and dream states. This could suggest that during a dream state,
information regarding sensory input (collected during wakefulness and stored in the long
term memory) is recalled and reconstituted as a simulated experience, essentially creating a
virtual environment within which the cognitive processes can remain embodied.
Embodied cognition can be understood as thought processing within the here and now. The
fundamental idea behind time-pressured cognition (the now) is that all human thought can
potentially be influenced by the concept of time as perceived by the individual, and relating
to objects or events. Liberman and Trope (1998) illustrated how an individual‘s perception
towards a future event could change in response to different relative temporal distances.
Personal evaluation has also been described as susceptible to PD influence; as research by
Freitas, Salovey and Liberman (2001) reveals, individuals were likely to employ a negative,
diagnostic assessment when it was expected in the more distant future but more likely to
prefer a positive, non-diagnostic assessment when it was perceived as imminent. Greater
temporal distance encourages more generalized thought (one cannot see the trees for the
forest) whereas immediacy evokes increased specificity (one cannot see the forest for the
trees). Time therefore manipulates attention and becomes a significant factor in appraisal and
decision-making (Liberman & Trope, 2008). Temporal distances are interrelated quantifiable
values that, alongside hypotheticality, spatial and social distance, establish PD and influence
higher-level cognitive processes such as evaluation and prediction (Bar-Anan et al., 2007;
Liberman & Trope, 2008; Stroop, 1935).
The central notion of situated cognition (the here) is that all informational processing is
susceptible to the continuous stream of incoming sensory data (Wilson, 2002). Furthermore,
any sensory information that is stored in long-term memory (LTM: alongside any relationship
between the sensory input and associated objects, events, physiology, behaviour, etc.) has the
potential to influence future thoughts regardless of construal level or context. Wilson (2002)
suggests that thought processing gradually builds a framework of automated subcortical
routines. Regularities in comparable circumstances encourage an automated response
generated by sensorimotor simulation; essentially a behavioural response, preceding
cognitive appraisal and contextualised by conditioned representational links. This concept is
supported by Garbarini and Adenzato (2004) who argue that cognitive representation relies
on virtual activation of autonomic and somatic processes as opposed to a duplicate reality
based in symbols. For example the sound of hissing is likely to have an acute physiological
impact upon an individual with an anxiety towards snakes. Although establishing threat
Tom Garner
2012
University of Aalborg
connotations (there is a snake nearby, snakes are poisonous, snakes could bite me) to the
object requires cognitive signification, a history of ophidiophobia would support a
conditioned subroutine, bypassing lengthier cognitive processing to connect the object (hiss
sound) directly to the ANS. An embodied theory would not accept pure behavioural
conditioning however and, instead, would suggest that the object would first stimulate virtual
sensory data (a snake‘s image, movement, etc.) that characterise the actual stimulus and
generate a threat interpretation. The entire process remains fundamentally cognitive but only
a fraction of the input data needs fully appraising as the simulated data is already directly
linked to the ANS through conditioning; supporting an efficiently responsive process
achieved via reduced cognitive load.
Recollection of memories to deduce and arrange future plans is also embodied in attachments
to sensory data. Existing research has argued that memory retrieval can cause a reexperiencing of the sensory-motor systems activated in the original experience, the
physiological changes creating a partial re-enactment (Gallese, 2003; Niedenthal, 2007). The
notion of implicit memory, relating to perceptual fluency and procedural skill (Johnston,
Dark & Jacoby, 1985) supports the developmental nature of embodied cognition. Wilson
(2002) argues that implicit memory is automated action; acquired through practice whereby
repetition instils conditioned movements and reduces the need for full cognition. Returning to
our snake example, declarative knowledge of how to effectively deal with a snake may reveal
a coping strategy. However, a lack of practice requires increased cognitive processing
(remembering instructions, talking internally through the action), reducing the immediacy
and accuracy of the action. By contrast, implicit knowledge of the coping strategy forged
through practice and repetition generates direct pathways between the stimulus and the
somatic nervous system, allowing fast and controlled response action. Wilson (2002) argues
that these processes of perception and action have the potential to become ‗co-opted and run
―off-line‖, decoupled from the physical inputs and outputs that were their original purpose, to
assist in thinking and knowing.‘ A potential consequence of this theory is that any prior
thought process that generated representations and relations between objects has the potential
to impact upon any future thoughts regardless of construal level.
In consideration of the various perspectives and concepts detailed above, it is an hypothesis
of this thesis, that the fear response exists on a two-level model of thought processing:
cognitive associative functions that bypass conscious appraisal to directly connect sensory
data to both the autonomic nervous system and motor responses; and full cognition where
reflective thought and emotional state is defined and the entire situation is rationalised and
comprehended. It is the behavioural-cognitive level that is central to the experience of fear, as
the initiating process and the determiner of immediate response. Sensory simulation, as
observed in dream states and everyday tasks, helps to explain the procedure in which the
mind manifests a notion of threat from a single, disparate stimulus by generating a full
context via virtual associated stimuli.
65
66
Embodied Cognition and Sonic Virtuality
ACOUSTIC ECOLOGY
Sound has the capacity to convey more emotional content than any other component of a
CVG experience (Parker & Heerema, 2007). Research suggests that human recognition of the
fear emotion is peaked during exposure to audio stimuli, suggesting that audition has a
greater association to the fear response than any other modality of sensory input (DeSilva et
al., 1998). Sounds have the potential to not only influence an audience‘s perception of a
visual scene (Tinwell, 2009) but also to generate immersion, depth and emotional colour via
sensory simulation.
Research concerning environmental sound is no longer in its infancy and yet the exploration
of emotion processes within CVG non-musical audio contexts is still a recent venture.
Existing research suggests that audio conveys negative emotional data (specifically fear and
sadness) more effectively than visuals (DeSilva et al., 1998) and controlled sounds associated
to virtual characters have shown to increase the perceived scariness of that character
(Tinwell, 2009). Quantitative psychophysiological research has recorded biometric (facial
muscle, cardiac and electro-dermal) activity in response to various sounds and revealed
significant variation in response between different sounds (Bradley & Lang, 2000). Ekman
and Kajastila (2009) further suggest that specific characteristics of an audio signal may
impact upon the listener‘s emotional response, revealing a difference in fear response in
reaction to changes in sound position and spread. This information not only identifies an area
of research in need of further development, but also nominates audio as a feasible approach to
manipulating a player‘s fear response.
Preceding acoustic ecology construction, this section explores the functions and processes of
listening. Breinbjerg (2005) describes the anthropocentric nature of the listening experience
and indicates that, unlike vision, audio stimuli cannot be shut out. Breinbjerg suggests that
sounds facilitate the perception of physical properties attributed to objects outside our visual
perspective and can confirm or enhance perceived detail of physical properties that lie within
our visual perspective. Breinbjerg also describes the function of listening as a way of
realising the design of the set (immersion in the nature of the environment) and the narrative
(objects and/or actions that the listener may need to react to) of a landscape. Kromand (2008)
supports this notion, proposing that sound exists as a purveyor of information and immersion.
Tuuri et al. (2007) posit that perceptual processing is intrinsically linked to the functionality
of audio, stating: ‗The procedural chain of events, actions and causalities in a situation can
give an indicative meaning even to a meaningless beep‘. Studies have argued that contextual
scenarios can have a tremendous effect upon the way we listen and, consequently, research
has identified several modes of listening that attempt to account for such effects. Gaver
(1993) suggests that non-specific, or everyday listening refers to hearing events within the
environment rather than the sounds themselves; a definition supported by Tuuri et al. (2007).
Gaver elaborates, insisting that, during everyday listening, audio perception bypasses
conscious semantic translation. The sound of a car‘s engine accelerating is not consciously
perceived as such, instead simply as a car. Chion (1994) identifies this type of listening as
casual listening, and defines the process as ecological and event-orientated. With regards to
Tom Garner
2012
University of Aalborg
the study of CVG sound, Collins (2006) describes causal listening as the ‗preparatory
function of game sound, affording the player information relating to game objects‘ positions
and dynamics‘.
Psychophysiological experimentation has provided quantitative data to support the notion of
automatic, instinct-based audio processes. Alho (1997) presents the concept of pre-attentive
audio processing and identifies pre-conscious brain activity resulting from infrequent changes
in repetitive musical patterns. Bach et al. (2009) discovered that, overall, reaction times for
audio cues were slower than those for visual targets, a logical outcome when considering the
nature of an audio signal exists along a temporal plane and consequently, time is required to
project all information contained within the sound (as opposed to the immediate disclosure of
data when viewing a static image). Cusack and Carlyon (2004) expand upon the above
concepts in their exploration of attention processes on audio perception. They describe a
―hierarchical decomposition of the soundscape‖ wherein attention is focussed on ever
increasing levels of specificity. This concept reflects the notion of reduced listening (Chion,
1994), which describes conscious attention towards the sounds themselves. Ekman (2009)
posits that attention ‗will guide the traversal of the ‗listening hierarchy‘ and so determine
what detail the listener will and can attend to at each moment‘.
Attention is one of four influences contributing to audio perception, as identified by Ekman
(2009) alongside proprioceptive sounds, emotions, and multimodal processing.
Proprioceptive sound refers to sound occurring inside or conducted through the human body
(swallowing, heartbeat, sounds conducted through bones). Ekman (2009) suggests that
extreme stress can also impact upon sonic perception. Using examples including soldiers and
law enforcement officers, Ekman describes auditory acuity (an increased sense of clarity and
specificity) and auditory blunting (the loss of sound detail or inability to hear very loud
sounds) as involuntary perceptual filters, and observed from a survey that auditory blunting is
a commonly occurring phenomenon in scenarios generating extreme stress. Referring back to
Massumi (2005), increased acuity is a potential response to the apprehensive terror phase
(senses are primed and attention focuses on data associated with the threat), whilst blunting
neatly integrates with the horror phase (innate physical response behaviour is prioritised and
present sensory data and cognitive appraisal is attenuated until the action reaches a stop).
Emotional influence incorporates personal preferences, cultural and social factors and Ekman
proposes that these factors ―compose a frame of reference in evaluating heard sounds‖, a
function that reveals similarity to the notion of semantic listening (Chion, 1994). Multimodal
perception describes the impact of data acquired from other human senses upon auditory
perception. Recent research has typically explored multimodal phenomena in terms of
audio/visual effects (Adams et al., 2002; Ma & McKevitt 2005; Özcan & van Egmond 2009;
Väljamäe & Soto-Faraco 2008) with several articles addressing the concept as part of the
acoustic ecology of CVG experience (e.g. Ekman 2009; Grimshaw & Schott 2008).
67
68
Embodied Cognition and Sonic Virtuality
Grimshaw and Schott (2008) support Chion‘s (1994) three-part construct of semantic,
reduced and causal listening but also incorporate a fourth. This mode, entitled navigational
listening, is defined by Grimshaw and Schott as sound that guides the individual through the
world via audio beacons. Grimshaw (2007) argues that a central aspect of human interaction
with sound is the ability to: conceptualise the position of the object (as relative to the
individual); identify movement speed and direction; and support kinaesthetic interaction with
objects within the environment. Schafer (1994) suggests that sounds can facilitate the
identification of spaces or territories without requiring visible boundaries. Breinbjerg (2005)
extends the notion of space by categorising types. Architectural space describes areas with
quantitative and measurable sizes/boundaries such as an indoor environment. Relational
Space describes the space indicated by the distance and position of sound sources in relation
to the listener. Breinbjerg also describes the notion of space as place; the phenomenon of
semantic meaning attributed to the audio environment that identifies a place within a
historical and geographical context. This concept can be compared to the notion of temporal
functions of audio; as Grimshaw (2007) asserts, sound ‗also has the ability to indicate a point
or period of time in the past, present or future‘. Space as place and historical temporality bear
relation to Parkes and Thrift's (1980) concepts of paraspace and paratime (particularly social
time).
Tuuri et al. (2007) identify eight discrete modes of listening that can be positioned along the
CLT continuum; from the low level construals associated with pre-cognitive listening to
higher level source, context and quality orientation. In this framework a distinction is made
between connotative (immediate, free associations labelled pre-cognitive) and semantic
listening (cognitive evaluation of symbolic / conventional meaning); a separation supporting
the assertion that higher level associative thought functions can transcend from the conscious
to the pre-cognitive via conditioning. The specific modes of listening support the process of a
fear experience; at the pre-conscious level, reflexive listening connects the sound object to
the ANS and SNS for immediate physiological support (a sudden noise may cause a startle,
increased respiration and perspiration), whilst connotative listening stimulates the perception
of associated virtual stimuli (simulated multi-modal sensory input that supports the audio).
Within a more generalised cognitive appraisal, causal listening utilises the information gained
to identify a potential source whilst empathetic listening uses the same data to assess affective
content and propose emotional motivation. Requiring more attuned focus upon the sound,
functional listening describes an attempt to identify the function of the sound and
consequently, the possible function of the source object. At the higher construal level of
audio comprehension lies semantic, critical (an evaluation of the associative strength between
audio and function) and reduced listening (an awareness of the individual properties of the
audio signal). Although comprehension of acoustic properties via reduced listening requires
conscious higher-level cognitive appraisal, variations in these parameters (position,
movement, loudness, etc.) have been revealed to influence precognitive and emotional
responses (Ekman & Kajastila, 2009; Garner et al., 2010). Several properties of a sound have
well-established associations to environmental information that is perceived as significant
during a fear experience. The relative loudness of an audio signal suggests object size and
relative distance (Winer, 2005) whilst an increasing or decreasing volume indicates
Tom Garner
2012
University of Aalborg
movement and direction (Bach et al., 2008). Low frequency sounds reflect the growling of a
predator whilst high-pitched sounds reflect pain and screaming, both of which signify a threat
(Parker & Heerema, 2007). Within a genuine fear evoking scenario it is unlikely that critical
and reduced listening will be utilised; however within the context of a computer video game,
it is probable (particularly as their coping increases) that a player will assess the effectiveness
of the game sound in achieving the desirable effect and consequently utilise reduced listening
to support their conclusions (for example – the sound that accompanied the sudden monster
attack was not frightening because the relative loudness was not great enough).
Grimshaw (2007) posits that a comprehensive understanding of soundscape optimisation
facilitates more than simply a replication of reality into virtuality; instead game sound design
should seek to understand the relationship between the virtual audio landscape and the
acoustic ecology of reality. Grimshaw also identifies the complexities that arise from
synchresis of audio/visual relationships, the simultaneous presentation of acousmatic and
ideodiegetic sounds and the kinaesthetic feedback loop resultant of player interaction. It is
this loop that is of prominent importance when one considers that the essence of a game
playing experience is arguably interaction. (Ekman, 2008) argues that: ‗Games ask players to
become active and play. Hence, sounds must support action, respond to player control and
often survive high repetitiveness‘. Here, sound is identified primarily as a ‗facilitator and
confirmatory of action‘ and, therefore, the most effective sounds will have a transparent
perceptual association to a gameplay element/event. Gärdenfors (2002) testifies to the
importance of this association by documenting the use of sound specifically designed to
enhance a gameplay action, which in reality has no association with that particular sound; a
good example would be the sound of Mario jumping in Super Mario Brothers (Nintendo,
1985).
At present, mainstream computer video games generate sensory data in visual, audio and
tactile modalities, and the latter is limited to interaction with the control interface. Feeling the
sensation of physical action within the virtual world is typically limited to vibration feedback,
providing a gross generalisation of tactile experience. To differentiate between various
environment textures or to utilise olfaction to further characterise a scene are notable research
endeavours (Hoffman et al., 1998; Richard, Tijou, Richard & Ferrier, 2006) but not
mainstream or commercial ones. As a result, even games that strive to achieve realistic
soundscapes must manipulate the limited resources to perpetuate a virtual sensation of
olfactory, tactile and gustatory input via representational inference. A virtual corpse alone
may not trigger an olfactory sensation of decay, but the sound of buzzing flies and wriggling
maggots around it arguably has a much greater potential to do so. This issue establishes a
clear divide between reality and the virtuality of current CVG; audio designers are often
required to compromise between playability (a sound design that supports player action via
extra-diegetic sounds, representative of other sense data) and realism (a soundscape more
akin to reality, focussed upon diegetic sounds). The notion of realism is questionable,
however, when observing a CVG experience as a whole. Although a realistic virtual
soundscape may accurately reflect its reality counterpart, the lack of audio input
compensating for other sensory modalities creates an incomplete experience, lacking
69
70
Embodied Cognition and Sonic Virtuality
immersion and consequently appearing unrealistic. Extended exposure to fictitious
Hollywood foley sounds has determined that genuine source recordings of many dynamic
sounds (shotgun blasts, footsteps in the snow, etc.) are often perceived to be unrealistic
(Parker & Heerema, 2007). Within the realms of fiction media, the lines separating the virtual
and the real have become blurred, creating scenarios where an audience is more likely to be
immersed by the hyper-real than the actual. Computer video games are not simply
simulations of real activities and, particularly within the fictional constructs of a survival
horror game, are better perceived as real-life activities themselves (Shinkle, 2005). Grimshaw
and Schott (2008) point out that a developing understanding of sound design (within a
specific context/domain) resulting from experience in playing several similar computer video
games or learning a single game system can alter sonic perception via the learning of
conventions. Sounds that initially signified an unknown enemy now identify the physical
characteristics, motivation and damage potential, allowing a player to make an informed
threat assessment. Within a survival horror game, such a system could be intentionally
exploited to support the player (see Left 4 Dead, Valve 2008) but also has the potential to
undermine the effectiveness of a terror stimulus.
A sound without context is effectively a shell that can have little significance for the listener.
The contextualisation of sound is arguably a process central to acoustic ecology, where
assumption and expectation originating from the individual‘s perception of their current
environment establishes a context frame that supports associations made between the sound
and information regarding the source (Özcan & van Egmond, 2009). Situational knowledge
provides a filter of associations perceived as irrelevant (Bar, 2004), allowing the mind to
more efficiently reach appropriate conclusions (such as, knowledge of dog ownership is
likely to filter associations between light impact sounds heard during the night and the
concept of intruder as source). Bar and Ullman (1996) argue that contextualisation occurs in
two stages: firstly the sound stimulates generation of associated sensory simulations, then the
collective information is correlated to more salient representations stored in the long-term
memory. Contextualisation has the capacity to alter completely a listener‘s perception of
audio source and preceding audio signals are also capable of manipulating the perceived
source of a subsequent sound (Ballas & Mullins, 1991).
In An Introduction to Acoustic Ecology (2000), Wrightson states that sound cannot be
disconnected from the natural environment. The acoustic environment is determined by
culmination of all processes and physical properties of the world, and consequently cannot be
sustained without change to its nature during environmental changes. The audio landscape is
connected to both the individual and the environment by a bi-directional influence potential
(Truax, 1984) that incorporates both external and internal sound. Internal sound is referred to
by Schafer (1977) in the form of internalised dialogue; suggesting that such a phenomenon
can modulate attention and attenuate sounds originating from the environment, and that
individuals may consciously restrict environmental sounds to support internal dialogue and
may also amplify incoming environment sound to attenuate internal dialogue. Sound reflects
the ecology, establishing equilibrium across the sonic spectrum (incorporating frequency,
tempo, rhythm and volume), allowing individual sounds to be distinguishable in even dense
Tom Garner
2012
University of Aalborg
audio environments (Krause, 1993). Listening becomes an embodied experience, dependent
on not only the past memories of the listener, but also on their present state and the countless
interactions that occur between the listener and the ever-changing environment. Wrightson
(2000) presents an example to elucidate this point, referring to the impact of industrialisation
on both the functions of audio within human society and the gradual deterioration of audition
accuracy, whereby man hearing, once capable of determining a range of audio subtleties, can
now only describe sound in (comparably) polar extremes. Industrialisation has also been
charged with damaging the acoustic ecology in a way that ultimately is threatening to life.
Barot (1999) argues that the generation of certain artificial sounds can be acoustically
matched to that of bird mating calls, thereby creating a sonic barrier between animals in a
way that limits their capacity to reproduce. The above argument supports the fusion of sound
with ecology, suggesting an interrelationship whereby life has the capacity to influence sound
and sound has the capacity to influence life; a concept succinctly outlined by Truax (1984).
The concept of a virtual sonic ecosystem within a first-person shooter (FPS) resonates with
the notions of embodied cognition and acoustic ecology in that it embraces an amalgamation
approach; classifying listener, soundscape and environment as inseparable components in an
integrated system. Virtual acoustic ecology asserts that CVG sound exists as part of an
intricate relationship between the player, the audio engine (virtual soundscape) and the
resonating space (real acoustic environment). Grimshaw and Schott (2008) describe FPS
audio engines as sonification systems, in which both individual sounds and collections of
sounds have both intended meaning (as established by the designer) and received meaning
(player interpretation). In The Acoustic Ecology of the First-Person Shooter (Grimshaw,
2008) the nature of this system is elucidated via identification of: the various functions of a
FPS audio; the individual relationships that exist between components of the system; the
perceptual factors that influence interpretation; and the unique circumstances that
contextualise the ecology within a CVG framework.
The foundational belief underpinning and embodied ecology construct of listening is that our
biology, the enveloping sensory input of the present and the LTM data representing our
history, are all crucial factors in thought processing and behaviour determination; essentially
an intricate matrix of causality with emotional affect residing at the helm. Emotion directs
attention towards specific current stimuli and filters out sensory data that possesses a low
emotional relevance. LTM retrieval and memory transfer function between short-term
memory (STM) and LTM depend upon the emotional value of the content and, therefore,
govern how an individual‘s history will impact upon their current cognition and behaviour.
During an audio-induced fear sensation, the intensity of the emotional experience will
influence the listening modes, determining the nature of the sonification data that will be
encoded from the audio. That information may in turn determine the reappraisal and
characterisation of the audio, subsequent auditory data perception, attention focus, emotional
state changes, etc. Without the capacity to evoke emotion, an audio input could only be
interpreted in an objective and detached manner via reduced listening. The emotional content
of the audio instigates recollection of emotion-related LTM data, facilitating subjective
analysis and representational association.
71
72
Embodied Cognition and Sonic Virtuality
The principle views of embodied cognition theory provide opportunity for an intriguing
framework within which a more comprehensive theory of game sound can be exhibited.
Whilst it is not presumed that complete understanding such a complex system is quite within
our grasp, it does reveal the various theories surrounding this study appear to be, in part,
components working in cooperation with each other to create a dynamic and constantly
changing system. The nature of our environment, coupled with our memories and current
physiology, determines our perception of sound, including: the mode of listening we employ,
whether a sound evokes an emotional response and the exact profile (including intensity) of
the experienced affect.
EMBODIED COGNITION AND VIRTUALITY
The following is a hypothetical thought process designed to elucidate a central concept of
embodied cognition: A contemplative fellow briefly imagines an act of defying gravity and
leaping upwards into space – a notion that, with hindsight, is revealed to have originated from
a brief glance at the sky. As the thought formulates, a visual representation of James Bond on
a jetpack, the physical sensation of leaping ever higher and John Williams‘ iconic superman
theme all manifested themselves. In this example, a single conceptual construct has been
characterised by both virtual sensory stimuli and simulated motor actions. A concept that
many may experience is personalised by the immediate environment and our unique memory,
formed by the inimitable experience of personal history. The concept that cognition is a
biological process grounded by bodily experience and the environment has garnered
increasing support in recent years (Garbarini & Adenzato, 2004). Shinkle (2005) argues that,
although sensory input and physical response is characteristic of a computer video game,
many game studies methodologies overlook the impact that embodied cognition and
physiological response have on game experience. Shinkle also posits that memories of past
events are not limited to stored knowledge of external proceedings; emotional responses
(characterised by physiological states and discrete response behaviours) play a crucial role in
defining these perceptions and experiences. This notion concurs with the argument described
so far in this chapter; a system that processes objective and affective data initially at an
autonomic level, producing physiological changes that are felt by the individual and fed back
into the system for cognitive processing. Questioning the exact nature of this system refers to
a considerable philosophical quandary; the mind-body problem, which asks whether such a
system is a) centralised within the brain and capable of detached processing (Cartesian
Dualism), or b) an integrated system that incorporates all somatic and autonomic actions and
the ecology of the surrounding environment.
Andy Clarke‘s 1997 book Being There: Putting Brain, Body and World Together Again
advocates the concept of integrated cognition, stating that ‗[m]inds are not disembodied
logical reasoning devices‘ (Clarke, 2007: p.1) and that rejection of the centralised processor
concept in favour of an embodied perspective is an increasingly popular attitude in the fields
of robotics and artificial intelligence. The concept of integrated cognition bears some
similarity to the notion of autopoiesis, especially in the autopoietic concept of a consensual
domain. This domain is brought about by the structural coupling (the interplay) between
Tom Garner
2012
University of Aalborg
mind, body and environment (see Winograd & Flores, 1986: p.46,49). Clarke (p.21) further
questions classical cognitive theory by means of the representational bottleneck concept
which states that, for a central processing unit to function, all sensory data must be converted
into a single symbolic code for comprehension then translated into various data formats to
carry out the different motor responses. Such a process is theorised to be time consuming and
expensive, leading to the conclusion that a centralised system could not possibly respond
adequately to real-time pressures of everyday life. Sensory filters, such as attention, reduce
the processing load by ‗sensitizing the system to particular aspects of the world – aspects that
have special significance because of the environmental niche the system inhabits‘ (Clark,
2007: p.24), a system that Clark relates to Jakob Von Uexkull‘s (1957) concept, Umwelt (a
reduced perception of the real environment as defined by the individual‘s needs, desire and
lifestyle).
Construal level theory (CLT) argues that increasing psychological distance (space-timerelevance) promotes more abstract thought (Liberman & Trope, 2008), however
psychological distance (PD) can only be measured in relation to the here and now and is
consequently dependent on the current environment. The here and now notion that constitutes
part of the EC theory is detailed in Margaret Wilson‘s Six Views of Embodied Cognition
(2002). This text argues that a cognitive model must: be established in a real-world
environment context (situated cognition – the here); recognise temporal and real-time effects
(time-pressured cognition – the now); acknowledge the environment as an integral part of the
model (an ecological framework); and accept that the function of all cognitive thought is
ultimately to guide action whether in the immediate circumstance or in planning for a future
event. A futurity can be related to virtuality in that it cannot be directly interacted with and
exists as an insubstantial entity. If we accept that contextualisation determines a significant
proportion of a sound‘s perceptual make-up, then any attachment of PD may severely affect
the perceived reality of a sound (for example, the sound of a distant car alarm may be less
real than an individual‘s home fire alarm as the latter is more firmly rooted within their
personal reality).
The concept of EC posits that thought cannot exist outside the here and now and that
conscious appraisal of an object or situation cannot be detached from sensory input, a notion
not dissimilar to that of thrownness. First established by Martin Heidegger in the 1927
publication Sein und Zeit (Being and Time), the concept of thrownness is succinctly
communicated within the contexts of computer science and cognition by Winograd and
Flores (1986: pp.33-35). Within this text, thrownness supports the principles of EC, in that it
refers to existence within the world as fundamentally inseparable from the environment in
which we exist. Winograd and Flores document that to not impact upon the environment is
essentially impossible, as even to do nothing has consequences. They also state that an
individual cannot separate themselves from a situation to reflect upon it, as the situation is not
a static entity but rather a continuous movement and accurate prediction of outcomes (outside
of a laboratory) is consequently, unachievable. This constant fluctuation and evolution of
existence makes a stable representation of the environment (or a situation within it)
73
74
Embodied Cognition and Sonic Virtuality
unattainable, arguing that ‗every representation is an interpretation‘ (p.35) and post-event
analysis remains fraught with subjective bias.
If we are to agree that all thought is under the continuous and forceful influence of the
surrounding environment, current personal physiology and long-term memory, and also that
such circumstances dictate that no perception can reflect reality entirely, then we could
further posit that all experience is virtual. We are the sole population of our own virtual
realities. Our universe supports countless parallel worlds, each with many consistencies, but
many with striking differences. In developing a continuum of sonic virtuality, the above
theory argues that real sound is impossible, and the polar extremes cannot be absolute real
and absolute virtual.
Garbarini and Adenzato (2004) argue that cognitive representation relies on virtual activation
of autonomic and somatic processes as opposed to a duplicate reality based in symbols. For
example the sound of hissing is likely to have an acute physiological impact upon an
individual with an anxiety towards snakes. Although establishing threat connotations (there is
a snake nearby, snakes are poisonous, snakes could bite me) to the object requires cognitive
signification, a history of ophidiophobia would support a conditioned subroutine, bypassing
lengthier cognitive processing to connect the object (hiss sound) directly to the autonomic
nervous system. An embodied theory would not accept pure behavioural conditioning
however and, instead, would suggest that the object would first stimulate virtual sensory data
(a snake‘s image, movement, etc.) that characterise the actual stimulus and generate a threat
interpretation. The entire process remains fundamentally cognitive, yet only a fraction of the
input data needs fully appraising as the simulated data is already directly linked to the human
autonomic nervous system (ANS – a primarily subconscious system, chiefly controlling
involuntary physical actions) through conditioning; supporting an efficiently responsive
process achieved via reduced cognitive load.
This suggests that the embodied cognitive processing of a sound can be significantly affected
by the presence (or absence) of cognitive shortcuts. Özcan and Van Egmond (2009) discuss
the way in which ambivalent sound can have dramatically different meanings depending
upon associated visual stimuli. Without such contextualisation support, the listener‘s
perception of the sound would be greatly dependent upon long-term memory (established
conventions, passed experiences, etc.), current environment (temperature, space, etc.) and
physiology (including established neuro-pathways such as cognitive bypass routines).
Essentially, the listener is immersed within their own exclusive virtual reality, where the
perception is embodied and the received sound is as unique as the individual.
Recollection of memories to deduce and arrange future plans is also embodied in sensory
data. Existing research has argued that memory retrieval can cause a re-experiencing of the
sensory-motor systems activated in the original experience, the physiological changes
creating a partial re-enactment (Niedenthal, 2007). This notion strongly relates to the auditory
concept, phonomnesis (an imagined sound that can be unintentionally perceived as real
[Augoyard & Torgue, 2005]). In this scenario, the mind (in response to an initial stimulus)
generates a re-experiencing of a sound that can be classified as virtual.
Tom Garner
2012
University of Aalborg
Augoyard and Torgue provide an invaluable reference guide to various acoustic and
psychoacoustic events in their work, Sonic Landscapes (2005), within which various
phenomena are documented that relate (in varying ways) to both EC and virtuality. As an
individual experiences numerous and complex soundscapes throughout their lives, such
phenomena are potentially commonplace. Sound may bring forth a past memory by way of
anamnesis; it may force their attention upon a specified place through hyper-localisation, or
even provoke a sensation that the space within which the listener is positioned is shrinking
(narrowing [Augoyard & Torgue, 2005]). As a listener, we have the perceptual capacity to
focus our attention upon an individual speaker within a room of thousands by disregarding all
irrelevant sound information (known as the cocktail party effect). Psychoacoustic entities can
manipulate listener behaviour and dictate future action via incursion (alarm, phone ringing,
etc.), dictate the listener‘s level of vigilance (the Lombard effect) and even incite a euphoric
state (phonotonie). With regards to virtuality, three perceptual phenomena of particular
interest are phonomnesis, remenance (a perceptual continuation of a sound that is no longer
being propagated) and the Tartini effect (a sound that is physiologically audible but that has
no physical existence); here a combination of tones will provoke the sensation of an
additional frequency that is not physically present (an occurrence that has been implemented
in military and crowd-control applications [Augoyard & Torgue, 2005]). Such auditory
phenomena arguably relate heavily to EC theory; if the listener were capable of detached and
objective sound perception, then such sonic illusions and auditory holograms would not exist.
RECONCILING ACOUSTIC ECOLOGY WITH SONIC VIRTUALITY
Originating in the 1960‘s within R. Murray Schafer‘s seminal works, acoustic ecology refers
to ‗how organisms interpret and are affected by natural and artificial sounds‘ (EsbjörnHargens & Zimmerman, 2009: p. 491). This ecology incorporates the auditioning organism
as an integral component of the soundscape, in which separate individuals may receive
widely different sound information from the same acoustic space. This clarifies the
connection between AE and EC, in that the embodied nature of listening creates a
personalised virtual auditory experience; because the individual is part of the foundation of
ecology theory, AE is therefore, virtual. Within this chapter, general conceptual notions
relating to virtuality have been explored and consequently, this thesis asserts that there exist
several ways in which EC theory may determine all sound (indeed all existence) to be
inherently virtual. Progressing from these ideas, this section details human hearing
(incorporating modes of listening, psychoacoustic theory and contextual conventions),
alongside virtual AE models.
The concept of a virtual sonic ecosystem within a first-person shooter (FPS) resonates with
the notions of EC and AE in that it embraces an amalgamation approach, as outlined by
Grimshaw and Schott (2008); classifying listener, soundscape and environment as
inseparable components in an integrated system. Their virtual acoustic ecology (VAE) asserts
that game sound exists as part of an intricate relationship between the player, the sound
engine (virtual soundscape) and the resonating space (real acoustic environment). Firstperson perspective game sound utilises a substantial number of sounds with the explicit
75
76
Embodied Cognition and Sonic Virtuality
purpose of representing a virtual environment that the player will find immersive, irrespective
of the fact that they are not physically situated within that world. Such sound is intrinsically
associated with virtual actions and entities, but the sound waves physically exist purely
within the domain of reality; each sound resonating within actual spaces and interacting with
real surfaces before reaching the listener‘s ear.
The foundational belief underpinning this construct is that our biology, the enveloping
sensory input of the present and the long-term memory data representing our history are all
crucial factors in thought processing and behaviour determination; essentially an intricate
matrix of causality. Emotion directs attention towards specific current stimuli and filters out
sensory data that possess a low emotional relevance. Long term memory (LTM) retrieval and
memory transfer function between short term memory and LTM can depend upon the
emotional value of the content (Friestad & Thorson, 1986) and, therefore, could manipulate
how an individual‘s history impacts upon their current cognition and behaviour. During
audition, the intensity of the emotional experience and the conscious desire of the listener
will influence the listening modes; determining the nature of the sonified data that will be
extracted from the sound. That information may in turn determine the reappraisal and
characterisation of the sound, subsequent auditory data perception, attention focus or
emotional state changes, formulating a circulatory loop.
Figure 1: The Sonic Virtuality Framework of Variables
The above diagram consolidates the associated theoretical concepts of VAE, EC and
virtuality to establish a framework of variables relative to sonic virtuality (Figure 1). Creating
a sound from component waveforms by way of synthesis is a more virtual sound class when
compared to a sound with a natural origin. Although the framework also classifies recorded
and artificially propagated sound as virtual, unless the sound is synthesised, the ultimate
Tom Garner
2012
University of Aalborg
origin is arguably more real. Propagation differentiates between natural resonance and
electronic amplification, classifying the latter as more virtual. Several variables connected to
semantic association are presented; the rationale stating that a sound with several, relevant
semantic attachments to entities perceived as genuine will support the reality of the sound
(for example, a voice with semantic attachments to a NPC will resonate with more truth than
a transcendent voice of god).
SOUND FUNCTIONALITY AND MODES OF LISTENING
The previous section outlined a set of (relatively) objective acoustic variables that determine
the virtuality of a sound. Here, we explore contextualisation and listening function, asserting
that such effects further enable the personal virtual realities documented earlier.
A sound without context can have little significance for the listener, and may potentially be
classified as virtual because the listener cannot attach the sound to an entity that exists within
their reality. The contextualisation of sound is a process central to AE, where assumption and
expectation originating from the individual‘s perception of their current environment
establishes a context frame that supports associations made between the sound and
information regarding the source (Özcan & Van Egmond, 2009). Contextualisation has the
capacity to alter completely a listener‘s perception of sound information and, consequently,
research has identified several modes of listening that attempt to account for such effects.
Gaver (1993) states that non-specific, or everyday listening refers to hearing events within the
environment rather than the sounds themselves. Gaver elaborates, asserting that during
everyday listening, sound perception bypasses conscious semantic translation. The sound of a
car‘s engine accelerating is not consciously perceived as such, instead simply as a car.
Cusack and Carlyon (2004) expand upon the above concepts in their exploration of attention
processes on sound perception. They describe a ‗hierarchical decomposition of the
soundscape‘ wherein attention is focussed on ever increasing levels of specificity. This
concept reflects the notion of reduced listening (Chion, 1994), which describes conscious
attention towards the sounds themselves. In their conceptual framework of listening modes
Tuuri et al. (2007) indirectly support EC theory, stating that listeners ‗do not perceive sounds
as abstract qualities, rather, we denote sound sources and events taking place in a particular
environment‘. Their work identifies eight discrete modes of listening that can be positioned
along the construal level theory continuum. The act of separating an individual sound from a
composite (hi-hats from a drum loop for example) is that of clearly distinguishing the
received sound information from the actual soundscape. In this circumstance, our mode of
listening is generating a virtual representation of the real sound environment.
Tuuri et al. (2007) also argue that ‗some sounds encourage the use of certain modes more
strongly than others‘. In A Climate of Fear (Garner & Grimshaw, 2011) we argue that the
modes of listening are largely determined by the perceived intensity of the sound; as
determined by psychological distance, physiological reflex and immediate affective response.
Consider a fire alarm as an example. When listening to such a significantly intense sound, a
high-level listening mode (e.g. reduced listening – evaluating the frequency and temporal
77
78
Embodied Cognition and Sonic Virtuality
difference between the component tones) is highly improbable and the listener is effectively
forced to respond by way of reflexive (break current behaviour, immediate new action) and
connotative (imminent danger, must evacuate!) listening. The sound itself was intentionally
designed to evoke such a response, supporting the acceptance of this theory within general
product design industries. A continuous, unchanging sound may fluctuate in terms of the
listening mode it encourages due to the complexities of the sound ecology. Returning to our
fire alarm, whilst the immediate listening mode demands a reflexive function, prolonged
exposure to the sound affords the listener time to appraise the sound with higher level
cognition. At this point a listener may begin to evaluate the causality (Where is that alarm
coming from?) or the functionality (Is that actually the fire alarm, or is it the burglar alarm?)
of the sound. Such changes in perceptual listening modes are primarily instinctive and
although conscious control is possible the notion of choice is essentially an illusion, as the
embodied nature of the listener‘s personal virtuality manipulates the outcomes of the choice.
Within a complex soundscape we are capable of attenuating sonic input to focus on
increasing levels of specificity. A city soundscape may be reduced to the compound sounds
of a motorbike, which can then be concentrated to the individual sound of tyres treading
asphalt. We may reflexively recoil from these sounds to avert the vehicle, or evaluate the
qualities of the sound to cross the street without visually confirming the location of the bike.
In such a circumstance, individuals placed within the same acoustic space may provide
dramatically different descriptions of the sonic environment even if they were required to
provide an objective account of their experience. Through hearing alone, we may even
mistake the vehicle for a taxi or van through misinterpretation of semantic associations. John
Greco (2010) argues that ‗our causal explanations typically cite only one part of a broader
causal condition‘. This may explain how such a sound misinterpretation may occur, as the
listener perceives an engine sound; establishes a causal syllogism (motorbikes have engines, I
hear an engine, therefore a motorbike is present) and makes a false assumption that is
accepted as reality unless conflicting information is provided. The well-established Gettier
Problem (Gettier, 1963) highlights issues with justified true belief theory, proposing various
scenarios (similar to that above) that argue an individual may (justifiably) accept a truth as
knowledge from a misinterpretation of information. Here truth is accepted as reality despite
the fact that the justification is virtual and it could further be asserted that (in many scenarios)
an individual is capable of accepting a fallacy as knowledge (and feeling justified in doing
so) due to the nature of the virtuality of the causal explanation.
CONCLUSIONS AND CHAPTER SUMMARY
This chapter has utilised concepts of EC theory to help explain why both reality and virtuality
remain deeply subjective and personal concepts. This, in turn, can be applied to our
understanding of listening to suggest that all sound may be subjective in terms of the way it is
appraised by the human mind, and whilst modern equipment may be capable of quantitative
analysis, their lack of an embodied existence (interconnecting emotions, memories and
physiology) enables their objectivity whilst we remain subjectively bias, unable to detach
from our embodied reality.
Tom Garner
2012
University of Aalborg
A review of relevant literature and logical deduction has provided a theoretical differentiation
between real and virtual computer video game sound which could precipitate a great deal of
future empirical research. At this stage it would be inappropriate to describe these variables
as concrete determiners of virtuality. Instead, the intention is to highlight the potential factors
that could have an impact upon perceived virtuality. It is also suggested that virtuality can
only be perceptual and is always sensitive to the nature of the individual‘s personal reality.
That such individual virtual existences are truth is a bold but compelling statement; the notion
that a higher level of being may have control over our lives in the same way that we control
our representative game avatars is one best left for another day.
In preparation for the experiment methodologies documented in chapter 6, the next chapter
hosts a detailed analysis of psychophysiological approaches to data acquisition with regards
to a range of general and specific, thesis-relevant applications. The concept of biometrics is
addressed in greater detail and electrodermal activity (EDA: skin conductance),
electroencephalography (EEG: brainwaves) and electromyography (EMG: muscle activity)
measures are scrutinised in terms of their advantages and limitations in comparison to
alternative psychophysiological methods. Alternative approaches to hardware configuration,
participant preparation and data processing are also comparatively analysed. These
discussions will then lead to three initial experiments, within which EMG and EDA biometric
data alongside qualitative debrief player responses reveal the effects of digital signal
processing (DSP) treatments upon objective and subjective measures of fear elicitation.
79
Chapter 5
Psychophysiology and Biometric
Feedback Systems in Computer
Video Gameplay
Garner, Tom A.
University of Aalborg
2012
Tom Garner
2012
University of Aalborg
Chapter 5: Psychophysiology
and Biometric Feedback Systems
in Computer Video Gameplay
INTRODUCTION
A review of psychophysiological methods and approaches is presented within this chapter;
commencing with a general overview of the field and concluding with a more detailed
discussion regarding the potential application of specific psychophysiological measurements
within the contextualisation of emotion recognition, biofeedback and computer video
gameplay. The intention is to provide the necessary theoretical background required to
expand upon the more qualitative audio analysis discussions documented in previous chapters
and will support the last of the three subsequent preliminary trials, obtaining quantitative
electrodermal and electromyographic data in response to audio cues within a horror-themed
first-person shooter (FPS) game level. This chapter also investigates the
electroencephalography (EEG) biometric measure with relevance to sounds and emotions.
This includes a comparative analysis of alternative approaches to EEG data collection, from
hardware configuration/montage setup, to participant preparation and signal processing
technique.
PSYCHOPHYSIOLOGY: DEFINITIONS AND APPLICATIONS
„The field of psychophysiology is concerned with the measurement of physiological responses
as they relate to behaviour‟ (John L. Andreassi, 2006: p.1)
In the broadest sense, psychophysiology refers to study of the relationships that exist between
physiological and psychological processes. Despite being a relatively young research field,
that Cacioppo et al. (2007) describes as ‗an old idea but a new science‘, psychophysiology
has branched into a wide range of applications and integrated with various other disciplines
including dermatology (Panconesi & Hautmann, 1996) and psychopathology (Fowles et al.,
1988). Modern psychophysiology was envisioned in response to the physiology/psychology
divide problem (that between the two they provide a comprehensive explanation of human
behaviour yet remain distinctly separate fields of study) that also led to the creation of
physiological psychology (see Wundt, 1904).
The distinction between physiological psychology and psychophysiology has itself been a
point of contention between researchers. Andreassi (2006) provides a detailed overview of
the dissimilarities, initially focussing upon differences in methodological approach despite a
shared goal in understanding the physiology of behaviour. Andreassi cites Stern (1964), who
suggests that psychophysiology studies psychological independent variables (IV) and
physiological dependent variables (DV) whilst in physiological psychology the opposite is
81
82
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
true (physiological DV and psychological IV). Andreassi notes that some researchers have
presented particular methodologies that are exceptions to the above rule, and states that this
has led to differentiations between the two fields in terms of subject matter (see Furedy,
1984). To summarise Andreassi (2006), psychophysiology is a primarily non-invasive
approach to physiological data collection in response to psychological manipulations,
obtained from living human beings. Drachen et al. (2010), citing Cacioppo et al., (2007)
present a similar description, defining psychophysiology as ‗[investigating] the relationships
between psychological manipulations and resulting physiological activity (measured in living
organisms to understand mental and bodily processes and their relation to each other)‘.
For the purposes of the thesis, the term biometrics refers to a definition that deviates slightly
from traditional understanding. IBM (2012) presents biometrics as ‗the science of identifying,
or verifying the identity of a person based on physiological or behavioural characteristics‘, a
definition that is relatively constant throughout academic literature and typically concerns
identity, security and product development (Ashbourn, 2000; Nanavati et al., 2002;
Woodward, 2003). Within this contextualisation, biometrics identifies affective states from
physiological input. The term is thereby used throughout this thesis to refer to
psychophysiological data collection methods and processes (such as electromyography, skin
conductance response, heart rate).
The collection and analysis of psychophysiological data to interpret an individual‘s emotional
state is a firmly established methodological approach. Our understanding of the exact role of
physiological changes within a framework of human behaviour and thought processing is,
however, not comprehensive and there remains uncertainty regarding the exact nature of the
emotional experience (as discussed in chapter 2). Russell (2003) asserts that physiology is a
component within a larger process, whereby cognitive appraisals of physiological state
changes determine emotion; a theory not dissimilar to the two-factor theory of emotion as
described by Schachter and Singer (1962) with its derivation dating back to the James-Lange
(physiology determines emotion) and Cannon-Bard (emotion determines physiology) debate.
It could be viewed that psychophysiological methods of experimental data collection
circumvent the origin of emotion debate, instead functioning on the assumption that
irrespective of chronological order, emotions and physiology are intrinsically linked to the
degree that a comprehensive account of physical (biometric) state changes can reveal
subjective emotional information. Psychophysiology is therefore not as persistently
concerned with chronologically ranking the chicken and the egg as it is intent on proving
their correlation to each other.
Biometric data collection addresses several problems experienced when evaluating emotions
via self-report, such as affect insensitivity and emotion regulation (Ohman & Soares, 1994).
Research has documented circumstances in which the agendas of the individual facilitate
regulation (suppression, enhancement, false presentation) of outward emotional expression,
providing severe reliability concerns if relying entirely upon visual analysis and self-report to
interpret emotional state (Jackson et al., 2000; Russell et al., 2003). Biometric data collection
has the potential to circumvent this problem via measurement of emotional responses
Tom Garner
2012
University of Aalborg
characteristically associated with the autonomic nervous system (ANS) and is significantly
less susceptible to conscious manipulation (Cacioppo et al., 1992). Recent research
concerning emotion suggests that comprehensive information from the participant and the
environment is required alongside biometric data before any accurate emotional
interpretations can be generated. A great deal of research exists whereby physiological
readings are cross-examined this way in the search for reliable associations and causal
patterns (Cuthbert et al., 2003; Ekman & Friesen, 1975).
Nacke and Mandryk (2010) provide a positive testimony for eye-tracking as a cost-effective
solution capable of giving further insight into human behaviour, particularly in a visual media
context such as computer video games (they assert that modern software can perform this
operation using only a standard webcam as acquisition hardware). Eye-tracking solutions
have appeared within a notable number of varied research studies that include: exploring how
users interact with web-mediated search engines (Granka, Joachims & Gay, 2004), studying
the effectiveness of animation-based learning materials (Bouchiex & Lowe, 2010) and
examining abnormal perceptual strategies in patients with autistic spectrum disorder (Sasson
& Elinson, 2012). For the purposes of research into affective states during gameplay, eyetracking information would primarily support the contextualisation of data from additional
biometrics, while the ability to pinpoint where on-screen a player is focusing, at any given
moment, could allow associations between in-game events/entities and significant biometric
readings to be better informed and more accurate.
Electrocardiography (the standard instrument for measuring heart rate) and respiration
measurements are also acknowledged as worthy biometrics across many research studies.
However, they characteristically appear as supportive physiological measures alongside
additional biometrics (e.g. Cuthbert et al., 2003; Vrana, 1993; Weber et al., 2009).
Kivikangas et al. (2010) suggesting that, although cardiac activity is widely used throughout
academic literature, the heart is associated with a variety of bodily processes and therefore
inferring a psychological state solely from changes in cardiac activity is likely to be
unreliable. As such, it would be appropriate to suggest that such biometrics performs a
supportive role in this context, strengthening the correlations between psychological events
and physiological responses.
ELECTRODERMAL ACTIVITY
Both within and outside of computer video games research, the term electrodermal activity
(EDA) can refer to a variety of physiological response measures and has associations with
several other closely related terms. Kivikangas et al. (2010) assert that EDA and galvanic
skin response (GSR) are too often confused and that a clear and agreed upon set of definitions
is necessary to avoid methodological error. In her review of electrodermal developments up
to the 1980s, Christie (1981) cites several research articles calling for GSR to be retired,
asserting that its use has led to unnecessary complications (Edelberg, 1972; Venebles &
Christie, 1973). Christie (1981) does address this issue, placing EDA as the central,
overarching term under which all associated terminology is a sub-classification (figure 1).
83
84
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
Within this framework, electrodermal activity is divided into endosomatic (the invasive
application of electrodes under the skin to measure impulses directly from the sympathetic
neurons) and exosomatic (surface-placed electrodes). Further distinction is determined by
finer variations in the methodology, primarily the type of current under analysis, the
characteristic of the electrical signal being studied. Temporal characteristics of the
methodology provide the final distinction, separating long-term tonic measurements (EDL –
electrodermal levels) from short-term, phasic activity (EDR – electrodermal response).
Figure 1: Framework of EDA terminology (Christie, 1981)
The invasive nature of endosomatic acquisition is naturally unsuitable for application in most
psychophysiological studies and consequentially is highly unlikely to appear within
methodologies relevant to this thesis. It has been suggested that the difference between the
direct current (DC) and alternating current (AC) variations of exosomatic electro-dermal
activity measurement is limited (in terms of obtained data [Fowles et al., 1981] as the
receptor measurement is always compared to a baseline [uniphasic]). DC measurements are
nonetheless the most commonly employed method within relevant studies, and the term skin
conductance features regularly as the biometric descriptor (e.g. Cuthbert et al., 2003;
Kivikangas et al., 2010; Koelsch et al., 2008; Vrana, 1993). From the information presented
above it is rational to assert that, for the purposes of experimentation relevant to this thesis,
phasic skin conductance response (SCR) is the most appropriate EDA sub-type due to the
low-invasiveness of application, relevance to short-term activity variation and commonality
alongside comparable research. Despite this framework being established many years ago,
GSR still appears in modern research although it most typically refers to skin conductance
(e.g. Mirza-Babaei, 2011).
Lang et al. (1993) relate the conductance level of the skin to sweat production from the
eccrine glands, which are resistors of electrical current. Sato et al. (1989) provide a detailed
account regarding the physiology of sweat secretion, positing that activation of the SNS
increases sweat secretion in preparation for bodily action. Sato et al. continue, stating that the
Tom Garner
2012
University of Aalborg
chemical composition of sweat possesses a significantly lower electrical resistance than the
dermal layers which, during increased secretion, results in an increase of electrical
conductivity between neurons and the skin surface that enables a meaningful EDA signal to
be recorded.
EDA is associated with the peripheral nervous system (Nacke et al., 2009) and more
specifically, is connected to the sympathetic nervous system (SNS) which, as discussed in
chapter 2, associates EDA exclusively with autonomic and excitatory processes (Poh et al.,
2010). Recent physiological studies have presented evidence in support of this association
through electrically stimulating autonomic regions of the brain to reveal reliable EDA
increase responses (Lanteaume et al., 2007; Mangina & Beuzeron-Mangina, 1996). Brain
structures within the limbic system (specifically the hypothalamus and amygdala) have been
associated with EDA, suggesting a connection between EDA and emotion-related activity
(Yokota et al., 1962). Critchley et al. (2000) observed SCR levels in tandem with functional
magnetic resonance imaging (fMRI) data in an effort to correlate increased SCR to particular
brain regions. Their study revealed a strong association between SCR and preconscious
structures (that include the prefrontal cortex, anterior cingulate, parietal lobe, and cerebellum)
directly relating skin conductance to emotional and instinctive processes.
Research has testified to the autonomic-excitatory function of EDA via observations of
distinct correlations between increased levels of electro dermal activity and excitatory
behavioural disorders such as Attention Deficit Hyperactivity Disorder (Rosenthal & Allen,
1978). Stress has also been connected to increased EDA levels, the rationale being that stress
disrupts balance within the ANS by eliciting significant excitatory activity impacting upon
heart rate, respiration and sweat secretion. Research concerning psychopathology has
revealed notable correlation between attenuated EDA in response to stress and pathologic
behaviour (Fung et al., 2005); a response that was also observed in patients who suffered
injury to particular brain structures (Critchley et al., 2000). Vaez-Mousavi et al. (2007) assert
that stimuli demanding immediate or significant attention also generate notable EDA
increases. The nature of EDA and its connections to an array of physiological and
behavioural processes can be broadly simplified to stating that electrodermal activity
characteristically measures human arousal (Gilroy et al., 2012; Hedman et al., 2009; Nacke &
Mandryk, 2010) and is currently the most commonly employed biometric when attempting to
quantify this psychological trait (Ravaja, 2004).
ELECTROMYOGRAPHY
Electrical impulses, generated during muscle contractions, are the foundation of
electromyography (EMG). Chemical induction and electrical stimulation generate a voltage
difference that cause muscles (specifically, striated muscle) to contract, and it is the same
voltage difference that is recorded during an EMG assessment (Gilroy et al., 2012). Gielen
(2010) provides a detailed account of the physiology surrounding electromyographic study,
describing the biological sub-structures within muscle tissue and elucidating the neural
pathways that connect the brain to individual muscles via the central and peripheral nervous
85
86
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
systems. Electrical impulses travel across this intricate array of pathways towards muscle
fibres and fibrils as action potentials (AP), stimulating the secretion of neurotransmitters and
causing depolarisation (opening a gate within cell membranes that exchanges the position of
sodium and potassium ions) which, in turn, generates new APs that travel across the length of
the muscle, causing further depolarisation and releasing calcium ions that facilitate the
activation of a complex interaction between muscle substructures that finally leads to
movement.
As with EDA, electromyographic data can be collected by way of internal (intramuscular)
and external (surface) methods that likewise have advantages and limitations. For the
purposes of psychophysiology, surface electromyography (sEMG) is most appropriate (as
opposed to primarily medical diagnostic applications that more often employ intramuscular
assessments) due to the non-invasive and painless application of the surface sensors that
facilitate freedom of movement (Gielen, 2010). Gielen also notes that sEMG supports more
accurate test reproducibility as sensor arrangements can be accurately marked. Such
advantages do come at the cost of weaker signal strength (primarily due to electrical
resistance from the skin) limiting the effectiveness of sEMG to muscles close to the skin
surface (Franssen, 1995).
EMG surface sensors are typically gold or silver plated for enhanced conductivity and, as
with EDA, often require a conductive gel to work effectively (Gielen, 2010). Archetypal
electrode montages are bipolar with two active electrodes (connected to a differential
amplifier) placed upon the most substantial area of muscle under analysis whilst a third
electrode (placed on a region with no discernable muscular activity such as the ear lobe or
occipital protuberance [back of the head]) acts a reference (Franssen, 1995). Facial
electromyography (fEMG) characteristically employs a surface electrode design to detect
activation of various regions of facial musculature and it is commonly correlated with the
study facial and emotional expression. Research concerning fEMG varies in the exact
muscles under observation, however the corrugator supercilii (forehead frown muscle),
orbicularis oculi (closes the eyelids) and zygomaticus major (cheek smile muscle) are most
commonly utilised within psychophysiological research due to their association to emotional
valence (Lang et al., 1998; Larsen et al., 2003). To further support clarity within this chapter,
throughout the remainder of this discussion, fEMG will simply be referred to as EMG
because the following articles cited within this section either refer explicitly to fEMG or
discuss issues that apply as equally to fEMG as they do to any other form of
electromyography.
Tom Garner
2012
University of Aalborg
APPLICATION AND LIMITATIONS
Whilst the majority of functional advantages relevant to EDA also apply to skin conductance
response (SCR), the reverse is not always true. To support clarity, this section shall focus
upon advantages and limitations that, although they may not be explicitly referred to as SCR
in the literature cited, are nonetheless equally applicable to this particular subclass of
electrodermal activity.
Whilst EDA data collection techniques certainly have merit, they are also not above
methodological concerns. The primary benefits of utilising SCR as a biometric include low
running costs and easy application (Boucsein, 1992; Nacke & Mandryk, 2010) non-invasive
sensors that allow freedom of movement and a well-established link with the common target
of arousal measurement due to distinct and exclusive connectivity with the sympathetic
nervous system (Fowles, 1986; Lorber, 2004). Another distinct advantage to SCR is that
secreted sweat is not required to reach the surface of the skin for a discernable increase to be
observed, allowing researchers to identify minute changes that would certainly not be
noticeable from visual observation (Bolls, Lang & Potter 2005; Mirza-Babaei, 2011).
In contrast to these advantages, traditional electrodermal sensor setups have been associated
with various erroneous factors such as motor activity (Roy et al., 2008) and emotion
suppression (Jackson et al, 2000; Wegner et al., 1990). Use of these techniques for the
purposes of measuring emotional response has been described as fraught with classification
difficulties and it is acknowledged that biometrics alone are not always reliable indicators of
discrete emotions (Cacioppo, 2007; Hazlett & Benedek, 2007). However, as discussed below,
such concerns are rapidly being addressed and resolved as the technology behind the method
advances.
Kivikangas et al. (2010) point to a relatively noise resistant signal as a significant advantage
of EDA applications and also support the notion of a distinct association between EDA and
arousal, particularly in comparison to the more ambiguous inferences associated with
electromyography and heart rate. By way of contextualising arousal response, recent
experimentation has revealed EDA to be an effective correlate of aggression and sociopathic
tendency. Cacioppo et al. (2007) do, however, balance the argument by suggesting a ‗manyto-one‘ relationship in which a single physiological measurement can be attributed to a
number of psychological phenomena. From this argument it appears logical to assume that
although EDA certainly has the potential to accurately reflect particular psychological
structures, an informed and multi-faceted method is required to achieve such goals. One
particular success story is that of Gross, Fredrickson and Levenson (1994) who utilised EDA
biometrics to test two alternative theories of crying, revealing that when individuals cry they
consistently experience increased arousal (as determined by EDA). This knowledge was used
to disprove the theory that crying facilitated homeostasis and attenuation of negative affect in
favour of the notion that crying instead created an intensely aversive state that motivates the
individual to address the source of the sadness.
87
88
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
In their discussion into the limitations of EDA, Kivikangas et al. (2010) suggest that the
temporal resolution of 1.0 to 4.0 second delayed responses is slow. Whilst this is certainly
true in direct comparison against electromyographic signal data, there remains potential to
accurately map events to EDA signal provided the presentation of stimuli is not too
temporally narrow. Gilroy et al. (2012) support this notion, asserting that EDA ‗has a quick
response (onset of 1.0 to 3.0 seconds) with a long decay period‘. Whilst the subjective
descriptors of quick and slow arguably depend upon the specifics of the methodology it
would be unfair to suggest that EDA lacks the temporal resolution for valid application in all
event-related studies.
Lober (2004) refers to the behavioural activation and inhibition systems (see Gray, 1987) and
asserts that EDA reflects motivational action systems, suggesting therefore that EDA may
have application as a measure of motivation or desire. EDA has also been successfully
employed as part of biofeedback systems with medical application and has been revealed to
be an effective form of supportive treatment for children within the autistic spectrum
(McLeod & Luccy, 2009) and individuals displaying antisocial or psychopathic tendencies
(Lorber, 2004). In a similar vein, Nagai et al. (2004) utilised an EDA-based biofeedback
system to support patients with drug-resistant epilepsy, revealing significant seizure reduction
between the test and control groups.
Whilst traditional EDA hardware setups are non-invasive and the application procedure is not
painful, issues with hyper sensitive reactions to electrode gels and movement-restrictive wires
that restrict researchers to short-term acquisition procedures are significant concerns
(Hedman et al., 2009). Poh et al. (2010) exhibit an intriguing solution to such long-term
discomfort and motion artefact issues commonly associated with standard finger-based EDA
sensor setups. They present a design in which the EDA sensors are integrated into a bespoke
wrist band to obtain measures from the ventral side of the distal forearms. This application is
certainly advantageous within a CVG context as the hands are not encumbered with
hardware, enabling full freedom of movement, significantly reduced artefact (as the hands are
the primary source of physical movement during play), a more natural and immersive
gameplay experience, and also a reduced risk of white coat hypertension and greater
ecological validity of obtained data.
Wireless connectivity protocols that include infrared and Bluetooth have been integrated into
a wide selection of biometric hardware devices (see Biomedical.com; Biof.com;
Psychlab.com) and affordable, commercial-grade SCR wireless devices (for a comprehensive
description of such a system, see Strauss, 2005) include the Q Sensor (Affectiva.com, 2012)
and the GSR Shimmer (Shimmer-research.com). Hedman et al. (2009) present the iCalm
sensor, a similar EDA-based wristband alongside accompanying software that was designed
to enable users from the general public to acquire, analyse and interpret their own
physiological data without developer or researcher support. Systems such as these provide
real portability and high usability, enabling users to provide data from a natural environment,
separated from computers and wires to shift the user‘s focus away from the mind-set of being
tested towards the activity being undertaken.
Tom Garner
2012
University of Aalborg
EMG provides very high temporal resolution (accurate to the millisecond), removes bias
present in visual observation, supports automation and is capable of detecting minute
muscular action potentials (Bolls, Lang & Potter, 2001). EMG analysis of particular facial
muscles has been described as ‗the primary psychophysiological index of hedonic valence‘
(Ravaja & Kivikangas, 2008). Positive emotional valence is typically characterised by a high
level of zygomaticus major activity and correspondingly low level of corrugator activation
whilst negative valence yields the opposite (Harman-Jones & Allen, 2001; Kallinen &
Ravaja, 2007). Countering some of these benefits, EMG measurements are relatively
sensitive to noise originating from both muscular cross-talk (action potentials of
neighbouring muscles confound the signal stream) and technical issues that include
inaccurate sensor placement, loose connections and interference from electrical appliances in
close proximity (Kivikangas et al., 2010). Kivikangas et al. cite the FUGA (fun in gaming)
project that assessed the validity and reliability of several biometrics (including: EMG, EDA,
functional magnetic resonance imaging, electroencephalography and eye-tracking) and
concluding that although all approaches came conjoined to methodological pitfalls, such
issues could be overcome with careful and informed planning and execution, alongside
logical interpretations of obtained data. Research within this project further supported the
value of such biometric systems, presenting a prototype system capable of accurately
identifying every time an individual was playing a computer video game within a three week
period from EMG and EDA data.
BIOMETRICS IN CONTEXT: EMOTION
„Psychophysiological measures […] can serve as “windows” on the mind and as “windows”
on the brain.‟ (Coles, 1989)
This section progresses from a more general discussion regarding EDA and EMG biometrics
to focus upon specific contexts relevant to the thesis. Commencing with emotion and
affective frameworks, this section then addresses research concerning psychophysiology
within the contexts of sound and, finally, computer video games. As discussed in chapter 2,
affect is positioned as the master-term, under which emotion and mood refer respectively to
short-term, event-related and long-term, situation-related responses.
Arguably, one of the principal psychophysiological associations is that between physiology
and emotion. Research within this field is concerned with the quantification of human
experience; searching for a means to better understand how and why we feel. Brave and Nass
(2002) posit that ‗emotions can be expressed via several channels, such as voice, speech,
facial responses and physiological responses‘. Nacke et al. (2009) assert that both EDA and
HR are proven successful measures of arousal and emotion but note that this is primarily true
within controlled laboratory environments. Ravaja (2004) also testifies as to the inherent
value of biometrics in emotion recognition systems, but states that it lacks considerable
reliability if used independently. Critchley et al. (2002) support the use of EDA as an
indicator of emotion, stating that EDA ‗provides a sensitive and convenient measure of
89
90
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
assessing alterations in sympathetic arousal associated with emotion, cognition and attention‘.
Critchley et al. further state that electrodermal activity can be indicative of many discrete
emotions that include pain, excitement, anxiety and apprehension, provided the methodology
is properly designed. An ‗appropriate‘ methodology characteristically refers to simultaneous
use of several biometrics alongside subjective data collection and cross-examination of all
elements before inferring psychophysiological patterns (Drachen et al., 2010).
Biometrics within usability and user experience (UX) studies reveals a powerful practical
application of study into the physiology of emotions, an advantage that has been exploited for
UX testing in computer video games (Ravaja et al., 2006). Mirza-Babaei et al. (2011) provide
a detailed discussion of UX methods and their limitations, asserting that whilst observation
and subjective user-feedback can solve the majority of usability issues, biometrics are
becoming an invaluable tool for understanding issues related to users‘ feelings, further stating
that ‗in certain categories of issue, [biometrics] reveal up to 63% more issues than
observation alone‘. Drachen et al. (2010) state that the rich level of detail and objectivity of
biometrics within UX studies provides researchers with the information required to correct
for bias in traditional subjective methods and provides event-related data that, during the
course of an experiment, maybe forgotten by the participant at the time of debrief.
Psychophysiology and UX/usability has been taken a step further, with recent technologies
integrating biometrics into their hardware to increase efficiency of testing and increase
product development speeds (Gualeni, 2012).
Mailhot et al. (2008) tested various musical excerpts that had already been subjectively
classified along the valence (positive-negative) spectrum; recording EDA, EMG and eyeblink data in response to these excerpts. The results revealed that music (pre-described
subjectively as unpleasant) increased corrugator activity whilst greater EDA was recorded in
response to pleasant musical excerpts. Ravaja (2002) compared viewer response to static and
dynamic facial images; data collected revealed a strong correlation between positive selfreport and zygomaticus major activation. Dimberg (1986) documented physiological patterns
identified when comparing user response to fear-inducing and neutral stimuli. The results
detail increased zygomaticus major activation in response to neutral stimuli, compared to
increased corrugator activity and EDA in reaction to fearful stimuli.
Such research asserts that activation of particular muscle groups corresponds reliably to
valence-charged stimuli, suggesting that EMG data has the potential to identify emotional
valence across a broad range of values and can therefore be used to compare not only positive
to negative, but also to explore the many varying degrees of these poles (Sato et al., 2008).
Recent work by Van den Broek (2006) exposed participants to stimuli pre-classified into four
emotional categories and revealed significant difference in the skewness of data distributions
(obtained from frontalis, corrugator supercilii, and zygomaticus major EMG) between each of
the emotion groups, suggesting that EMG signals that initially appear comparable can,
potentially, be clearly distinguished via descriptive statistics.
Tom Garner
2012
University of Aalborg
Existing research has documented increased EDA during exposure to a fear-inducing
stimulus (Bradley et al., 2008; Dimberg, 1986; Meehan et al., 2002) and other negative
emotional states such as disgust (Jackson et al., 2000), however positive emotional states
have also caused significant increases in skin conductance (Mailhot et al., 2008). Whilst such
studies do not postulate that such associations indicate EDA as viable sole indicator of fear or
disgust, this information does suggest that measuring arousal via EDA with appropriate
contextualisation could support artificial fear recognition. Researchers attempting to
effectively utilise biometrics for data recognition are typically required to utilise some form
of the valence-arousal model (VAM – a dimensional model referred to in chapter 2). The
corrugator supercilii and zygomatic major electromyographic approach to the VAM is
founded on the principle that the former muscle is activated during negative (frown) affect
and the latter in response to positive (smile) emotional experience, a model that takes
influence from the facial action coding system (FACS) established by Ekman and Friesen
(1978). Activation of the zygomatic muscle has, however, also been associated with negative
emotional states, specifically disgust (Mailhot, 2008). Research has revealed substantially
low correlations between positive occurrences and smile activation during certain sporting
events (Ruiz-Belda et al., 2002) and Russell et al. (2003) document several studies in which
smiling appears to be largely a form of conscious social communication rather than reactive
expression. A potential solution to this problem is documented by Larsen et al. (2003), stating
that the corrugator supercilii is capable of representing both positive and negative valence.
Activation of corrugator muscle tissue is potentiated by negative affect but also has been
revealed to be inhibited by positive affect (Lang et al., 1993). This information suggests that
corrugator activity may provide sensitive readings of valence whilst zygomatic activation
could suggest intense disgust, happiness or imply mixed emotions.
The notion of ambivalence have been addressed by Larsen et al. (2003) who suggest that
positive and negative affect should be treated ‗as separable components of the affect system,
rather than as opposite ends of a bipolar valence continuum‘, suggesting that circumstances
may arise in which both positive and negative emotions could be received during an
electromyogram recording. Corrugator activity is described as ‗sparsely represented in the
motor cortex and […] less likely to be involved in such fine voluntary motor behaviours as
articulation and nuanced display rules‘ (Ekman & Friesen, 1975) suggesting that corrugator
activation is unlikely to be affected by conscious suppression or false activation. As with
many psychophysiological measurements, corrugator activity has been described as
insufficient at identifying discrete emotional states. Within a computer video game context,
Hazlet (2003) associates frustration with corrugator activation, identifying increased readings
in novice users during failure to complete tasks and during interaction with software rated as
difficult to operate. Waterink and Van Boxtel (1994) observed a correlation between
corrugator muscle activity and exerted mental effort, whilst some researchers associate
sadness and disgust with corrugator stimulation and correlate fear with corrugator relaxation
(Lang et al., 1993; Yartz & Hawk, 2002). This information suggests some disagreement with
regards to the functionality of EMG (specifically corrugator activity) in emotion recognition.
What does appear more consistent, however, is the notion that corrugator activity is less
susceptible to suppression, false response and ambivalence.
91
92
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
Also referred to as the circumplex emotional model (Russell, 1980); the VAM plots
psychophysiological data onto a two-dimensional (X, Y) construct, positioning the affective
experience along a positive-negative (pleasant-unpleasant) and low-high activation (calmexcited) continuum (Ravaja et al., 2004). Utilising both EDA and EMG assessment systems,
Bradley and Lang (2000) revealed a significant correlation between negative emotional affect
and increased electro-dermal activity, startle reflex potentiation and corrugator
electromyographic (EMG) readings, supporting the association between these specific
biometric measures and hedonic valence. The valence-arousal model acts effectively as an
intermediate approach to emotion recognition, in that it does not have to assume discrete
emotions. Both the ratio-scale of the axis and the positioning of emotional states within the
2D space can be determined by the researcher and be based upon experiment-specific
contextualisation data.
BIOMETRICS IN CONTEXT: SOUND
Psychophysiological research with a focus upon audio stimuli has explored the physiological
effects of speech, music and sound effects to varying degrees. Koelsch et al. (2008) revealed
that changes in musical expression could evoke variations in SCR, HR and event-related
potentials (measured via electroencephalography). As discussed in chapter 4, quantitative
psychophysiological measures have been utilised to assess psychological response to sound
in various academic texts (Bradley & Lang, 2000; Ekman and Kajastila, 2009). The arousal
and valence experimentation concerning visual stimuli has recently been extended to address
audio. Bradley and Lang (2000) collected electromyogram and electro-dermal activity data in
response to various auditory stimuli. Experimentation revealed increased corrugator activity
and heart rate deceleration in response to unpleasant sounds, and increased EDA in reaction
to audio stimuli qualitatively classified as arousing. Jancke et al. (1996) identify muscle
activation in the auxiliaries of the forehead as producing significant, high resolution data in
response to audio. Electro-dermal activity has been utilised to differentiate between motion
cues, revealing increased response to approach sounds (Bach et al., 2008) and event-related
potentials (collected via electroencephalography) reveal changes in brain-wave activity in
response to deviant sounds within a repeated standard pattern (Alho & Sinervo, 1997).
In an effort to rank the horribleness of various sounds, Cox (2008) provides data in support
of gender, age, geographical location and cultural biases when assessing affective qualities of
a sound. Experimentation data also suggested that ‗[i]f the source or event is identifiable, than
[sic] a respondent‘s description of a sound is likely to be dominated by the source or event,
rather than the properties of the signal‘. Cox (2008) suggests that it is the revelation of source
(regardless of whether it is correct or not) that strongly determines affective response;
however Cox does not discount the intrinsic acoustic properties as potential inducers of
negative emotional states. Intense scraping sounds and particular combinations of tones
(creating dissonance) are described as intrinsically unpleasant across gender, age and sociocultural types.
Tom Garner
2012
University of Aalborg
The information documented above suggests that biometric data collection has the potential
to reveal emotion states with greater objectivity and temporal precision than that obtained by
way of qualitative methodological practice. Utilising EMG and EDA measurements, the
affective quality of a sound can be positioned within the two dimensional space of valence
and affect, with recent work even alluding towards the potential for discrete emotion
classification via physiology data. The use of qualitative information collection alongside the
quantitative physiological data is a reoccurring practice. Although the validity of self-report
within this research field has been questioned (Kappas & Pecchinenda, 1999), the history of
experiments where both methods are employed or such a practice is recommended (Case &
Wolfson, 2000; Cox, 2007; Grimshaw, 2008) provides reassurance that self-report data,
combined with biometric results, can be reliable and valid.
Nacke, Grimshaw and Lindley (2008) measured tonic levels of EMG and EDA during
gameplay, comparing the effects of both sound effects and music upon biometric output.
Their results showed no significant difference in tonic physiological data but concluded, in
line with many researchers in comparable studies, that analysis of tonic levels is an
inappropriate approach to data collection and therefore support the use of event-related
approaches.
BIOMETRICS IN CONTEXT: COMPUTER VIDEO GAMES
Research methodologies incorporating biometrics within the field of computer video games
are diverse, with studies addressing: the influence of gaming uncertainty on engagement
within a learning game (Howard-Jones & Demetriou, 2008), the impact of playing against
human-controlled adversaries in comparison to bots upon biometric response (Ravaja et al.,
2006) and developing biometric-based adaptive difficulty systems (Ambinder, 2011).
Research into the latter has revealed that although non-biometric approaches to adaptive
difficulty exist (e.g. Left 4 Dead [Valve, 2008]), those utilising affective physiology systems
enhance the gaming experience more significantly (Liu et al., 2009).
According to Gualeni et al. (2012), commercial interest in biometrics has extended to casual
and web-mediated gaming, including mobile applications. They describe the Biometric
Design for Causal Games (BD4CG) project, an extended research study exploiting biometric
technology to better assess player experience of casual and social gaming. Existing research
has also supported the merit of biometric data as both a quality control tool, allowing
developers an objective insight into the emotional valence and intensity that their game is
likely to evoke (Hazlett, 2006; Keeker et al., 2004), and also as part of an integrated gaming
system that connects the biometric data to the game engine, thereby creating a game world
that can be manipulated by the player‘s emotional state (Sakurazawa et al., 2004). There is
increased agreement concerning what is meant by gameplay emotional states and the game
experience questionnaire, or GEQ (IJsselsteijn et al., 2008), is a well-established and
frequently employed qualitative debrief tool (Gerling et al., 2011). However, the
questionnaire itself is not an exhaustive framework of game experience and it is still argued
93
94
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
that there currently exists no commonly agreed upon theory of gameplay experience
(Kivikangas et al., 2010).
The advantages documented above are not without counterbalancing limitations, as
Kivikangas et al. (2010) state: ‗the various studies using psychophysiological measures do
not yet form a common field […] thus, we have number of separate results for any separate
research questions but very little accumulated knowledge‘. In addition, the task of
understanding thought and feeling during computer video gameplay is a grandiose challenge
as they: produce at least two sensory modalities at any one time, require complex cognitive
appraisals with varied semantic understandings, require the player to formulate strategies at
varied time-scales (Klimmt, 2003) and motivations for gameplay can vary both between
individuals and sessions (Kallio et al., 2011). Kivikangas et al. (2010) note that a significant
proportion of understanding behind psychophysiology comes from more static and simplistic
testing stimuli that have themselves been rigorously tested using subjective reporting before
physiological factors are considered, such as the International Affective Picture System
(IAPS) and International Affective Audio System (IADS) databases. Therefore, immediate
research should be concerned with understanding the relationships between dynamic stimuli
and physiological effects (Bonanno & El-Nasr, 2012). Gow et al. (2010) testify to the
difficulty of inferring specific emotional experience during gameplay due to both the
complex nature of the game itself and the presence of erroneous variables outside the
gameworld. It must also be considered that the nature of a virtual reality that typifies several
game genres (particularly the FPS) may cause established psychophysiological correlations to
falter.
Picard (2010) notes that in a previously undertaken assessment of player affective state, the
most significant event occurred when the game controller malfunctioned, supporting the
notion of a clear distinction between emotions based in reality and those originating from
diegetic elements of the game. The former is expected to evoke a significantly more intense
emotional response when equated to the latter even if the nature of the event is comparable.
For example, witnessing a murder or losing money within a game may produce a muted
experience (of fear and frustration respectively) in contrast with a real-life equivalent
scenario. Toet et al. (2009) revealed that although darkness within reality is a reliable cause
of anxiety measured in HR and cortisol level, the same is not true in an FPS game. Studies
correlating gameplay emotions and biometrics have conceded that, upon statistical analysis of
obtained data, the strong correlational evidence is somewhat marred by significant variation
observed between individual participants (Mandryk et al., 2006). However, this is not to say
that entities that exist beyond the veil of what many may describe as ‗reality‘ have no
capacity to make us feel. Joly (2012) testifies, referring to Wall-E (Stanton, 2008): ‗being
computer generated does not preclude the elicitation of affective experience‘. Certainly, if the
computer-generated visual representation of a barely humanoid and almost speechless robot,
interacting with inanimate objects as the sole sentient inhabitant of a desolate wasteland can
elicit significant affective change, then there must be ample potential for a computer video
game to achieve comparable results.
Tom Garner
2012
University of Aalborg
Maintaining immersion and flow during gameplay is jeopardised if requiring the player to
simultaneously provide an introspective of their affective state (IJsselsteijn et al., 2007) and
post-game debriefing often overlooks details due to a lack of reliability in participant memory
recall. Electromyogram and skin conductance hardware can be quickly and easily applied to
the participant (and removed), providing a safe, non-invasive measurement that is unlikely to
draw user attention away from the test stimuli (Cacioppo et al, 1986; Hazlett & Benedek,
2007; Huang et al, 2004). Both EDA and EMG signals are highly sensitive to small
physiological changes and provide a significantly greater temporal resolution than self-report
(approximately 10ms, 100 data points per second), allowing player response to be accurately
examined along a timeline and connected to game events of matching chronology (Hazlett &
Benedek, 2007). Mirza-Babaei et al. (2011) testify to the advantages of EDA almost ideal for
games testing, both during experimentation and post-test to support subjective debrief
responses, but concede that use of a game control pad is very likely to produce movement
artefact. The temporal characteristics of EDA (specifically SCR) have been sourced to
support pacing within an adaptive game system to avoid both prolonged periods of low
intensity and over-saturation of arousing stimuli (Gilroy et al., 2012). Increased EDA has
been presented as a potential indicator of specific in-game events, including: scoring a goal,
engaging in a fight and collecting points (Mandryk & Inkpen, 2004). Ravaja et al. (2006)
suggest that EMG and EDA are, together, reliable indicators of reaching a goal (as identified
by reduced corrugator activity, increased zygomatic activity and decreased EDA). In analysis
of an FPS title, Kivikangas and Ravaja (2010) observed a positive biometric response to
player-death but a negative response when the player killed an opposing character (though
note that in a multiplayer environment, killing an opponent instead elicits a positive
response).
The temporal precision of EDA and EMG has facilitated the notions of experimental designs
that extract only biometric data that correspond to a pre-defined and repetitive event in the
search for patterns within the data (Kivikangas et al., 2010). Weber et al. (2009) analysed the
biometrics of 50 minutes of gameplay, utilising data-logging techniques to compare obtained
data within different macro-stages of gameplay (e.g. safety/danger, exploration/re-treading,
etc.), a systematic approach that has been praised as a progressive and fruitful methodology
(Kivikangas et al., 2010). Ravaja and Kivilangas (2008) analysed specific epochs of data (1
second before event and two, 1 second epochs after event onset) as an approach to accurately
extracting relevant biometric data with which to correlate against event descriptors. Picard
(2010) asserts that biometric testing must focus upon ecological validity, ‗characterizing
patterns of data from individuals and clusters of similar individuals experiencing emotions in
real life‘, an method that, thanks to the recent developments in biometric design and
affordability, is becoming an increasingly realistic option. The assessment of biometric
methodologies is arguably being approached from many angles, with recent research also
questioning the validity of commonly employed statistical analysis tools. Bagiella et al.
(2000) argue that use of multivariate and repeated measures analysis of variance (ANOVA)
techniques commonly lead to false rejection of null hypotheses (type 1 error) and that mixedeffect models should be considered as a viable analysis alternative.
95
96
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
Cheng and Cairns (2005) argue that facial EMG can effectively measure the intensity of a
gaming experience and Ravaja et al. (2004) correlate facial EMG and EDA state changes to
gameplay specific experiences; their research identifying both a positive correlation between
increased EDA readings and self-report of spatial presence within the game, and a significant
association between positively indexed EMG data and positive game events (reaching a
goal). Increasing EDA has also been associated with gaming uncertainty (Howard-Jones &
Demetriou, 2008), and Drachen et al. (2010) revealed a similar correlation between selfreported gameplay experience and biometrics across three individual AAA (triple-A: a term
typically referring to big-budget games produced and marketed by large, established
companies) first-person shooter titles. Emotion research specified towards gameplay has
utilised similar biometric and subjective debrief methods to recognise game-relevant
emotional states, including: frustration, boredom, challenge and fun (Mandryk et al., 2006).
It has been asserted that biometric data collection is sufficient to facilitate improved
communication between human and computer (Van den Broek, 2006). Human-computer
interaction research in this vein has extended to CVG theory and physiological data has been
collected in various experimentation concerning games (Mandryk et al., 2006; Nacke &
Lindley, 2008). Hazlett (2006) measured EMG response to varied game events and
discovered significant correlation between specific game events identified as positive (using
self-report during a video review of gameplay) and zygomatic activity, and analogous
relationship between negative events and corrugator activity. An interesting study conducted
by Ravaja et al. (2008) provided further evidence supporting the association between
increased EDA and gameplay intensity and also discovered a surprising relationship between
electromyographic valence data and a first-person shooter (FPS) experience. Standard FPS
protocol places the player in a hunter/hunted scenario, requiring the player to kill large
numbers of virtual adversaries whilst their avatar is itself under constant threat of being
killed. Ravaja et al. revealed that during FPS action, players experienced intensely negative
valence whilst shooting an adversary and, comparatively, experienced positive valence as a
result of being shot; concluding that the patterns of stress and relief experienced within an
FPS environment do not follow the expected success-failure formula, as dictated by other
game genres.
The notion of integrating biometrics into a product has also been applied to CVG
development with sensors built into the game controller in most cases (Sykes & Brown,
2003). Utilising biometrics in gameplay has also become manifest in adaptive game systems
with established research utilising biometrics as an additional modality within game design
(Nijholt, Bos & Reuderink, 2009). One notable example being Nacke and Mandryk (2010),
who amalgamated ECG, EMG and EDA into a side-scrolling 2D shooter, in which biometrics
would control a number of game parameters (including enemy size, enemy speed, weapons
statistics and player jump height). The essential premise being to employ biometrics as an
enhancement of gameplay rather than a core mechanic, an approach that has much promise if
such technology is to be integrated into gaming as an evolution rather than a gimmick-based
and mechanically flawed innovation. Gilroy et al. (2012) present a similar approach that
exploits biometrics within a passive interaction system that connects user output to non-
Tom Garner
2012
University of Aalborg
player character moods/personalities, subtle narrative changes and pacing. Such changes are
not required to be predictable or reliable (in fact are likely to require the opposite) in contrast
to biometric systems that attempt to control direct and precise player-movement/interaction
parameters.
Gualeni et al. (2012) support the integration of biometric technology, stating this will provide
opportunity to obtain data from significantly larger sample sizes than with laboratory testing,
greatly increasing the power of statistical inferences. Also providing a counter-argument,
Gualeni et al. state that (in addition to problems with invasive and distracting hardware) such
systems would undoubtedly require additional resources for developers to program into the
game and also demand additional cost collating and analysing the collected data. Biofeedback
gameplay systems have also proved beneficial as alternative medical treatments, for example
Jitaree et al. (2012), who document an EMG-biofeedback game that provides elderly patients
an automated exercise program in order to reduce the risk of injuries sustained from falling.
As with the other emotion-related biometric designs, research into computer video games
experience must have a solid and informed methodology with carefully crafted and distinct
research questions and close examination into experimental design undertaken to address
erroneous variables while significant preparation and care must be taken in setup of the
sensors/hardware (Kivikangas et al., 2010).
The above sections within this chapter have discussed the applications, advantages and
limitations of biometric investigation, primarily electrodermal activity and electromyography.
Figure 2 below summarises the key points raised within this chapter. Also deliberated upon
was the effectiveness of these biometrics, specifically within the fields of audio, emotion
recognition and computer video games. The review of literature revealed that biometrics can
be used effectively to distinguish affective states in response to varied acoustic stimuli and
gameplay and that carefully designed methodologies have great potential to produce reliable
results.
A review of alternate methodologies advocates the valence-arousal model (VAM) as a
potentially appropriate foundation for emotion recognition within a gameplay scenario
(although comprehensive analysis comparing VAM statistics and qualitative user-response is
needed to enable this model to accurately predict discrete emotional states) within which a
comprehensive set of qualitative descriptors ranging from individual and compound events to
overarching themes, moods and atmospheres will flesh-out the VAM framework ultimately
increasing the accuracy of a CVG emotion recognition system. The discussion presented
within this chapter advocates biometrics within an array of CVG applications that includes
product development, adaptive gameplay mechanics, additional control modalities,
Biometrics is also asserted to have significant future potential within mobile technologies,
such as tablet computers handheld consoles and mobile communication devices. Although a
number of limitations to this technology are presented, most are counterbalanced with
promising current and future developments.
97
98
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
Figure 2: Summary chart of EDA and EMG
Electrodermal Activity (EDA)
Advantage
Limitation
Non-invasive
Erroneous motor activity
Low running costs
Emotion suppression
Easy application
Weak classification
Freedom of movement
Temporal factors
Established connection to arousal
Sensitivity to electrode gel
Can identify minute changes
Awkward with games controllers
Noise resistant
Good onset temporal resolution
Affordable wireless
Real-time information
Electromyography (EMG)
Advantage
Limitation
Non-invasive
Weaker signal strength
Painless
Limited to surface muscles
Reproducible sensor montages
Emotion suppression
Can identify minute changes
Weak classification
Excellent temporal resolution
Sensitive to noise
Sensors don‘t require gel
Can restrict movement
Established connection to valence
Expense of pro. systems
Real-time information
Application
Stress reduction
ADHD/Autism
Socio/Psychopathology
Brain damage
Attention
Arousal
Aggression
Motivation/desire
Biofeedback systems
Usability / UX
Application
Valence
Usability / UX
Product development
Biofeedback systems
Exercise training
ELECTROENCEPHALORGRAPHY
Electroencephalography (EEG) is the recording of electrical activity that is achieved by way
of electrodes, placed in specific arrangements across the scalp. The flows of ionic current
within the neurons of the brain cause voltage oscillations that can be observed with EEG
signal acquisition equipment. Although EEG study has been traditionally associated with
medical research (most notably the diagnosis of epilepsy), there is a growing body of
research that has great interest in exploiting EEG technology to better understand and,
ultimately, to communicate directly with the human mind. This chapter documents the
potential of EEG in this latter regard and presents an argument for this particular method of
psychophysiological data acquisition as both a robust and reliable approach to computer
video game biofeedback systems. Commencing with an overview of the advantages and
limitations of EEG (within both general and game contexts), this chapter explores the
associations of audio to EEG data, documents the competing methods of signal acquisition
and filtration, highlights prominent feature extraction techniques and also emotion
classification systems. A related analysis of current consumer grade EEG hardware,
discussing the potential of such devices to create affordable and accessible biofeedback
systems is presented within the conclusions in chapter 9.
Tom Garner
2012
University of Aalborg
ADVANTAGES, LIMITATIONS AND APPLICATIONS OF EEG
When considering which method of biometric data collection is more appropriate it is
important to first have a clear understanding of the purpose for which this data is to be used.
Biometric researchers often favour EEG for the purpose of obtaining direct brainwave data
that, unlike other approaches (electromyography and respiration for example) is less
susceptible to subject biases such as emotion suppression and false response (Murugappan et
al., 2009). Direct brainwave analysis bypasses conscious input and enables the researcher to
observe internal or covert processing (Mulholland, 1973). If a direct analysis of brain-centred
neural activity is necessary, EEG boasts advantages over neuroimaging alternatives that
include functional magnetic resonance imaging (fMRI), positron emission tomography (PET)
and magnetoencephalography (MEG). This is primarily in terms of the significantly lower
expense of EEG equipment (Vespa, Nenov & Nuwer, 1999), but also due to substantially
greater portability and ease of use (Hamalainen et al., 1993). Higher temporal resolution
(potentially several thousands of samples per second) is a substantial advantage (Fisher et al.,
1992) with particular value for computer video game play analysis, as responses can be
accurately be mapped to game events/transitions even if they occur in quick succession. EEG
is comparatively less invasive for the participant and does not expose them to damaging
radiation or magnetic fields. Unlike fMRI and PET, EEG has no associations with
claustrophobia (Murphy & Brunberg, 1997). An advantage that is of particular interest to
auditory research is that EEG equipment emits minimal noise and wireless capabilities of
modern headsets facilitates the separation of the participant from the receiving computer,
enabling tests to occur in potentially silent environments.
EEG is, of course, not without disadvantages. The positioning of the electrodes (irrespective
of positioning system employed) are focussed mainly on the higher, cortical regions of the
brain and as a result, are not ideal for measuring lower regions such as the medulla,
diencephalon, or pons. Perhaps the most significant issue with EEG is low spatial resolution
(Srinivasan, 1999), a characteristic that severely limits our potential to accurately determine
the location the electrical activity is originating from. Signal to noise ratio is an additional
problem (Schlögl, Slater & Pfurtscheller, 2002) due to the various sources of signal artefact
such as eyeblink, facial muscle contraction and electronic interference from proximate
devices. Whilst EEG and fMRI have been recorded simultaneously to combine the spatial and
temporal resolutions (DiFrancesco, Holland & Szaflarski, 2008), such an approach then
negates many of the benefits of EEG and is certainly inaccessible to the majority of compter
video game and audio researchers.
The limitations and obstacles documented above have arguably not dissuaded researchers
from using EEG for a range of purposes that include: robotic control by way of brain
computer interfacing (Ranky, 2010), determination of meditation and attention levels
(Crowley et al., 2010) and assessment of the mindsets of athletes (Stanley et al., 2004). EEG
study has also provided correlations between brain activity and task efficiency (Chouinard et
al., 2003), perceptual feature binding (Schadow et al., 2007), emotional valence (Crawford et
al., 1996), and discrete emotional states (Takahashi, 2004). Seigneur (2011) utilised EEG to
99
100
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
develop a brain-computer interface that had the potential to recognise discrete emotional
states (specifically happiness) and transmit that information great distances by way of a webbased network. It was revealed that the system could be integrated into any web service, but
was intended for popular social networking sites in which users would automatically generate
positive feedback (Facebook likes) in response to particular EEG data patterns. It was
theorised that this system could ultimately facilitate an ‗economy model in which people pay
depending on the emotions they have experienced‘. Another internet-based biometric setup
utilised EEG-based emotion recognition to control a commercial music site in which the
determined emotional state would regulate the particular style, genre or individual track being
played (Liu et al., 2010).
Campbell et al. (2010) employed a bespoke classification algorithm to discriminate P300
signals (event related potentials observable as positive deflections in voltage with latencies of
approximately 300ms) for the purposes of partially controlling a mobile phone (specifically
initiating the dialling process in response to the P300, which is itself initiated by visual
recognition of the desired recipient). Ismail et al. (2011) utilised EEG data in tandem with eye
tracking to automatically generate emotion-related tags (boredom, joy, etc.) in real-time as
the participant was reading a segment of text. Emotion-related classification algorithms have
also been employed within video game contexts. Gonzalez-Sanchez et al. (2011) utilised
EEG data (again, alongside eye tracking) to identify frustrating gameplay events within
Guitar Hero (Red Octane, 2005). Bespoke games have even been developed around an EEG
driven brain-computer interface (BCI) as a core gameplay mechanic (Qiang et al., 2010).
Chanel (2009) developed an EEG biometric measuring system that was integrated into a
classic puzzle game (reminiscent of Tetris) to differentiate between emotional states
experienced at varying levels of difficulty, working towards a biofeedback system capable of
managing the adaptive difficulty of gameplay in real-time.
Improving technology is gradually solving many of the characteristic limitations of EEG, as
increasingly multi-channel systems are developed and sensor arrangement models are
continually refined, spatial resolution of EEG is arguably increasing (Vaisanen, et al., 2008).
Coyle et al. (2008) present a potential solution in reducing EEG artefacts within a gamerelevant biometric system. EEG data is synchronised with infra-red tracking of the head,
wrists, feet and torso to enable automatic identification and removal of muscle activitycaused artefacts (a system that itself could be achieved with consumer grade equipment –
specifically the infra-red Microsoft Kinect game controller or the gyroscope mechanism
within the Emotiv headset). The account of recent EEG related experimentation supports the
assertion that although this particular approach to a biometric affect measuring system is not
without problems; there nonetheless remains a strong confidence amongst researchers in this
method. The proceeding section will return to the difficulties and methodological barriers
associated with EEG, to explore the alternative strategies employed by researchers to
circumvent these issues and work towards the development of an accurate and reliable
interpretative system.
Tom Garner
2012
University of Aalborg
PERSPECTIVES ON EEG ACQUISITION AND FILTRATION
Continuing from the assertion that, despite some methodological concerns, EEG remains a
biometric medium with significant future potential and what could be described as a fan-base
of researchers across various disciplines, this section outlines the process of data acquisition
and initial processing to firstly reveal how many of the concerns associated with EEG can be
overcome, but also to support a new strategy of EEG acquisition/processing that could form
part of the gameplay-biofeedback system that has been theorised upon throughout this thesis.
As stated earlier, EEG recordings are obtained by way of electrodes placed across the scalp.
In most cases a conductive gel or paste is used and participants are required to prepare the
scalp with light abrasion, reducing the impedance and improving signal to noise ratio (SNR).
One of the most well-established approaches to electrode placement is the international 10-20
system (Jasper, 1958), an arrangement that has transcended EEG to be used in other forms of
biometric data acquisition (Herwig et al., 2003). This system typically incorporates 21
electrodes, of which some are placed precisely at identifiable anatomical locations and
consecutive electrodes are placed at fixed distances from these points in steps of 10% to 20%,
accounting for variations in cranial size (Herwig et al., 2003). Chatrian et al. (1985) were the
first to propose a higher-resolution 10-10 (also known as the 10%) system which increased
the number of electrodes to 74. The approach consequently gained recognition as the new
standard in EEG placement (Klem et al., 1999). Oostenveld and Praamstra (2000) supported
the continuing resolution evolution with a 10-5 system that provided a potential placement
structure for over 250 individual electrodes. Such dense electrode EEG caps are now
commercially available (Pflieger & Sands, 1996), suggesting that although increased spatial
resolution does dictate a greater monetary cost, the financial burden of a current flagship
system will decrease significantly upon release of a superseding design. With reference to
more relevant research, the international 10-10 system appears to be popular for the purposes
of emotion recognition systems (Murugappan et al., 2009; Murugappan et al., 2010; Rizon
2010; Silva et al., 2002). In parallel with these developments, a differentiated research track
exists that concerns developments of a more recreational nature. Discussed in greater depth
later in this chapter, low-cost consumer grade headsets are evolving in the opposite direction;
utilising fewer electrodes to exploit the benefits of portability, ease of use and affordability.
Such devices feature prominently in recreation-focussed research projects (for example:
Crowley et al., 2010; Ismail et al., 2011; Ranky, 2010; Sourina & Liu, 2011).
This variation suggests a divergence of opinion dependent upon the goals of the research,
with medical applications prioritising spatial resolution to enable more accurate diagnoses
whilst recreational purposes favour systems with greater potential for integration into
commercial entertainment systems. Despite such divergences in hardware design, several
other characteristics of EEG acquisition remain relatively constant. Pre-test preparation of the
participant by way of skin abrasion is recommended in academic texts (Teplan, 2002) as a
way of improving signal quality. However, this is not always consistent and it is argued that
particular EEG setups (such as those incorporating active or spiked electrodes) can produce a
usable signal without skin preparation (Griss et al., 2001; ). External signal noise generated
101
102
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
by electric devices is typically attenuated via a simple band-pass filter (Chanel et al., 2005).
Other common causes of signal noise include ocular, muscular and vascular interference that
commonly occur below 4Hz and above 30Hz (Murugappan et al., 2010) that, if removed
completely, may severely impact upon the Delta and Gamma frequency bands, and therefore
a more complex approach is needed. The most common solutions to this problem are
preparatory steps (e.g. encouraging minimal movement from participants) and simultaneous
acquisition of biometrics from interference generators, such as electromyography to identify
ocular or muscular activity and electrocardiography to identify vascular activity (Murugappan
et al., 2010). Spatial filtering technique (a mathematical process that focuses upon the central
distribution of an EEG waveform to attenuate noise) is employed regularly within relevant
research (Rizon, 2010). Surface Laplacian (SL) filtering, first presented by Hjorth (1975)
computes the averages of a set number of nearest neighbours (in relation to the EEG signal)
and removes the resultant values from the signal. Such a process is relatively simple to
execute and consequently is a favoured form of spatial filtering in emotion recognition
systems (e.g. Levis, 1995; Takahashi, 2004). From this we can infer that research tracks
relevant to this thesis prioritise simplicity and efficiency in their data processing systems,
most likely to minimise computational processing and system hardware demands. Much as
the acquisition hardware is required to integrate successfully into a variety of commercial
applications, so too is the software that interfaces between them.
EEG FEATURE EXTRACTION AND EMOTION CLASSIFICATION
In continuation from the preparatory and pre-processing phases of EEG recording practice, it
is then required that the raw signal be transformed into inferential data to support the purpose
of the acquisition. This section explores both well-established and experimental approaches
to such transformations, and concludes with a discussion regarding which of these
approaches could be most beneficial for an emotion-classification feedback system within a
computer video game context.
Before feature extraction and classification methodologies are discussed, EEG channel
representations, or montages, are briefly addressed. As with all aspects of EEG acquisition
and processing, the montage selection is another point of contention between researchers and
often determined by the specific purpose(s) of the system. Common montages include bipolar
(the difference between two adjacent electrodes), referential (the difference between an
electrode and a designated reference), laplacian (the difference between an electrode and the
weighted mean of its surrounding electrodes) and average reference (collective outputs of all
electrodes are summed and averaged). Electrode arrangements (such as the international 1020 system) have also been referred to in research as montages. This complicates our
understanding. Because a system such as the 10-20 can arguably be employed with several of
the montages detailed above, this chapter refers to the electrode placement systems as
arrangements and the interrelations between electrodes and signal amplifiers as montages.
Whilst a specific justification for their choice is not regularly prevalent, emotion recognition
research appears to characteristically favour referential montage setups (Campbell et al.,
2010; Chanel et al., 2005; Ekanayaki, 2010; Ismail et al., 2011) although examples of bipolar
Tom Garner
2012
University of Aalborg
montage (Bos, 2006) and average reference (Achaibou et al., 2007) are still present and often
dictated by the hardware employed. It should also be mentioned that the two commercial
headsets under scrutiny later within this chapter both employ the referential montage system.
Commencing with what is one of the most commonly employed analysis tools, the Wavelet
Transform primarily serves to separate raw EEG into component frequency (sinusoidal)
bands. An EEG signal consists of δ (Delta: <4Hz), θ (Theta: 4-8Hz), α (Alpha: 8-13Hz),
β (Beta: >13-30), Ύ (Gamma: 30-100+Hz) and µ (Mu: 8-13Hz) bands. Berger (1929) argued
that frequency analysis was crucial to meaningful EEG appraisal and this notion remains
valid, as can be observed from the numerous research projects that incorporate frequency
analysis methods. Details on these bands of frequency and summaries of common
associations believed to exist between individual bands and specific activities/thought
processes have already been documented (Sanei & Chambers, 2007). Graps (1995) presents
an online introduction to wavelet analysis history and developments that concisely
differentiates between the original Fourier transform (continuous transformation of signaltime domain to signal-frequency domain), discrete Fourier transform (a sampled equivalent
of the standard Fourier transform) and windowed Fourier transform (an approach to
frequency analysis of aperiodic signals). An arguably more relevant differentiation made
within this document is that between the fast Fourier transform (FFT) and the discrete
Wavelet transform (DWT). Graps (1995) identifies the key difference between FFT and DWT
to be that ‗individual wavelet functions are localized in space. Fourier sine and cosine
functions are not‘. In condensed terms, FFT processes all elements of raw EEG data within a
fixed space whereas DWT provides a space that is flexible. The practical consequence of this
difference is that DWT resolutions are variable, enabling both detailed analysis processing
and signal discontinuity isolation. These advantages have resulted in DWT being advocated
for use in a range of research interests, from speech signal noise reduction (Fan et al., 2004;
Wieland, 2009) to characterisation of transient random processes in oceanic engineering
(Gurley & Kareem, 1997). Both FFT and DWT processes appear to reoccur habitually within
emotion-recognition EEG systems (Flórez et al., 2007; Levis, 1995; Murugappan et al., 2010;
Rizon, 2010; Takahashi et al., 2004) and a review of related literature does reveal a general
preference towards wavelet-based analysis.
Probing deeper into utilisation of wavelet transform processing, sub-categories of DWT
(typically db4, db8, sym8 and coif5 algorithms) are revealed within emotion recognition EEG
systems (Levis, 1995; Murugappan et al., 2010; Murugappan et al., 2011; Rizon, 2010)
suggesting a lack of consensus regarding the relative advantages and disadvantages between
the wavelet functions and also inferring that a strong methodology should include various
functions to provide a more comprehensive analysis. Whilst frequency analysis is clearly a
form of feature extraction, a much wider array of statistical tools have been utilised within
EEG research in the quest for more reliable and accurate approaches to EEG signal
interpretation. Murugappan et al. (2010) provide a concise overview of various feature
extraction methods that include: common spatial patterns, asymmetrical power of specific
frequency bands, and power of each frequency band alongside mean of raw signal.
103
104
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
Established statistical techniques, which include mean, standard deviation, power, analysis of
variance (ANOVA) and multivariate analysis of variance (MANOVA), also make
appearances within relevant experimental research methodologies (e.g. Jones et al., 2001).
Classification systems provide the next logical step within EEG analysis processing, using
data generated by way of feature extraction to generate output data that are essentially an
artificial replication of our brain‘s cognitive process (of course only within this highly
specific context and circumstance). Within the emotion-recognition system, classification
algorithms assess the processed EEG data and make a predetermined judgement based upon
parameters and thresholds designated by the researcher. As with many of the steps already
discussed, EEG classification algorithms vary significantly between research projects and
currently there is no distinct consensus regarding optimum approaches to classification
methodologies.
One of the most conceptually accessible classification systems is the arousal-valence model
(AVM) associated with frontal EEG asymmetry. AVM essentially identifies human emotions
by correlating an obtained EEG reading against a pre-established programmable statement
that connects discrete emotion tags (fear, joy, disgust, anger, etc.) to points along a twodimensional plane. Cacioppo (2004) asserts that frontal asymmetry can be viewed to act as
either a moderator (‗differential activation is […] instrumental in the production of tonic
affective states‘) or mediator (differential activation is ‗thought to dampen or augment the
process‘) of physical response to emotion-evoking stimuli. Which perspective is adhered to
will define whether frontal asymmetry (or indeed any electrical impulse obtained via EEG) is
believed to be crucial to the response occurring at all, crucial to the scale or finer specifics of
the response, or merely a by-product of the response. Currently, research cannot ascertain a
causal connection between frontal asymmetry and response (specifically, affective state) and
evidence that is in support of this connection is largely correlational (Allen et al., 2001).
Irrespective of the exact nature of frontal asymmetry, the AVM has been described as ‗one of
the most promising and fertile in the field‘ (Cacioppo, 2004) and has proved effective in
distinguishing various valence-related affective states, such as fear from joy in response to
musical excerpts (Schmidt & Trainor, 2001) and attraction from withdrawal (Coan & Allen,
2003). Jones et al. (2001) identified frontal asymmetry as a reliable indicator of depression
and similar studies have reliably utilised this classification system to identify anxiety
(Wiedemann et al., 1999). The AVM is effectively elucidated in an article by Liu et al.
(2010), who visualise an adapted AVM based upon Russell‘s Circumplex Model (Russell,
1980). In summary of the process, the majority of relevant research asserts that heightened
left hemisphere activity is indicative of positive emotional states and conversely, right
hemisphere activity is indicative of negative emotional states. In addition, greater amplitude
of neural activity indicates intense emotional states (fear and joy, as opposed to sadness and
satisfaction). Whilst significant evidence supports this model, acceptance of this rule as an
absolute has been contested (e.g. Liu et al., 2010).
Tom Garner
2012
University of Aalborg
A Support Vector Machine (SVM) is a non-probabilistic linear classifier that traditionally
exists as a binary model; able to distinguish between two discrete categories (though they
may be adapted for multiclass use through approaches such as the one-verses-all calculation
[Chapelle, Haffner & Vapnik, 1999]). SVM classifiers feature in emotion recognition systems
frequently (e.g. Flórez et al., 2010; Sourina & Liu, 2011; Takahashi, 2004). SVM
classification is a relatively simple approach and consequently less likely to demand high
computer processing resources. Chanel (2009) asserts that strong generalisation of datasets
and ‗interesting performances in high-dimensional feature spaces‘ are characteristic
advantages of SVM processing. Whilst this may advocate SVM as a viable approach to
classification, emotion recognition accuracy ratings vary (for example, 41.7% [Takahashi,
2004], 64-71% [Flórez et al., 2010]). Murugappan et al. (2010) implicitly reject SVM in
favour of K Nearest Neighbour (KNN) and Linear Discriminant Analysis (LDA) approaches.
KNN and LDA classification algorithms also feature in several emotion-recognition models
(Levis, 1995; Murugappan et al., 2011; Rizon, 2010). The dates of these publications suggests
that the developmental history of classification systems does not disregard one approach in
favour of another but, instead, researchers appear to be focussed upon improving the various
approaches simultaneously, with greater attention paid to the minute details. This research
track has led to the employment of various subclasses of classification formulas, including
Relevance Vector Machines (Chanel, 2009) and Absolute Logarithmic Recoursing Energy
Efficiency (Murugappan et al., 2010).
Whilst it could be asserted that to properly support any EEG biofeedback system, a
comprehensive evaluation of all models of processing is required. Of course such an
approach would be extremely time-consuming and from an overview of relevant literature it
is clear that algorithms, classification frameworks etc. are subject to ongoing development
and any comprehensive evaluation would quickly become outdated. In response to this
argument, the notion that a restricted or windowed evaluation of methodologies would
provide a compromise between informed design and practical efficiency is arguably
appropriate. It could further be posited that, due to the multitude of variables that potentially
influence best processing method (equipment, context, desired output, etc.), comprehensive
pre-testing is advised to enable the direct comparison of a range of processing approaches
within a controlled scenario.
CONCLUSIONS AND CHAPTER SUMMARY
From the discussions throughout this chapter EEG, EDA and EMG psychophysiological
measures show significant potential for accurate and reliable quantitative assessment of
player affect. Although there are clearly no perfect systems currently available, the research
conducted so far strongly advocates biometrics as a key component of emotion recognition.
The key problem lays in the translation between quantitative physiology and qualitative
psychology, highlighting the difference in embodied existence between man and machine.
Appropriate contextualisation is arguably the solution, artificially embodying the biometric
data via attachments to emotion-related language and expression. Chapter 6 presents
methodology designs for three related preliminary experiments, all of which focus upon
105
106
Psychophysiology and Biometric Feedback Systems in Computer Video Gameplay
assessment of affective experience in response to sounds modulated by way of digital signal
processing (DSP). The intention of these experiments is to discover if such detached
modulations (the DSP effects have no predefined contextual association to the situation,
theme or atmosphere of the testing environment/game) are capable of intensifying/
attenuating a fearful experience, or if context is indeed essential to successful manipulation of
an individual‘s emotional state.
Tom Garner
2012
University of Aalborg
107
Chapter 6
Methodology Designs
Garner, Tom A.
University of Aalborg
2012
108
Methodology Designs
Chapter 6: Methodology
Designs
INTRODUCTION
To better understand the interrelationships that exist between sound and emotion, three
preliminary experimental trials were built, the intention being to assess the potential of
objective acoustic parameters to evoke and/or modulate fear during computer video
gameplay. One key hypothesis raised from the literature reviews and discussions of previous
chapters is that the embodied nature of listening dictates that objective acoustic parameters, if
mechanical in nature and detached from semantic meaning, are incapable of altering a
player‘s emotional state. If it is accepted that all listening is embodied, then such a thing as a
completely objective acoustic parameter cannot exist, as it is inappropriate to suggest that a
listener could not attach their own connotations to even the most discrete quantitative
parameter. This differentiates objective from quantitative and within the contexts of
acoustics, parameters (including volume, reverberation, delay, attack, etc.) can be measured
in quantitative scales (primarily seconds, hertz and decibels), yet cannot be labelled objective.
However, as with the concept of virtuality, objectivity/subjectivity can arguably be
understood as a continuum, within which particular acoustic factors can possess a higher
objectivity value. The sounds utilised within these experiments are treated with a range of
quantitative digital signal processing (DSP) effects with minimal intended semantic
attachment. The source sounds themselves (particularly within experiments 1 and 2) arguably
also possess limited connotative information, as there is no discernible connection to the
sounds heard and the circumstances (both diegetic and extra-diegetic) the listener is placed in.
Should the obtained data reveal statistically significant difference between treatment groups,
then the evidence would support that sound is capable of modulating emotion without distinct
contextual support.
EXPERIMENT 1:
1.1 WEB MEDIATED ASSESSMENT OF AFFECTIVE GAME SOUND
The preliminary testing documented within this thesis utilises the internet as a platform for
data collection. Collections of sampled audio files that feature in both the biometric pre-trial
and the real-time intensity experiment are also presented in two internet-mediated tests. The
full details of these internet trials are disclosed later in the chapter. The following three
sections examine existing relevant theory and experiment-obtained information to
contextualise a subsequent empirical investigation and also establish a theoretical foundation
to support the methodology (documented later within this chapter).
Using the internet to cast a wider net and access millions of potential participants has a
significant appeal but is, however, fraught with methodological difficulties. This section
provides an assessment of electronic research (incorporating internet-mediated testing)
reliability, comparing online internet responses with data collected in traditional testing
Tom Garner
2012
University of Aalborg
environments and presenting an overview of the advantages and limitations of using the
internet to perform various forms of data collection. The focus of this review is upon internetmediated approaches to experimentation methodologies for primary research, not conducting
secondary research (such as literature reviews). The information presented within this
literature review then feeds into the methodology for a bespoke internet-mediated testing
environment, in which quantitative data is collected in response to a variety of audio samples
to evaluate perceived user-defined fright, emotional valence and arousal. Results,
documented in the next chapter (7), support the hypothetical frameworks that are presented in
chapter 9.
The primary test hypothesis for Experiment 1 states that both intensity and valence measures
of user-affect will differ significantly between control (untreated) and test (DSP treated)
sounds. It is further expected that there will be significant difference between individual
source sounds. The secondary hypothesis asserts that there will be no significant difference
between online and offline test group results.
1.2 THE POSSIBILITIES OF E-RESEARCH
‗[The internet] holds the promise to achieve further methodological and procedural
advantages for the experimental method and a previously unseen ease of data collection for
scientists and students.‘ (Reips, 2002)
The internet provides an ever-increasing wealth of possibilities for psychological research
and the number of legitimate web-based experiments is growing significantly (Birnbaum,
2004; Lewis et al., 2008). Online technology supports the hosting of highly interactive test
environments, provides a dialogue between researcher and participant, can deliver surveys
and complex tests to a great audience, and even supports real-time access to physical
laboratory environments from a remote location. Received data obtained from both a
traditional (pencil and paper) and electronic (online network) method yielded comparable
results and it has been suggested that the internet is the next step forward in developing and
presenting survey materials (Lewis et al., 2008). Progress appears set to continue and it has
been posited that the Internet ‗will decisively shape the nature of psychological research‘
(Nosek & Banaji, 2002)
This chapter specifically addresses web experiments, first by way of outlining the substantial
benefits then proceeding to address methodological concerns and potential solutions. A range
of relevant literature is documented and cross-examined alongside current technological and
socio-cultural circumstances. The term web experiment here refers to an online task that
requires participants to interact with web-based materials (images, audio, video, etc.) and
provide response data either directly from the task (e.g. response time, task actions) or posttask (e.g. opinion obtained in debrief). Although varied terminology for this online medium
of psychological experimentation exists, Reips (2002) argues that web experiment is the
preferred term, being historically the first term implemented for this purpose.
109
110
Methodology Designs
Web experimentation requires no physical space or materials (excluding client side hardware)
and is unlikely to require additional personnel for operations, development or maintenance
(Stanton & Rogelberg, 2001). The cost of web space is becoming increasingly affordable and
domains can be continually reused for new web experiments, whilst professional grade web
authoring software is becoming increasingly powerful, accessible and affordable. The nature
of web experiments requires that the entire process be displayed and open, providing accesses
for anyone who may wish to replicate or develop an existing test (Reips, 2002). Furthermore,
online participation bypasses scheduling difficulties and supports the simultaneous input of
multiple users and provides continuous access to web experiments, facilitating significantly
fast delivery of data from large samples (Reips, 2002). The opportunity for anonymity during
participation provides access to individuals/groups that would refuse to take part in an offline
equivalent test (Schmidt, 1997). Integration of pre-established online protocols can facilitate
the automation of data collection and storage; in addition, obtained data can be refined by
elimination of incomplete or undesired responses and additional information (reaction time,
response time/date) can be collected and stored automatically (Reips, 1995).
Generating powerful datasets that inspire confidence in research conclusions is a highly
desirable characteristic of both academic and commercial research. Large sample sizes also
support increased generalisability of results to the general population (Horswill & Coster,
2001) and will increase the statistical power of obtained results. The internet offers a
potential solution and has been described as an ‗inviting opportunity to reach large numbers
of people‘ (Welch & Krantz, 1996). In some cases the internet has proved to be the single
solution for academic researchers who require large sample numbers and have limited
resources (Klauer et al., 2000). The increased accessibility and relative ease of data input in
computer-based psychological tests (compared to traditional pencil and paper methods)
enables higher response rates and more comprehensive information to be received (Kiesler &
Sproull, 1986). In addition to connecting the researcher to large sample sizes, relatively rare
and highly specified populations may also be accessible (Schmidt, 1997). Web
experimentation has the potential to overcome geographical distances and provides links to a
wide variety of societies and cultures that maybe inaccessible by other recruitment solutions
(Reips, 2002).
The online adaptation of existing field or laboratory research has the potential to increase the
external validity of such prior experiments by way of providing a more natural environment
(most likely the participant's home), where the participant completes the experiment in
comfortable and familiar surroundings (Reips, 1995). Reduction of experimenter effects and
wider participant access to increase representativeness of the sample supports external
validity further (Reips, 1997). Nosek and Banaji (2002) argue that it is the removal of
experimenter-based coercion effects that grant internet methodologies this increased validity.
A suitably constructed web experiment may provide anonymity for the participants, reducing
potential deception and encouraging input of sensitive information that a participant may feel
uncomfortable revealing (Reips, 2002). Welch and Krantz (1996) noted that a significant
number of participants comfortably provided critical feedback by way of a networked system,
Tom Garner
2012
University of Aalborg
suggesting a greater willingness to criticise through an indirect medium compared to direct
interfacing with the researcher. It has also been asserted that anonymous and indirect
participation reduces the tendency to respond in a socially desirable way, providing freedom
from social convention bias (Booth-Kewley, Edwards, & Rosenfeld, 1992).
Stanton and Rogelberg (2000) argue that modern computing technologies support increased
interactivity in web experimentation. Graphics, animations, video, sound and images are all
supported by the modern internet and consequently are available to utilise within web
experiments (Mutz, 2011: p.7). The internet also exists as a component of remote-distance
laboratory interaction, a substantial evolution in laboratory experimentation with significant
application potential in industry, academic and education contexts (Schauer et al., 2007).
Although the premise is over a decade old, it is still undergoing regular development and
gaining in popularity. LabVIEW (www.ni.com/labview) is a long-running solution,
essentially providing an online graphical user interface (GUI) representative of a real
laboratory environment. Users can consequently interact with a physical laboratory
environment in real-time and receive experiment data from their remote location. Modern
computer technology enables these interfaces to present realistic virtual worlds in which high
quality graphic representations of physical equipment and devices support increased
immersion, and consequentially high user-confidence and performance whilst operating the
program (Scheucher et al., 2009).
1.3 THE METHODOLOGICAL PERILS OF E-RESEARCH
To effectively utilise the internet for conducting web experiments it is vital that a number of
technical, methodological, procedural and ethical concerns be addressed (Reips, 2002).
Walsh et al. (1999) propose that the perceived credibility of information obtained from the
internet varies substantially depending upon demographic variables, and that many perceive
internet-based research to be significantly less credible than that conducted using printed
media. Whilst this arguably applies mainly to secondary research practices, it is still
conceivable that such negative attitudes may also extend to web experimentation. With
almost 60% of the United Kingdom accessing the internet daily and consistent increases in
internet usage (ONS, 2010), it is conceivable that confidence in the internet as a viable
medium for academic study is growing. However, it remains likely that many would still
maintain distrust towards internet-mediated research, and consequently it is strongly
advisable that every precaution be made to strengthen the validity, reliability and
generalisability of any web experiment methodology.
With the ultimate purpose of this section being to support the development of a web
experiment pertaining to variation of audio samples, it is important that possible variances in
the presentation of audio via the web be addressed. Roychoudhuri et al. (2003) state that the
‗human ear is more sensitive to quality degradation than the human eye‘, and therefore every
effort should be made to maintain audio quality. This problem can be extended further with
regards to client side variances (hardware, browser, etc.), raising the concern that such
differences all have the unwanted potential to alter the presented sound. Roychoudhuri et al.
111
112
Methodology Designs
(2003) provide a detailed description of the quality and performance of various audio
compression formats presented through a number of connection types and speeds, their work
suggesting that various formats and compression types should be pre-tested on a range of
connection speeds to confirm uniformity.
The nature of computer-mediated web experiments allows a researcher to overcome
geographical and logistical barriers to connect participants to tests without necessitating a
physical presence. It is, however, this absence of researcher presence that generates
significant methodological issues, whereby validity becomes increasingly questionable due to
lack of control over erroneous variables. In the place of a traditional and controlled
laboratory environment, web experiments are completed in various different surroundings
with the researcher unable to directly control; the participant, any characteristics of the room
(temperature, light, humidity, acoustics, etc.), the presence of distracters (other people,
background noise, participant multi-tasking, etc.), the time of day or the equipment
(hardware, operating system, software, etc.) used. This presents additional concerns with
regards to testing sound by way of a web experiment, as participants may use an array of
alternative sound outputs (speakers, headphones, surround sound systems, etc.) and specific
control parameters (user-set volume, equalisation, stereo panning, etc.).
Without interaction between participant and researcher, confirmation that the participant fully
understands the requirements of the tasks is very difficult to obtain and participants may
(accidentally or deliberately) confound results by completing the task with another individual
also present, by accidentally leaving questions unanswered, or by intentionally providing
false information (Nosek & Banaji, 2002; Reips, 1997: p.381; Reips, 2002). Krantz (2001)
suggests that precise control of how test materials are presented and perceived without the
attendance of a researcher is particularly difficult. Several potential solutions for this specific
problem include: a list of requirements as part of the briefing (complete the task alone, be in a
quiet place, etc.), utilising web scripting to limit the allocated time for task completion, and
debrief questions asking the participant to report on their environment (Nosek & Banaji,
2002).
The attendance of a live researcher characteristically imposes social pressure upon the
participant that, although it may be treated negatively as a form of bias, may also contribute
towards reduction of deliberate deception (Nosek & Banaji, 2002). Without careful
participant observation, there is a notable risk of participants missing a crucial requirement of
the task or performing an action that disrupts the experimentation procedure. Participants
may use web controls (back, forward, refresh, etc.) in a way that compromises the experiment
(Nosek & Baniji, 2002), or they may neglect an option from a drop-down menu, causing a
missing or incorrect response (Reips, 2007: p.375). Such problems can, however, be easily
addressed with careful design and pretesting. Modern web authoring software such as
Dreamweaver CS6 (Adobe, 2012) provides accessible and powerful tools to enable discrete
control over participant input. Text input fields can require precise numbers of characters or
digits and specific formats (email, web address, etc.), whilst access to browser control
functions can be limited, hidden time limits applied to interactions and input validation tools
Tom Garner
2012
University of Aalborg
confirm that all data has been correctly entered before the user can continue. Scripts can
facilitate immediate written feedback should a participant make a mistake or leave a page
incomplete.
Producing datasets that can be confidently extrapolated beyond the confines of the
experimental context in which they were obtained is best achieved by way of cross-validating
results using alternative research modalities (Cho & LaRose, 1999). Stanton and Rogelberg
(2000) suggest that ‗research comparing browser-based results with other modalities paints
an optimistic picture‘, referencing a number of experiments that reveal notable consistency
between results obtain via web experimentation and alternative mediums. It may however, be
inappropriate to rely solely upon cross-modal testing to confirm generalisability without also
addressing the related generalisation issues intrinsically associated with web experimentation
and their potential solutions.
A significant asset of internet-mediated research, such as access to large sample groups, can
only be fully exploited at the cost of significantly increased risk to the generalisability of
obtained results. Depending upon the specific details of the recruitment method, the nature of
web experimentation has the potential to generate a sample of participants that do not
adequately represent a target population and, therefore, obtained data lacks generalisability. It
has been suggested that because web experimentation inherently requires participants to have
internet access, obtained results are likely to become skewed towards overrepresentation of
particular demographics (Stanton & Rogelberg, 2000). Subsequent survey studies have
suggested that such effects (geographic location, age, gender, financial status, etc.) are
diminishing as the global penetration of the internet is exponentially increasing (Nielsen Net
ratings, 2000). However, demographic diversity has been revealed to significantly skew data
from online studies if specific differences, relevant to the research question are not controlled
(Stamler et al., 2000; Yost & Homer, 1998).
The absence of human contact within a computer-mediated web experiment also has the
potential to weaken generalisability if the study relates to any form of human interaction
(Reips, 2002). However, if the research is concerned with human computer interactions, then
it would be logical to assume that computer-based methodologies support, rather than hinder,
ecological validity. Within the context of the Experiment 1 research question, several aspects
of the web experiment methodology support the generalisability of obtained data. Personal
computers (PCs) used for web browsing are also commonly employed for gaming and
consequently the same hardware (monitor, speakers/headphones, control interfaces) is
increasingly likely to feature in both the web experiment and in recreational gameplay. In
order to maximise upon this opportunity, participants should be required to indicate if their
preferred platform for gaming was a PC or a console.
Furthermore, the separation between participant and researcher may result in a greater risk of
participant-sourced bias. Reips (2002) highlights multiple submissions as a common problem
due to the difficulty in identification of participants, suggesting that the same participant may
provide multiple datasets that cannot be separated by IP address (if using different terminals
113
114
Methodology Designs
or a connection that refreshes the IP address). Such an event may occur innocently (the
participant was unaware that they were not to repeat the test), maliciously, or as a result of
weak website design. A simple and efficient solution to innocent participant error and website
design could be to incorporate an explicit request to participants that they must only perform
the test once, explaining the reasons and importance of this; also to control data submission
(e.g. avoiding several sets of data being sent via multiple clicks of the submit button) whilst
ensuring that participants are given immediate feedback after submission of data (Cho &
LaRose, 1999). It could also be suggested that the data collected from an expert in the field
may also affect the generalisability of the data. Reips (1997) argues that providing an
opportunity for such an expert to identify themselves and their data is a viable solution to this
problem, allowing such persons to explore such experiments and provide feedback without
affecting the data.
Whilst such identification measures are logically sound for establishing some control, they do
not account for deliberate participant action. Stanton and Rogelberg (2000) refer to access
control (restriction of admission to web materials) and authentication (verification of
participant identity) as potential solutions to participant sourced bias; however, this raises
further questions regarding how to implement such control, who is worthy of access and what
impact will such a method have upon sample size. Existing research fortunately argues that
relatively low rates of multiple submission do not compromise reliability of results; however,
Frick et al. (2001) strongly advise identification data be collected before users begin
experimentation and Reips (2002) further suggests repeated questioning of standard
information (age, date of birth, etc.) to support the truthfulness of the participant. Generating
a comprehensive account of participant information may significantly reduce the risk of such
problems; however (as will be elaborated later in this section), methods of obtaining and
storing individual participant records present further difficulties in the form of ethical
concerns.
In addition to participant demographic issues, significant dropout rates may also negate the
generalisability of obtained results (by way of selection bias) if the causes of dropout
discriminate for or against specific population groupings. A study conducted by Buchanan
and Reips (2001) revealed that personality differences existed between Mac and PC users and
documented a correlation between JavaScript-enabled users and lower average education
levels. Their research argues that web experiments should employ only basic and widely
available technology, suggesting that rich multimedia websites could alienate particular
demographics. This is regrettable, because the incorporation of Flash (Adobe) and streaming
audiovisual materials supports the production of attractive and engaging web pages which
could arguably serve to increase participant uptake and retention throughout the experiment.
In agreement with Buchanan and Reips (2001), it has been suggested that such materials
require specific software and greater hardware resources to operate; subsequently causing
variation in user experience due to alternative connection speeds, client-side equipment
(Schmidt, 2000) and negatively impacting upon both experimental validity and
generalisability.
Tom Garner
2012
University of Aalborg
Addressing the above concern, modern internet and personal computer capabilities have
arguably reduced compatibility concerns considerably. A Millward Brown (2011) survey
claims that 99% of web browsers are compatible with Flash technology with 80% Javaenabled. Flash, in particular, could be positioned as a fundamental internet technology, with
both YouTube (averaging 2 billion views per day) and Facebook (more than 2 billion videos
watched per month) requiring flash technology (Internet in Numbers, 2010). Although this
cannot account for potential variations in browsing and multimedia playback speeds,
comprehensive pre-testing of web materials, (accounting for potential variations in hardware,
operating system, web browser, security software, third party plugins, connection type, and
connection speed) goes a great distance towards administering control over such systematic
biases (Stanton & Rogelberg, 2000), ultimately reducing both dropout risk and erroneous
variation in between participant experiences.
Dropout curves should be differentiated from dropout rates, and are a measure of when a
dropout occurs during the process of experimentation. Dropout curves risk a compromise of
validity via analysis bias if incomplete data sets are not identified and discarded (Reips,
2002). Although removal of incomplete responses may solve this concern, methods of
reducing dropout are preferable, as they serve to retain the potential of large sample sizes.
Acting upon the notion that dropout rates are highest during the opening few minutes of a
web experiment, Reips et al. (2001) suggest implementing a warm-up phase, utilising several
incidental tasks/questions before proceeding to the experimental phase.
1.4 ETHICAL CONCERNS
Participant privacy and anonymity are arguably the primary ethical concerns associated with
web experimentation (Cho & LaRose, 1999) and any research conducted through this
medium must carefully adhere to appropriate conduct whilst enrolling participants and
handling personal data (Davidson, 1999). Indirect approaches to recruitment (registering with
a search engine, advertising banners within other sites), although unobtrusive and ethically
preferable, are relatively ineffective in recruiting large numbers of individuals (Tuten et al.,
2000). Comparatively, direct requests by email, post or phone may yield greater responses
but may be perceived by some as invasive. Careless recruitment practices may have a
negative impact upon participant privacy, and also encourage malicious response from
individuals who feel angered by persistent and/or intrusive requests for participation (Cho &
LaRose, 1999). Viral advertising is believed by some to be the most effective approach to
enrolling participants (Nosek & Banaji, 2002) and, if conducted appropriately via word of
mouth and an attractive, interesting website, it may be the ideal approach to satisfying both
ethical and sample size criteria.
The selecting of participants for interactive web experiments offers a diverse range of
options. Of these, self-selection supports larger sample sizes, but in addition to associations
with socio-cultural biases (discussed earlier) raises ethical concerns due to researchers
utilising internet-mediated methods of collecting participant identity information (Reips,
2002). Identification of the participant is invaluable when attempting to control self-selection
115
116
Methodology Designs
bias, multiple submissions and sample characteristics. However, direct requests and
subversive automatic collection of identification data greatly threatens a participant‘s
anonymity whilst also raising concerns regarding vulnerability towards potential fraud and
identity theft (Stanton, 2008). Such fears are becoming increasing justified, with 2005 seeing
56 billion US dollars lost to corporate and consumer identity theft (Romanosky, 2011) and
UK fraud associated with banking information generating £609.9 million in consumer losses
in 2008 (ONS, 2011). Failure to account for such concerns may severely deter potential
respondents and dramatically reduce response rates (Bartel-Sheehan & Grubbs-Hoy, 1999).
As a result, it has never been more crucial that participants‘ personal information be securely
handled or not required during web experimentation. Nosek & Banaji (2002) argue that the
storing of sensitive material on internet servers leaves the data vulnerable to theft if not
meticulously protected. Although modern protection services do exist, which offer strong
(but not necessarily impenetrable) security, it could still be suggested that transferring serverstored material to off-line devices is the only way to guarantee protection from online data
theft.
Analysis of existing research suggests an unfortunate obligation to compromise between
validity, ethics and generalisability. For example, automatically collecting an array of
participant information (IP address, operating system, screen resolution, browser type)
requires the use of plug-in based procedures such as JavaScript which have been revealed to
produce variation in browsing speeds and compatibility issues, leading to systematic bias and
increased dropout rates (Schwarz & Reips, 2001). Although both Java and JavaScript
procedures have become increasingly common, with over 82% of United States based
internet browsers compatible (WebKnow, 2011), significant issues remain, with both
YouTube and Facebook forums documenting problems associated with this technology. This
creates a genuine concern as, without secure knowledge of client-side software, the
researcher cannot account for variations in participant experience derived from these
differences. Reips (2000) provides a potential solution in multiple site entry technique, a log
file setup facilitating the collection of hyperlink information, allowing the researcher to
observe what site led the participant to the experiment. Unlike plugins using JavaScript, logbased automated data collection procedures have not been associated with website variation.
However, most information collected using this method is inferential and although systematic
variation is unaffected, ethical issues (participants may not wish researchers to know details
of other websites they have accessed) remain present. Conducting a web experiment that is
unsuitable for children or that wishes to collect data from only adult respondents is fraught
with additional ethical difficulties. Nosek & Banaji (2002) suggest a range of potential
solutions, including minimising features that might appeal to children, recruiting from adultdominated sources, direct invitation only, and authentication by way of a centralised
registering process.
Additional care must also be taken to ensure that participants fully understand the nature of
the experiment and are willing to contribute. In a laboratory environment, a researcher has the
opportunity to brief the participant, obtain informed consent and respond to any unforeseen
issues that cause discomfort or distress. The absence of a researcher in web experiments
Tom Garner
2012
University of Aalborg
means that real-time feedback is not possible and, consequently, briefing should be
comprehensive and consent forms should be transparent, with possible participant questions
anticipated and answered (Nosek & Banaji, 2002). This raises questions regarding the ethical
acceptability of duplicity or intentional withholding of information during the briefing
process. Whilst such actions may support the requirements of the experiment and reduce
participant bias, the risk to a participant‘s wellbeing is greater when compared to an
equivalent scenario with a researcher present.
In summary of this review, internet-mediated experimentation can clearly overcome barriers
associated with traditional laboratory experimentation, providing access to large sample sizes
via a medium that is highly cost effective. High participation rates increase the potential for
external validity, and more statistical power is afforded during data analysis. Large quantities
of data can be collected, pre-processed, stored and even analysed automatically using serverside routines. This enables the design of increasingly efficient and, consequently, desirable
methodologies. Such benefit does naturally not come devoid of cost, and careful attention to
methodological pitfalls is essential during the design stage. For the purposes of audio-based
experimentation; variations in hardware, settings, connection speeds, audio compression and
encoding formats must be addressed to support any conclusions drawn from obtained
datasets.
The methodology below consolidates most of the theory presented thus far (specifically the
discussions relating to internet-mediated research discussion, fear theory and the emotion
potential of sound) to support the development and execution of an online interactive testing
environment built to assess the impact of digital signal processing effects upon users‘
subjective emotional experience. Online and offline testing methods are also compared to
evaluate both the advantages and limitations of internet-mediated testing, as already
discussed within this chapter.
1.5 METHODOLOGY INTRODUCTION
The methodology of experiment 1 takes its influence from both the preceding internetmediated experimentation literature review and the chapters concerning sound, perceptual
listening and affective states. Data is obtained by way of both online (internet-mediated,
uncontrolled testing environment) and offline (local-network, maintained environment)
methods to simultaneously provide information regarding the ability of digital signal
processing (DSP) effects to alter the emotion evocation potential of a sound whilst also
comparing online to offline environments to test the assumptions regarding web-mediated
limitations documented earlier in this chapter. Experiment 1 is split into two separate tests,
the purposes of which are to exploit existing theoretical and applied research for the function
of developing an internet-mediated audio testing environment within which the affective
quality of preselected sounds can be assessed and recorded as quantitative measurements. The
specific interest of this research is in the potential of variable audio parameters to influence
the valence and intensity of a listener‘s fear sensation in response to a presented sound.
Selections of acoustic parameters that can be easily manipulated using DSP are employed.
DSP selection is supported by research documented earlier within the thesis (chapter 3), that
117
118
Methodology Designs
asserts manipulation of these parameters may have the capacity to alter the fear potential of a
source sound (a term used within this chapter to refer to the original, unprocessed incarnation
of a selected sound). The information gathered from this experiment feeds into the
development of the final experimental trial documented within this thesis; a bespoke
computer video game level incorporating these sounds and measuring the affective states of
participants by way of psychophysiological measures to further explore which sounds are
most suitable for evoking a particular emotional state during gameplay.
1.6 WEBSITE DESIGN
Two separate web experiments (referred to on the website as online tasks) were developed
and hosted on gameresearchers.co.uk (site no longer live), a bespoke academic website
constructed exclusively (and in its entirety) to support these tests. The design language is
primarily Extensible Hypertext Markup Language (XHTML) 1.0 and Cascading Style Sheets
(CSS), defining the basic structure and appearance of the web pages. PHP (version 5.3.4)
scripting was employed specifically within the web experiment pages to enable collection and
processing of user-inputted data between web pages, and transfer of information to a
designated email account. All programming and debugging tasks were carried out using
Dreamweaver CS4 (Adobe, 2008) with additional programs enlisted to develop assets and
materials. Graphical images were processed in Photoshop CS4 (Adobe, 2008), animations
developed using Flash CS5 (Adobe, 2010), video clips developed in Vegas Pro 9 (Sony,
2009) and audio files edited with Cubase 5.1 (Steinberg, 2009), all of which were then
imported into Dreamweaver before uploading onto a hosting domain.
Despite existing concerns that rich web content may both exclude potential participants, and
generate variation between user experiences (Buchannan & Reips, 2001), current statistical
information (Millward Brown, 2011) supports the implementation of Flash as a highly
compatible format for iOS (Apple), Windows (Microsoft) and Linux operating systems. In
comparison, accessible alternative formats of audiovisual playback including QuickTime
(.mov/.mp4), Windows Media (.wmv/.avi) and RealPlayer (.rm) were all revealed in pre-tests,
to be more likely to require installation of additional software (particularly between
alternative operating systems). Gameresearchers.co.uk heavily utilises the Flash (.flv/.swf)
formats for animations, interactive buttons, audio triggers and full motion videos (FMV). In
addition to providing a reliable platform, Flash allows significant control over how online
media can be interacted with, enabling greater researcher control over the website
environment.
Van Duyne et al. (2007: p.10) insist that website design should first and foremost reflect the
needs and preferences of the user, a concept referred to as user-centred web design. Pertinent
design specifications associated with this concept include ease of use, performance,
satisfaction, and content. Van Duyne et al. differentiate this design mode from technologycentred design (a showcase of web technology and technical mastery) and designer-centred
design (prioritising the aesthetic and creative image of the website); they assert that both
technical and design aspects should be chiefly governed by user requirements. Walsh (2009)
Tom Garner
2012
University of Aalborg
outlines several development points and asserts that such considerations are vital to the
production of a good website. Precedence (dominance of visual material that directs the
attention of the user), clear spacing, orientation, typography (font, colour, size, paragraphing),
usability, alignment, clarity and consistency criteria emphasise the considerations that must
be made to accommodate potential users and support the assertion that accessibility and
function should be prioritised over aesthetics and technology. Responding to the above
specifications, gameresearchers.co.uk adopts a contemporary, two-tone colour scheme with
limited text, padding (clear space between elements) and a consistent style. Animated flash
sequences are positioned to direct user attention towards the desired links and a particular
emphasis is placed upon the links to the two web experiments.
1.7 THE HORROR GAME SOUND DESIGNER (HGSD)
The intention behind this online task was to place the participant in the role of sound designer
for the development of a fictitious project. Various contextual circumstances were presented
alongside alternative sounds (in some sections, DSP variations of the same sound; in others,
entirely different sounds) and the participant was required to select the sound that they
believed best supported the relevant context. The inspiration for the design of this task is
largely accredited to the Participatory Audio Research Tool (PART), developed by the
interactive institute, Sweden (www.tii.se). PART utilises video to depict a situation and
contextualise the associated audio. Participants are required to manipulate the audio by
various parameter changes and then submit the processed sound they believe would most suit
the circumstances (Fagerlönn & Liljedahl, 2009; Fagerlönn, 2010). Such freedom for audio
manipulation is beyond the scope of this chapter and consequently, HGSD user input is
restricted to selection of a single sound from a collection as opposed to submitting
customised audio.
Figure 1: Screenshots for Horror Game Sound Designer (left) and Sounds of Fear (right)
119
120
Methodology Designs
Within HGSD, eight individual web pages host interactive audio-visual material and request
participant feedback. All sections require the participant to view a video clip to establish
context, then listen to a selection of audio samples and decide which sound is most
appropriate. Only one sound can be selected within each section and the participant is not
required to rank the sounds but select a single sound that they deem the most suitable within
the presented context. Each sound is embedded within an animated (Flash) button, responsive
to the mouse. Placing the mouse over a button activates the sound (alongside a visual cue to
reinforce usability) whilst leaving the button space terminates and resets the sound to remove
the potential for multiple sounds overlapping, ultimately supporting accessibility. The video
material can be controlled (play, pause, stop, tracking, audio mute and volume control) via a
standardised control panel integrated into the video boxes. This allows the participant to
activate and replay the video at their discretion, ultimately providing them ample time and
flexibility to thoroughly compare and evaluate the sounds. Browsing movement throughout
the task is restricted to limit variation between user experiences; browser navigation (back,
forward and refresh) is disabled so users can only progress via purpose-built navigation icons
embedded into the pages, five minute viewing limits are applied to each page and
confirmation windows are activated if the participant closes the browser.
During the pretesting phase (documented in more detail below), respondents were asked to
rate the sensitivity of several personal questions to provide an approximation of which
questions were likely to increase dropout rates by raising security concerns. The results
revealed that although date of birth was rated as highly sensitive, identification of age
grouping (18-25, 26-30, etc.) rated low. Gender, marital status and nationality also rated low
whereas income and contact information rated high. Specific information relevant to the
research question (hours spent playing games per week, number of games bought in a month,
preferred gaming platform, computer specification, operating system, web browser, audio
hardware) all received low ratings, whilst over 90% of respondents revealed they would
prefer to volunteer such information rather than have it taken without their knowledge via
web technology. A username identification system, in which participants created a 7-9 digit
memorable identification tag (to separate individual datasets), 3 replaced the original email
request in response to the pre-test findings. Following from the eight sections within HGSD a
final debrief page is presented that requests additional information. In response to pre-test
information, this page requests user age group, gender and nationality, number of hours spent
playing games per week (0, 1-5, 6-10, 11-15, 16-20, 21+) and sound hardware used during
testing (stereo speakers, stereo headphones, surround compatible speakers, surround
headphones). Javascript processes and IP address logs are not used at any point within
gameresearchers.co.uk.
1.8 THE SOUNDS OF FEAR (SOF)
In contrast to the Horror Game Sound Designer, SOF utilises the Self-assessment mannequin
(SAM), a scale system that relates quantitative measurements to emotional experience by
way of graphical representations (Bradley & Lang, 1994). Although the original SAM scale
utilises a three-dimensional model (activation, valence and dominance), it is the former two
Tom Garner
2012
University of Aalborg
measurements that are employed within this online task. This is because the information
retrieved from SOF would ultimately serve the development of an electromyography-related
emotion measurement framework which characteristically exists within a two dimensional
model of activation and valence (Ravaja et al., 2004; Russell, 1980). Subsequently, users
were required to rate each of the 24 sounds (12 sources, each in treated and untreated
incarnations) for both intensity (measured on an ordinal scale from 1 [low] to 9 [high]) and
valence (same ordinal scale, 1 = highly negative, 5 = no clear valence, 9 = highly positive).
With the exception of a debrief questionnaire (identical in both form and content to that
which is utilised in HGSD) that is presented after the main task, The Sounds of Fear web
experiment is contained within a single HTML page to facilitate accessibility and allow
participants to observe the test environment whilst simultaneously viewing the tutorial video.
The SOF page hosts twenty four individual audio files, embedded within interactive flash
buttons that activate the sound when the mouse is positioned over the button and reset the
sound if the mouse leaves the button proximity (much the same way as in HGSD). Adjacent
to each sound button are two dropdown menus requiring a numerical (1-9) selection relative
to both activation and valence. Spry Validation tools (established in Dreamweaver) ensure
that all required fields are completed before information can be submitted, presenting a
warning to the participant if a section is incomplete. Copies of the SAM activation and
valence image are presented both above and below the sound buttons, allowing participants to
easily refer to the scale whilst listening to the audio. Limited browser controls and the
username identification procedure employed within HGSD are also present in SOF.
1.9 SOUND DESIGN
The audio samples embedded within both online tasks were sourced from pre-existing game
titles: Half-life 2 (Valve, 2004), Doom 3 (ID Software, 2004), and Amnesia –The Dark
Descent (Frictional Games, 2010). Prior to uploading, all sound files were treated using
Cubase 5.1 (Steinberg), a digital audio workstation software title hosting various third-party
plugins to support audio editing. All sounds were compressed via Cubase into CD quality,
MP3 format (256kbps, 16 bit resolution, 44.1 kHz) stereo samples averaging around 500
kilobytes of required disk space per minute of audio by way of a 5:1 compression ratio.
Figures 2 and 3 document all sounds embedded within both task webpages; outlining the
audio variations, parameter settings and processing details.
The Sounds of Fear assessed the affective participant responses to variations in six
preselected sound parameters: 3D positioning, anticipation period, attack, loudness, pitch and
sharpness. Each of the six sound variations were tested twice (to support the assertion that a
particular processing effect would alter emotional response in the same way, irrespective of
the source sound) and both a treated and untreated version was presented, generating a total
of twenty four sounds. 3D positioning is also referred to as localisation; a psychoacoustic
perception of the location of an audio source, gained from both acoustic and environmentally
sourced information (Grimshaw, 2009). Both the ease of identifying a source and the
121
122
Methodology Designs
positioning/movement of a sound have been associated with manipulation of the fear
response (Ekman & Kajastila, 2009; Bach et al., 2009). 3D positioning of audio samples was
achieved using a 5.1 surround processor within Cubase. Untreated audio variations were
equally balanced across all channels to centralise the location, whilst the treated variation
manipulated the surround output to simulate movement. In one instance the sound was
panned from front-right to rear-left and in the other, from rear-left to front right (ensuring that
the effect would be almost identical between surround and stereo outputs). Attack refers to
the difference (in time) between the onset of a sound and the initial intensity peak. Research
relating to this parameter suggests that short attack periods (sudden, immediately intense)
potentiate greater startle responses that are likely to be interpreted as frightening events. In
contrast, long attack periods slowly introduce a sound to the listener thereby greatly reducing
startle potential (Parker & Heerema, 2007). Horror-themed sounds with an attack of less than
100ms were selected as untreated audio variations and a volume envelope within Cubase was
employed to overlay a fade-in effect, effectively increasing the attack value.
Anticipation period refers to the temporal distance between the onset of an emotional priming
cue and a stimulus. For the purposes of the SOF task, both cues are presented within a single
audio sample, commencing with a priming sound of predetermined length immediately
followed by a startle cue. Relevant research has revealed that such pre-startle stimuli have
considerable potential to augment a subsequent startle probe (Bradley et al., 2005) by raising
our alert level and preparing us for immediate and direct response behaviour (Smith, 1999).
Preliminary testing of various priming sounds supported the selection of two comparatively
different ambient sounds (a high-pitched, continuous alarm and a reverberated water drip),
both of which could be looped to create samples of any required length. Cubase was used to
loop and blend the audio priming samples, generating two variations (one a 5 second prime,
the other a 20 second prime) for both sounds.
As previously referred to in chapter 3, Parker and Heerema (2007) suggest that evolutionary
development may have instilled instinctive fear responses to extreme pitches. Low frequency
audio may encourage a fear response by way of association with predator growling, whilst
comparatively, high-frequency sounds may evoke the same response by way of instinctive
connotations to human screams. The Sound Shifter P plugin (Waves) manipulated the pitch of
the treated audio samples, creating the most achievable difference between treated and
untreated variations without significantly distorting the sample. Research by Cho et al. (2001)
revealed that increased loudness and sharpness (higher frequency and purer tone) both have
the potential to produce discomfort and negative affect for the listener. This concept has not
yet been directly related to the emotional experience of fear. However, it is possible that
within a horror context, an increasing discomfort induced by audio could potentially be
perceived by the listener as a more catalytic emotional induction. A simple digital signal
boost within the Cubase native mixer was implemented to increase the decibel level enough
to create a significant difference, without distorting the sample or generating a volume level
that would be highly uncomfortable for the listener. The Q10 parametric equaliser plugin
(Waves) was used within Cubase to boost a high frequency channel (2 kHz) whilst
Tom Garner
2012
University of Aalborg
attenuating the overall volume to emphasise the tonal sharpness whilst ensuring that loudness
could not be a simultaneously contributing factor.
Audio treatment in the Horror Game Sound Designer incorporates greater numbers of
variations per audio parameter; between three and five in comparison to the two variations
presented in SOF. Increasing the number of variations per parameter enabled greater detail in
the assessment at the potential risk of generating alternative sounds that participants would be
unable to differentiate between. The digital processing of sounds was accomplished using the
same equipment and techniques as the sounds utilised for the SOF task.
Figure 2: Audio variations presented in the Horror Game Sound Designer task
Figure 3: Outline of audio variations assessed in the Sounds of Fear task
123
124
Methodology Designs
1.10 PRETESTING AND PARTICIPANT RECRUITMENT
In addition to gathering qualitative data regarding potential participants‘ opinions of privacy
and identity issues, both the online tasks and hosting website were rigorously assessed for
compatibility, speed, accessibility, navigation and aesthetic quality. All pages within
gameresearchers.co.uk were tested for variances across several operating systems,
connection types/speeds and web browsers. The configurations that produced no noticeable
variation in comparison to the offline version were detailed on the website and participants
were asked to only continue if their system was compatible. Both the PHP scripting
(employed to transmit data from the site to a designated email address) and Flash content
proved compatible with no significant performance variation across all pre-test computers.
Provided the internet connection was broadband, specific types including cable, Asymmetric
Digital Subscriber Line (ADSL), Symmetric Digital Subscriber Line (SDSL) and Local Loop
Unbundling (LLU) revealed no noticeable differences in performance or compatibility (this
includes both wired and network variations). Interaction with the online tasks whilst other
internet activities were occurring via the same connection did however diminish the browser
speed of both experiments and consequently, participants were requested to close all other
internet functions (including other users accessing the same connection from another
location) during completion of the test. Sample sizes varied between HGSD (N = 38, male =
25, female = 13) and SOF (N = 28, male = 16, female = 12) despite multiple requests within
the website for participants to complete both tests.
1.11 OFFLINE TESTING
In recognition of the importance of cross-validating results using alternative research
modalities (Cho & LaRose, 1999), an offline version of the entire website was constructed.
Every aspect of this version was identical to the online website and with the single exception
that submitted participant data was transferred directly to a log file rather than an email
account. Participants (N=6, 1 female and 5 male – completed both experiments) were all
students from a college in North West England who met the same filtration criteria as that set
in the online experiment. Participants navigated the site and completed the tasks using an
Apple MacBook (2.4 GHz Intel Core 2 Duo processor, 2 GB RAM) running at 1280x800
visual resolution and Triton AX pro 7.1 surround headphones. The test environment consisted
of an indistinct, small classroom space (low artificial light and only an internal observation
window) containing a single desk and chair. Participants were given two minutes to navigate
around the website with a researcher available to address any concerns. The researcher
remained present during the tutorial video, and each participant was asked if they had any
further questions before providing written consent and beginning the experiments (during
which the researcher left the room, returning between tests and for the debrief). The
completion time for setup, both tasks and debrief averaged 10 minutes 26 seconds (mean).
Tom Garner
2012
University of Aalborg
EXPERIMENT 2:
2.1 REAL TIME VALUE OF PRESELECTED
SOUND PARAMETERS DURING GAMEPLAY
Although revealing significant difference in qualitative affect between DSP audio treatment
groups in a controlled, audio-only, environment does provide invaluable evidence relevant to
the thesis, it must be acknowledged that the results documented within the previous section
cannot be automatically applied to a computer video game context. Within such
circumstances an array of additional complex stimuli are presented to the player alongside
additional goals and motivations. The presented sounds when experienced during gameplay
are susceptible to various contextualising elements, from synchresis with visual information
to a player motivation to impress the researcher. Bearing this in mind, the following sections
document the second of three preliminary trials, an assessment of players‘ subjective selfreport of fear intensity in response to varied audio treatment groups (specifically pitch,
loudness and localisation) within an interactive gameplay environment. The initial hypothesis
For Experiment 2 reflects that of Experiment 1, asserting that a significant difference will be
observable between DSP treated sound and control groups. In addition, it is expected that the
real-time qualitative responses will not significantly interrupt gameplay and will correlate
with players‘ post-gameplay analysis of their emotional state.
2.1 PRELIMINARY TESTING
Table 1 represents a range of sound properties and effects organized according to their
objective/subjective parameters, their potential (based on the literature review) for inducing
different types of fear and the ability for a game procedural audio engine to manipulate.
Several preliminary trials were conducted using the same game level and selection of sounds
used for the experiment described below. These preliminary trials utilized the same procedure
that is outlined in section 1.5 below and similar equipment was used, but data was collected
entirely using debrief self-report. In these trials, participants were asked to complete a
modified version of the questionnaire that was to be used in the main experiment. 20
individuals participated and 3D positioning, distortion, chorus/modulation, equalization,
loudness, reverberation, stereo panning, ADSR, dissonance, and pitch were selected as
treatments. Each treatment was applied to a separate sound and players compared the original
to the treatment once in each trial. Mean participant results revealed, 3D positioning
(particularly sound coming from a sharp left or right), pitch (particularly high pitched sound)
and loudness (specifically greater relative loudness) to be notably effective in increasing
participants‘ perceived intensity ratings. These three treatments were consequently selected
for the experiment and are shown in red in figure 1.
125
126
Methodology Designs
Figure 1: Potential affective properties of sound with those currently
under investigation highlighted in red
2.2 PREPARATION OF SOUNDS
The 5 sounds utilized in the experiment were all taken from the Source engine originally
created for Half-Life 2 (Valve, 2004). In its untreated state, each sound is presented as a
single monophonic channel. In addition to the 5 test sounds, avatar footstep and vegetation
rustling sounds can also be heard during gameplay.
2.3 GAME LEVEL DESIGN
A bespoke game level was judged to be the most appropriate choice of presentation medium
for the sounds. Because the specific interest of this research is to develop the audio within a
survival horror computer game, contextualization is therefore the key to producing results
with ecological validity. Whilst this method could allow several non-sonic variables
(particularly audio/visual synchresis and gameplay-related emotional experience) to impact
upon the results, it should be acknowledged that any correlation/data patterns drawn from this
experimentation must be observed within the context of a computer video game as this is the
only environment in which the research aims to apply gained knowledge.
Tom Garner
2012
University of Aalborg
Figure 2: Overhead view of test level
The custom level was built using the unmodified CryEngine 2 (Crytek, 2007) game engine
and sandbox level editor. Although the game engine supports third-person and first-person
perspective play, research suggests that a first-person display can increase the sense of
urgency and immersion (Calleja, 2007). The avatar is not completely absent however, and (in
traditional first-person shooter (FPS) style) visible forearms, hands and a pistol are
outstretched into the virtual world. The level was non-linear with no suggested direction and
could be completed by reaching one of three evacuation points. To achieve the desired
aesthetic and encourage any negative player valence to be fear-related, certain survival horror
conventions were utilized, including a night-time setting and a dense forest environment.
Near-zero visibility without the aid of a flashlight restricts the field of vision (Perron, 2004),
creating large volumes of ‗blind space‘ (Bonitza, 1982). In keeping with not only survival
horror convention but also traditional FPS formats, the player was pursued during the level
by an unknown creature which facilitates the hunter and hunted principle (Grimshaw &
Schott, 2008). This creature was, however, only implied through the narrative in the level
introduction and the sounds heard during gameplay.
Control layout was addressed to support gameplay accessibility and increase the chance of
participants using tacit knowledge to control their avatar and keeping their focus more
explicitly on the sound, graphics and atmosphere (Cunningham, Grout & Hebblewhite,
2006). The default controls followed the standard setup found on most FPS games and the
participant was given the opportunity to customize the controls before playing. The audio for
the level differs depending on the level type. Type 1 used untreated audio whilst types 2, 3
and 4 used treated audio (pitch shift, 3D and loudness respectively). All level types housed
the same group of 5 source sounds (a distant zombie call, a nearby twig snap, a woman‘s
scream, a monster‘s attack scream and a sudden distorted monster scream) activated by a
127
128
Methodology Designs
series of proximity triggers built in concentric circles. Regardless of gameplay, a minimum of
5 seconds of silence was guaranteed between sounds. All sound points were fixed and always
produced the same sound (not accounting for treatment variations). All sounds within a
specified type were treated with equal parameter settings of the same DSP process). This
treatment was one of the following: 3D (Binaural processing placing the sound to the right
side of the player), loudness (an intensity increase of 25dB), or pitch (300 cents rise in pitch
compared to the untreated sound). Given that this is a preliminary experiment to assess which
factors, easily processed by a game audio engine, might be most emotive in the survival
horror game context, such differences were designed to be noticed without being too obvious
– the fine-tuning is for later experiments.
2.4 ENVIRONMENT AND GAME EQUIPMENT
The game level ran on a bespoke 32-bit PC with Windows Vista (Service Pack 2) operating
system, AMD Phenom 2 (3.2GHz) quad core processor, 8GB RAM, ATI Radeon 4850
(1.5GB) GPU. At time of writing, this is a mid-level gaming specification PC able to run
most new release games at medium/high settings. The PC monitor was a LG, 22‖ LCD
screen, supporting the game level‘s 1920x1080 (full HD) graphics resolution. This
configuration was designed to resemble a typical consumer home setup that was powerful
enough to run a game representative of current gaming technology, whilst avoiding an elite
specification that would be likely to exclude the majority of the casual gaming community.
The testing was executed in a small studio space, providing natural light and a glass partition
window through which participants could be observed without disruption.
The sound was processed and reproduced via an Asus Xonar 7.1 sound card and Tritton AX
Pro 7.1 headphones. It has been suggested that the choice of headphones or speakers could
be a significant contextual variable (Cox, 2008) particularly in terms of localization and
immersion (Grimshaw, 2007) and impact. In a comparable study, Murphy and Pitt (2001)
show a preference towards headphone use, arguing that it ‗enables the designer to incorporate
more complex sound objects whose subtleties will not be lost due to background noise,
speaker cross-talk, etc.‘. The nature of headphones (namely speakers very close to the ear,
attenuating background noise and limiting acoustic effects that may distort/alter the sound
between the speaker and the ear) suggests that they are more likely to produce a more
immersive experience, and the commercial availability of a range of headphones (many
specifically designed for computer video games) suggests that headphone use is common
within a player‘s typical environment.
2.5 PARTICIPANTS
Similar experiments in related fields of study reveal a large range of participant numbers,
with smaller numbers ranging from 15-25 and larger numbers reaching 100. Although
practical constraints for this experiment set the participant number at 12, a number of relevant
published experiments reveal that statistical significance is possible with relatively small
participant numbers (Moffat & Kiegler, 2006; Nacke & Lindley, 2008; Ullsperger et
al.,2007). The 12 participants each experienced a different order of the 4 level variations
Tom Garner
2012
University of Aalborg
(untreated audio, pitch shift, 3D surround and loudness increase). This structure was
implemented to reduce order effects which have been identified as a further possible cause of
bias (Nacke & Lindley, 2008). All participants were students or recent university graduates
aged between 18 and 55, 9 male and 3 female. Participants were asked for their gender, age,
ethnic background and game playing proficiency. Each participant was also required to
complete an ethics form that informed them of the horror themes of the game and also asked
them to document any visual or hearing impairment and sign a disclaimer stating that they
consented to the information they would provide being used for the purposes of the
experiment. No sensitive personal information was collected.
2.6 PROCEDURE
Before playing, each participant was given a brief detailing the exact procedure, along with
game instructions and control information. Participants were aware that they needed to rate
the emotional impact of a sound, but not that fear (or negative valence) was under
investigation. Participants were required to provide their own single word descriptors to
illustrate the emotion they perceived, thereby not biasing subjective response towards fear.
The game level took between 50 and 140 seconds to complete and each participant played 3
variations which, including the brief and debrief time, set the total typical completion time at
10 minutes for 1 audio property. Testing four separate treatments in a single sitting would
take approximately 55 minutes (allowing 5 minute breaks between each treatment test). The
debriefing questionnaire required immediate response after each play-through, followed by a
more detailed set of questions to be answered after the last level was completed.
2.7 DATA COLLECTION
Moffat and Kiegler (2006) argue that, although ‗physiological measurements […] can be
valuable in helping to read the emotional state of game players‘, the links between emotion
and physiological response are currently unreliable and psychophysiological data collection
alone cannot provide a complete account of a participant‘s emotional state. An overview of
psychophysiology suggests that quantitative response measurement (heart-rate, galvanic skin
response, electromyography, etc.) is capable of providing accurate emotional valence and
intensity data (Cacioppo, Tassinary & Bernston, 2000), but cannot distinguish between
different emotional states of the same valence. Research has attempted to counter this
problem via near-simultaneous collection and correlation of objective physiological response
and subjective player responses (Bach et al., 2009; Grimshaw, Lindley & Nacke, 2008).
Cacioppo, Tassinary and Bernston (2000) admit that ‗specific types of measurement of
different physiological responses…are not by themselves reliable indicators of wellcharacterized feelings‘, suggesting that empirical data must be cross-examined alongside
additional data sources. To provide supporting data, direct participant opinions were collected
using a real-time vocal response system. A software-based digital audio workstation (Pro
Tools LE 8) synchronized to the game engine recorded participants‘ vocalized input via the
integrated headset microphone whilst they played the game. The initial participant mandate
requests that the player rates the ‗emotional impact‘ of each sound heard using a specified
129
130
Methodology Designs
scale and then communicates that score vocally. During gameplay (all types) a visual prompt
[1-2-3-4-5] appeared on the screen for 2 seconds immediately after a key-sound was
triggered. The headphone setup (section 2.4) recorded the vocal responses via a microphone
integrated into the headset. Audio data was recorded as a separate channel and synchronized
to a video recording of the in-game performance. The exact time of the vocal response was
recorded as text data in the game event log. The rationale for this approach comes from two
concepts: memory and flow. Rugg and Petre (2007) argue that a participant‘s explicit
knowledge regarding information they have recently received is stored in short-term memory
(STM), requiring rehearsal or meaningful association to migrate towards long-term memory
(LTM). Whilst it may be a fair assumption that an intense emotional response could facilitate
such a memory translation there are no guarantees, and the possibility that participants could
forget a number of the sounds by the end of the level presents a genuine risk. Conversely,
requesting that the participant break from the game to respond immediately to each stimulus
severely diminishes the potential for flow. Flow is central to attention and, consequently, a
break in flow negates immersion (Brown & Cairns, 2004; Csikszentmihalyi, 1990).
Because the intention is to evaluate sounds‘ emotive potential within a game environment it
is vital, in order to achieve contextual validity, that the player feels that they are playing a
game. Real-time vocalization is an attempt to find a middle-ground between the two
extremes. The number of audio samples used obeys Miller‟s law (Miller, 1956) and subjects
were asked to rate each sound using a 5 point scale (1=least emotive, 5=most emotive), in
keeping with recommendations suggested by recent research (Gillham, 2000) and
experimentation (Steele & Chon, 2007): subjects spoke or shouted the appropriate number
while playing in response to the visual prompt [1-2-3-4-5]. Freeze, fight and flight are
response actions associated with fear-inducing stimuli (Bracha, 2004). Perron (2005) argues
that such response actions can be applied to the experience of fear in a computer video game.
Reversing this, an analysis of player action and performance might reveal insight into their
emotional state. Case and Wolfson (2000) suggest that emotional arousal can greatly impact
upon performance: ‗[W]hen highly aroused, people tend to be faster but less accurate, and
they focus mainly on the most salient aspects of a task‘. Currently there is no existing
framework correlating player performance and fear-related arousal; a broad analysis of
performance may help further support the other data (user-input, psychophysiological
response) and provide a launching pad for further study within this specific area. To this
purpose, Fraps (v.3.2, 2010) real-time video capture software was implemented to provide a
complete visual recording of each participant‘s actions within the game.
The debrief questionnaire requested only explicit knowledge from participants and was in 3
sections. Section A required participants to provide individual words that they felt reflected
the atmosphere of the game and the emotional content of the sounds and then to rate the
perceived ‗scariness‘ and difficulty of the level overall using 5 point ordinal scales. This
section was answered immediately after the player had completed each game level and was
repeated for each play through. Section B requested each participant to rank the 3 levels in
order of perceived ‗scariness‘ and provide quality control information regarding sense of
immersion, flow and general game experience using the same 5 point scale system. Section C
Tom Garner
2012
University of Aalborg
asked the participants to state how often they played computer video games and if they
suffered from any visual or hearing impairment. They were also asked their age, gender,
nationality and country of origin. Subjects' participation and the data collection were
conducted in accordance to Aalborg University's Research Ethics Framework.
EXPERIMENT 3:
REAL TIME BIOMETRIC FEAR ASSESSMENT OF GAME SOUND
The preliminary experimentation documented within this section takes influence from the
previous biometrics discussion and advances upon results and conclusions obtained from the
previous two trials. Electrodermal activity (EDA) and electromyography (EMG) signal data
is collected from two groups of participants; both playing a bespoke game level. The design
of both the control (group A) and test game (group B) levels is identical with the exception of
DSP sound treatments that overlay particular sound events in the test group. Acoustic
treatments (pitch shift, sharpness, tempo, distortion, attack time, localisation and signal to
noise ratio) are then compared to control datasets in a search for significant difference in
arousal, corrugator supercilii activity and qualitative post-play feedback. It is hypothesised
that both EDA and EMG measures will reveal significantly different datasets between groups.
It is also expected that the physiological data will reflect subjective responses presented by
participants in the debrief.
3.1 BESPOKE GAME DESIGN
To enable effective comparison of the desired audio variables (outlined in section 1.2), a
bespoke first-person perspective game level was developed, entitled The Carrier. This game
places the player in the dark bowels of a sinking ship, with a race against time to reach the
surface. The presence of a dangerous creature is alluded to via scripted animation sequences
within the gameplay, and the intention is for the player to feel that they are being hunted. The
level was produced primarily using the CryEngine 2 sandbox editor (CryTek, 2007) and all
in-game graphical objects, characters and particle effects are taken from the associated game,
Crysis. The game level designs follow a sequence of prescribed events designed to subtly
manipulate the player‘s actions. Plausible physical barriers, disabling of the run and jump
functions and a logical progression of game scenes restricts the player to following a more
uniform direction and pacing. These constraints are complemented by the reduced visibility
settings, which provide plausibly restricted vision and movement to encourage (rather than
force) players to follow the desired linear path. Graphical elements orientate and direct the
player and invisible walls are utilised where (absolutely) necessary to avoid players straying
or accidentally becoming locked between objects. Ambient atmospheres and sound events of
indeterminate diegetic status, positioned in the darkness further the perception of a larger,
open world to add some credibility and realism to the game environment, despite its notably
linear design.
131
132
Methodology Designs
As the player progresses through the game level they are subjected to several, crafted ingame events utilising sound as the primary tool for evoking fear. During these events, user
control is sometimes manipulated to ensure that player focus can be directed appropriately
(this takes the form of forcing the player-view towards an event and then freezing the
controls for a short time). The decision to use this technique is arguably a point of contention
between first-person shooter titles. For example, Half-life 2 (Valve, 2004) was recognised for
never manipulating the player‘s perspective during single events, whilst Doom 3 (ID
Software, 2004) takes full control, manipulating the camera angle to create a cut-scene effect.
The former title prioritises flow and consistent diegetic narrative at the risk of the player
missing parts of (or even the entire) event, whilst the latter places a precedence upon
accentuating the scene and creating a filmic style with the possibility of reducing gameplaycohesion and immersion. Other games attempt to present a compromise, such as in Crysis 2
(Crytek, 2011) where the player is presented with an icon that indicates a significant event is
occurring (nearby building collapsing, alien ship passing overhead) and if the player selects
that option their viewing perspective is automatically manipulated to best observe the event.
The custom game level built for this experiment therefore is intended as a compromise,
ensuring that the player will fully observe the stimuli whilst minimising the disruption to
flow. The manipulations themselves are relatively subtle, and occur only three times within
the game.
The opening scene of the level presents the premise and endeavours to create an initial sense
of familiarity and security via recognisable architecture and everyday props. This atmosphere
is juxtaposed against a dark and solitary environment to create a sense of unease from within
the familiar. Subsequent scenes utilise conventional survival horror environments whilst
implied supernatural elements and scenarios also draw heavily from archetypal horror
themes. First-person perspective is retained but the customary FPS heads-up display and
weapon wielding is omitted, giving the player no indication of avatar health or damage
resistance, and also removing the traditional ordnance that increases player coping ability and
diminishes vulnerability-related fears. The avatar has no explicit appearance, character or
gender and is anchored into the gameplay via physics-generated audio (footsteps, rustling of
vegetation, interactions with objects, etc.) and the avatar‘s shadow. The player is required to
navigate the level and complete basic puzzles to succeed. Unbeknownst to the player, their
avatar cannot be killed or suffer damage to ensure that load/save elements are unnecessary
and that no player will repeat any section of gameplay, thus further unifying the collective
experiences of all participants.
3.2 SOUND DESIGN
The Cryengine2 (Crytek, 2007) integrates the FMOD (Firelight Technologies, 2007) game
audio management tool and consequently provides advanced audio development tools
including occlusion simulation, volume control, three-dimensional sound positioning and
physically based reverb generation. These features allow custom sounds to be easily
incorporated into the game and controlled without the need for third-party DSP plugins or
resource costly audio databases. Unfortunately, the engine has understandable limitations and
processing modalities such as attack envelopes and pitch shifting cannot be achieved with the
Tom Garner
2012
University of Aalborg
same level of precision as could be achieved with a professional digital audio workstation. To
this end, the custom sounds were pre-treated and separate sound files were generated for both
variations of each key sound. For the purpose of this experiment, the 7 modalities (attack
[AT], distortion [DN], localisation [LN], pitch [PT], sharpness [SH], signal to noise ratio
[SNR] and, tempo [TP]) generated 12 key sounds (two sounds for each modality – to support
the argument that if a DSP effect were to generate a significant difference, this would be
observable when tested on two different sounds). Due to time limitations and gameplay
restraints, SNR and TP parameters could only be tested once per game type. Two variations
of each sound were developed (group A the control and group B the treatment) as contrasting
extremes of each modality, producing a total database of 24 audio files per game. Figure 1
outlines the use of audio employed throughout both test levels.
Figure 1: Custom Audio Databases, Variables and Parameter Details
Sound Name
Diegetic Music
Ship Voice
Heavy Breath
Monster Scream
Woman Screams
Ship Groans
Man Screaming
DSP modality
Distortion
Distortion
Localisation
Localisation
Pitch
Pitch
Sharpness
Control (group A)
No additional DSP
No additional DSP
Centralised
Centralised
No additional DSP
No additional DSP
No additional DSP
Man Weeping
Sharpness
No additional DSP
Chamber banging
Monster Growl
Bulkhead Slams
Engine Noise
Attack
Attack
Tempo
Signal to noise
ratio
2 second linear fade-in
1 second linear fade-in
20 BPM
No noise present
Variant (group B)
Frequency distortion
Frequency distortion
Left to right sweep
Full left pan
300 cent pitch raise
500 cent pitch drop
12 dB gain @ 1.7kHz
with 1 octave of
bandwidth (7.0 dB gain
reduction)
12 dB gain @ 5kHz
with 1 octave of
bandwidth (4.0 dB gain
reduction)
0 second attack
0 second attack
30 BPM
Noise present
3.3 PILOT STUDY
In preparation for the main trial, participants (n=8) played through a beta version of the test
game whilst connected to EMG and EDA hardware. Following the trial, each participant was
debriefed and asked to disclose their opinions regarding gameplay and biometric hardware
experience. Recurring feedback from the players included orientation difficulty due to overcontrast and low brightness of graphics, difficulty in solving the puzzles and absence of
player damage/death resulting in lack of a convincing threat. Preliminary testing aided
calibration of standard decibel levels and several participants revealed difficulties operating
the control interface, notably coordination of the mouse (look) and keyboard (movement)
functions. In response, the final version: operated using a simplified keyboard-only WSAD
133
134
Methodology Designs
(basic movement controls: forward, backward, left strafe and right strafe respectively) control
layout (the space bar was the only other control button, used to interact with objects), reduced
the colour saturation and increased overall brightness. There remained no player death due to
the significant variation in completion time it would cause in addition to requiring players to
revisit sections of the level. Puzzles were simplified and steps were taken to increase usability
during these sections, clarifying the correct route/action via clearer signposting. Pilot-test
biometric data revealed spikes in both EMG and EDA measures immediately after
application of the sensors and following being told that the test had started.
3.4 TESTING ENVIRONMENT AND EQUIPMENT
The game level ran on a bespoke 64-bit PC with Windows Vista Home Premium (Service
Pack 2) operating system, AMD Phenom 2 X4 955 (3.2GHz) quad core processor, 8GB RAM,
ATI Radeon 4850 (1.5GB) GPU. At time of writing, this is a mid/high level gaming
specification PC able to run most new release games at high settings. Peripheral specification
includes LG 22‖ LCD Monitor (supporting 1920x1080 output resolution), Microsoft Wireless
Desktop 3000 mouse and keyboard, Asus Xonar 7.1 sound card and Triton AX Pro 7.1
headphones. During the pilot tests, facial expression was collected using a Technet USB
webcam and Camtasia Studio 7, which enabled the recording of in-game activity with facial
observations presented as picture-in-picture. However, it was revealed that running this
system in tandem with the game and biometric acquisition software severely reduced the
frame rate of the video recording. As a result the webcam element was removed and Fraps
(Beepa, 2007) was employed as a lighter consumer of processing resources. Biometric data
was collected using a Biopac MP30 data acquisition unit and Biopac Student Lab Pro v3.7
interface software. Experimentation was carried out in a small studio space, providing only
artificial light and attenuation of outside environment noise.
3.5 PARTICIPANTS, PROCEDURE AND ETHICS
10 participants (9 Male and 1 Female) were recruited for this experiment, none of whom was
involved in the preliminary testing. All were volunteers; undergraduate students studying
information technology or audio engineering and were aged between 18 and 27. All
participants rated their prior experience and gaming confidence as moderate or high and
stated familiarity with FPS type games and PC standard gaming controls. Participants were in
the majority British but also present were one Portuguese and one French individual (both of
whom were fluent in English). Experience in survival-horror games revealed some variation
however, with self-report ratings ranging from 1 to 10 (1-10 scale) with a mean score of 4.7.
Participants were informed that the research aim was to explore the emotional potential of
sound within a computer video game context. Each individual was also made aware of strobe
lighting, visual images that may be perceived as frightening or upsetting and the full
biometric data collection procedure. Participants were asked to sign a disclaimer stating they
were willing to continue, were handed an instruction card outlining the game controls and
were supervised as the data collection and peripheral equipment was setup. The supervising
researcher ensured that each participant was ready and that a satisfactory EMG/EDA baseline
was displayed before leaving the testing room. Synchronisation of gameplay with both the
Tom Garner
2012
University of Aalborg
biometric and video recordings was achieved by mapping the respective start and stop
actions to the same key, allowing the participant to automatically synchronise the entire data
collection process with little difficulty. Upon completion of the game levels, the researcher
would re-enter the room to begin the test debriefing; including participant completion of a
brief questionnaire to assess perceived difficulty, overall experience intensity, immersion,
disruption caused by the biometric sensors, and past gaming experience. Participants were
also asked to watch the video of their gameplay performance and provide a voice-over
commentary, with focus upon their developing emotional state and identification of discrete
emotions. Average completion time for setup, trial and debrief was 45 minutes per person.
3.6 DATA COLLECTION
Electro-dermal activity and Electromyographic hardware were configured to synchronise
with the game engine timestamp, allowing significant biometric readings to be accurately
matched with their corresponding chronological point of gameplay. The event logging system
was utilised to identify overall completion time. The Fraps (Beepa, 2007) screen capture
software was utilised to generate a video render of the participants‘ performance which was
then incorporated into the test debrief, in which participants were required to observe their
gameplay, the intention being that participants would re-experience the affective states felt
during the game and be able to more accurately describe their emotions in reference to
specific game events. The test debrief replayed the video capture to the player, who was
required to describe his or her experience in real-time, generating an audio commentary that
was overlaid with the original video. Participants were asked to classify any events perceived
as emotionally relevant, describe their affective state and identify any individual sounds
perceived to be incongruous (specifically; low quality, inappropriate distortion, or unbefitting
connotations) within the gameworld. Debriefing was concluded with several closed questions
to establish variation between participants, reveal any prior game playing experience/skill,
and assess the comfort, intrusion and flow interruption of the psychophysiological hardware
using a generic 5-point scale. EDA data was collected from the right index and middle fingers
of each participant by way of a SS57L Biopac EDA sensor lead and isotonic electrode gel.
BSL shielded SS2LB leads connected to trimmed, disposable EL501 electrodes were utilised
to collect facial EMG data. Existing research warns that precise positioning of EMG
electrodes is a difficult task (Huang et al., 2004), therefore upmost care was taken to apply
the hardware to each participant. A light abrasive treatment was applied to the skin in and
around the areas on which the sensors would be placed to reduce electrode-skin impedance
(Hermens et al., 2000). Electrodes were then applied across the midline of the muscle (De
Luca, 1997) and surgical tape was used to reduce motion artefact (Huang et al., 2004)
CONCLUSIONS AND CHAPTER SUMMARY
The intention of these three experimental trials is, primarily, to assess the capacity of sound
to evoke/modulate a fearful response without distinct semantic association. In addition, the
divergence between the methodologies is intended to provide valuable information regarding
various stimuli setup and data collection techniques. The following chapter therefore
discusses not only the results based upon the primary goal, but also evaluates the
effectiveness of these three separate testing approaches to support future experiment design.
135
Chapter 7
Experiment Results and
Discussions
Garner, Tom A.
University of Aalborg
2012
Tom Garner
2012
University of Aalborg
137
Chapter 7: Experiment Results
and Discussions
INTRODUCTION
This chapter documents the analysed results from the three preliminary experiments,
previously detailed in chapter 6, alongside a retrospective evaluation considering both the
results and the successes/limitations of the methodologies employed. The final conclusions
extracted from these discussions are then to be correlated with the information within the
earlier, literature review chapters to collectively support the design of the final hypothetical
frameworks: an ecology of fear within a virtual environment, a fear-based classification
system for game sound, a model visualising the interactions between real and virtual acoustic
ecologies in a gameplay context and a framework for an embodied virtual acoustic ecology.
EXPERIMENT 1:
1.1 HORROR GAME SOUND DESIGNER RESULTS
Data collection for HGSD testing was retrieved from the same automated email system
utilised for the SOF experiment. The characteristics of the obtained data required primarily
non-parametric testing and consequently the asymptotic chi-squared test was employed, again
using PASW (v.18: IBM, 2009). The nominal nature of sound localisation preference first
required that the data be reclassified along a numerical scale (numbered 1-2-3-4-5, referring
to left, left to right, centre, right to left and right respectively). The same procedure was also
required for general preferred sound (numbered 1-2-3-4-5, referring to thunder, insect, train,
gunshot, and door slam) and audio output (1-2-3-4, encoded from stereo speakers, stereo
headphones, surround speakers, surround headphones respectively). Figures 2-5 below
represent the three categories of DSP treatments in which significant difference was
observed.
Figure 1: Chi-square analysis of cumulative results
Attack
1
Attack
2
Pitch
1
X2
df
Sig.
5.737
2
.057
24.368
2
.000
.842
2
.656
X2
df
Sig.
Attack
1 dif.
25.158
5
.000
Attack
2 dif.
13.789
5
.017
Pitch
1 dif.
4.316
5
.505
Pitch
2
Delay
1
Delay
2
Pos.
1
Pos.
2
2.579
2
.275
.211
2
.900
.684
2
.710
13.053
4
.011
4.895
4
.298
Pitch
2 dif.
19.158
5
.002
Delay1
dif.
8.737
5
.120
Delay
2 dif.
18.842
5
.002
Pos.
1 dif.
15.053
5
.010
Pos.
2 dif.
12.842
5
.025
Diff.
Sounds
32.789
4
.000
138
Experiment Results and Discussions
Figure 2: Histogram representation of favoured individual sound based upon user-selection
Figure 3: User-preference regarding localisation DSP upon animal-scream sound
Tom Garner
2012
University of Aalborg
Figure 4: User-preference regarding attack DSP upon monster-breathing sound
Analysis reveals statistical significance for preferred general sound (diff. sounds), localisation
DSP upon an animal scream sound (position 1) and attack time upon a heavily breathing
monster sound (attack 2), revealing a significant preference towards thunder and door slam in
the first measure, towards centred and left to right moving localisation in the second and,
towards a slow-building 15 second delay in the third. Qualitative difficulty ratings relating to
each preference-measure reveal significance in several places. Attack 1 and attack 2 difficulty
ratings follow a normal distribution around a central tendency, indicating that the majority
perceived the task to be ‗moderately‘ difficult. Pitch 2 and position 1 difficulty ratings,
however, reveal a left-skewed curve, indicative of an easy task whilst delay 2 and position 2
indicate easy/moderately easy. Accounting for alternative independent variables, the ChiSquared test of independence of categorical variables (Pearson Chi-Square test statistic)
analysed the effects of gender, audio output, gameplay experience and online/local variation.
The gender independent variable (IV) revealed significant difference within the pitch 2
(x2=6.957, p=.031), position 1 dif. (x2=12.505, p=.028) and position 2 dif. (x2=13.150,
p=.011) measures. The online/local IV revealed statistically significant difference within the
attack 1 dif. (x2=11.050, p=.05), measures. The hours playing games IV presented
significance within the attack 2 (x2=19.960, p=.03) and position 1 (x2=29.344, p=.015)
measures. The audio output IV revealed no significant effect upon any of the dependent
measures.
139
140
Experiment Results and Discussions
1.2 SOUNDS OF FEAR RESULTS
The data collected from both the online and offline versions of this test were sent via an
automated email service to a designated address. Upon completion, the information was
transferred to PASW for analysis. In accordance with standard analysis procedure, the
repeated measures nature of the experiment dictated employment of paired samples t-test
analysis (two-tailed) to distinguish between treated and untreated groups across all 12 sound
sources and repeated measures ANOVA to present pairwise comparisons, testing for
significant difference between all 24 presented sounds. Both analysis measures were utilised
for intensity and valence separately. The results from both online and offline testing were
collated together to analyse the effects of DSP as the independent variable (IV) and were
separated later for direct comparison of the online/offline IV.
Figure 5: Snippet of ANOVA results. Green highlights significant difference between sources irrespective
of treatment. Red highlights circumstances in which difference is only significant with one treatment type
Sound
Type (1)
1. Footsteps - DSP
1. Footsteps - DSP
1. Footsteps - DSP
1. Footsteps - DSP
1. Footsteps - DSP
1. Footsteps - DSP
1. Footsteps - DSP
2. Footsteps - untreated
2. Footsteps - untreated
2. Footsteps - untreated
2. Footsteps - untreated
2. Footsteps - untreated
2. Footsteps - untreated
2. Footsteps - untreated
2. Footsteps - untreated
Intensity Measure
Sound
Type (2)
09. Voice Radio - DSP
13. Church Door - DSP
17. Woman Scream - DSP
18. Woman Scream - Untreated
20. Manhole - Untreated
23. Water Lurker - DSP
24. Water Lurker - Untreated
09. Voice Radio - DSP
12. Tree fall - Untreated
13. Church Door - DSP
17. Woman Scream - DSP
18. Woman Scream - Untreated
20. Manhole - Untreated
23. Water Lurker - DSP
24. Water Lurker - Untreated
Mean
Difference (1-2)
-3.250
-2.143
-2.714
-3.036
-2.357
-3.036
-3.250
-3.286
-2.321
-2.179
-2.750
-3.071
-2.393
-3.071
-3.286
Std.
Error
.691
.465
.570
.572
.449
.596
.531
.709
.434
.451
.481
.568
.492
.595
.539
Descriptive statistical analysis of the intensity dependant variable revealed relatively low
standard deviations between participants (minimum = 1.647, maximum = 2.78) but also small
variations in mean responses between treated and untreated groups. The paired differences ttest revealed statistically significant difference between groups in the following sound
sources: Voice Radio (t=2.743; p=.011), Church Door Slam (t=2.698; p=.012), Manhole
Cover Scrape (t=-2.333; p=.027) and Monster Roar (t=-2.698; p=.012). Valence dependent
variable standard deviations were lower than for intensity (between 1.162 and 2.457) but
initial analysis of mean distributions suggested a similar pattern to that observed within the
intensity group.
Sig.
.019
.024
.016
.004
.004
.007
.000
.022
.003
.013
.001
.003
.012
.005
.000
Tom Garner
2012
University of Aalborg
Figure 6: Bar chart representation of intensity and valence mean results from SOF experiment
141
142
Experiment Results and Discussions
Descriptive statistics revealed a full range of valence responses (1-9) and a clear majority
preference (prominently 3‘s 4‘s and 5‘s) for most of the 24 sounds. The paired differences ttest revealed statistically significant difference between groups for only the Zombie sound
(t=2.091, p=.046).
Repeated measures ANOVA testing utilised Mauchly‘s test to check the sphericity
assumption and Greenhouse-Geisser for corrections. Multivariate tests for the intensity
measure revealed significant difference between sounds (F=32.96, p<.001). In contrast to the
result obtained from the paired samples t-test, ANOVA pairwise comparisons of the intensity
measure identified no difference of statistical significance between paired treated and
untreated sounds. However, significant difference was identified between large numbers of
individual sound types (summarised in figure 5 and visualised in figure 6). The notable
observation presented in this data is that for some of the sound sources, both variations of the
sound are significantly different to both variations of another (examples highlighted in green
in figure 5). However, some sounds only reveal significant differences between a single
variation of each paired source, suggesting that (in that particular instance) it could be the
DSP effect that generates the perceived difference in intensity. Figure 7 exemplifies this
difference, revealing that without DSP treatment (in this case, a loudness boost) the voice
radio sound cannot be statistically differentiated from any of the other 24 sounds in terms of
intensity. In comparison, with DSP treatment the same sound reveals significantly greater
intensity means than five other sounds. The same repeated measure ANOVA testing for the
valence variable revealed no statistical significance between any specific pairings of the 24
sounds, nor was any difference overall presented. Figure 6 presents both intensity and
valence measures adjacently, revealing the limited range of mean responses within the
valence group compared to the intensity group.
Finally, to compare the online to the local network datasets, a random sample of the 22 online
participants was taken to compare equivalent sized online/local samples and one-way
ANOVA was employed. Results indicated significantly greater mean responses for online
datasets in the following intensity measures: untreated voice radio (F=15.244, p=.003),
untreated woman scream (F=6.659, p=.027) and untreated water lurker (F=12.987, p=.005)
whilst pitch treated zombie (F=7.313, p=.022) revealed a greater mean intensity response for
the local dataset. With regards to valence measures: untreated footsteps (F=5.548, p=.04),
and sharpness treated tree-fall (F=9.308, p=.012) revealed greater mean response for the
online dataset whilst localisation treated footsteps (F=5.976, p=.035), untreated voice radio
(F=12.8, p=.005), untreated woman scream (F=5.74, p=.038) and untreated monster roar
(F=7.71, p=.02) all received greater mean valence responses from the local dataset.
Descriptive statistics also revealed greater range and frequency of extreme responses in the
online group. One-way ANOVA comparison of the local and online dataset ranges confirm
the difference as significant (F=7.059, p=.009)
Tom Garner
2012
University of Aalborg
143
Figure 7: Table revealing substantial effect DSP has had on differentiating source from other sounds
Sound
Type
(1)
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
Sound
Type
(2)
1
2
3
4
5
6
7
8
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Mean
Diff.
(1-2)
3.250*
3.286*
2.429*
2.179
2.679*
2.964*
1.679
2.536
1.536
1.500
.964
1.107
2.643
1.714
1.964
.536
.214
1.929
.893
2.286
1.000
.214
.000
Std.
Error
.691
.709
.432
.517
.542
.528
.669
.676
.560
.576
.645
.578
.630
.537
.656
.521
.500
.600
.483
.563
.463
.594
.575
Sig.
.019
.022
.002
.069
.010
.002
1.000
.233
1.000
1.000
1.000
1.000
.073
.979
1.000
1.000
1.000
.928
1.000
.104
1.000
1.000
1.000
Sound
Type
(1)
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
Sound
Type
(2)
1
2
3
4
5
6
7
8
9
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Mean
Diff.
(1-2)
1.714
1.750
.893
.643
1.143
1.429
.143
1.000
-1.536
-.036
-.571
-.429
1.107
.179
.429
-1.000
-1.321
.393
-.643
.750
-.536
-1.321
-1.536
Std.
Error
Sig.
.570
.548
.557
.497
.577
.550
.648
.686
.560
.541
.553
.645
.645
.540
.616
.517
.499
.618
.502
.650
.489
.604
.528
1.3 DISCUSSION
Results obtained during the collective analysis of SOF data revealed statistically significant
differences between paired variations of some, but not all source sounds. Unfortunately there
was little consistency between source sounds for single DSP effects with the exception of
loudness that revealed a statistically significant difference in both associated sound sources
(voice radio and church door slam). This would suggest that an increase in loudness has
greater potential to consistently impact upon the perceived intensity levels of a sound, whilst
the effectiveness of pitch and localisation (the other DSP effect to reveal significance but only
on a single source sound) is more context-dependent. The zombie call sound (treated with a
pitch raise) was revealed to be the only significant valence-related pair, suggesting that DSP
effects are unlikely to have a significant impact upon perceived valence ratings. The
contextualisation of the zombie sound is arguably responsible for the effectiveness of pitch in
this example and the distinct raise in pitch creates both a notable distortion capable of
breaking the immersive flow and a distinctly comedic effect reminiscent of parody-horror
stylisation.
1.000
.986
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
144
Experiment Results and Discussions
The descriptive statistical analysis indirectly presents evidence in support of the power of
contextualisation in that, despite the theme of fear/horror being descriptively prominent
throughout the web-testing, the sounds are not given clear context in terms of their source or
purpose. The perceived context of these sounds primarily relates to a neutral, relaxed and
comfortable testing environment with no inherent threat or risk. This is revealed in the
presence of multiple valence responses of 8 and 9 (high to very high positive valence) across
all test sound groups and the mean range being just off-centre (3.14 – 4.61). One-way
ANOVA testing across all SOF results revealed statistically significant differences in user
intensity-responses between many of the 24 presented sounds. Although no pairwise
differences yielded significance, the notable quantity of distinguishable sounds within a
controlled context certainly supports the assertion that individual sounds are capable of
generating different perceptual intensity responses. It does not, however, support the same
assertion in relation to valence.
Results obtained from the online/local comparison ANOVA test provide some insight into the
impact of inconsistencies between testing environments within this context. Both intensity
and valence responses appear to have been affected by online-based variables but again, this
is not consistent across all sound groups and no clear association is discernable between
online/local environments and treated/untreated sound sources. The absence of both treatment
groups being significantly affected by the online/local variation suggests that DSP effects
may impact upon the potential of online variations to alter intensity responses. Alongside the
comparison of range and extreme response frequency, this data analysis suggests that there is
a small but nonetheless; significantly lower reliability rating for web-based experimentation.
Analysis of the HGSD data presents an expected preference towards more easily identifiable
sounds with sudden attack times and inherent horror significations. Preference towards
centred localisation is surprising, primarily because of the assumption that full left/right
would defy expectation as few sounds are presented in such a way. The data could be
questioned upon further overview of the position 2 group which reveals that although the
centred position is clearly the majority selection, if extreme left and right were combined, the
difference then would be marginal. Although the observed data differed significantly from the
Chi-Square expected result, both centred and left to right moving localisation scoring equal
values unfortunately reveals no real preference. The attack test matches the expectation of the
hypothesis, revealing a preference towards a gradually building 15 second attack, a DSP
effect that clearly alters the connotations of the sound to imply that the source is approaching.
Data regarding qualitative difficulty participants experienced in differentiating and selecting
sounds provides no conclusive evidence to support the hypothesis that increased difficulty of
task could have an erroneous impact upon participants‘ selections. Incorporation of the
gender, audio output and gameplay experience/history data revealed strong potential for such
factors to significantly influence dependent results and highlight them as variables that must
be actively controlled to ensure accurate and reliable data in future experiments. That the
Tom Garner
2012
University of Aalborg
participants‘ audio hardware did not produce significantly different data in any of the groups
is surprising, however the preliminary nature of this experiment suggests that it should not be
discounted as a potentially erroneous factor and more definitive testing would be required
before an evidenced statement could be made.
One of the primary issues with this trial has been the lack of local participants (N=6). This
concern reduces the statistical power of the analysis but also reminds us of the inherent value
of web-mediated testing environments with regards to participant access. Other limitations lie
within the nominal and ordinal nature of most DV groups. Future experimentation should
undoubtedly explore pitch, attack and preceding silence period via ratio scales and compare
larger numbers of alternative parameters than are presented here. Localisation could also be
extended by comparing dependent measures in degrees within a surround sound environment
to explore both static and dynamic localisation effects.
1.4 CONCLUSIONS AND EXPERIMENT SUMMARY
This chapter presented the first of three preliminary experiments contained within the thesis.
Obtained results suggest a requirement for more stringent testing and, ideally, larger sample
sizes to confirm some of the concepts the data analysis has suggested. Results also signify
that contextualisation may be a decisive factor in the potential of a sound to evoke affective
response but this experiment has also highlighted particular DSP effects that have greater
potential to support a context-driven psychoacoustic characteristic (such as immediate
loudness reflecting a powerful danger or attack envelope connoting an approaching threat).
The subsequent chapter will take influence from assertions presented here with regards to the
methodology for a second preliminary trial, this time incorporating various sounds into a
bespoke game environment and assessing participants‘ emotional states by way of a real-time
vocalisation approach to affective data collection.
EXPERIMENT 2:
2.1 RESULTS
Each participant (N=12) completed the four game segments featuring the four alternative
sound treatments, completing questionnaire sections between levels and provided
demographic and further data in the debrief at the end of the session. All participants were
White British, 9 male and 3 female, with ages ranging from 18 to 55. During the debrief
participants rated the immersive quality of their overall experience and how disruptive to
flow providing the real-time audio responses was (figure 8).
Figure 8: Mean/Standard deviation of participant ratings of immersion/flow disruption
Immersion
Disruption
N
12
12
Range
2.00
2.00
Min.
2.00
1.00
Max.
4.00
3.00
Mean
3.25
2.33
SD
.621
.7785
145
146
Experiment Results and Discussions
Figure 9: Bar chart representation of average RTI measures between control and treatment groups
In terms of descriptive statistics, figure 9 elucidates the average RTI responses between
sound treatment groups, revealing little difference between the treatments or control group.
To test the statistical significance of the results, one-way repeated measures ANOVA was
employed via PASW/SPSS v.18 (IBM, 2010), with sound treatment type as the within subject
factor for the dependent variable data collected. The dependent variable classes tested were
completion time and 3 measurements associated with player in-game use of the run function
(total activation number / total time run was activated / mean run time).
Figure 10: Mean (+ standard deviation) of DV measurements for four sound modalities
Sound treatment factors
DV measures
Untreated
Pitch
Surround
Loudness
Completion
time
244.348
(184.823)
136.735
(89.453)
217.997
(182.061)
178.137
(114.879)
No. of RUN
activations
2.08
(1.929)
2.67
(3.393)
1.42
(2.021)
2.17
(2.823)
Total RUN
time
21.606
(23.894)
29.969
(34.836)
23.856
(32.757)
23.41
(32.621)
Mean RUN
time
13.518
(19.97)
15.324
(20.206)
15.028
(24.56)
17.744
(26.563)
Tom Garner
2012
University of Aalborg
Mauchley‘s test of the DV measurements completion time (χ2 (5) = 7.157, p > .05) and total
run time (χ2 (5) = 9.529, p > .05) signified that the assumption of sphericity had been met.
However, number of run activations (χ2 (5) = 13.108, p < .05) and mean run time (χ2 (5) =
123.203, p < .05) indicated a violation. The Greenhouse-Geisser estimates of sphericity were
employed to correct the degrees of freedom (ε = .671, ε = .598 respectively). One-way
repeated measures ANOVA revealed no statistically significant difference between the four
sound treatment factors when measuring number of run activations (F2.041, 22.15 = .954, p >
.05), total run time (F3, 33 = .953, p > .05) or mean run time (F1.794, 19.736 = .608, p > .05).
Completion time did indicate significance (F3, 33 = 2.888, p < .05) revealing that the objective
DV measurement of level completion time was significantly affected by the different sound
treatment factors.
Measurements associated with the run function were tested further to assess correlation
between use of the run function and RTI. The Spearman‘s Rank correlation (two-tailed) was
selected to test the relationship between these variables (R (RTI, number of run activations) = -.483, p <
.001, R (RTI, total run time) = -.634, p < .001, R (RTI, mean run time) = -.55, p < .001) indicating a
moderate/strong RTI-total time correlation, a weak RTI-number of runs and a moderate RTImean run time correlation. The debrief questionnaire also revealed variation in PC player
experience and confidence (PEC) prior to testing between participants. These grouped
differences were evenly distributed and ranked (25%, R = 1-4) across the total sample.
Multivariate variance analysis (MANOVA) assessed the significance of variation between
PEC groups when measuring completion time and RTI (F (3, 44) = 20.616, p < .001).
Bonferroni post hoc revealed specific significant difference lay between PEC levels 1 and 3 (
x 1 - x 3 = 247.831, p < .001), 1 and 4 ( x 1 - x 4 = 233.606, p < .001), 2 and 3 ( x 2 - x 3 =
216.728, p < .001) and between levels 2 and 4 ( x 2 - x 4 = 202.503, p < .001) with completion
time the dependent variable. Comparable results were obtained in post hoc with RTI the
dependent variable ( x 1 - x 3 = 1.55, p < .001), ( x 1 - x 4 = 1.8, p < .001), ( x 2 - x 3 = 1.217, p
< .001), ( x 2 - x 4 = 1.467, p < .001).
Comparisons between real-time intensity averages and debrief intensity ratings averaged
across all treatment types revealed a moderate/strong correlation via Spearman‘s Rank
Coefficient (R (RTI, debrief intensity) = .694, p < .001), suggesting that although players reported
similarly both during and after playing the game, the 30% margin of error arguably supports
the use of vocal intensity responses collected during the game. The final analysis tests
searched for difference (measured in RTI) between the game level source-sounds (figure 11).
A series of Freidman‘s tests of variance for repeated measures were employed to identify
significant difference between source-sounds, and search for any interaction between sourcesound factors and sound treatment factors. Results revealed significant difference between
source sounds across all treatments: untreated
(χ2 (4) = 20.593, p < .001), pitch (χ2 (4) =
16.964, p < .01), 3D (χ2 (4) = 16.176, p < .01), loudness (χ2 (4) = 22.545, p < .001) and when
tested across all treatments (χ2 (4) = 71.891, p < .001). As with earlier tests, no significant
difference was identified between treatment type (χ2 (3) = 2.123, p > .05).
147
148
Experiment Results and Discussions
Figure 11: Results of Friedman’s test series
Sound Name
Zombie Call
Twig Snap
Woman Scream
Monster Attack
Intense Scream
Sound Treatment Factors
Untreated
Pitch
3D
2.75
2.63
2.25
1.38
1.56
1.69
2.69
3.31
3.44
3.75
3.38
3.31
4.44
4.13
4.31
Loud
1.81
1.69
3.38
3.69
4.44
2.2 DISCUSSION
Whilst this chapter has identified a great number of audio parameters that have the potential
to affect player intensity response, the scope has currently limited this study to only three.
The results obtained present no significantly conclusive evidence to support the hypothesis
that pitch alteration, decibel level or binaurally processed panning techniques affect player
intensity response. One initial possible explanation is the set levels for the 3 treatments were
too conservative to trigger significantly different intensity readings and that a more focussed
analysis on an individual parameter across a greater range of treatment values would be of
profit to further investigations. Within the remit of a preliminary test, such lack of significant
conclusiveness is not unexpected.
Incorporation of real-time audio responses and game engine event logging allowed an
accurate synchronisation of several data sets and provided an opportunity to analyse data at
(and around) specific points of gameplay. Testing sound treatment modalities via various
measurements of in-game avatar running action revealed no significant difference between
groups for any of the three measures (number of run activations, total time spent with run
engaged, average length of run time). Several possible reasons lie in alternative player
motives for activating the run function (accelerating the player once the level exit has been
visually identified in order to vary gameplay when the player experiences frustration or
boredom) and inexperienced player difficulties in coordinating both run and movement
controls simultaneously. Event logging provided a data filtration system, only recording data
if certain criteria were met. Parameters of this logic-gate were set to record run associated
data only between activation of a key sound and for the subsequent five seconds of gameplay
(the logic suggesting that any run activity occurring during this time would be more likely to
be a response to the key sound). This system unfortunately did not reveal any further insight
due to too little data recorded (under such stringent filters) to perform any statistical analysis,
suggesting that the run function was not used as an evasive player movement in reaction to
any key sound. The Spearman‘s Rank Correlation test provided further analysis of data
regarding possible relationships between the emotional intensity ratings (RTI) and the three
run-function measurements. Results identified a strong link between the participants‘ RTI and
total run time, and a moderate relationship between RTI and mean run time; suggesting that
analysis of the run function may have future potential as an objective measurement.
Tom Garner
2012
University of Aalborg
Results of the data analysis also suggest that other critical factors were affecting the
dependent variables, most notably player experience and confidence (PEC) prior to
experimentation. As previously asserted in chapter 3, fear cannot be experienced without a
genuine perception of threat and that perception can alter in both presence and intensity
depending on the appraised severity of threat and the individual's capacity to overcome that
threat. It is therefore argued that high levels of experience (knowledge of game conventions,
a variety of fear induction tactics, etc.) coupled with confidence and adept skill in game
controls is very likely to reduce the threat severity and increase the coping ability. The test
results presented above support this, presenting a highly significant negative correlation
between PEC and RTI. At both a qualitative and quantitative level, the testing revealed that
players with very little gaming experience were more likely to struggle to reach the level exit
when they were feeling more intense fear, to an extent that they reported frustration and
dislike towards the game. Quantitative data analysis supported this finding, revealing a
significant correlation between RTI and completion time. The causal nature of completion
time as a variable becomes a matter of opinion, with a logical assertion being that because
sensation of fear-intensity positively correlates with completion time, which variable is the
cause and which is the effect may fluctuate throughout the game. Taking longer to reach the
level exit may increase negative emotional valence (worry, fear, frustration) which in turn
creates a feedback loop as increased negative valence causes the player to lose their way or
begin travelling in circles (as we observed in several of the gameplay capture videos). It is
also observed that completion time can impact upon the potential of a sound to evoke
increased fear response. Existing literature has suggested that forewarning cues which signify
that a frightening event is imminent (such as extended periods of unexpected silence), may
significantly increase the impact of a sudden sound (Perron, 2004), and increased completion
time dictates a greater mean time between key sound events. Such an aspect of sound design
shows great potential to manipulate player fear response and is certainly a candidate for
further study.
The nature of the experimental design afforded the opportunity to run statistical analysis of
variance between each of the 5 alternative sounds heard in every play-through. Because each
player reported an emotional intensity rating for each sound for four repetitions (across the 4
treatment types) order effects can arguably be dismissed alongside audio/visual interaction
effects due to the repetitive and low visibility graphic environment. Results posited that
despite contaminating variables (alternative audio treatments, PEC), a significant difference
in RTI existed between groups and an ordinal rank and specific mean differences between
each sound were also identified. Such findings provide a strong argument for the value of
game sound in manipulation of player emotional states, and calls for the continuation of this
research line of enquiry to establish exactly what sonic differences caused these significant
and substantial variances in fear-response intensity.
The results of this experiment confirm that differences between sound parameters can affect
the degree of intensity an individual experiences whilst playing a survival horror game. The
specific measures tested (3D, loudness and pitch), although not statistically significant, do
reveal potential if subject to greater parameter extremes. Although not formally analysed via
149
150
Experiment Results and Discussions
statistical data, initial observation posits periodicity (specifically, the length of silence
experienced before a key sound) as a good candidate for individual study. The statistically
significant difference observed between source-sounds further identifies timbre and attack
(ADSR) as strong potential candidates for further study. Future experimentation will explore
each of the above sound parameters individually, assessing each parameter across a detailed
number of measures rather than a single measured difference between the treatment group
and the control. Psychophysiological measurements to increase objectivity will also be
integrated into all further experiments.
The results support documented above support the notion that the experience of fear is (at
least within the confines of this context) a complex matrix of interacting variables. Whilst
real-time intensity response and event logging have proved substantially valuable the exact
execution of this approach requires minor alteration. Real-time audio responses collected
during periods of silence in the game, in addition to immediately after each key-sound, could
provide a more detailed account of player emotion, but at the risk of further interrupting
immersion and flow. The substantial interference effect of player experience and confidence
prior to testing is acknowledged however, future experiments should endeavour to focus upon
increasing breadth and depth, testing with both a wider range of DSP effects and a greater
number of parameters within each effect.
In terms of the test game itself, the substantial difference between players‘ completion times
(both between participants and between separate playthroughs) could arguably be an
erroneous source of tension, capable of increasing the RTI responses. Players lost and
struggling within the game level may experience greater fear elicitation as their circumstance
acts as an affective primer. The open-world style of level design is likely responsible for this
issue and presents a compromise between level design that reflects modern games (but leaves
substantial room for variation in the player experience) and linear design that provides greater
control of variation between gameplay experiences (but may be perceived as sterile and nonrepresentative by the player).
2.3 CONCLUSIONS AND EXPERIMENT SUMMARY
Results obtained from this study support data event logging as an effective way to collect
detailed gameplay statistics in an efficient manner. The lack of conclusive or statistically
significant evidence strongly supports the notion that a more systematic methodology
(controlling for erroneous variables such as player-experience/confidence) is more likely to
yield greater results. The issues with level design may not necessarily require severe
compromise on either extreme and the results from this experiment have since informed the
level design of subsequent experiments (primarily encouraging deceptively linear
environments with significantly more distinct and frequent navigational signposts integrated
within the set pieces).
Tom Garner
2012
University of Aalborg
In preparation for the final preliminary experiment, the upcoming chapter hosts a detailed
analysis of psychophysiological approaches to data acquisition with regards to a range of
general and specific, thesis-relevant applications. The concept of biometrics is addressed in
greater detail and electrodermal activity (EDA: skin conductance) and electromyographic
(EMG: muscle activity) measures are scrutinised in terms of their advantages and limitations
in comparison to alternative psychophysiological methods. This discussion will then lead to
the final trial, within which EMG and EDA biometric data alongside qualitative debrief
player responses, shall reveal the effects of a new set of DSP treatments upon objective and
subjective measures of fear elicitation.
EXPERIMENT 3:
3.1 RESULTS
Data obtained from both EDA and EMG acquisition was initially extracted in 5.00 second
epochs. The baseline epoch for each participant was collected between 30.00 and 35.00
seconds of the 1 minute rest period before the game began. This was to allow time for the
signal to stabilise after inevitable rises immediately after applying the sensors whilst leaving
time before the test began so to avoid test-start anxiety. Data relevant to each test sound was
extracted, also as 5 second epochs, but each one commenced in synchronisation with the
sound onset. Integrated signal analysis tools within BSL Pro v.3.6.7 (Biopac, 2001) enabled
both biometric signals to reveal descriptive statistical data across each epoch. Both
electrodermal activity and electromyographic data were measured for minimal and maximum
peaks within the 5 seconds, plus a mean, area and slope output.
The next step (along with all subsequent analytical processes) was carried out with PASW
statistics (v.18, IBM, 2009) and was to calculate the difference between the baseline and each
test epoch to generate a differential dataset (see figure 1). The final step produced descriptive
statistics of the differential dataset and then utilised the statistical t-test for independent
samples to search for statistically significant difference between the means of the control and
treatment groups for each test sound. Results of the t-test revealed significant difference
between groups of the muffled scream (sharpness treatment) sound in the EDA-mean (t=2.377, p=.045), EDA-max (t=-2.357, p=.046) and EDA-min (t=-2.457, p=.039) dependent
variables and was the only notable difference in the EDA output measures. Of the EMG
outputs, only EMG-max revealed a significant difference between groups, at the muffled
whimper (also sharpness treatment) sound (t=2.669, p=.028). Each individual sound (both
treated and untreated) was separately tested but revealed no statistical significance between
any of the sounds for any measure of EDA or EMG. An overview of the descriptive statistics
suggests that overly high variance between players, present even when testing the differential
data, is the likely cause of an absence of significant difference between sounds.
151
152
Experiment Results and Discussions
Figure 13 presents a visual overview of the EMG and EDA signal data with notable increases
in either biometric synchronised to an event. Overlaying biometric output with event data
reveals that the majority of EDA/EMG spikes and peaks can be attributed to specific events
within the game, however, not all of these events are directly tied to a sound and more often
probably relate to a visual entity such as a lifeless body or flash of light. Visual analysis of the
overall biometric signal also reveals significantly raised skin conductance levels during the
first 60-80 seconds of gameplay, and a steadily increasing rate of EMG activity from
commencing gameplay to completing the level. These trends are generally representative of
all players‘ test results (particularly the initial high EDA and steadily increasing EMG).
Figure 12: Extract of collated dataset – differential data for all measures of EMG
Group A Mean
EMG
Min
Max
Mean
Area
Slope
Music
-0.01786
0.00398
-0.0002
0.00216
-0.00008
Ship Groan
-0.02888
0.01376
-0.00018
0.01312
0.00032
light explosion
-0.02186
0.00902
-0.00014
0.01158
0.00194
Muffled Scream
-0.01954
0.01128
-0.00016
0.0076
-0.00048
Man Whimper
-0.01762
0.00654
-0.00012
0.01156
-0.00074
Pipe Banging
-0.03338
0.02134
-0.00012
0.01662
0.00002
Breath
-0.0142
0.01126
-0.00014
0.0193
0.00194
-0.03244
0.02026
-0.00012
0.01438
0.00042
-0.0217
0.00772
-0.00014
0.0231
0.00026
Animal Scream
-0.02594
0.01472
-0.00014
0.01426
0.00008
Monster Roar
-0.02264
0.01354
-0.00016
0.01536
0.00002
-0.0233
0.0131
-0.0001
0.01734
-0.00134
Ship Voice
Scream
Door Slams
Group B
EMG
Min
Music
Max
Mean
Area
Slope
-0.0013
-0.00492
-0.00016
-0.00554
0.00052
0.0036
-0.00456
-0.00016
-0.00266
0.00044
light explosion
-0.00756
0.00672
-0.00018
-0.00268
0.00002
Muffled Scream
-0.0036
0.00282
-0.00006
-0.00574
-0.0003
Man Whimper
0.00128
-0.00366
-0.00012
-0.00166
0.00008
Pipe Banging
-0.00452
-0.00076
-0.00014
-0.00036
-0.00018
Breath
-0.00086
0.0033
-0.00012
0.00338
0.00022
Ship Voice
-0.01362
0.00702
-0.00014
0.00812
0.00182
-0.0062
0.01298
-0.0001
0.00338
-0.00074
Animal Scream
-0.01012
0.02012
-0.00016
-0.0002
0.00028
Monster Roar
-0.02568
0.01352
-0.00016
0.00704
0.00212
Door Slams
-0.00636
0.00264
-0.00018
0.00488
0.0008
Ship Groan
Scream
Tom Garner
2012
University of Aalborg
Figure 13: Full EMG(red)/EDA(blue) signal output + synchronised events/related screenshots
153
154
Experiment Results and Discussions
Figure 14: Descriptive statistics & Chi-Square analysis of qualitative debrief responses
Descriptive Statistics
N
Mean
Std. Deviation
Minimum
Maximum
Comfort
10
7.1000
.73786
6.00
8.00
Flow disruption
10
2.7000
.82327
2.00
4.00
Intensity
10
6.7000
1.33749
4.00
8.00
Difficulty
10
3.3000
.67495
3.00
5.00
Frustration
10
2.2000
.78881
1.00
3.00
Chi-Square Test Statistics
Chi-square
df
Asymp. Sig.
Comfort
Flow disruption
a
28.000
28.000
a
Intensity
Difficulty
18.000
a
Frustration
56.000
a
26.000
a
9
9
9
9
9
.001
.001
.035
.000
.002
a. 10 cells (100.0%) have expected frequencies less than 5. The minimum expected cell frequency is 1.0.
Non-parametric analysis compared the control and test groups via the Mann-Whitney U test
for two independent samples to assess if any significant variation existed between the groups
with regards to subjective intensity, frustration and difficulty ratings. Results revealed no
statistical significance between groups for any of the three dependent variables. Descriptive
statistics and the Chi-Square test was administered, including all ten players‘ results, to assess
comfort of biometric sensors, perceived difficulty of the game itself, subjective frustration
levels and the extent to which the sensors and wires obstructed the flow of gameplay and
sense of immersion. The results (shown above in figure 14) reveal statistical significance for
all measures and the descriptive means reveal a relatively high level of comfort and low
disruption during gameplay alongside moderately high intensity ratings, low difficulty and
very low frustration levels. Information collected from qualitative discussion with
participants revealed that context/situation heavily influenced how they felt about a sound.
For example, many commented that the consecutive slams at the end of the level created
suspense and fear by signifying urgency and a time-limit to reach the level end. Participants
also commented that the alien roar sound was very intense and discomforting. Participants
did not comment directly upon the fast tempo of the slam sounds or the immediate attack and
high volume of the alien roar sound, suggesting limited conscious awareness of the
quantitative sound techniques being employed. Figures 15 and 16 present line graph
representations of particular statistical measures of both EEG and EDA, revealing the
difference between sound treatment groups and individual sounds.
Tom Garner
2012
University of Aalborg
Figure 15: Line graph representing mean EEG peak values (red = control, blue = DSP treated sound)
Figure 16: Line graph representing mean EDA values (red = control, blue = DSP treated sound)
155
156
Experiment Results and Discussions
3.2 DISCUSSION
As a preliminary experiment, several concerns were foreseeable from the onset whilst others
were revealed throughout the course of the study. It is acknowledged that the overarching
themes, atmospheres and stylisations of a computer video game are a likely source of
variation upon the perception of individual sounds and compound sound events/ambiences,
the rationale being that such factors form a contextualising framework against which smaller
entities within the game are appraised. To that end it was desirable that two contrasting game
levels would be created, providing an opportunity to observe if particular sonic variables had
the potential to reliably alter affective states across varied contextual environments. A second
game level (Silence) was created prior to testing and tasked the player with searching for their
missing friend in a foggy woodland environment. A more cinematic feel juxtaposed against a
different environment with contrast in the objective, characterisation and colour created a
dissimilar experience whilst remaining within the horror theme remit. Unfortunately, its use
raised the projected length of testing to 90 minutes per individual and was deemed
inappropriate with regards to available resources.
The Biopac EDA/EMG hardware and accompanying software has proven to be a robust and
accessible solution. The integrated signal analysis features provide highly usable raw figures
upon which statistical analysis can be performed. Connection between players and sensors
proved consistent, and the positive reviews from players concerning comfort of use and lack
of flow/immersion disruption further advocates both the Biopac system and EDA/EMG
biometrics in general as effective data acquisition tools within computer video games
research. With regards to the game level design and control interface; observation of the
participants‘ movements, in conjunction with debrief questionnaire responses, revealed a
generally high level of usability, with most players able to move effectively throughout the
level. The lack of a heads-up display (HUD), text-based instruction and mini-map navigation
was regarded by some players as initially confusing, but not a source of frustration. Feedback
comments also suggested that the absence of an extra-diegetic HUD improved immersion
(via improved realism) and that undetermined player-health feedback increased tension (by
way of restricting access to usual game statistics such as hit-points and enemy attack damage,
thereby limiting the players‘ ability to form a defence/coping strategy).
Players also commented that an absence of weapons signified no combat mechanics and after
a few minutes of gameplay (in which the player was not attacked or killed) further suggested
to players that they could not die, a realisation that quickly reduced fear intensity by way of
removing the threat, both within the diegetic narrative (no threat of avatar damage/death) and
as an extra-diegetic gameplay tension (no threat of having to repeat part of the level). The
limited control interface received mixed reviews with most negative commentary from highly
experienced players who felt the lack of look freedom (the WASD setup restricted
observation movement to the Z-axis) differed too greatly from convention and limited
exploration. Players with lower experience ratings however presented a positive report,
asserting that the simplified controls increased accessibility. In a search for potential
erroneous variables that could have reduced the capacity of the DSP treatment to produce
Tom Garner
2012
University of Aalborg
meaningful results, aspects of the testing environment are definite candidates. With reference
to electrodermal activity, it could be asserted that ecological validity and laboratory-based
bias concerns reduce the capacity to extract meaningful data. It was noted that various aspects
of the testing environment (continuous presence of researcher, white-coat syndrome from
biometric hardware application and, unfamiliar environment, control hardware, action-button
configuration, audio setup and computer monitor / graphics setup) were likely to raise anxiety
levels prior to testing and consequently potentially invalidate all subsequent measures in
response to game-based stimuli.
The particular durations and placement of data extraction epochs present additional potential
for erroneous effects, the most likely problem being the placement of the baseline epoch and
the limited time between starting the recording process and starting the game (at which point
meaningful measuring begins). EDA trends strongly suggest a reliable peak in activity lasting
for (based upon obtained results) up to 90 seconds after beginning the recording and a long
decay extending over several minute before a low-level plateau is reached. One potential
causal factor could be that the participant was informed when the recording had started,
prompting a preparatory state of alertness and focus. As noted within the precious chapter, the
exact placement and duration of epochs varies with little agreement as to the ideal. As such it
could be appropriate to apply extra resources towards systematic analysis of the biometric
signal data from a variety of epoch configurations in search for an ideal arrangement within
the specific context of this study.
Whilst that lack of statistically significant difference between the test and control groups
could be disheartening, it must be remembered that the primary function of the experiment
was to explore new territory, uncover potential methodological problems and present possible
solutions for use in future study. Furthermore, the correlation between quantitative biometric
data and qualitative debrief responses suggests that data acquisition was notably successful
and that the issues are more likely to be associated with the testing environment, equipment
and procedure.
3.3 CONCLUSIONS AND EXPERIMENT SUMMARY
Although the results obtained from this study do not support the hypothesis that quantifiable
acoustic parameter changes can alter the perceived intensity of fear-related emotional states,
they do support the use of biometrics within such experimental scenarios and presents a
valuable base from which to build through further study. This chapter strongly advocates a
systematic approach to game sound testing, to allow for the plentiful erroneous factors to be
addressed. Ecological validity is a substantial concern and future testing will ideally attempt
to recreate a comfortable playing environment, comparable to that which a player would
experience in their own home. Researcher presence should arguably be minimised and
players should not be informed when recording has begun, reducing white-coat effects.
Future testing will integrate video tutorials (reducing live researcher presence and adding
uniformity to the participant instruction process) and present the player with an interactive
tutorial level, allowing them to become familiarised with the interface, control mapping, etc.
157
158
Experiment Results and Discussions
and also providing time for stress factors associated with initial exposure to the testing
environment to subside. However, recreating such an environment may not be necessary as
the developments in biometric technology may soon present us with affordable equipment,
suitable for home use and with the robustness and accuracy of research-grade equivalents to
source meaningful data acquisition directly from the natural environment. The following
chapter discusses the current opportunity for such testing to occur, presenting a selection of
consumer-grade biometric headsets that provide affordability and ease of application.
WITH REFERENCE TO THE ACADEMIC REVIEW
The opportunities and concerns raised with regards to the logistics of Experiment 1 certainly
resonated with several arguments presented in the relevant review (chapter 6). Comparatively
larger numbers of participants were made available (Reips, 2002), physical space and
material requirement was minimal (Stanton & Rogelberg, 2001) and the automated data
collection/collation process meant that once the tests were live on the internet, there was no
additional work required until enough results had been posted (Reips, 1995). Several
methodological issues associated with internet-based testing were also reflected during
Experiment 1, specifically compatibility issues between browsers and operating systems (see
Buchanan & Reips, 2001). In terms of the data collected, the limited difference between
online and offline datasets in both HGSD and SOF suggests that the lack of researcher
control over participants during online testing (see Krantz, 2001; Nosek & Banaji, 2002;
Reips, 2007) may not have been particularly damaging with this methodology and within this
specific context.
Direct experience with the psychophysiological measures of electromyography (EMG) and
electrodermal activity (EDA) supported many of the assertions referenced within the
biometrics chapter (5). EDA proved to be a reliable indicator of arousal (Gilroy et al., 2012;
Hedman et al., 2009; Nacke & Mandryk, 2010) and in context, this could be utilised to infer
anxiety and stress during gameplay. The EDA equipment was, as expected, easy to apply and
operate (Boucsein, 1992; Nacke & Mandryk, 2010) with minimal intrusion for the player
(Lorber, 2004). Motor activity (Roy et al., 2008) proved not to be a source of erroneous
variation and the temporal resolution, traditionally described as slow (Kivikangas et al.,
2010) proved adequate for accurate synchronisation to in-game stimuli. EMG confirmed
expectations for high temporal resolution and sensitivity (Bolls, Lang & Potter, 2001) but did
not demonstrate as a reliable indicator of negative valence, contradicting much previous
research (Harman-Jones & Allen, 2001; Kallinen & Ravaja, 2007). This outcome does,
however, support the notion that recreational fear is an ambivalent experience (Perron, 2005;
Svendsen, 2008: pp. 75-76). In terms of fear experience, the data obtained from these three
experiments strongly supports the concept of a complex, embodied system that is susceptible
to reflexive shock, slow-building apprehensive/suspenseful terror and more subtle variations
between the two. Despite strong effort to control erroneous variables during testing, betweenparticipant differences remained noteworthy, supporting the notion of interpersonal affective
influences such as gender, culture and personality (Mériau et al., 2006; Hamann & Canli,
2004).
Tom Garner
2012
University of Aalborg
The experiments undertaken within this thesis do not confirm or deny many of the auditory
processing theories documented within the academic review chapters as this would be beyond
the scope of the study. However, with regards to the primary focus of all three experiments
(assessing the emotional/affective potential of quantifiable acoustic parameters) the obtained
data does reveal both resonance and dissonance when considered alongside some of the
previously referenced research. Assertions that parameters such as immediate attack
(Moncrieff et al., 2001), slowly increasing intensity (Bach et al., 2009), increasing tempo
(Alves & Roque, 2009), low pitch (Parker & Heerema, 2007), and unclear localisation
(Breinbjerg, 2005) are supported by the results obtained from these experiments, but not
conclusively. The notion that some sounds may have the capacity to universally evoke a
particular emotional state by way of underlying evolutionary factors (Parker & Heerema,
2007) remains uncertain due to the presence of contradictory data (for example, participants
producing a limited/no clear affective response to sounds specifically designed to evoke
evolution-based fear responses). The assertion that sound can be processed pre-attentively
(Alho, 1997) is supported by the notably faster response times of EMG (and, in some cases,
even EDA) to stimuli than the real-time qualitative responses of players. Indirectly, the lack
of clear patterns within the data could be perceived as supportive of the concepts that imply
auditory processing as a complex matrix of variables. This includes the discrete modes of
listening (Chion, 1994; Gaver, 1993; Grimshaw & Schott, 2008), embodied cognition factors
(Wilson, 2002), attention filtering (Ekman, 2009), multi-modal effects (Adams et al., 2002;
Ma & McKevitt 2005; Özcan & van Egmond 2009; Väljamäe & Soto-Faraco 2008) and
sonification data (Grimshaw, 2007; Schafer, 1994). As a result, data obtained from the three
experiments indirectly supports a hypothetical framework of auditory processing that
incorporates such concepts.
CHAPTER SUMMARY
The conclusions taken from Experiment 1 weaken the hypothesis that quantitative acoustic
parameters can manipulate emotional experience, whilst evidence collected from both
Experiments 2 and 3 suggest that context and situational signifiers are essential to evoking
fear. Qualitative data taken from these experiments does however suggest that certain DSP
effects may cause a context-based response from players without intentional design. Sounds
incorporating reverberation, loudness, periodicity/tempo, pitch-shift, sharpness and attack
were indirectly described by many participants as significant factors to their emotional states
during play. It is acknowledged that the methodologies of these experiments could benefit
from further development, but all showed genuine potential, in particular, the capacity of
psychophysiological hardware (EMG/EDA) to produce relevant and accurate quantifiable
data. The final hypothesis taken from these trials is that context is essential to emotional
experience and whilst quantitative sound parameter cannot directly influence affect, they can
alter context and therefore influence indirectly. Context/situation is therefore presented as the
mediator between sensory input and affective output.
159
Chapter 8
Hypothetical Frameworks
Garner, Tom A.
University of Aalborg
2012
Tom Garner
2012
University of Aalborg
Chapter 8: Hypothetical
Frameworks
INTRODUCTION
This chapter consolidates the conclusions extracted from both the literature review and
experimentation chapters to present several hypothetical frameworks based upon
interrelations and processes within audio perception and emotional experience. The purpose
of these designs is to provide initial (but nonetheless, well supported) hypotheses of what
component entities and interactions exist within the emotion-listening process. The inclusion
of evidence sourced from both academic review and primary experimentation supports the
assumption that although these frameworks are expected to develop gradually, they are
highly unlikely to be rejected completely. Future research is expected to prompt small
modifications but, overall, the development of emotion-sensitive game sound will arguably
be more efficient if based upon (and building from) these frameworks.
INTERACTIONS AND PROCESSES WITHIN AN ECOLOGY OF FEAR
Figure 1 outlines the interaction of the fear response process within a natural environment
and is not contextually specified towards computer game play. Display of the theory
associated with human fear response reveals the complex nature of the interactions, with the
majority of the causal links bidirectional between framework entities.
Figure 1: Interactions of the fear response process
161
162
Hypothetical Frameworks
The process presented here follows the Cannon-Bard theory (1931) of emotions, with sensory
data processed via some aspect of the central nervous system (though not necessarily the
brain). Although figure 1 does not directly reference sound or computer video gameplay,
many of the key entities and processes featured here are also presented in later (more thesisspecific) frameworks. The key assertions within this framework are as follows:
1. Brain processing of sensory data is highly unlikely to be exclusively cognitive or subcortical. Instead, there exists a continuum where (depending on the influencing
factors) cognition or sub-cortical processes take varying ratio preference.
2. The interactions between the mind (the brain and central nervous system [CNS]) and
body are largely bidirectional, meaning that although the process begins with
impulses translated via the CNS, the resultant changes in physiology are felt by the
individual and therefore fed back into the system to influence the future processing.
For example, an individual may hear an unfamiliar sound, interpret a threat and feel
fear. This response may then trigger increased heart-rate, perspiration and a perceived
drop in temperature. The individual has an awareness of these physiological state
changes that they interpret as confirmation of threat that, consequently, heightens the
fear experience.
3. All elements within the mind/body process are susceptible to embodying influences
either directly or indirectly.
Figure 2: Ecology of fear within a virtual environment
Tom Garner
2012
University of Aalborg
Figure 2 is largely built around the fear-stage concepts of pre-encounter, post-encounter and
circa-strike defence (Fanselow, 1994). Reaching a particular stage is dependent upon the
psychological distance between the individual and the fear-object. Embodied cognition and
computer video game theory contextualise this framework, which is intended to reveal the
many variables that exist within a fear scenario; both those contained within the individual
(personal fear profile, memory, triple vulnerability theory, etc.) and the external variables
originating from the natural environment and the virtual game world. This framework
elucidates that ways in which almost any element of the system can impact upon any other.
This refers to both the macro (environment influences player, player influences game, game
influences environment) and micro elements. For example, low background environment
sound increases signal to noise ratio. This makes game sound more intense and the player
experience circa-strike defence during play. Their dislike of this experience causes the player
to turn down the sound within the game system.
INTEGRATING AUDIO CLASSIFICATIONS INTO A FEAR FRAMEWORK
Goodman (2010: p.69) states that ‗cognitive faculties of the auditory cortex do not need to be
engaged for fear responses to be engaged‘, positing that autonomic processes can evoke a
fear response from an audio stimulus as a result of innate processes or operant conditioning.
Whilst this suggests that audio signals are capable of inducing a fear sensation by way of
biological, psychodynamic and cognitive processes, the question remains as to what
properties within an audio stimulus cause a fear response and whether such parameters can be
manipulated to attenuate or amplify a fear response. This section discusses various
classifications of game sound (reflexive/cognitive responses, listening functions,
representative functions) and positions them within a framework of fear.
The power of suggestion has been documented in studies pertaining to experiences of the
paranormal (Lange & Houran, 1997) and arguably proposes that it is the preparation of the
individual, by establishing a situational context before exposure to explicit fear cues
(activation of pre-encounter defence [see chapter 3]), that chiefly determines the impact of
subsequent fear stimuli. Garner et al. (2010) compared relative pitch, loudness, and
localisation changes across several sounds experienced whilst playing a computer game and
discovered only limited association between acoustic parameter modulations and emotional
impact. Referring back to this thesis, the experiments documented within chapters 6 and 7 did
not intentionally characterise the nature or situational context for the sounds employed and it
is therefore suggested that not integrating a controlled contextualisation may have attenuated
the potential of the parameter modifications. Adding a heavy reverberation without context
may have little effect on the impact of a sound, whilst a pre-established gameplay element in
which the player is required to identify the position of a sound to avoid the source may have a
significant impact as, within this context, the reverberation obscures localisation, reduces
coping affordance and increases player-action uncertainty. These notions support the
hypothesis that fear, in response to audio stimuli, cannot be significantly augmented by way
of universal quantitative acoustic parameter manipulation. The modulating of acoustic
parameters must be integrated as part of a situational framework that considers both an
established fear experience profile and variation between individuals; creating perceptual
audio characteristics that are the key to effective fear manipulation.
163
164
Hypothetical Frameworks
Referring back to chapter 3, Fanselow‘s (1994) three stage structure of fear induction (preencounter, post-encounter and circa-strike) provides a contextually relevant method of sound
classification in which psychoacoustic/perceptual characteristics can be mapped onto a
framework of fear. The nature of pre-encounter defence dictates that stimuli have greater
psychological distance (that they are future-orientated, physically distant and
hypothetical/suggestive). These sounds are to set the scene and establish mood, tone and
atmosphere, consequently denoting that the threat-object is not (yet) present and the audio
stimuli available instead signify the immediate environment and entities indirectly relating to
the threat. Ainoplast, chronoplast and topoplast functions of sound, first presented in
Grimshaw (2007), denote periods within history, the passing of time and the architectural
space of the environment respectively. Sounds that possess Schafer‘s (1994) archetypal sound
(historical, symbolic and mysterious sounds), soundmarks (sounds native to a particular
location that support identification of place ) or keynote (ubiquitous and often ever-present
sounds that are not always consciously attended to and, as such, exist in the background of
the soundscape) characteristics arguably belong within the pre-encounter stage as their
primary functionality to establish scene and present an implicit, rather than explicit threat.
Kinediegetic (initiated directly by player action) and proprioceptive (internal bodily) sounds
are more complex and likely to transcend Fanselow‘s stages of fear, depending upon their
relative intensity. Whereas low intensity variations (light shaking of player-held lantern,
sound of player‘s slightly elevated heartbeat) suggest initial caution (pre-encounter), high
intensity variations (player gasp/scream, dropped lantern hitting floor, etc.) instead reflect
increasing terror (post-encounter) or revelatory horror (circa-strike).
Stimulus appraisal in the pre-encounter stage is likely to employ cognitive, high level
construal appraisals and anxiety is hypothesised to be present yet relatively low while
listening function is expected to be functional, semantic and/or reduced as the listener is
afforded more time with which to fully assess the scenario. Within a computer game context,
critical listening is also feasible, whereby the player may assess the quality and
appropriateness of the sound. To successfully evoke pre-encounter defence, the sonic
environment must suggest a locale in which threat exists at a psychological distance. Avatar
footsteps treading on disembodied flesh and bone, distant screams of an agonised victim,
reverberant acoustic paraspaces (a term coined by Parkes and Thrift [1980] that Grimshaw
[2007] refers to as ‗a space that, within the acoustic ecology, [that] describes and provides
immersory and participatory affordances to do with location, time and cultural and social
factors‘) that obscure localisation, all connote danger at a distance and strongly advise
caution without presenting a sound that is directly causally related to the threat-object.
The above classifications of sound possess specific traits that arguably imbue them with high
psychological distance (PD) and consequently, within our fear framework, sounds within
these classifications retain a pre-encounter fear function. Post-encounter defence however,
demands decreased PD whilst maintaining a degree of uncertainty. Within this phase there is
arguably a great deal of flexibility available as alternative aspects of PD can be manipulated
to reach the same affective endeavour. Signal sounds (foreground sounds that are designed to
be consciously attended to [Schafer, 1994]) are hypothetically more appropriate within this
Tom Garner
2012
University of Aalborg
phase, whereby the player is expected to perceive these sounds as originating directly from
the threat source. If we are to accept that the terror stage potentially activates the behavioural
inhibition system, freezes movement and potentiates hyper-attentiveness, then an audio
stimulus that generates such a response matches the profile of a retainer - a sound that
encourages a player to remain in the same location (term originally from McMahan [2003]
and adapted for sound by Grimshaw & Schott, [2008]). As noted above, kinediegetic and
proprioceptive sounds may also be present; however, the more intense nature of the postencounter stage suggests that such sounds should reflect this increase (heavier breathing,
increased heart-rate, lighter footsteps, etc.). In terms of PD, the hypotheticality and social
distance parameters of sounds that fit within the post-encounter phase are reduced as the
source is assumed to be actual and attentive towards the player.
Assuming player attention is more acutely focussed; then causal, empathetic, semantic and
functional listening is expected within the post-encounter phase as the player may attempt to
derive actionable information to support a coping strategy. Here audio designers may decide
upon what information they wish to reveal. Sounds that disclose threat intention and
emotional state may serve to accentuate fear intensity whilst localisation data may attenuate
it. Acoustic properties that signify physical characteristics are deeply subjective in their
capacity to modulate a fear response. Clichéd characteristics, including large size, fast
movement, unpredictable behaviour, distorted appearance, and great strength, are preferential
but their effectiveness remains at the mercy of the player‘s individuality.
As discussed previously in chapter 3, the circa-strike defence (revelatory horror event)
operates initially by way of an automated behavioural process dependent upon evolutionary
and conditioned response routines. As a result, initial appraisal of horror-type audio stimuli is
expected to induce reflexive and connotative listening functions to support immediate and
decisive response behaviour. An audio stimulus within this context could be described as an
attractor – a sound that induces immediate player response (Macmahon [2003] again, adapted
for sound by Grimshaw & Schott [2008]). High intensity kinediegetic and proprioceptive
sounds can reflect the nature of the horrifying sensation (gasp, scream, player damage).
However, because the complete horror response is not solely an innate, spontaneous reaction,
we must continue along the line of fright (Massumi, 2005) where the initial reflexive function
is gradually replaced by higher level appraisal as the individual moves out of the circa-strike
defence state by increasing the PD between themselves and the threat (e.g. running away
from threat to increase physical distance, placing barriers between self and threat to increase
temporal distance, etc.). The horrifying stimulus can then be more comprehensively evaluated
as the individual reverts to either a terror or caution state (or is relieved from the fear
sensation entirely) depending upon the new circumstance. Within this transition between
circa-strike and resolution states, sounds within the immediate environment may act as
calmers, signifying increases in PD (decreased volume of the threat giving chase to signify
you are successfully evading, the slam and locking sounds of a heavy door that denote
increased safety as a result of an effectively positioned barrier, etc.)
165
166
Hypothetical Frameworks
Figure 3 consolidates the above theories to explicate the interactions between audio stimuli
and emotional response in a fear context. It is suggested that perceived characteristics of a
sound determine the processing pathway and that appraisals of stimuli have the potential to
influence the perception of subsequent audio input; in certain cases, conceptually priming the
individual for appropriate action in response to possible, high intensity stimuli. In addition to
the three states documented by Fanselow (1994) a relief state (referred to as safe) is added to
create a more complete set of fear states. Comparative analysis of the four alternative feararousal states reveals increasing autonomic processing and limited cognitive appraisal in
response to greater perceived intensity (relating back to figure 1). Listening function reflects
this assumption; aligning critical and evaluative functions to cognitive appraisal whilst
immediate and reflexive listening is allied with an autonomic response.
Figure 3: Classification of game sound within a fear scenario
Tom Garner
2012
University of Aalborg
Figure 4 extends the framework documented above to produce a visual representation of a
theoretical structure that elucidates the interrelations between individual player, the virtual
game world and the environment of reality. The purpose of this framework is to display the
complex interactions and processes that occur during gameplay which could then be
exploited to manipulate a player‘s fear response through informed audio design.
Figure 4: Interactions between virtual and real acoustic ecologies in a gameplay context
Ultimately, this framework could potentially act as the foundation for an automated fear
evaluative audio system, capable of combining data from the game engine (acoustic
properties, intended perceptual quality, player action, gameplay contextualisation, etc.) with
real-time biometric input to effectively evaluate the fear evoking potential of sound
(individual or collective) with relevance to the player. Such a system could potentially
generate an understanding of an individual‘s personal fear profile, which could then alter the
properties of audio featuring later in the game, maximising the fear experience.
167
168
Hypothetical Frameworks
AN EMBODIED VIRTUAL ACOUSTIC ECOLOGY
In this section, the advantages of embodied cognition (EC) theory in advancing the
understanding of game sound, particularly within an interactive and ecological context are
examined. The final framework, visualising an embodied virtual acoustic ecology (eVAE) is
also presented to better assert the advantages of EC theory and elucidate the interactions
between player, environment and computer video game with a focus upon soundscape
variations and brain-processes. Chapter 4 documented several auditory phenomena (largely
taken from Augoyard & Torgue [2005]) that included anamnesis, narrowing and the Lombard
effect. Within this section it is asserted that if we are to acknowledge the existence of such
effects, it is logical to consequently assume that auditory processing is an embodied event,
dependent upon the relationship between physical environment, memory and physiology.
Developing from R. Murray Schafer‘s (1994) notion of acoustic ecology, Grimshaw and
Schott (2008) propose that, within a CVG context, there exists a virtual acoustic ecology
(VAE) that combines the player, game soundscape (derived from the game‘s engine) and
environment (known as the resonating space) as an integrated system. Whilst this framework
arguably incorporates EC theory, the VAE construct could be cross-examined with the
exploration into EC documented earlier within this chapter in an effort to develop a more
fully embodied virtual acoustic ecology framework (eVAE).
Figure 5 visualises a potential procedural chain to better elucidate the looping mechanisms
and inter-relating variables that impact upon our perception of game sound within an
embodied framework. Critical elements of the VAE construct remain (such as soundscape,
resonating space sound functionality and perceptual factors) but specific constructs within the
player are now presented that suggest the functionality of EC. At the origin stage,
soundwaves are acknowledged to be resultant from a complex matrix of historical and
circumstantial factors (asserting that the sound is not only dependent upon the here and now,
but also a highly complex chain of past events that have led, by way of causality, to the
present) but, irrespective of this stage, the resultant wave can always be reduced to velocity,
waveform amplitude and cycle frequency. Resonating spaces are asserted as key determiners
of the here and now of EC theory, in that the physical makeup of the environment may
(through only minor perturbations in signal processing) dramatically alter the perceptual data
extracted during cognition.
The dynamic nature of resonating spaces further accommodates the notion of real-time
cognition as changes within the physical environment (shifting temperatures, position/density
of reflecting surfaces, new materials entering/leaving the resonant space, etc.) have
significant potential for signal attenuation/amplification, meaning that no two sonic
waveforms should have precisely the same acoustic data outside of a heavily controlled
environment. The internal system map displayed here acknowledges the embodiment theory
that the central processing unit (CPU) is continuously affected by incoming sensory data, the
physiology and the long-term memory of the listener. The term Black Box alludes to the
limitations of this mapping in that the actual process of converting neural input signals into
output impulses (that drive both external action and internal looping systems) remains
Tom Garner
2012
University of Aalborg
unknown. One immediate application of this visualisation is derived from its highlighting of
key points within the listening process that a designer could focus upon in an attempt to
artificially replicate a desired sonic perception. The most apparent eVAE framework element
to replicate/synthesise would arguably be the soundwave data (the acoustical information that
constitutes a complete sound) and this is certainly a common choice within game sound
design.
Figure 5: A modified Virtual Acoustic Ecology implementing key Embodied Cognition theory
169
170
Hypothetical Frameworks
The line of sound detailed within this framework has interestingly coincided with the
developmental approach to game sound design. 8-bit era designers would focus upon
manipulation of the origin elements of the sound (plot/circumstance factors or graphics that
clearly signified the source of a sound, such as a whirling loop sound presented in tandem
with a recognisable UFO sprite, all revealed within an alien-themed game) due to the
limitations of waveform synthesis to replicate organic sources. As the game's medium shifted
from cartridge to compact disc, designers moved one step forward and attempted to create
convincing sonic illusions by focussing upon the soundwaves themselves. These sounds were
designed to clearly reflect their origins and the current environment, acknowledging the here
and now. In our current seventh generation of gaming, we are arguably now one step further,
experimenting with virtual acoustics as our technology enables us to artificially replicate
complex acoustic environments. The key characteristic of these artificial acoustic
environments is their capacity to function in real-time, acknowledging the time-pressured
view of EC theory. To explain, Wilson (2002) described embodied cognition as situated in
both space and time. Time-pressured cognition refers to the notion that ‗situated agents must
deal with the constraints of [real-time]‘. Referring back to the current FPS game engines,
audio technology enables sonic landscapes to act (and react) in real-time. For example the
sound profile of an approaching monster (splashes, roars, grunts, etc.) in Amnesia: the Dark
Descent (Frictional, 2010) is augmented, in real-time, in response to interactions between the
player‘s actions and physics parameters within the game. If the player moves quickly from
the monster, overall volume decreases whilst ambient reverberation is increased (although the
sounds themselves otherwise continue), signifying to the player that, although the monster is
still in pursuit, there is greater distance between them and that they consequently have more
time to escape. This progression suggests a general consensus between game sound
developers that an embodied approach is highly fruitful in increasing the experience of the
real during gameplay, as each step of progress has incorporated a key view of EC theory. The
embodied virtual acoustic ecology progresses further by considering the interactions between
both the virtual and actual resonating spaces and takes artificial manipulation right up to the
ear itself.
The amalgamation of circumstances required to facilitate even a simple sound contains a
large enough number of elements that, if artificially replicated, would be perceived as real.
Take the specific sound wave generated from a gunshot as an example. Even before we
consider the environmental impact put upon the wave as it travels from the source to the ear,
it is true such an event cannot simply happen without a complex set of requirements met.
There needs to be a gun, a bullet, a shooter and a target. There must be a motive, driven by
incentive and/or disincentive, which itself requires a complex arrangement of entities,
associations and processes. Early game developers lacked the technology to artificially
replicate a believable gunshot sound wave, but they could replicate the circumstances leading
to that sound, artificially replicating the shooter as a player avatar, the target/weapon as a
sprite graphic while the motive was established via plot or simply the player's awareness that
―this is a game and it is my job to shoot things‖. These techniques presented the player with
an associative dataset that, when combined with the soundwave data, could manifest a
perception of the sound as real.
Tom Garner
2012
University of Aalborg
Currently, most of these methods could be described as non-invasive, in that they only
replicate a segment or segments of the data processes that occur externally to the human
body. In relation to this, the embodied virtual acoustic ecology diagram reveals the possibility
that if we are not to push deeper, into the brain itself, we may have almost reached the
limitations of how immersive and believable we can make sonic environments. If it were
possible to replicate either input impulses (converted from sensory data) or the output neural
impulses (converted from input signals via the Black Box), it could essentially short-circuit
the framework, enabling the internal loop to function without actual sensory input. One
important question to consider is which neural impulse node (in or out) should be replicated?
The answer to this question could be dependent upon the comparative difficulty of
distinguishing I/O signals from electrical noise.
CONCLUSIONS AND CHAPTER SUMMARY
The hypothetical frameworks presented within this chapter take influence from all the
information already documented earlier within the thesis to ensure that their structure and
properties are well-supported, although it is acknowledged that further testing is necessary to
confirm absolutely their correctness.
The central hypothesis, relevant to all of these frameworks, is that an awareness of a concrete
psychological process that determines user-response to auditory stimuli would provide a great
advantage over improvisatory sound design, in computer video game contexts and beyond.
Understanding how the various influences of our embodied experiences cause our unique
perceptions of sound could ultimately be used to accurately predict perception from
controlled sound characteristics. These frameworks deliberately attempt to arrange the
complex notions of audio perception and emotion processing as structured blueprints,
characteristically similar to computer language. The intention is to create a representation of
these processes that is compatible with electronic code in a way that would enable a software
program to automatically predict perception outcomes from a programmed artificial
awareness of the sound‘s properties and contextual information. From this, the software could
actively manipulate the emotion outcome by controlling the sound output within the game
engine, then evaluate the effectiveness of the sound via biometric feedback, enabling the
system to self-customise in response to each unique player.
171
Chapter 9
Conclusions and Future Work
Garner, Tom A.
University of Aalborg
2012
Tom Garner
2012
University of Aalborg
Chapter 9: Conclusions and
Future Work
INTRODUCTION
This closing chapter revisits the discussions, frameworks and experiments of the thesis to
present a final summary of the contributions achieved and concluding arguments that will
inform future study. An outline of further research that is expected to continue on from this
thesis is also documented as is a brief comparative analysis of two consumer EEG devices
that might be used for such research.
SUMMARY OF PhD PROGRAMME
Understanding emotionality is a crucial aspect of human-computer interaction and sound is a
critical component to consider when developing emotionality as it is directly associated with
a user‘s experience of emotion (Alves & Roque, 2009). This thesis has documented
theoretical research and associated experimentation within the study of acoustics and fear.
The work produced was continuously framed within the context of computer video games.
The primary aim of the thesis was to collate literature from a range of disciplines to develop a
framework of acoustic ecology within the context of fear. The intention being to develop our
understanding of the role sounds (excluding musical and speech) play in eliciting fear during
gameplay and to provide quantifiable evidence to support the hypothesis that manipulation of
acoustic properties could affect the intensity of an individual‘s fearful experience.
This thesis brought together core concepts of embodied cognition (Wilson, 2002), acoustic
ecology (Truax, 1978), virtual acoustic ecology (Grimshaw & Schott, 2008) computer video
game experience (Grimshaw, 2007) and fear processing theory (Massumi, 2005) to construct
three hypothetical frameworks: an acoustic ecology of fear (both within and outside of a
virtuality context), a model of virtual and natural acoustic ecology interactions and an
embodied virtual acoustic ecology model. Beginning with an overview of emotions, fear
conceptualisation and sound processing; the thesis examined the six main concepts of
embodied cognition (Wilson, 2002), thrownness, construal level theory and psychological
distance (Heidegger, 1927; Lieberman & Trope; 2008; Winograd & Flores, 1986). These
concepts were strongly advocated within the thesis and heavily influence the hypothetical
frameworks that could, in turn, provide the basis for a future research programme (outlined
below). Existing empirical and conceptual research concerning acoustic parameters, audio
classes and modes of listening was also amalgamated and refined within a survival horror
game context and additionally includes a consolidation of literature relevant to internetmediated experimentation.
Empirical investigation included several experiments measuring players‘ experience of fear
by way of both innovative subjective analysis (real-time intensity vocalisation) and
quantitative biometrics. Obtained data revealed that changes in acoustic parameters of game
173
174
Conclusions and Future Work
sound can have a significant impact upon the player‘s emotional (fear) experience and both
empirical data and secondary research was amalgamated to produce a hypothetical process of
fear that was then re-contextualised into a gameplay-relevant acoustic ecology. The intention
of these hypothetical designs was primarily to present a transparent, testable framework of
audio perception and fear experience that would enable game designers to better understand
the emotional responses audiences would have to their sound, enabling them to create a
desired impact upon their audience more effectively and efficiently than improvisatory (trial
and error) processes.
The experiments presented within this thesis utilised many of the ideas taken from the
preceding literature review chapters to begin exploring the emotional qualities of acoustic
parameters. This work was preliminary (and also served to test the effectiveness of the
methodologies themselves) and future work is expected to provide more highly specified
detail. Most specifically, the variances in fear elicitation that can be observed in response to a
comprehensive range of parameters within individual acoustic effects. Increased specification
could address the impact of particular parameter settings, for example: level of high-pass
filtering within reverberation, degree angle within localisation, and individual frequency
bands within equalisation. Such detail would enable the development of a comprehensive
understanding of the relationships that exist between quantitative acoustic manipulation and
subjective emotional experience within a computer video game context. This could
eventually lead to the creation of a concrete sound-emotion reference guide, enabling sound
designers to immediately discover the expected affective potential of a sound based on its
acoustic and contextual characteristics. Such a system would likely evolve over time as social
and cultural factors shift our affective perceptions; therefore the original frameworks
presented within this thesis would remain valuable as tools to update such a reference.
CONCLUSIONS AND RETROSPECTIVE EVALUATIONS: CHAPTER 2
Chapter 2 discusses the origins, definitions and perspectives concerning human emotions,
including an assessment regarding the value of developing a greater understanding of the
underlying processes and an evaluative overview of alternative classification techniques. An
argument correlating physiology and emotion is presented, followed by an approach to
emotions from within a computer video games context and finally, an introductory synopsis
of fear (and associated terminology).
A key assertion of Embodied Cognition theories referenced in this thesis is that emotions are
an integral part of human thought processing. They are posited as vital to communication,
decision-making, survival and reproduction. Consideration of emotions will arguably be of
great benefit to human computer interaction development and could significantly improve the
power of educational tools/software. Emotions are inherently functional and dysfunction only
arises in response to sociocultural clashes where the rapidly changing, geographically bound
and contextually specific requirements of behaviour within relationships often demand that
we oppose our own natural impulses. Chapter 2 also provides a consolidation of various
theories of emotion, enabling efficient access to an array of theoretical standpoints. Theory
Tom Garner
2012
University of Aalborg
surrounding emotion study has developed significantly across time, with some past beliefs
expunged and replaced with heavily contrasting hypotheses. Currently, there remains an
absence of unanimity between both large and small populations with disagreements present
between individuals to entire cultures/civilisations. An overview of literature concerning
emotions and neuroscience indicates that there is a great deal of information that suggests
involvement of various brain structures in emotion processing but the technology and
understanding is not yet capable of confirming such assertions or revealing the precise nature
of their involvement.
With regards to the development of artificial emotion recognition systems, biometric emotion
classification structures utilising dimensional models that differentiate emotional states via
trait descriptors (intensity, valence, dominance, etc.) are argued to be more appropriate than a
discrete model. Artificial emotion recognition should therefore not attempt to determine a
specific emotion (fear, joy, sadness, etc.) from biometrics alone, but instead gather a
dimensional profile and contextualise that information from environment/circumstance data
to finally infer the emotion. The dimensional model is also preferable because the question
will not always be „what emotion is being experienced?‟ but instead „what is the nature of this
emotion and how is it changing within the same discrete experience?‟ During survival horror
gameplay, an individual may be experiencing fear throughout, but the specifics are dynamic
and fluctuate throughout play. In this scenario, parameters such as valence, intensity and
dominance are of heightened importance.
Because we no longer live in environments within which we must continually and literally
fight for our survival (hunting/gathering food, defending from predators, etc.), our evolutiondefined emotional mechanisms are not exerted in the same ways and now fit into different,
sociocultural frameworks in which primal emotional experiences can be virtually experienced
in a recreational context. With regards to emotional experience within a CVG context, this
chapter distinguishes between artefact (A-), fiction (F-), gameplay (G-) and representative
(R-) emotions (Perron, 2004; Tan, 1996). Irrespective of whether the player‘s emotional state
is gameplay, fiction, representation or artefact-orientated, game sound designers have to
remain aware of the interactive nature of computer video gameplay. Repeated playthroughs
(from small sections to replaying the entire game) will undoubtedly evoke differences in
emotional response. An additional related conclusion is that a greater consensus of emotion
terminology and theory within a CVG context is required to increase the pace of pragmatic
development.
With the focus switched towards fear and its subsidiaries, horror is determined to be a more
immediate, intense experience and more closely related to shock and disgust; it is revelatory.
In contrast, terror is measured and reflects suspense; it is anticipatory. Threat is a necessity
for fear elicitation, the significance of which will largely determine the intensity and
dominance of the affective response. Fear within a recreational context is positioned as a
desirable emotional experience and it is argued within various academic texts (documented
within Chapter 2) that the attraction of recreational fear lies in the potential to experience fear
175
176
Conclusions and Future Work
and overcome it or relish the relief when the stimulus subsides. The presence of an intrinsic
aesthetic appreciation in the dark and macabre, concept of sublime is also asserted and this
chapter closes with a model of the interactions between horror and terror within a survival
horror gameplay context, elucidating (theoretically) the way in which individual horrorevents augment the terror quota over time.
CHAPTER 3
With an emphasis upon the nature and processes of fear, chapter 3 commences by suggesting
that current sound design practice within survival horror games is largely a creative and
improvisatory process, with designers instinctively crafting the soundscape from
subconscious influences within their own experience. In response to this, a comprehensive
understanding of the core processes that underpin a fear-related experience could provide a
more grounded, stable foundation of design upon which creative processes can add character,
artistry and aesthetic structure. It further states that (as also suggested in chapter 2)
understanding emotional experience is highly valuable in human-computer interaction
applications, primarily with regards to usability and decreasing user-frustration but also in the
development of more fluid and organic operations and workflows and increasing user
engagement and productivity levels.
Chapter 3 provides a thesis-contextualised definition of fear subcomponents; differentiating
horror, terror, suspense, shock, anxiety and disgust. One significant assertion raised here is
that there is a hard-wired core of emotional processing, formed and reinforced over many
years by way of evolutionary development that transcends sociocultural and individual
differences. Threat is concluded to be a necessity of fear and is determined by the nature of
potential loss (the most extreme loss being death of one‘s self or loved ones).
With regards to CVG applications, genuine fear is posited as the instinctive response to the
perception of a threat as real; therefore, to evoke an authentic fear response, a survival horror
game must immerse the player within the game world and make the narrative and virtual
environment feel immediate and dominant. It is argued that immersion is intrinsically
connected to realism but not simply to the concept defined as an objective set of virtuality
characteristics, indistinguishable from reality. Immersion arguably also requires affective
realism (Hudlicka, 2009), both in terms of NPC communication reflecting reality and the
ways in which the game alters the player‘s emotional state. Players need to experience
emotional investment in the narrative and feel genuine desire to explore, progress and
conquer the game both in a diegetic (rescue the hostages, discover the secret, save the world,
etc.) and extra-diegetic (complete all achievements, beat the game before friends, unlock new
gameplay modes/levels, etc.) sense.
An overarching assertion within this thesis is that approaches to immersion via affective
realism demand a foundational structure in much the same way as physical realism (for
example physics engines consist of an established set of mathematical rules to reliably
simulate gravity, collision, friction, etc.). Therefore, a structured framework of fear
Tom Garner
2012
University of Aalborg
processing could arguably support the creation of a realistic affective ecology that fulfils
player expectations without being mechanical or predictable. This system connects study
regarding psychophysiology/biometrics to the fear-related research, in that player biometrics
and contextualising scenario descriptors can arguably provide the input with which the
affective framework interprets player-emotional state and triggers an appropriate output
response within the game engine. From both an ethical and commercial perspective, it must
be acknowledged that an entirely genuine experience of fear would arguably negate the
positive affective potential in recreational terror and instead could potentially be disturbing
and deeply upsetting for the audience. Therefore, it is crucial that as designers explore new
approaches to increasing the fear-elicitation potential of their craft, they remain vigilant to the
dangers of making fear too real.
The closing sections of this chapter present a discussion regarding the acoustic and
psychoacoustic properties that may connect sound to fear. Within these sections, sound is
posited as a crucial element to consider in the designing of computer video games intended to
evoke emotional responses, due to the significant potential of sound to alter affective states.
Chapter 3 also presents a range of academic texts that support the potential of objective
acoustic parameters to effectively evoke emotional responses. Low pitches, rumbling timbres,
immediate attack times, gradual volume increases (connoting an approaching source),
distortions, dissonances, sharp tones and high contrast volumes are submitted as strong
potential candidates for fear elicitation. Localisation techniques are also asserted as providing
potential for fear elicitation, specifically exploiting surround sound systems to place sounds
behind the player or utilising reverberation and delay DSP to mask localisation, making the
source difficult to locate within virtual 3D space. In relation to horrific episodes, sounds that
signify disgusting events, if presented in a manner that also evokes shock/surprise, have
notable potential to generate an intensely horrific experience. Acoustic parameters associated
with disgust include low-pitch rumbling, guttural sounds that connote events such as
vomiting and prolonged, high-frequency, sharp tones analogous to fingernails on a
blackboard. Unexpected sonic events that break from an established pattern have been
associated with negative valence and therefore are suggested as an additional approach to
survival horror sound design, both relevant to musical composition and sound effects. Rapid
onset/offset of sounds is concomitant to perceived urgency and although not expected to
single-handedly evoke a fear response, the potential to influence the intensity of the
experience is noteworthy. Sound design in conjunction with ambient soundscapes and with
game visuals presents additional opportunity for systematic fear elicitation, specifically
utilising acousmatic sound (sound in the absence of a visible source) that conjures feelings of
de-familiarisation and the uncanny. Likewise, manipulating the ambient sonic background
provides opportunity to mask localisation, create composite dissonance or establish an
extended period of silence with which to create a jarring shock. Increasing tempo is
connected with the notion of entrainment, specifically gradual escalations of tempo are
suggested to increase heart and respiration rates whilst also representative of increasing
predator speed and accelerating urgency.
177
178
Conclusions and Future Work
It is advocated that preparation of a player‘s preceding affective state prior to stimulus is one
of the most powerful approaches to generating an intense experience. Essentially, the player‘s
state of mind before the stimulus is vital. This approach strongly reflects the definition of
terror outlined within the thesis and suggests that shock-horror events may not necessarily be
ineffective (or even cheap and amusing) if they are a component of a well-structured
overarching terror scene.
CHAPTER 4
Chapter 4 discusses the concepts and frameworks associated with what it means for a sound
to possess virtuality. Also incorporated is a detailed account of embodied cognition (EC)
theory that is integrated into concepts of virtuality and acoustic ecology. The term virtuality is
differentiated from virtual reality; the former describing a continuum between real and
artificial whilst the latter is an electronic emulation of reality. Chapter 4 discusses the notion
of perceived truths, suggesting that the inescapable bias intrinsically tied to our view of
existence removes the capacity for truly objective perception. We therefore all possess our
own unique, virtual representation of reality without ever experiencing it completely. It is
further suggested that, irrespective of the sonic characteristics that determine a sound to be
virtual, all sound is propagated within real acoustic space between speakers and the ear and is
therefore susceptible to real acoustic treatment and consequently it is asserted that no sound
can be classified as entirely virtual.
Although a typical commercial computer video game can only express its virtual world by
way of two (or sometimes three) sensory modalities, this thesis argues that the nature of
human perception can be exploited to circumvent this problem. Intense and immediate
stimuli can dominate the senses and focus attention; therefore if a computer video game could
evoke realistic representations of sonic environments and maintain player attention via
sensory dominance, the complete experience could potentially be perceived as real.
The discussion in chapter 4 asserts that immersive and believable game sound is essential to
the gaming experience due to this limited number of sensory modalities directly associated
with gameplay (sound, vision and sometimes touch). The arguably gross/generalised
employment of tactile stimulation alongside an absence of olfactory and gustatory stimuli
within mainstream games results in a sparse virtual environment and therefore there is a
demand for sound and vision to dominate the gameplay experience and also represent the
missing sensory modalities (for example, sounds of coughs and holding of breath in response
to disgusting smell, representing olfactory sense; reloading earcon heard when ammunition is
collected, reflecting tactile modality).
Later sections within this chapter return to the concept of unique realities between individuals
and emotion is positioned as both a governing factor in human attention (which elements of
the shared environment are noticed and if/how much information is extracted from them) and
in how obtained sensory data are processed. Consequently, if we are to assume that a truly
objective perception of existence is impossible then emotions could be, in a sense, our reality.
Tom Garner
2012
University of Aalborg
Chapter 4 also presents in-depth discussions regarding acoustic ecologies (AE) that support
the virtual acoustic ecology (VAE) concept and relate to the eVAE construct, an embodied
virtual acoustic ecology that visualises the VAE concept within an embodied cognition
framework (presented in chapter 8). The assertion that emotional frameworks are essential in
development of artificial intelligence systems is maintained, positing that AI should be able to
reflect and reflex as well as compute.
Lucid dream states are discussed, raising interesting questions regarding whether truly
internalised processing is possible. This section ultimately concludes that, whilst the sensory
data experienced within a dream state are internally generated, such stimuli consist of
memories whose origins inexorably lead back to the environment and experiences in waking
life. In addition, thought processing during dream states nevertheless remains susceptible to
particular, immediate environmental and physiological factors that include: environment and
internal body temperatures, light penetrating eyelids, current illnesses, various potential
tactile inputs and, most notably, sound. Chapter 4 refers to Augoyard and Torgue (2005) in
suggesting that various documented auditory phenomena support the concept of an embodied
listening process. A complex interrelationship is revealed as sound is shown to influence
thought processes (anamnesis, narrowing) whilst the mind appears capable of generating
phantom sounds (phonomnesis, remenance), creating a perceptual continuation of sounds that
have ceased or even a sound that (within the current space-time) never existed. In result of
this, it is concluded that the process of listening cannot ever entirely deconstruct a sound into
objective acoustic data and subjective perceptual information.
Such concepts relating to EC theory dictate that human perception frameworks (both relevant
to fear and sound processing) must acknowledge time pressure, situation/context, local
environment, hypotheticality and relevance as key variables within a thought-processing
model. For the purposes of the thesis, such variables are classified under the blanket term
contextualisation, which here refers to all elements relevant to the sound, beyond the
quantitative acoustic properties of the sound itself. This includes EC (all factors associated
with the here and now) variables and also past-based factors that include: pre-scene memories
(dependent upon recollection filter - memories relevant to current scenario are more likely to
be recollected), episodes within overall scenario experienced prior to current event,
awareness of prior physiological state (providing reference to compare against new state) and
inferred history associated with current stimulus. Chapter 4 presents a table, consolidating
several variables that may contribute to a sound being perceived as real or virtual, these
comprise: natural (human voice) against artificial (synthetic sine wave) origin, causal
(gunshot sound reflects hammer hit on shell casing and gunpowder explosion) against
symbolic (human voice shouting bang! reflects gunshot) representation, immediate versus
delayed temporality (for example, synchrony of voice to mouth movement), natural versus
artificial (electric amplification, speakers, etc.) propagation, live against recorded
presentation, analogue against digital signal processing, and visible versus hidden
(acousmatic) source.
179
180
Conclusions and Future Work
It is also suggested that one notable way in which these numerous variables can potentially
alter output thoughts and behaviours in response to audio stimuli is by way of listening
modes. It is suggested that the amalgamation of the acoustic nature of a sound and the
contextualisation factors of the scenario determines the way in which we hear and,
consequently, what information we infer from the sound(s). For example, consider a
screeching of car tyres and prolonged horn sounds heard within two scenarios: whilst
crossing the road and whilst sitting on a bench nearby. In the first circumstance, the sounds
present a high intensity (determined by close proximity) and knowledge of the current
situation (awareness of scenario as personally relevant and the source as immediate) implies
imminent danger. The consequent response is expected to be reflexive listening, supporting
an immediate autonomic response and subsequent evasive behaviour. The second
circumstance presents a less intense and less dominant sound and the individual‘s awareness
of their current circumstance and environment does not denote impending danger. As a result,
connotative and empathetic listening modes are more probable as the listener appraises the
source, assesses the scene and considers the experience for the person who may be in danger.
CHAPTER 5
This chapter consolidates a range of literature to deliberate the applications and limitations of
electrodermal activity (EDA), electroencephalography (EEG) and electromyography (EMG)
both in general and thesis-specific contexts. The various definitions of psychophysiology and
biometrics are addressed and both are discussed within the milieus of emotion, sound and
computer video games.
Psychophysiology is the study of observable behaviours in living human organisms to enable
a better understanding of the relationships that exist between psychology and physiology.
Non-invasive experimental procedures are characteristic of this study. Biometrics as a term is
utilised in a way that deviates from tradition; within this thesis it refers to discrete
physiological measures that indicate psychological (specifically affective) activity. The
original definition bears some similarities, namely that both utilise physiology for
identification and classification, but the traditional application of security is replaced with
communication and recreation. Chapter 5 also asserts that whilst conception of the origins of
emotion is crucial for a complete understanding of the workings of the mind it is arguably
less than essential for certain psychophysiological investigations. With an intention to infer
affective state changes from physiology, it is arguably immaterial as to the chronology,
provided the relationship is accurate. In a more general sense, biometrics provides several
distinct advantages over subjective data collection, namely: overcoming participant difficulty
in affect-related self-analysis, circumventing false response or intentional
repression/accentuation and accurate identification of minute state changes. Biometric
approaches to user feedback are posited to be invaluable to usability and user experience
testing, chiefly due to the advantages detailed above. The current capabilities regarding eyetracking technology are also discussed and it is concluded that this biometric may hold much
promise as a supportive indicator of the event/entity that has caused a physiological state
change.
Tom Garner
2012
University of Aalborg
Establishing a clear definition of electrodermal activity is undertaken within this chapter,
positioning EDA as a blanket term under which the hardware configuration, temporal
characteristics of the recording and output measurement determine the subtype. Skin
conductance response (the type of interest within the thesis) employs a non-invasive,
exosomatic hardware configuration, assesses epochs of data synchronised to short-term
events and measures electrical conductance (as opposed to impedance, resistance or
admittance). The biological connections of the human nervous system reveal associations
between various brain structures and EDA. The presented review of relevant literature
strongly advocates EDA primarily as an effective measure of arousal, a crucial axis of a
dimensional model of affect. Notable advantages of EDA include: affordability and low
running costs, ease and non-invasiveness of application, freedom of movement for
participants during testing, relatively noise-resistant output signal and accuracy of
measurement. A temporal resolution of 1-4 seconds is relatively low compared to other
biometrics, however, for the purposes of emotion assessment in response to game events,
EDA could be considered more than capable of identifying individual events provided the
game utilised adequate pacing. Such scenarios highlight the value of eye-tracking technology,
suggesting that the significantly greater temporal resolution (and ability to differentiate
multiple events occurring simultaneously) would present a suitable solution.
Within chapter 5, several recent hardware developments (in addition to eye-tracking) are
documented that provide usable solutions to several of the typical limitations of EDA
measurement. Wireless systems give users freedom from the confines of a testing
environment and support ecological validity by way of facilitating use within participants‘
own homes. Wrist-band and fingerless glove supports for EDA sensors lift the restrictions
upon arm, hand and finger movement during testing to reduce distraction and obstruction in
tasks that require such movements. Dry sensor technology increases comfort for the user and
also diminishes both the risk of allergic reactions and the time required to apply/remove the
equipment. Such hardware setups are strongly advocated for use in CVG emotion-related
testing, primarily because of the inherent value in maintaining ecological validity (allowing
participants to play in their own living environments so data more accurately reflect real
scenarios) as it is asserted that player emotions are highly susceptible to lab-based biases
(researcher presence, social pressure/expectation, etc.). In addition, sensor setups that allow
total freedom of movement allow players to engage with control interfaces naturally, a
significant plus when considering the obstructive nature of standard finger-attached, wired
setups. It is concluded that the connection between arousal and EDA is established and
reliable yet, as a standalone data source, it is arguably inappropriate to infer any additional
psychological characteristics without additional biometrics and/or contextualisation
information.
The electromyography subtype relevant to the thesis is identified as facial electromyography
(fEMG), a further subclass of surface electromyography (sEMG) exists, but fEMG is referred
to throughout simply as EMG for clarity and efficiency. Considerable research texts indicate
EMG as a reliable measure of hedonic valence that, alongside EDA, facilitates construction
181
182
Conclusions and Future Work
of a two-dimensional model of affect measurement. In addition to this, EMG shares many
advantages with EDA (non-invasive, affordable, high accuracy levels at minute discrepancies,
minimal restriction of movement) and, in addition, provides an extremely high temporal
resolution. Noise susceptibility and difficulty of accurate application are acknowledged as
potential pitfalls but overall it is stated that, provided necessary care is taken, EMG is
arguably an efficient and robust biometric.
Under consideration, and within the context of emotion recognition, biometric measures are
largely viewed as effective approaches to differentiating both discrete emotion classifications
and emotion characteristics (valence, intensity, dominance) provided two criteria are met. The
first is that several complementary biometrics are used simultaneously to address the intrinsic
limitations that exist within them (for example, EMG addresses the temporal limitations of
EDA, whist EDA solves the noise susceptibility of EMG). The second is that a
comprehensive collection of contextual and situational information is collected, both from the
participant directly via qualitative assessment and from pre-established descriptors relating to
the scene, atmosphere, and motivations, for example, to bridge the gap between concrete
observations and abstract concepts.
With relevance to automated identification of fear-related affective states, EMG is positioned
as a highly suitable component of such a system, due to activity in the corrugator supercilii
(frown muscle) being consistently associated with negative affect throughout the literature. It
is also asserted that the temporal characteristics of corrugator activity may also support
differentiation of various negative states within the fear spectrum. For example, an immediate
and quickly dissipating spike of EMG may indicate shock in response to a horrific event
whilst more frequent, and sustained, low-intensity rises in activity could arguably be better
attributed to terror. Electrodermal activity has been shown to increase in response to positive
stimuli in a comparable manner to that of negative stimuli, with no clear statistical process
capable of reliably distinguishing between positive and negative arousal experiences.
This chapter (and indeed the thesis) does not attempt to advocate an ideal model for
dimensional affect measurement. Instead the most common approach (the circumplex model)
is evaluated and its variations are considered. Whilst ambivalence is assuredly a concern
when attempting to assess affective states by way of a dimensional model, it could be
suggested that the nature of EMG somewhat limits our capacity to measure positive and
negative affect as separable components, primarily due to the absence of an equally reliable
facial indicator of positive valence (zygomatic activation has been associated to positive
affect but also neutral and negative valence). Whilst it has been claimed that relaxation of the
corrugator muscle is itself an indication of positive affect, there does not appear to be a
substantial consensus regarding this contention.
Tom Garner
2012
University of Aalborg
Within chapter 5, a range of academic literature is also reviewed to establish the potential of
sound stimuli to alter affective states as measurable via biometrics. It is concluded that
various psychophysiological measures (EMG, EDA, EEG, heart rate) respond reliably to
particular sonic events, most notably: unexpected events, pattern deviations and a range of
certain acoustic parameters. The academic literature reviewed within chapter 10 advocates
dissonance, and localisation and movement are identified as potential acoustic parameters of
which variations are expected to be discernible via biometrics. It is acknowledged that
sociocultural factors are likely to influence responses to auditory stimuli, particularly with
regards to sounds with significant connotative attributes. However, it is maintained that
certain acoustic and psychoacoustic characteristics evoke responses that originate from
evolutionary development and consequently transcend sociocultural differences. Furthermore,
the primal heritage of fear supports the argument that auditory stimuli that evoke this emotion
are more likely to overcome the sociocultural barrier than many other emotions.
This chapter also concludes that biometrics has great potential to support responsive and
engaging adaptive gameplay mechanics, particularly those related to emotional experience.
Biometrics is beginning to permeate most gaming platforms, from online social network
games to serious/educational virtual environments and has furthermore been posited as
reliable indicators of game-specific emotional states such as immersion, flow, presence,
frustration, challenge and fun. Just as adaptive difficulty can increase or attenuate challenge
in response to gameplay parameters (health, shooting accuracy, completion time, etc.), an
adaptive fear system may alter the affective intensity in response to biometric parameters.
This would not only enable a game to avoid extreme emotional highs (anxiety, genuine upset)
and lows (boredom, disengagement), but also increase the replay value by varying the
experience to defy expectation and maintain a sense of the unknown in a game that has
already been completed by the player. It is acknowledged however, that study concerning
biometrics in games applications is in its infancy and there is insufficient agreement between
researchers with regards to many of the connections between biometric data and gaming
experiences. Computer video games exist in a wide variety of highly discernible genres with
distinct game mechanics, motivations, atmospheres, perspectives, interfaces and interactions
(and this is certainly not an exhaustive list). Consequently, any associations that prove
reliable in one genre cannot be generalised across the multitude of genres or even between
individual titles within the same genre (assuming distinct variances in some of the categories
documented above are present). The implication for tests of game designs is that they must
possess transparent likenesses of the games or genres they represent in order that conclusions
can be justifiably extrapolated.
In addition, chapter 5 provides a comprehensive review of electroencephalography (EEG),
discussing the relative advantages and limitations of this biometric within a computer video
games context. Within this chapter it is asserted that EEG possesses several advantages over
many of the alternative biometrics, including: high temporal resolution, portability,
affordability and ease of application. Whilst such advantages are also characteristic of EMG
and EDA, EEG additionally provides direct access to brain activity, as opposed to indirect
183
184
Conclusions and Future Work
measures by way of an intermediate (muscle activity, cardiac activity, sweat secretion, etc.).
Whilst other biometric technology shares this benefit (such as functional magnetic resonance
imaging and positron emission topography), EEG claims distinct advantages over such
equipment, namely: non-invasive hardware, no radiation or magnetic fields and equipment
that facilitates freedom of movement and activity during testing.
Limitations are also documented (limited access to lower brain structures, low spatial
resolution and signal to noise ratio) however it is advocated that research communities (both
in games studies and other disciplines) remain optimistic that EEG will yield significant
results in the near future. Recent studies documented within this chapter have revealed EEG
to have prolific application and also presented within this chapter are several promising
approaches to solving some of electroencephalography‘s limitations.
With regards to EEG data collection methodological options; laplacian, referential and
bipolar sensor montages are considered and it is concluded that referential is the most
appropriate setup, primarily in response to the observation that the majority of relevant
existing research utilises this montage and that the current generation of consumer-grade
EEG headsets all employ the referential montage as standard. A brief overview of EEG
feature extraction methods is presented but no in-depth comparison of the various alternatives
is offered, neither is there a detailed technical account of these methods. A brief comparative
analysis of fast Fourier transforms (FFT) and discrete wavelet transforms (DWT) is
documented, highlighting that researchers are arguably required to decide between ease of
use and efficiency of the former, and greater control of the latter. DWTs are concluded to be
preferable for the purposes of emotion feedback loops within a gameplay context.
Classification options are also discussed in which frontal asymmetry is proposed to be a
potential indicator of valence. Other features of EEG have also been connected to an
attention/concentration and relaxation continuum, a measure that could be related to arousal,
potentially facilitating an effective circumplex model of dimensional affect assessment by
way of a single biometric. In addition, a range of mathematical classification algorithms are
briefly evaluated to reveal further disagreement between researchers with regards to
classification methodologies, with new and updated approaches being published regularly. It
is concluded that personal preference holds some influence over which classification tool will
be utilised and it is therefore suggested that a new series of research should ideally
experiment with several alternatives by way of a preliminary study before committing to a
particular scheme.
It is concluded that the above arguments position EEG as a suitable biometric for continuing
study in this specific field. This section of chapter 5 is consequently an introductory academic
discussion to preface future empirical research that is discussed below, in the closing sections
of this conclusions chapter.
Tom Garner
2012
University of Aalborg
Overall, this chapter concludes that biometric data acquisition (particularly EMG, EEG and
EDA) has notable worth in game experience and emotion assessment. The multitude of
research papers documenting these measures in addition to a gradually increasing consensus
level (relating to both theoretical and methodological issues) arguably implies a significant
interest in the potential of biometrics within this field but also increasing faith that this
approach will yield significant results in the future.
CHAPTER 6
Chapter 6 presents the collective methodologies for the three preliminary experiments
undertaken during the course of this study. Whilst each individual experiment differed in data
collection and overall design, the purpose remained consistent: to discover the potential of
quantifiable acoustic parameters to evoke and/or modulate emotional (fear) experience. As
preliminary trials, they were not expected to reveal conclusive evidence but, instead, search
for circumstances in which manipulating sounds with digital signal processing effects
produced statistically significant differences in emotion measures when compared to
untreated controls. This in turn, would reveal if differences in game sound had any
discernible effect upon player-emotion as, if not, there would be little value to the
forthcoming hypothetical frameworks.
The opening section of chapter 6 establishes the potential value and relative limitations
associated with utilising the internet as a medium for academic experimentation. It is asserted
that internet-facilitated testing is affordable, efficient, supports multiple simultaneous users,
can reach participants across the globe, has potential for very large sample sizes, and supports
automation of data collation and filtering. Overall, the generally high level of accessibility
and novelty associated with web experiments generates significant participant interest to
further increase sample sizes. Utilising the internet also provides greater ecological validity
by way of allowing individuals to participate in experiments from their own homes. Carefully
crafted web experimentation can remove the need for the researcher‘s physical presence
during testing, reducing researcher-based coercion and white-coat syndrome. As internetrelevant technology develops, so too will the opportunities for web-based experimentation.
Progression within virtual reality design may ultimately facilitate highly immersive virtual
testing environments, greatly comparable to real equivalents and providing the high levels of
control afforded by traditional laboratory setups. Virtual web experiments could support
access to participants from around the world and present test environments that would be
either difficult or even impossible to obtain in reality.
There arguably remains a stigma against web-based research, placing such approaches
immediately on the back foot. Compatibility issues with experimentation materials (graphics,
sounds, interactive elements, embedded video, etc.) require careful consideration and the
complexities of internet and digital technology present a substantial challenge for researchers.
Alternative compression algorithms and connection types/speed in addition to compatibility
issues relating to hardware, browsers and third-party plugins (Flash, Java, etc.) inevitably
result in difficulty when attempting to create a uniform test environment. In many cases, even
185
186
Conclusions and Future Work
the basic layout and colour scheme of a website can vary when viewed in alternative
browsers. Researcher absence limits experiment control further, the primary concerns being
restricted regulation over intentional participant bias/deception, but there is also concern
regarding lack of confirmation that the user understands the task/process and is willing to
participate throughout. Additional related problems include risk of multiple or incomplete
submissions and several participants completing the same form and submitting responses as
an individual. It is also asserted that the nature of web experiments encourages participant
samples to be skewed towards a specific demographic and therefore obtained data lack
generalizability.
Ethical concerns connected to web experimentation chiefly involve participant anonymity
and data protection. This unfortunately creates a conflict between the demands of ethical
security and test reliability because the most effective approaches to participant control
(automated IP address collection, repeated personal information gathering, required username
and password creation, etc.) directly infringe upon security of personal information. Invasive
recruitment practices present an additional ethical concern and it is asserted that viral
advertising is arguably the most effective compromise solution, enabling interest in a web
experiment to spread organically without pressuring potential participants directly. Modern
social networking technologies are also advocated as potential approaches to soft, viral
advertising techniques within which word-of-mouth can spread between virtual networks and
links to the test can be shared to provide increased accessibility.
Following from the above discussion, chapter 6 presents the methodologies for the three
experiments, beginning with the internet-mediated trials. This experiment consisted of two
individual testing environments (the horror game sound designer [HGSD] and the sounds of
fear [SOF]), in which the affective potential of a range of quantitative DPS effects was
examined through contrasting approaches. Whilst HGSD utilised a greater degree of
interactivity, requiring participants to select sounds they felt most appropriately reflected the
accompanying visual material; SOF assessed users‘ affective responses more directly,
requesting that they provide 1-9 scale intensity and valence measures for each sound.
The second experiment utilised a bespoke game level built from the Crysis (Crytek, 2007)
game engine, incorporating treated and untreated sound groups into a playable game.
Participants provided data through post-questionnaire assessment but the key method of data
collection was real-time intensity (RTI) measures, a five-point scale that players would
vocalise when prompted during gameplay.
The third and final experiment took its influence from both the previous chapter on
biometrics and the prior two preliminary trials. A new test game was presented and
electromyography and electrodermal activity data, consolidated alongside subjective player
experience reports, allowed an assessment of affective responses to alternative DSP
treatments (pitch, periodicity, sharpness, attack time, localisation and signal to noise ratio)
comparable to those utilised within the earlier experiments.
Tom Garner
2012
University of Aalborg
CHAPTER 7
Chapter 7 documents the results, evaluative analyses and conclusions relating to the three
experiments, beginning with the internet-mediated test. With regards to post-experiment
evaluation, the majority of participants for this test noted that the minimalist visual design,
accompanied with large interactive icons/buttons and visual plus textual feedback cues
created an accessible and transparent experience. Employment of a relatively modern and
certainly professional grade web-building tool (Dreamweaver CS4) ensured that all of the
embedded material and core programming language utilised within the website was
compatible with current web browser applications. Problems did arise with regards to thirdparty plugins, specifically embedded flash animations that became incompatible with a new
version of a major web browser that had been released one month before testing began. This
gives an impression of the timeline by which web technology can change, with previous
incarnations becoming obsolete very quickly. Tutorial videos demonstrating how the
experiment worked proved an effective approach to presenting uniform instructions to
participants, although it was noted by several participants that a video recording of a visible
instructor speaking to the player (as opposed to a voice-over) would have improved clarity
further and increased participant comfort.
In terms of the valence-arousal model (VAM: a classification system mapping intensity and
positive/negative measures onto a two-dimensional plane) approach to participant response, it
is asserted that, whilst intensity and valence measures do elucidate affective differences
(particularly if both terms are clearly defined to the participant during briefing), there are no
explicit directions as to the discrete emotions being experienced and, as such, obtained data
are arguably limited in specificity. Additional data collection approaches embedded within
the test may present a possible solution; for example, requiring the participants to describe
their emotional state (either free text entry or multiple choice from a list of predefined
descriptors) following audition of each sound.
Overall, the results obtained from the internet-mediated experiments suggest great potential
for web experiments as the approach not only provided access to a significantly larger sample
size than would have been possible in a local study; but also presented a number of
interesting differences in affective potential between untreated and loudness-treated groups
and also between various different source sounds. The final proposition raised from this trial
is that although the specific nature of sonic characteristics that determine affective response
is, as yet, unknown, there is evidence to suggest that such an auditory phenomenon that
transcends interpersonal differences really does exist.
The second experiment presented a preliminary assessment of fear-related affective intensity
in response to various DSP audio treatments based upon those employed within the web
experiments. Although some trends within the data were observable, there was no conclusive
evidence to support the hypothesis that loudness, localisation (via 3D positioning) or pitch
would have a significant impact upon subjective player affect responses. This is accounted
for in a subsequent discussion by a number of potential erroneous variables and an overall
187
188
Conclusions and Future Work
lack of differentiation between the treatment groups. Future work is expected to exert greater
control over these variables and to experiment systematically with wider extremes of
treatment parameters.
The data collection approaches utilised for this experiment did, in contrast, prove to be highly
effective. Real-time intensity (RTI) readings, in which participants provided spoken intensity
statements (integers 1-5) when cued by an automated scripting program implemented by way
of the game engine, enabled subjective user feedback to be more accurately associated with
specific in-game events and sounds. Whilst the relatively small 1-5 RTI parameter was
employed to increase accessibility for the players, it could be asserted that such a limited
range of subjective responses could limit the power of any obtained data. In future testing a
1-9 range (utilised within the relevant, SAM [self-assessment mannequin] scales of valence,
intensity and dominance) could be more effective, as players would be presented with more
control for differentiating stimuli. Automated event logging also showed notable potential as
an efficient approach to data collection and filtering. During testing, the coordinates of the
player‘s location, sum and duration of run function activations, and completion times were all
automatically collected and presented within an auto-generated text file. The scripting system
(the CryEngine Flow graph) supports additional mathematical functions that would enable
basic descriptive statistical analysis to be carried out automatically and in real time (although
such features were not used within this experiment). Data logging provides accurate
timestamps for any desired event or set of events, supporting simple yet effective correlation
against response data, an approach arguably more reliable and precise than manually
establishing event times from observation of gameplay videos.
Player experience and confidence (PEC) levels are asserted to be critical variables that should
be controlled in order to yield valid results. It is suggested that the primary reason that PEC
has such influence over results is connected to coping; specifically that players with high
PEC ratings are fluent in FPS controls and avatar coordination whilst also having experience
of comparable aesthetic designs, atmospheres and scenarios. As a result, for players with a
high PEC rating, test games are unlikely to present surprising or unknown gaming territory,
affording them (utilising their knowledge of conventions and playing skill) a coping
confidence that negates the fear-evoking stimuli. Finally, completion time is also
acknowledged as a potential source of erroneous variation. Longer completion time
consistently leads to higher overall intensity responses, suggesting that as players struggle to
reach the level exit, increased frustration and anxiety provides a priming effect, accentuating
the evocative potential of subsequent audio stimuli.
The third experiment integrated psychophysiological measures, a new game level and several
additional DSP treatments. RTI collection was not employed but debrief questions were
presented to participants. Despite the suggestion that test games should ideally represent the
style/mechanics/interface/etc. of their associated genre (in this case, survival horror) several
aspects of the game's design were unique; most notably, the control interface that reduces
movement to forward, reverse and left/right rotation via the WASD keys, omitting the
Tom Garner
2012
University of Aalborg
characteristic mouse-to-look function. The intention was to limit erroneous gameplay
variation (primarily completion time) between participants with differing PEC levels.
Ultimately, it is concluded that this was a mistake as, although variations in completion time
and player-confidence in exploration and progression were attenuated, there remained distinct
variation between some participants. Future associated testing should arguably employ a
commonplace control interface and control for PEC through sample screening, the additional
logic behind this approach being that participants with very low PEC are not regular game
players or have little to no interest in games and therefore do not reflect the target consumer
group.
Other design choices that, whilst not necessarily representative of the genre can be observed
in existing titles, proved more successful. The omission of a heads up display arguably
increased immersion and the absence of weapons/defence (initially) reduced coping ability.
Both EDA and EMG equipment met expectations set by the preceding literature review, in
terms of ease of application, consistent signal connection, minimal task distraction and usercomfort. Unfortunately, EDA proved considerably susceptible to erroneous laboratory-based
effects, supporting the previous chapter‘s assertion that ecological validity is of prime
importance when measuring electrodermal activity.
Consideration of the obtained data generated several logical recommendations for future
study. Effort to recreate a generic home-style environment within which to execute testing is
expected to reduce erroneous anxiety levels and increase ecological validity. Researcher
absence throughout gameplay is advised to reduce social pressure derived from the player‘s
assumptions regarding the expectations of the researcher. The inclusion of an extended period
of initial gameplay (comparable to the test section) before biometric recording commences
supports the player in acclimatising to the unique features of the game; both diegetic
(atmosphere, style, narrative, motivation, etc.) and extra-diegetic (unique controls,
navigation, perspective, etc.) Again, this is to reduce erroneous anxiety and reflect
comfortable gameplay circumstances to support inferences that any increases in fear-related
experience are due to the intentional, crafted elements of the game.
Shifting focus upon the sounds employed within the experiments, the particular digital signal
processing (DSP) effects applied within these tests were chosen based upon secondary
research into relevant theoretical and experiment-based literature in addition to their
presumed ease of application within this context. It is acknowledged that other DSP might
potentially have been equally effective and future study should attempt a comprehensive and
systematic series of comparable trials to ensure that nothing is overlooked. This is beyond the
scale of this thesis. Contextualisation of sound is strongly asserted to be a significant
contributor to its affective potential. Due to the absence of an established context or any
additional sensory data, participants were likely to embody each sound within their own
context (possibly even extending to falsely assuming the source/nature of the sound). Such a
variable could potentially have had an erroneous impact upon results and future study should
strive to correct this.
189
190
Conclusions and Future Work
Overall, the results from these experiments revealed evidence that not only can different
sounds evoke significantly different emotion-related responses but adjustment of individual
acoustic parameters can also have a notable effect upon affective experience. This study does
not reveal the exact nature of how these interactions work or present a quantifiable
emotion/affect value for individual sounds/parameters. It does however support the argument
that both sound and specific acoustic parameters have the potential to influence emotional
response and therefore helps to validate the hypothetical frameworks.
CHAPTER 8
Chapter 8 commences with two models of fear processing. The first begins with a fear object,
contained within the environment, that (depending upon the nature of the various variables
within the scenario) is processed by way of collaboration between autonomic and cognitive
systems. Response to the input is expected to reflect the nature of the processing; meaning an
input that is largely processed autonomically will generate a more instinctive response.
Conversely a stimulus that is processed primarily via cognition and appraisal is anticipated to
engage a more measured and deliberate response. For example, unknown footsteps on a
creaking floorboard may generate a momentary automated response (gasp, freeze, increased
heart-rate, perspiration) but the most temporally significant response will be cognitive
appraisal as the listener consciously hypothesises upon alternative causes for the sound and
actively modifies their behaviour to facilitate the gathering of further information (such as
maintaining silence to hear more) and confirmation rejection of initial hypotheses.
The second model is contained within a computer video gameplay context; however the input
variables include entities from both the game engine and the local environment. Within this
model, interpersonal effects (a personal fear profile that includes personal history, phobias
and coping ability) plus the nature of the stimulus determines the nature of the fear
processing by way of perceived psychological distance. Stimuli appraised as psychologically
distant (hypothetical, faraway, future orientated) enable pre-encounter defence: temporary
autonomic response with extended cognitive appraisal. Psychologically proximal stimuli
(certain, close, immediate) are significantly more likely to trigger circa-strike defence:
extended autonomic behavioural response with minimal cognitive appraisal. The model takes
influence from continuing ecological models by including a feedback loop in which aspects
of the final fear experience (dependent upon its nature) are fed back into the model via the
personal fear profile.
The next hypothetical frameworks documented in Chapter 8 include a visualisation of the
interactions between real and virtual acoustic ecologies within a survival horror CVG
context. The fear model, documented prior to this is extended to incorporate sonification data,
listening modalities and environment/player and game/player interactions; the intention being
to create a detailed yet transparent model of auditory processing within a survival horror
CVG context.
Tom Garner
2012
University of Aalborg
Finally, chapter 8 presents the eVAE model, a framework that places the player and game
sound (generated from the game‘s engine) within a shared resonating space and considers
synchresis between audio and visual stimuli, sharing a likeness with the model established by
Grimshaw and Schott (2008). The key difference in the eVAE model is the focus upon the
brain/mind of the player, which is displayed as internal and external looping mechanisms.
The external loop receives sensory data from the local environment and the player‘s
physiological state. The biological body (including the ear and nervous system) translates the
auditory data into electrical impulses that are then recoded against the player‘s individual
personality profile (contained in the long-term memory and relating to the personal fear
profile documented in chapter 3). The output impulses of this procedure affect physiology
(e.g. increased heart-rate, adrenaline secretion, muscular tension) that, in turn, alters physical
behaviour (kinaesthetic action: clenching game controller, pressing buttons, etc.) and finally
gameplay action (avatar pulls trigger of weapon, sidesteps behind cover, etc.). Each of these
elements are themselves, looped directly back into the system whilst gameplay action
simultaneously leads to gameplay response dictated by the engine in reply to the gameplay
action (NPC enemy is shot, avatar evades enemy fire, etc.). This information is also then
integrated back into the system via the sensory inputs and collated to dictate further
processing and player response behaviours.
The eVAE model‘s internal loop acknowledges the possibility that the mind does not simply
receive and process input to then generate appraisals and output response behaviour, but also
feeds back the appraisal information immediately into the system. For example, inspired by
Dead Space (Visceral, 2008) a player may receive sensory data during computer video
gameplay that is interpreted as: there is a huge recurring banging sound and intermittent
roars heard in the next room and past experience informs me that a boss battle is anticipated
beyond this door. Appraisal of the situation, largely from contextualised auditory data, is then
processed to generate a strategy: stock up on ammunition and switch to contact beam. The
internal loop mechanism denotes that this strategy can be immediately reintroduced into the
system for confirmatory reappraisal without new stimuli from the local environment: but
what if the challenge is a swarm of small enemies? Maybe I should equip the pulse rifle
instead. The focus of eVAE is upon cognitive auditory appraisal that processes the data by
way of the brain and it does not reference neural shortcuts that bypass the brain in order to
enable immediate response behaviour. In the discussion surrounding the eVAE model, a key
assertion of the thesis is that, as games technology develops, designers are moving closer to
the ear with regards to the focus of their strategies; from initial concentration upon the source
and motivation/context of the sound, to crafting of the sound itself, and finally to careful
manipulation of the entire virtual soundscape. Modern noise-cancelling surround sound
headphones and acoustically treated gameplay environments progress a step further in
manipulating the resonating space that the player him/herself inhabits. Whilst currently
'science fiction' in tone, it could be argued that the next logical phase will involve either
intentional manipulation of the player‘s body and/or bypassing the ear completely by way of
transmitting synthetic auditory information impulses directly into the brain.
191
192
Conclusions and Future Work
FUTURE STUDY: CONSUMER GRADE ACQUISITION DEVICES
This section refers back to information discussed within chapter 5 (Psychophysiology) and
contextualises it within CVG applications and focuses upon consumer grade EEG recording
headsets and accompanying software. It commences with an overview of the two currently
competing systems, the MindWave from Neurosky and the EPOC from Emotiv.
The MindWave is a highly competitively priced, wireless EEG acquisition headset that
requires relatively low computer system resources and utilises a single active electrode that is
placed on the forehead and a reference electrode, connected to the left ear lobe (Neurosky,
2011). Although the commercial release of the device is very recent, the proprietary engine
and SDK were presented earlier in the MindSet (2009), a comparable Neurosky EEG device
that is essentially identical to the MindWave with the exception of integrated Bluetooth
headphones. Academic research utilising the Neurosky system (that can be directly applied to
the MindWave) dates back to 2008, suggesting that research access predated the commercial
release. The MindWave exploits a proprietary algorithm to convert raw EEG data into
measurements of cognitive attention and meditation (entitled, eSense measurements), rated in
integers from 0-100 (Wu, Liu & Tzeng, 2011). These measurements are based upon Alpha
and Beta frequency EEG data that are generated via a fast Fourier transform (Peters et al.,
2009). The associated marketing publication claims that the algorithm is adaptive, capable of
adjusting to individual fluctuation and trends of the user (Myndplay, 2011). Temporal
resolution of data collection is high (>1000 samples per second). Included with the hardware,
a source development kit is provided (MindWave Development Tools) that supports a
moderate range of languages (C/C++, C#, Java) and, for the purposes of game development,
is highly suitable for integration into big-budget, commercial/mainstream game engines,
mobile game applications and web-based gaming technologies.
The Neurosky system (incorporating both the MindWave and MindSet devices) has been
praised for affordability, portability, ease of use and its recognition of attention/meditation
levels is mostly agreed to be reliable and accurate (Cowley et al., 2010; Peters et al., 2009;
Rebolledo-Mendez & de Freitas, 2008; Vourvopoulos & Liarokapis, 2011). RebolledoMendez and de Freitas (2008) posit that the capacity to reliably measure attention via the
Neurosky system could allow researchers to infer further knowledge with regards to
motivation and interest. Further research (albeit written by the manufacturing company)
asserts a robust EEG signal output, comparable to that of professional grade hardware such as
Biopac systems (Neurosky, 2009). The use of a single active electrode presents obvious
drawbacks when considering the potential application of the MindWave as an emotionrecognition system. Tammen and Loviscach (2008) assert that the primary functionality of
the MindSet is to differentiate attention levels and that such data cannot be further processed
to provide reliable insight into the user‘s emotional state. Spatial resolution is too low to
facilitate frontal asymmetry assessments, eliminating the possibility of effective arousalvalence measurement. Rebolledo-Mendez et al. (2009) note that although raw acquisition is
of high temporal resolution, the processed attention levels are presented at only 1Hz,
reducing temporal sensitivity significantly and presenting a notable delay between state
Tom Garner
2012
University of Aalborg
changes and system response. Rebolledo-Mendez et al. also state that whilst the dry electrode
improves ease of use and user comfort, connection reliability is affected and a broken
connection can take between seven and ten seconds to be re-established. Peters et al. (2009)
concur with several of these issues but argue that the MindWave system would prove
effective in conjunction with other modalities of EEG acquisition.
The EPOC (Emotiv, 2008) utilises a 14-electrode configuration based upon the international
10-20 arrangement (Campbell et al., 2010) and a referential montage setup that measures
difference between the active electrodes and two references placed on opposite sides of the
head. Within the class of consumer grade headsets the EPOC currently has the largest number
of electrodes than any of its competitors (Seigneur, 2011). The headset operates with a
sample rate of 128Hz and 16-bit resolution (Liu et al., 2010), revealing a higher technical
specification when compared to the MindWave. In parallel with the MindWave, documented
advantages to the EPOC include affordability, portability and ease of use/comfort (Adelson,
2011; Campbell et al., 2010; Lievesley et al., 2011). Flórez et al. (2011) documented a 71%
accuracy rating in distinguishing between mental tasks and the capacity of the EPOC to
acquire EEG data has also been compared to medical grade products in terms of reliability
and accuracy (Stytsenko et al., 2011).
The EPOC can be purchased in various forms from consumer to research grades, the primary
difference between the options is the accompanying software (consumer level limits the user
to pre-set methods of connecting EEG to software applications and visualisations). The
research edition of the software includes the full source development kit that allows full
customisation of the processing algorithms and generation of new ones, though, as with the
MindWave, API programming knowledge is required before custom setups can be created.
The Emotiv Experimenter (Adelson, 2011) is an externally developed freeware program that
integrates into the EPOC system enabling significantly greater control of EEG processing and
classification algorithms. The EPOC system is also compatible with OpenVibe (INRIA,
2012) another freeware program that facilitates significant control over every aspect of EEG
processing, from signal filters to classification algorithms, providing pre-set parameters
whilst permitting high levels of modification (Renard et al., 2010). The brain-computer
interface (BCI) research community has revealed an awareness and appreciation of this
software application, and OpenVibe has been utilised as an EEG interface for gaming
development purposes (Congedo et al., 2011) and to support the use of the EPOC itself as a
reliable hardware tool (Ekanayake, 2010).
The notion that the EPOC is a match for professional or medical grade EEG is not widely
accepted and particular research has argued against such a statement (Campbell et al., 2010).
Ismail et al. (2011) argue that the required preparation of the wet-sensors can be timeconsuming and that the headset can become uncomfortable after prolonged use. The software
algorithms for classification and interpretation of thoughts have been criticised as unreliable
and ineffective as a conscious-thought game control interface (Adelson, 2011). Ekanayake
(2010) asserts that the electrode arrangement does not cover critical scalp locations and that
this fact limits its effectiveness in comparison to higher-grade systems. Adelson (2011) also
193
194
Conclusions and Future Work
states that the EPOC suffers from high sensitivity to electromyography (EMG) input that in
many circumstances, leads to users controlling a game through physical movements
(clenched jaw, strained ocular muscles) rather than genuine EEG mind-control.
Upon consideration of the above discussion it is asserted that the EPOC setup provides the
greater advantage over the MindWave. Primarily this is due to the electrode arrangement and
external freeware applications and the EPOC has more power to enable effective emotion
classification utilising many of the techniques outlined in chapter 5. The Emotiv EPOC
shows great potential in providing a solid balance of affordability and robust effectiveness
and is expected to feature as the primary hardware peripheral utilised within further study
extending from this thesis. Whilst the EPOC headset is arguably less comfortable to wear for
extended periods, more difficult to apply and has a lower sample rate than the MindWave, the
technology offered by the EPOC is undoubtedly closer to professional/medical grade
equipment whilst also accessible and compatible with modern gaming technology.
FUTURE STUDY: OUTLINE
Continuing research will aim to advance the theoretical concepts documented within the PhD
thesis, retaining focus upon the emotionality of audio and quantification of subjective
experience through psychophysiology. Future study would be anticipated to provide further
theoretical advances and also yield a greater volume of empirically sourced data and practical
prototyping with real-world application. It is envisioned that, alongside a comprehensive
body of secondary inter-disciplinary research and a systematic series of empirical trials, two
prototype software systems will be developed with value to both academic and commercial
spheres. Produced work is expected to generate meaningful results, actionable conclusions
and prototype systems of significant interest to usability, quality assurance,
acoustics/psychoacoustics, computer video game development, artificial intelligence, audio
therapy and cognitive neuroscience research fields.
Future research should strive to further consolidate and further substantiate the concepts
documented within this thesis through the provision of additional data and demonstration of
clear value of the results via successful prototyping. The primary focus of continuing research
should encompass fear-associated emotional experience and acoustics, extending beyond:
‗what is the potential of quantifiable acoustic parameters to modulate affective response‘ to:
‗what is the potential of individual variables (within acoustic parameters of sound) to both
intensify or attenuate a person‘s sensation of fear‘. This will incorporates more detail to,
hopefully, inform sound designers not only which acoustic characteristics are likely to create
a desired emotional response, but also state the specific levels/measures of these parameters
(beyond the scope of the thesis, that was limited to comparative analysis of 2-3 variations).
Theoretical and empirical work will advance from the more general concepts detailed within
this PhD thesis to explore the full potential of a range of acoustic parameters in greater depth,
elucidating not only the emotioneering (Freeman, 2003) potential of an individual acoustic
parameter, but also the comparative potential of a range of variations within the same
Tom Garner
2012
University of Aalborg
parameter. Combinations of parameter settings will also be addressed, facilitating the
expansion (and hopefully the completion) of the thesis‘ framework for audio design within
emotion-manipulation applications.
It is expected that future study expanding beyond this thesis will include a substantial
quantity of pragmatic effort. Practical applications of the theoretical developments
(documented above) are to include the development of a prototype software tool. The initial
prototype system (Prototype 1) is an observation tool intended to aid researchers. It enables
real-time two-way interaction between a computer video game program and biometric
software, viz., biofeedback. It will essentially be a data gathering and display system that
would allow the recording of physiology to be synchronised to game events and for
physiology measurements to affect game events. Much of the acquisition and initial
processing of biometric data would be automated and synchronised to the key events within
the user‘s experience, reducing the risk of human error and significantly improving the
efficiency of data collection during empirical trials. This system would also enable the data to
be presented visually in real-time to the user. The researcher would be able to observe the
same readings synchronised to both a video recording of the user‘s in-game activity and the
user‘s audio commentary (should real-time subjective data collection be required). All the
information would be recorded as a video file, facilitating easy revisiting of each test and the
automated production of valuable presentation materials. Such a system can be implemented
within a traditional desktop computer/console system or may utilise mobile technology to
create an iOS or Android compatible program enabling the system to work effectively on a
range of alternative formats and capitalise upon the current trends of mobile application
development and portable computing technology.
The second prototype (2) will utilise the theoretical framework and empirical data gathered
using the initial prototype for a specific purpose; to create a biometric feedback loop system
in which the user is able to manipulate the audio landscape of the virtual reality by way of
psychophysiological measures (specifically their brainwaves via EEG). The intention is to
support two contrasting applications. The first, to aid individuals who suffer from phobias by
presenting them with an audio landscape that is representative of their fear and provide them
with real-time biofeedback measures that they aim to attenuate. The audio framework is used
here to increase/decrease difficulty, allowing users an adaptive difficulty that makes the
program accessible at first and increasingly challenging to aid their coping development. The
second application utilises the same system with the opposing objective of intensifying a
fearful experience for recreational purposes. In this scenario, the system creates a ‗living‘
acoustic virtual reality that reacts in real-time to the player‘s emotional state as they progress
through the game. The core gameplay mechanic requires players to control their emotional
state to move between game levels as the soundscape attempts to undermine their efforts. For
example, a player may be required to achieve a relaxed and calm state in order to progress but
is pitted against a soundscape designed to elicit fear and discomfort. The hardware used
within these systems is open to several variations. Electroencephalography, galvanic skin
response and heart rate are preferred methods of measurements of physiology as, at present,
195
196
Conclusions and Future Work
they are the more difficult to manipulate via conscious thought and therefore are more
representative of our emotional subconscious.
Continuing study within this field has wide-ranging potential merit for both industrial and
academic application. Biometric feedback systems and psychophysiology advances have
specific value in user-experience (UX) and quality assurance (QA) testing; an avenue of
research that is of critical importance to manufacturers of any consumer product from
automotive to PC tablet design. Fear and anxiety can produce significant barriers between
product and user, therefore a thorough understanding of attenuating these emotional states
may allow designers to greatly improve accessibility of their products, allowing them to
expand their customer base and to enable those who would normally be excluded by
fear/anxiety to experience and enjoy the benefits of modern technology. In addition to these
benefits, a biometric feedback system capable of varying the intensity of phobia-related audio
soundscapes could have a great potential in attenuating a range of phobias by allowing the
user to develop their coping ability through repeat practice and real-time progress feedback.
Significant practical application is also asserted within the field of recreational computer
video gameplay. Within a recreational context, fear can be a gateway to joy and excitement
(Perron, 2005; Svendsen, 2008) and emotions are often a prerequisite to immersive gameplay
experiences (Shilling, 2002). Consequently, systems capable of accurately interpreting a
player‘s emotional state might not only support innovative gameplay mechanics (e.g. players
must master their emotional output to progress) and real-time adaptive gameplay (ex.
emotional output affects artificial intelligence director, making the game experience unique
between both individual players and repeat plays), they could also facilitate a deeper level of
communication with non-player characters which may result in more meaningful
relationships between the player and the characters of the game. As a body of theoretical
work, such study would consolidate a wide range of research from various disciplines to
provide a succinct analysis of fear/anxiety theory and could also further advocate sound as a
powerful tool in the manipulation of human emotion. A more detailed understanding of the
affective power of audio could enable sound designers to better incite and convey emotional
characteristics across a wide variety of sonic applications from fire alarms to cinematic foley
(reproduction of everyday sounds, developed in film/TV post-production) design. A
conceptual framework of fear amplification/attenuation could be further expanded upon to
better understand additional emotional traits such as joy or anger.
CHAPTER SUMMARY AND FINAL STATEMENT
Referring back to the introduction chapter and the primary hypothesis stated within; the
evidence gathered and literature reviewed supports the assumption that game sound,
biometric data and qualitative game experience descriptors have the potential to operate as a
psychophysiological feedback loop system.
Tom Garner
2012
University of Aalborg
The first foundational assumption, that human emotion can be understood as arrangements of
quantitative variables that exist within the brain and body, cannot be conclusively
acknowledged or rejected, but could be extended to include environment as an additional
source of variables. Whilst the evidence gathered via the academic review and empirical
studies support this hypothesis (and the hypothetical frameworks embody it), proof that such
systems can be employed to accurately predict emotional outcomes from specific sound
characteristics is needed to support this hypothesis conclusively. With regards to the second
foundational assumption of the introduction (that quantitative acoustic parameters have the
potential to modulate the affective value of a sound without contextual support), this study
suggests that this could be true. Evidence reveals certain acoustic parameter changes having
statistically significant impact upon player-emotion measures. However, the circumstances in
which these parameters work are not consistent between experiments and more extensive
testing is required to reach a definite conclusion. The possibility of a game sound/biometric
feedback loop system is ultimately supported by the evidence within this thesis, as the
potential for both game sounds to evoke emotion and the psychophysiological equipment to
analyse it is promising.
With regards to the empirical study undertaken as part of this thesis, the web-mediated
experiment reveals some strong associations between user-rated emotional valence and effect
and particular sounds (and digital sound treatments). Whilst evidence does not provide a
comprehensive account regarding the potential of digital signal processing to
accentuate/attenuate an emotional response, this trial arguably elucidates several
methodological difficulties lying within this test, thereby presenting a valuable guide for
future research to better refine testing of this kind. The real-time intensity (RTI) experiment
presents a novel yet effective approach to qualitative data collection and also reveals several
propositions as to how affect testing within this context might be enhanced. The
psychophysiology/biometrics experiment presents a strong argument for the employment of
electro-dermal activity and electromyography as measures of human affect within a CVG
context, revealing some reliable correlations between such physiological measures and fearrelated game experiences. This collective research is anticipated to greatly support future
research into biometric feedback applications within gaming and has also presented enough
data to further support sound as a valid form of stimuli for modulating fear during computer
video gameplay.
This thesis has also presented a range of theoretical explorations, comprising: emotionality,
psychoacoustics, psychophysiology and game studies. The principal theoretical contributions
put forward within the thesis are: an ecological framework of fear within a CVG context, a
new perspective regarding the virtuality of sound, a new structure of gameplay experience
that amalgamates fear processing and embodied listening during play, and an embodied
virtual acoustic ecology (eVAE) model. It is anticipated that sound designers and researchers
concerned with manipulating fear-related affect in response to sound will find this thesis of
interest and value. By presenting research-supported frameworks of both how we listen and
how we experience fear in a CVG context, it is hoped that the improvisatory or trial-and-error
197
198
Conclusions and Future Work
approach to sound design within this field will be more regularly rejected in favour of a
methodical approach, centred around the audience as they exist within their environments.
Ultimately the intention is to support the development of future games capable of
transcending social, cultural, age and gender barriers to ensure that everyone is equally, and
deeply, scared – but in the most positive sense.
This thesis has significantly contributed to the understanding of the affective potential of
game sound. The hypothetical frameworks take existing knowledge (regarding sound
perception, embodied cognition, and emotions) and apply them in a unique way to a CVG
context. Modern approaches to measuring player experience are implemented for the unique
purpose of assessing fear responses to sound during computer video gameplay, providing
original data to support new conclusions. It is hoped that, in addition to driving forward
future research, this thesis will encourage more structured and logical game sound design
practices that denote an appreciation for the complex embodied world we listeners live in,
ultimately supporting the development of game sound systems that can respond to the natural
acoustic environment and the physiological characteristics of the player, as well as the
traditional input from the controller.
Tom Garner
2012
University of Aalborg
199
Chapter 10
Appendix: Future Work,
References & Complete Datasets
Garner, Tom A.
University of Aalborg
2012
200
Appendix: Future Work, References & Complete Datasets
Chapter 10: Future Work,
References and Complete
Datasets
INTRODUCTION
This closing chapter presents a design document for a bespoke software application, intended
to begin development in the immediate future. Commencing with an outline of the
circumstances from which the Xpresence development began, an initial blueprint of the
software is presented alongside a statement of potential value, relevant to the themes and
ideas raised within this thesis. The chapter concludes with a complete reference list and set of
raw data figures obtained from the three experimental trials.
XPRESENCE: PURPOSE AND FUNCTIONALITY
The functionality of the software arose from observable limitations with currently available
software to perform certain biofeedback tasks with a high degree of user-control. The
overarching function of Xpresence is to facilitate a two-way communication between the
Emotiv Epoc headset and the majority of computer video game engines, with which
electroencephalographic response to game events can cause elements within the game to
change. To specify, the primary component functions of Xpresence are to: acquire data from
the Emotiv Epoc headset, process the raw EEG and gyroscope data to support extraction of
custom features, provide an automated but fully editable classification tool, facilitate various
real-time visualisation methods and, enable any data derived from the headset to be mapped
to computer keystrokes.
Whilst various comparable software titles do currently exist, the specific requirements of the
future research documented above reveal certain limitations. The research edition of the
Emotiv system incorporates a source development kit with an array of implementations that
includes a key mapping tool and real-time 2D graph generation. The chief concern with the
former is the distinct restriction in terms of specifiable independent variables. Users can
select from a small range of predetermined, subjective descriptors (relaxation, excitement,
engagement, frustration and meditation) however the particular processing and classification
algorithms utilised by the Emotiv system cannot be customised and therefore the connection
between quantitative statistical EEG features (frequency analysis, signal spikes, mean
activity, etc.) and qualitative descriptors is beyond user control. One feature of the Emotiv
system that is potentially beneficial is the ability to record brief epochs of EEG activity and
then map that output to a keystroke; meaning if a characteristically similar signal is received,
the system will recognise it and consequently activate the set keystroke. However, limitations
are present also with this tool in that the exact method by which the system compares activity
is undetermined and, in practice, using this tool for conscious control (such as moving an
Tom Garner
2012
University of Aalborg
avatar forward) lacks reliability. Therefore the facility to customise the recognition method by
way of specifying specific statistical features would be a notably useful feature of Xpresence.
The visualisation toolset within the Emotiv system shares similar advantages and limitations.
Raw and frequency-domain (generated via a standard FFT) measures are generated in realtime to enable live observation of EEG data. The restrictions of this system again relate to a
lack of user control over the mathematical processes that lie between the acquired raw EEG
and the Emotiv-defined qualitative descriptors. Although the recorded signal can be exported,
there are no native tools within the software that allow statistical analysis of the data.
Alternative software packages present a resolution to the control limitations associated with
the Emotiv system yet remain an incomplete solution, primarily due to communication
restrictions between both the Emotiv headset and games engines. The BSL Pro (Biopac,
2012) currently at version 3.7.7, presents an array of analytical tools, including: FFT,
histogram generation, smoothing, difference, peak detection and waveform mathematics,
alongside a suite of digital filters for comprehensive initial signal processing. Parameter
control of these tools is a mixture of low and high-level, with user access to a selection of
parameter pre-sets with some low-level control afforded in several of the more basic
functions (e.g. filter cut off frequency and number of coefficients settings in the low-pass
digital filter and sample number defining within the smoothing tool). This presents a balanced
compromise between control and accessibility, allowing users access to a range of advanced
functions without alienating them with highly complex interfaces.
The clear limitation with the BSL software is that it communicates only with associated
Biopac hardware that will far exceed low budgets. Mind Workstation (Transparent Corp.,
2012) presents a potential solution. This software exists as a midpoint between the Emotiv
and BSL systems in terms of user control and statistical toolset range. Analysis is largely
limited to measuring frequency bands, raw data and the qualitative descriptors (frustration,
excitement, etc.) set by the Emotiv hardware. However, there is greater control over these
features when compared to the Emotiv system and, unlike BSL, it supports connectivity to
both the Emotiv Epoc and Neurosky Mindwave headsets. Visualisation options are of a high
standard and can also be generated in real-time. In addition, the Mind Workstation (MW)
presents a native biofeedback toolset, enabling users to loop EEG data back into the system to
manipulate stimuli in real-time. The principal limitation with this system is that the stimuli
are self-contained within the software. Users are able to import custom sounds and visuals
but there is no clear means to incorporate a third party system (such as a game engine) into
the MW biofeedback routine.
The MW system is a commercial software title (currently priced at $490 for the most
advanced version) that, while considerably more cost effective against an equivalent BSL
system, lacks research-grade control (in certain areas) and, when combined with the Epoc
headset costs, may be an overly expensive solution for some budgets. The final software
presented here provides an effective solution to the budgetary and control limitations of the
201
202
Appendix: Future Work, References & Complete Datasets
previous packages. OpenVibe (Inria, 2012) is a free to use, open-source software package that
provides comprehensive user control over signal processing, feature extraction, classification
and acquisition setup. OpenVibe utilises a flow-graph interface style with user-generated
patches built up of connectable nodes (similar to MaxMSP [Cycling ‗74] and Reaktor [Native
Instruments]). Control within the toolset is largely similar to that of the BSL system and
provides a solid level of freedom for designing custom biofeedback routines. The single
notable concern with the OpenVibe software is the lack of accessibility (particularly to those
without programming knowledge/experience) when attempting to establish communication
between OpenVibe and a third party program. The system employs a VRPN (virtual reality
peripheral network) to enable effective communication between any programs that are written
in C++ programming language. The difficulty is that setup of the VRPN requires significant
programming ability and there is currently little regulated support available due to the opensource nature of the product. In response to the information documented above, Xpresence is
intended to be a cost-effective graphics-based solution, requiring no low-level programming
ability and only minor occurrences of text-based commands (fully supported by way of a
comprehensive in-built help system) for the most advanced tools. Connectivity is initially to
be limited to the Emotiv Epoc headset. The primary functional intention of Xpresence is to
develop a software package capable of the following:
1) Professional and research-grade signal processing and statistical analysis tools
2) Comprehensive classification integrating both biometric and stimulation (game data) input
3) Keystroke mapping to connect classification outputs to peripheral systems (game engines)
4) Visualising biometrics and game data with a range of customisable presentation styles
XPRESENCE: DESIGN DOCUMENT
The aesthetic design for Xpresence takes influence from modern audio/music plugins for
digital audio workstations that emulate physical equipment, such as the Solid State Logic
4000 (Waves, 2008), a collection of software programs that replicate the processing of the
renowned SSL4000 and 9000 physical mixing desks. Although Xpresence does not employ
graphical buttons, knobs and faders for all functions, such entities are integrated where
possible to increase user-accessibility and engagement. Standard drop-down menu options are
available but in most cases, the functions presented within these menus can also be accessed
directly within the graphic user interface (GUI) window. EEG record and playback is
controlled via a selection of animated buttons that colourise in response to user-action to
provide immediate and clear feedback. All primary subsections of the program are presented
within the main screen (figure 1) as clearly distinguishable nodes and the vertical tool
window incorporates standardised icons for general functionality (new, save, cut, copy, paste,
etc.). All icons/symbols reveal a text descriptor if the mouse pointer is positioned over them.
Font and size of text is designed to balance between clarity and individual aesthetic with the
colour scheme/background also intended to support this by way of a minimalist design with
high contrast to clearly distinguish between sections of the GUI, supporting faster userfamiliarisation. The superscript integers presented below point to corresponding numbers
within the screenshot images to aid visual interpretation.
Tom Garner
2012
University of Aalborg
The functionality of the primary window centres on the flow graph window and incorporates
a headset connectivity window, a play/record control panel and timer, a standard drop-down
menu array and two quick-access toolbars, one containing the interactive icons representing
the main Xpresence nodes/modules and the other providing easy access to generic tools. The
headset connectivity window1 is very similar in design to the equivalent tool provided with
the Emotiv system as a visual representation of headset connection is essential and the
Emotiv window is transparent and easily interpretable. Like the Emotiv tool, the connectivity
window updates in real-time in response to connectivity changes, registering black for no
connection, red for limited, orange for poor, yellow for good and green for excellent. The
play/record control2 is a moveable and hideable panel allowing immediate recording and
playback function of EEG data without requiring the user to open the acquisition server.
The standard drop-down menu3 allows alternative access to all of the functionality available
via the quick-access menus plus links to the help/tutorial menu, view controls and structural
nodes. The help/tutorial menu is presented in a standard Microsoft Windows format, with
content split into chapters and keyword search functionality for fast reference. The view
menu enables limited control over the general appearance/colour schemes and allows the
quick access menus and play/record panel to be hidden (creating space that is automatically
filled by expansion of the flow-graph window. Structural modules (accessed by way of the
modules menu) allow more complex patches to be built by enabling any function module to
have multiple (between 2 and 5) inputs or outputs. For example, the game engine and
acquisition modules could both be connected to the classification module to enable headset
EEG data to be cross-referenced against real-time game event data via a 2 to 1 structural
node.
The generic quick-access toolbar4 presents immediate access to common functions (new
project, save, load, cut, copy and paste) in addition to a comments tool (selecting the tool
opens a moveable text box to enable labelling of individual modules or sections of a complete
patch to aid navigation around complex patches or remind the user of specific settings) and a
help icon (when selected, hovering the mouse pointer over certain areas reveals a concise
description of that element). The function module toolbar5 provides immediate access to the
primary modules used within Xpresence. Each icon within the toolbar represents one of the
key Xpresence tools and can be easily dragged and dropped onto the flow-graph window. The
flow-graph window6 itself is relatively basic in design. Modules can be added, removed and
duplicated using the quick-access tools, via a small drop-down menu that appears in response
to right clicking the module and in response to keystrokes/short-cuts (e.g. ctrl+X for cut,
ctrl+D for duplicate, etc.). Single and sustained left clicks towards the base or top of the
function modules activate the connection wire. With this selected, the wire can be extended to
connect with the input/output of any other node (although not all connections will create a
valid function). Double left clicking of the function modules opens up new windows in which
the settings and parameters particular to that tool are presented.
203
204
Appendix: Future Work, References & Complete Datasets
Figure 1: Xpresence main screen and primary module arrangement window
Tom Garner
2012
University of Aalborg
Figure 2 presents the acquisition server, visualisation tool, keymap generator and
record/playback setting window. The acquisition server7 takes its influence largely from
OpenVibe (Inria, 2012) in terms of the adjustable settings offered, namely: alternative
connection ports, age/gender adjustment and device identification (ultimately supporting
simultaneous multiplayer functionality). A second play/record panel is presented alongside
the option to collect data from any arrangement of the 14 electrodes, enabling researchers to
examine specific spatial locations and economise on computing power. The visualisation
tool8 enables researchers to present graphs and topographical imagery in real-time, both in
response to live data acquisition and playback of pre-recorded signal streams. Within a single
visualisation window, three viewing ports are presented that enable data to be viewed in up to
three alternative formats that include standard graphs and topography in both 2D and 3D
versions. Each viewport can be detached from the main visualisation window and then sizeadjusted or set to full-screen in much the same way as with a typical computer media player.
The keymap generator9 takes influence from the Emotiv tool, primarily because the research
edition SDK being utilised to build Xpresence will facilitate easy access to the coding behind
this tool. Within this tool, researchers can select from multiple connected headsets and
establish a rule that connects a trigger with a behaviour and identify the application they
would like this rule to apply to. The functionality of the trigger is the primary bespoke
addition when comparing the Emotiv keymap generator to the one within Xpresence.
Whereas the trigger within the Emotiv tool was limited to facial expressions and preestablished (non-editable) features, Xpresence provides a communication between the
keymap generator and the classification module (discussed below) that allows the user to
create a range of custom triggers from the raw data. The behaviour settings allow the
connection between trigger and keystroke some basic customisation, enabling researchers to
determine how long the keystroke is held for, set a delay between trigger onset and keystroke
activation and configure repetition/periodicity. The record/playback window10 incorporates a
basic toolbar that opens a generic browser window, enabling users to load pre-recorded
headset data files (instead of using live data) and save recordings to employ later.
Figure 3 provides a screenshot of the signal processing and feature extraction tool. The BSL
pro software (Biopac, 2012) inspires several elements of both design and function, with the
main signal display, timescale bar and feature analysis windows positioned in a comparable
arrangement. With regards to functionality: customisable timescales, marker tools, viewer
settings and analysis windows all take cues from the BSL software. The most notable
differences presented in the Xpresence tool are the custom feature configuration and the
simplified, streamlined aesthetic design. The main signal display11 presents raw EEG data
from all (or selected) Epoc headset channels. Within this window, both live and pre-recorded
signal data can be visualised. To the lower left position of the main display window, the
display control panel12 is placed, incorporating (from left to right) a marker generator (that
positions a marker upon the display window wherever to timeline indicator is currently
placed) the settings tool (that opens a small drop-down window, allowing users to choose
between mouse pointer function) and the timescale tool (enabling users to set the
chronological scale and time measurement).
205
206
Appendix: Future Work, References & Complete Datasets
Figure 2: Module windows (Visualisation, Read/write, Keymap Generator & Acquisition Server)
Tom Garner
2012
University of Aalborg
The main drop-down menu13 follows a similar functionality (chiefly with regards to the file,
edit and help options) to the one presented in the initial Xpresence window. Processing is a
pivotal menu however, as it provides the only route to signal processing tools. Although
expected to incorporate various digital filters (hi-pass, low-pass, band stop, etc.) it is expected
that this element of the overall software will include only essential processing initially, and
integrate additional processes as plugins later in development. The display menu enables the
user to set the number of feature analysis rows (up to three can be used) and adjust several
visual parameters. This includes: the thickness and resolution of the signal streams, the colour
coding of each channel and each feature analysis box, automated vertical and horizontal
signal scaling and marker reveal/hiding. It is also possible to select any combination of the
fourteen EEG channels for visualisation in the 2D graph.
The quick-access menu14 provides the user with an accessible alternative to navigating the
drop-down menus for some of the most common tasks that includes icons for (from top to
bottom): automatic vertical scaling, automatic horizontal scaling, move to peak/next peak,
display grid, add comments, place marker and visual display settings.
The feature analysis toolbar15 initially presents the user with a single row of eight boxes, each
of which reveals a drop-down menu that, when selected, displays a feature and an EEG
channel (including select all) to attribute that feature too (e.g. mean (4) = the mean measure
for channel 4). Features can be established statistical calculations (mean, min, max, integral,
etc.), pre-established affective measures (meditation, excitement, engagement, etc.) or custom
equations built by the user. It is the custom feature configuration tool that distinguishes the
feature extraction module from Biopac BSL pro or the Emotiv control panel supporting the
user with a relatively simple script-based system for building bespoke features. The custom
feature configuration (CFC) window can be accessed via the feature extraction option in the
main drop-down menu and also by a quick-access icon found in the top right of the screen.
The CFC tool itself is a simple code-editing window with a compliment of standard buttons
(save, load, apply, cancel). The exact nature of the code is not fully determined, however the
intention is to allow the user to build equations based around mathematical functions,
qualitative descriptors and raw EEG data. The code window can also automate epoch
recording, allowing a user to select specific points in time for the automated analysis to be
executed. For example, if a user wished to define a custom feature as within a 30 second
epoch between 1m15s and 1m45, when the mean signal from channels 1-7 is 25%(or more)
greater than the mean signal from channels 8-14, they would first establish the epoch times
(E1start=1:15, E1end=1:45) then state the feature as an equation (f=m[c1:c7]>m[c8:c14]) and
finally, state the value of the greater than symbol (>=25%+). This equation is then saved and
given a user-defined name. The use of specific symbols and the syntax of the script is not yet
established and the above equations are presented merely to elucidate the system.
207
208
Appendix: Future Work, References & Complete Datasets
Figure 3: Signal Processing and Feature Extraction windows
Tom Garner
2012
University of Aalborg
Figure 4 is represents a typical view of the classification tool, another flow-graph based
system designed to support the implementation of complex class algorithms with relative
ease. The classification tool primarily takes its influence from the GUI scripting engines
utilised within modern FPS source development kits such as the Unreal SDK (2012) and the
CryEngine 3 SDK (Crytek, 2012) that replace conventional, text-based coding with graphics
nodes that present all relevant parameters as visible controls. The structure of the
classification tool also takes influence from the classification trees presented in SPSS (IBM,
2012). This approach allows users to create complex interconnecting patches, capable of
controlling intricate systems such as artificial intelligence or game physics, without requiring
the user to navigate bulky text files or learn difficult syntax and commands. The main flowgraph arrangement window16 is positioned in the centre of the screen and is initially blank.
Nodes can be added via right-clicking anywhere in the blank space, revealing a drop-down
menu containing all available nodes. The primary node is the feature node17, a conditionbased matrix with a user-specified number of outputs, each of which is activated based upon
the rules set by the user. For example, in the first node the user could specify the feature as
raw EEG signal at channel 5 (electrode FC5) then connect one output to S<25µV (signal is
below 25µV) another to S=25:75µV signal is between 25 and 75 µV), and a third to S>75µV
(signal is greater than 75µV).
Each output could be coupled directly to the keymap generator by way of connecting them to
a send node18 or instead link to a second feature node via the in port19 to create a more
complex classification network. Double-left clicking upon a feature node reveals another
custom feature configuration tool20 with comparable functionality to the CFC presented in the
processing and feature extraction tool and consequently means that there is some overlap
between the two toolsets. This is primarily due to a need for usability feedback, to determine
whether the CFC would be more appropriate within the feature extraction or classification
tool. Small differences do exist between the two, with the classification CFC integrating a
graphical keypad display, presenting the user with the most common mathematical functions.
As with other aspects of the Xpresence program, hovering the mouse pointer over the keypad
buttons will reveal a text description to further aid accessibility by way of reducing the
number of mathematical elements a user may have to learn before proficient use. Colour
coding is also different, though it is anticipated that the script colour schemes are to be fully
customisable.
In addition to the function and send nodes, the classification tool will incorporate various
logic (AND, OR, NOT, gate, XOR, etc.) and mathematical nodes (abs, add, sin, counter, etc.)
to support the construction of large, complex patches. In addition to the classification treestyle documented above, the Xpresence classification tool will enable users to implement
classification algorithms common to established academic literature, such as support vector
machines and linear discriminant analysis, enabling researchers to use (and customise)
advanced and current classification processes.
209
210
Appendix: Future Work, References & Complete Datasets
Figure 4: Emotion Classification flowgraph tool and Feature Editor programming window
Tom Garner
2012
University of Aalborg
BUILD AND IMPLEMENTATION STRATEGY
The Emotiv API provides the core functionality around which Xpresence will be built. As a
proof of concept the C++ programming language dictated by the API is expected to enable
control over any Windows PC-based games title. There are limits to some testing
applications; most notably Apple Mac products (XCODE) and Microsoft XBOX live Arcade
(C#) which would facilitate prototype testing upon what are arguably the most globally
popular mobile and console-based independent gaming platforms respectively. Provided the
initial development was successful, a port between platforms could yield further value as a
future endeavour. It is also a distinct possibility that the Xpresence program could be
extended to communicate with internet applications, particularly web-based/flash and social
games, ultimately meaning the technology could permeate all forms of popular modern
gaming.
The majority of graphics have already been developed, primarily using PowerPoint
(Microsoft, 2010) and Illustrator CS5 (Adobe, 2010) and all functionality programming is
expected to be achieved via Visual Studio C++ (Microsoft, 2010). The initial games engine
Xpresence will attempt to communicate with will be the CryEngine 3 SDK (currently v3.4,
Crytek, 2012). This is chiefly due to the accessibility of the engine‘s flow graph system
(enabling quick setup of links between keystrokes and game parameters) and the capacity to
manipulate the game sound with a high degree of control. In addition, the CryEngine
showcases some of the most advanced graphics and AI features currently within modern
gaming, enabling many of the game responses to EEG input to both look and sound
particularly impressive.
211
212
Appendix: Future Work, References & Complete Datasets
REFERENCES
2K (2007) Bioshock [Computer Video Game]
Achaibou, A., Pourtois, G., Schwartz, S., and Vuilleumier, P. (2007) Simultaneous Recording
of EEG and Facial Muscle Reactions during Spontaneous Emotion Mimicry,
Neuropsychologia, 46, pp.1104–1113
Adams, W.H., Lyengar, G., Lin, C., Naphade, M.R., Neti, C., Nock, H.J. and Smith, J.R.
(2002) Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues,
EURASIP Journal on Applied Signal Processing, 2, pp.1-16
Adelson, M. (2011) An Experimentation and Mind-reading Application for the Emotiv
EPOC. Princeton University, http://compmem.princeton.edu/experimenter/
ExperimenterReport.pdf
Adobe (2008) Dreamweaver CS4 [Computer Software]
Adobe (2010) Flash CS5 [Computer Software]
Alho, K., and Sinervo, N. (1997) Pre-attentive Processing of Complex Sounds in the Human
Brain, Neuroscience Letters, 233, pp.33–36
Allen, J.J.B., Harmon-Jones, E. and Cavender, J.H. (2001) Manipulation of Frontal EEG
Asymmetry through Biofeedback alters Self-reported Emotional Responses and Facial EMG.
Psychophysiology, 38, pp.685-693
Alves, V. and Roque, L. (2009) A Proposal of Soundscape Design Guidelines for User
Experience Enrichment, in: Audio Mostly 2009, September 2nd -3rd, Glasgow.
Alves, V. and Roque, L. (2011) Guidelines for Sound Design in Computer Games, in: Game
Sound Technology and Player Interaction: Concepts and Developments (Ed. Mark
Grimshaw) IGI Global, pp.362-383
Alwitt, L. (2002) Suspense and Advertising Responses, Journal of Consumer Psychology,
12:1, pp.35-49
Ambinder, M. (2011) Biofeedback in Gameplay: How Valve Measures Physiology to
Enhance Gaming Experience, in: Proceedings Game Developers Conference (GDC), Citeseer
Anderson, C.A. (2002). Violent Video Games and Aggressive Thoughts, Feelings, and
Behaviors, in: (Eds.) Calvert, S.L., Jordan, A.B. and Cocking, R.R., Children in the Digital
Age, pp.101-119, Westport, Praeger Publishers.
Anderson, M.L. (2003) Embodied Cognition: A Field Guide, Artificial Intelligence, 149,
pp.91-130
Tom Garner
2012
University of Aalborg
Andrade, E.B. and Cohen, J.B. (2006) Affect-Based Evaluation and Regulation as Mediators
of Behavior: The Role of Affect in Risk Taking, Helping and Eating Patterns, in: (eds.)
Kathleen D. Vohs, Roy F. Baumeister, and George Loewenstein, Do Emotions Help or Hurt
Decision Making? A Hedgefoxian Perspective, Russell Sage, 2006
Andreassi, J. (2006) Psychophysiology: Human Behaviour and Physiological Response,
Psychology Press (5th edition)
Arnold, M. (1954) The Human Person, New York: Ronald
Ashbourn J. (2000) Biometrics: Advanced Identity Verification, Springer-Verlag, London,
UK
Attwell, K.H. (2006) 100 Questions & Answers about Anxiety, Jones & Bartlett Publishers,
Inc.
Augoyard, J. and Torgue, H. (2005) Sonic Experience: A Guide to Everyday Sounds, McGillQueens University Press, Canada
Bach, D.R., Schachinger, H., Neuhoff, J.G., Esposito, F., Salle, F.D., Lehmann, C., Herdener,
M., Scheffler, K., Seifritz, E. (2008) Rising sound intensity: an intrinsic warning cue
activating the amygdala, Cerebral Cortex, 18, pp.145-150
Bach, D.R., Neuhoff, J.G., Perrig,W., and Seifritz, E. (2009) Looming sounds as warning
signals: The function of motion cues, International Journal of Psychophysiology, 74:1,
pp.28-33
Bagiella, E., Sloan, R. P., and Heitjan, D. F. (2000) Mixed-effects models in
psychophysiology. Psychophysiology, 37, pp.13-20
Baird, R. (2000) The Startle Effect. Implications for the Spectator Cognition and Media
Theory. Film Quarterly, 53:3, pp.13-24
Baltzly, D. (2010) Is Plato‘s Timaeus Panentheistic? Sophia, 49:2, pp.193-215
Ballas, J. A., and Mullins, T. (1991) Effects of context on the identification of everyday sounds,
Human Performance, 4:3, pp.199–219
Bar, M. (2004) Visual objects in context. Nature Reviews: Neuroscience, 5:8, pp.617–629
Bar-Anan, Y., Liberman, N., Trope, Y. and Algom, D. (2007) Automatic processing of
psychological distance: Evidence from a Stroop task, Journal of Experimental Psychology,
136, pp.610-622
Bar, M. and Ullman, S. (1996) Spatial context in recognition, Perception, 25, pp.343-352
Bard, P. (1928). A diencephalic mechanism for the expression of rage with special reference
to the sympathetic nervous system, American Journal of Physiology, 84, pp.490-516
213
214
Appendix: Future Work, References & Complete Datasets
Barlow, D. and Durand, V. (2009) Abnormal Psychology: An integrative Approach,
Wadsworth CENGAGE learning
Barot, T. (1999) Songbirds forget their tunes in cacophony of road noise, The Sunday Times,
January 10th.
Barrett, L.F. (2006) Solving the Emotion Paradox: Categorization and the Experience of
Emotion, Personality and Social Psychology Review, 10:1, pp.20-46
Bartel-Sheehan, K., and Grubbs-Hoy, M. (1999) Flaming, complaining, abstaining: How
online users respond to privacy concerns, Journal of Advertising, 28:3, pp.37-51
Bechara, A., Damasio, H. and Damasio, A.R., (2000) Emotion, decision
making and the orbitofrontal cortex, Cerebral Cortex, 10, pp.295-307
Beepa (2010) Fraps v.3.2 [Computer Software]
Benedict, R. (1946) Patterns of Culture, New York: Penguin
Bethesda (2008) Fallout 3 [Computer Video Game]
Birnbaum, M.H. (2004) Human Research and Data Collection via the Internet, Annual
Review of Psychology, 55, pp.803-832
Bischof, M., Bassetti, C.L. (2004) Total dream loss: a distinct neuropsychological
dysfunction after bilateral PCA stroke. Annals of Neurology, 56, pp.583-586
Blanchard, R.J. and Blanchard, D.C. (1989) Attack and Defense in Rodents as
Ethoexperimental models for the Study of Emotion, Progress in NeuroPsychopharmacologicy and Biological Psychiatry, 13, pp.3-14
Bleiler, E.F., (Foreword) written in - Lovecraft H.P. (1973) Supernatural Horror in
literature, Dover Publications
Blythe, M. and Hassenzahl, M. (2003) The semantics of fun: Differentiating enjoyable
experiences, in: (Eds.) M. Blythe, C. Overbeeke, A.F. Monk, and P.C. Wright, Funology:
From Usability to Enjoyment, Dordrecht: Kluwer, pp.91-100
Bonanno, A.T. and El-Nasr, M.S. (2012) Event-related Physiological Response in a Horror
Game, PX Workshop, FDG‟12, May 29th, 2012, Raleigh, NC, USA
Bonitzer, P. (1982). Le champ aveugle. Essais sur le cinéma, Cahiers du cinema and
Gallimard, Paris.
Tom Garner
2012
University of Aalborg
Bolls, P., Lang, A. and Potter, R. (2001) The Effects of Message Valence and Listener
Arousal on Attention, Memory, and Facial Muscular Responses to Radio Advertisements,
Communication Research, 28:5, pp.627-651
Booth-Kewley, S., Edwards, J.E., & Rosenfeld, P. (1992) Impression management, social
desirability, and computer administration of attitude questionnaires: Does the computer make
a difference? Journal of Applied Psychology, 77, pp.562-566.
Bos, D.O. (2006) EEG-Based Emotion Recognition: The Influence of Visual and Auditory
Stimuli, http://hmi.ewi.utwente.nl/verslagen/capita-selecta/CS-Oude_Bos-Danny.pdf, 2006.
Boucheix, J. and Lowe, R.K. (2010) An eye tracking comparison of external pointing cues
and internal continuous cues in learning with complex animations, Learning and Instruction,
20:2, pp.123-135
Boucsein, W. (1992) Electrodermal activity, New York: Plenum Press.
Bourke, J. (2005) Fear: A Cultural History, Virago Press
Boyce, T. (2011) Silent Hill 2: An Analysis, essay for University of Lethbridge,
http://gameplay-archive.org/documents/Boyce_SilentHill2(2011).pdf
Bracha, S.H. (2004). Freeze, flight, fight, fright, faint: Adaptationist perspectives on the acute
stress response spectrum, CNS Spectrums, 9:9, pp.679-685
Bradley, M.M. and Lang, P.J. (1994) Measuring Emotion: The Self-Asessment Manikin and
the Semantic Differential, Journal of Begavioural Therapy and Experimetnal Psychiatry,
25:1, pp.49-59
Bradley, M.M. and Lang, P. (2000) Affective reactions to acoustic stimuli, Psychophysiology,
37, pp.204-215
Bradley, M., Greenwald, M.K., Petry, M.C. and Lang, P.J. (1992) Remembering pictures:
Pleasure and arousal in memory, Journal of Experimental Psychology: Learning, Memory, &
Cognition, 18, pp.379-390
Bradley, M., Lang, P.J. and Cuthbert, B.N. (2002) A Motivational Analysis of Emotion:
Reflex-Cortex Connections, In: J.T. Caccioppo et al., Foundations in Social Neuroscience,
MIT Press
Bradley, M.M., Moulder, B. and Lang, P. (2005) When good things go bad: The reflex
physiology of defense. Psychological Science., 16, pp.468-473
Bradley, M., Silakowski, T. and Lang, P. (2008) Fear of pain and defensive activation, Pain,
137:1, pp.156-163
215
216
Appendix: Future Work, References & Complete Datasets
Bradwejn, J. (1993) Neurobiological investigations into the role of cholecystokinin in panic
disorder, Journal of Psychiatry and Neuroscience, 18:4: pp.178-88
Brave, S. and Nass, C. (2002) Emotion in HCI, In: (Eds.) J. Jacko and A. Sears, The HCI
Handbook, Hillsdale, NJ: Lawrence Erlbaum Associates
Breinbjerg, M. (2005) The Aesthetic Experience of Sound – staging of Auditory Spaces in
3D computer games, In: Aesthetics of Play, Bergen, Norway, October 14th-15th,
http://www.aestheticsofplay.org/breinbjerg.php
Bronkhorst, A. W. (1995) Localization of real and virtual sound sources, Journal of the
Acoustical Society of America, 98:5, Pt. 1, Nov 1995, pp.2542-2553
Brown, J.S., Kalish, H.I. and Farber, I.E. (1951) Condition Revealed by magnitude of startle
response to a stimulus, Journal of Experimental Psychology, 41, pp.317-328
Brown, E. and Cairns, P. (2004) A Grounded Investigation of Game Immersion, Conference
on Human Factors in Computing Systems, pp.1297-1300
Buchanan, T., and Reips, U.D. (2001) Platform-dependent biases in Online Research: Do
Mac users really think different? In: (Eds.) K.J. Jonas, P. Breuer, B. Schauenburg, and M.
Boos, Perspectives on Internet Research: Concepts and Methods.
Byrne, R. M. J. (2005) The Rational Imagination: How People Create Alternatives to Reality,
MIT Press, Boston, MA
Cacioppo, J.T. (2007) Handbook of Psychophysiology. Cambridge University Press
Cacioppo, J.T., Petty, R.E., Losch, M.E. and Kim, H.S., (1986) Electromyographic activity
over facial muscle regions can differentiate the valence and intensity of affective reactions,
Journal of Personality and Social Psychology, 50, pp.260-268
Cacioppo, J.T., Bush, L.K. and Tassinary, L.G. (1992) Microexpressive facial actions as a
function of affective stimuli: replication and extension, Psychological Science, 18, pp.515526
Cacioppo, J. T., Klein, D. J., Berntson, G. G. and Hatfield, E. (1993). The psychophysiology
of emotion. In: (Eds.) M. Lewis and J.M. Haviland, Handbook of Emotions, pp.119-142. New
York, Guilford Press.
Cacioppo, J. T., Tassinary, L. G. and Berntson, G. G. (2007) Psychophysiological Science,
Handbook of psychophysiology, pp.3-26
Cacioppo, J.T., and Gardner, W.L. (1999) Emotion, Annual Review of Psychology, 50,
pp.191-214
Tom Garner
2012
University of Aalborg
Cacioppo, J.T., Tassinary, L.G., and Bernston, G.G. (2000) The Handbook on
Psychophysiology, Cambridge University Press
Calleja, G. (2007) Revising immersion: A conceptual model for the analysis of digital game
involvement, Situated Play, University of Tokyo, Japan, September 24th-28th
Campbell, A., Choudhury, T., Hu, S., Lu, H., Mukerjee, M., Rabbi, M. and Raizada,
R.D.S (2010) NeuroPhone: brain-mobile phone interface using a wireless EEG headset, ACM
MobiHeld, 2010, pp.1-6
W. B. Cannon. (1926) Physiological regulation of normal states: some tentative postulates
concerning biological homeostatics, In: (Ed.) A. Pettit and C.Richet: ses amis, ses collègues,
ses élèves, p.91. Paris: Éditions Médicales, 1926.
Cannon, W.B. (1931) Again the James-Lange and the thalamic theories of emotion,
Psychological Review, 38, pp.281-195
Cantor, J., Ziemke, D. and Sparks, G. (1984) The Effect of Forewarning on Emotional
Responses to a Horror Film. Journal of Broadcasting, 28:1, pp.21-31
Capcom (1996) Resident Evil [Computer Video Game]
Capcom (2012) Resident Evil 6 [Computer Video Game]
Cardinal, R.N., Parkinson, J.A., Hall, J. and Everitt, B.J. (2001) Emotion and Motivation: the
Role of the Amygdala, Ventral Striatum, and Prefrontal Cortex, Neuroscience and
Biobehavioral Reviews, 26, pp.321-352
Carroll, N. (1990) The Philosophy of Horror: or Paradoxes of the Heart, New York and
London: Routledge
Carroll, N. (1999) Film, Emotion, Genre, in: (Eds.) G. Smith and C. Plantinga, Passionate
Views: Film, Cognition and Emotion, Johns Hopkins University Press, Baltimore, 1999,
pp.21-47
Carroll, N. (1996) The Paradox of Suspense, in: Suspense: conceptualizations, theoretical
analyses, and empirical explorations, Lawrence Erlbaum Associates Inc.
Case, G. and Wolfson, S. (2000) The effects of sound and colour on responses to a computer
game, Interacting With Computers, 13:2, pp.183-192
Chalmers, A., Howard, H. and Moir, C. (2009) Real Virtuality: a step change from virtual
reality, Proceedings of the 2009 Spring Conference on Computer Graphics, New York, NY
217
218
Appendix: Future Work, References & Complete Datasets
Chanel, G., Kierkels, J.J.M., Soleymani, M. and Pun, T. (2009) Short-term emotion
assessment in a recall paradigm, International Journal of Human-Computer Studies,
67:8, pp.607-627
Chanel, G., Kronegg, J., Grandjean, D. and Pun, T. (2005) Emotion assessment: Arousal
evaluation using EEG's and peripheral physiological signals. Technical Report 05.02,
Computer Vision Group, Computing Science Center, University of Geneva, 2005.
Chapelle, O., Haffner, P. and Vapnik, V.N. (1999) Support vector machines for histogrambased image classification, Neural Networks, 10:5, pp.1055-1064
Chatrian, G.E., Lettich, E. and Nelson P.L. (1985) Ten percent electrode system for
topographic studies of spontaneous and evoked EEG activity, American Journal of EEG
Technology, 25, pp.83-92
Chen, J. (2006) Flow in Games, University of Southern California, Los Angeles, USA
Cheng, K. and Cairns, P. (2005) Behaviour, realism and immersion in games, Proceedings of
CHI 2006, Conference on Human Factors in Computing Systems. ACM Press, pp.1272-1275
Chion, M. (1994) Audio-Vision: Sound on Screen, C Gorbman Ed., Columbia University
Press, New York
Cho, H., and LaRose, R. (1999) Privacy issues in internet surveys, Social Science Computer
Review, 17, pp.421-434
Cho, J., Yi, E., and Cho, G. (2001) Physiological responses evoked by fabric sounds and
related mechanical and acoustical properties. Textile Research Journal, 71:12, pp.1068-1073
Chouinard, S., Briere, M., Rainville, C., and Godbout, R. (2003) Correlation between evening
and morning waking EEG and spatial orientation, Brain & Cognition, 53:2, pp.162-165
Christie, M. J. (1981) Electrodermal activity in the 1980s: a review, Journal of the Royal
Society of Medicine, 74, pp.616-622
Christie, I.C. and Friedman, B.H. (2004) Autonomic specificity of discrete emotion and
dimensions of affective space: a multivariate approach, International Journal of
Psychophysiology, 51, pp.143-153
Clark, A. (1997) Being There: Putting Brain, Body, and World Together Again. MIT Press,
Cambridge, MA.
Clark, L. A. and Watson, D. (1994) Distinguishing Functional from Dysfunctional Affective
Responses, in: (Eds.) P. Ekman and R.J. Davidson, The Nature of Emotion: Fundamental
Questions, New York: Oxford University Press, pp.131-136
Tom Garner
2012
University of Aalborg
Coan, J.A. and Allen, J.B. (2003) Frontal EEG asymmetry and the behavioral activation and
inhibition systems, Psychophysiology, 40, pp.106-114
Coles, M.G.H. (1989). Modern mind-brain reading: Psychophysiology, physiology, and
cognition, Psychophysiology, 26, pp.251-269
Collins, K. (2006) Introduction to the Participatory and Non-Linear Aspects of Video Games
Audio, [Website] www.gamessound.com
Collins, K. (2008) Game Sound: An Introduction to the History, Theory, and Practice of
Video Game Music and Sound Design, MIT Press
Comisky, P. and Bryant, J. (1982) Factors involved in generating suspense. Human
Communication Research, 9:1, pp.49-58
Congedo, M., Goyat, M., Tarrin, N., Ionescu, G., Varnet, L., Rivet, B., Phlypo, R., Jrad, N.,
Acquadro, M. and Jutten, C. (2011) Brain Invaders: a prototype of an open-source P300based video game working with the OpenViBE platform, 5th International Brain-Computer
Interface Conference 2011, Graz, Austria
Conrad, A., Muller, A., Doberenz, S., Kim S., Meuret, A.E., Wollburg, E. and Roth, W.T.
(2007) Psychophysiological effects of breathing instructions for stress management, Applied
Psychophysiological Biofeedback, 32, pp.89–98
Coyle, S., Ward, T., Markham, C. and McDarby, G., (2004) On the suitability of
near-infrared (NIR) systems for next-generation brain–computer interfaces. Physiological
Measures, 25, pp.815-822
Crowley, K., Sliney, A., Pitt, I. and Murphy, D. (2010) Evaluating a brain-computer Interface
to categorise human emotional response, 10th IEEE International Conference on Advanced
Learning Technologies, (ICALT), July, 2010
Cox, T. (2007) Bad vibes: an investigation into the worst sounds in the world, 19th ICA
Madrid
Cox, T. (2008) Scraping sounds and disgusting noises, Applied Acoustics, 69:12, pp.11951204
Craig, A.D. (2008) Interoception and emotion: A neuroanatomical perspective, in: Lewis, M.
Haviland-Jones, J.M. and Feldman Barrett, L., Handbook of Emotion (3rd ed.). New York:
The Guildford Press, pp.272–288
Crawford, H.J., Clarke, S.W. and Kitner-Triolo, M. (1996) Self-generated happy and sad
emotions in low and highly hypnotizable persons during waking and hypnosis: Laterality
and regional EEG activity differences, International Journal of Psychophysiology, 24,
pp.239-266
219
220
Appendix: Future Work, References & Complete Datasets
Critchley, H.D., Elliott, R., Mathias, C.J. and Dolan, R.J., (2000). Neural activity relating to
generation and representation of galvanic skin conductance responses: a functional magnetic
resonance imaging study. Journal of Neuroscience, 20, pp.3033-3040
Critchley, H.D., Mathias, C.J. and Dolan, R.J., (2002) Fear conditioning in humans: the
influence of awareness and autonomic arousal on functional neuroanatomy, Neuron, 33,
pp.653-663
Crockett, M.J., Clark, L., Tabibnia, G., Lieberman, M.D., and Robbins, T.W. (2008)
Serotonin modulates behavioral reactions to unfairness, Science, 320(5884), p.1739
Crytek (2007) CryEngine 2 [Computer Software/Game Engine], Electronic Arts
Crytek (2007) Crysis [Computer Video Game], Electronic Arts
Crytek (2011) Crysis 2 [Computer Video Game], Electronic Arts
Csikszentmihalyi, M. (1990) Flow: The Psychology of Optimal Experience, Harper and Row,
New York
Cunningham, S., Grout, V., and Hebblewhite, R. (2006) Computer Game Audio: The
Unappreciated Scholar of the Half-Life Generation, Audio-Mostly 2006, Pitea, Sweden
Cusack, R. and Carlyon, R. P. (2004) Auditory Perceptual Organization Inside and Outside
the Laboratory, in: (Ed.) Neuhoff, J., Ecological Psychoacoustics, Elsevier Academic Press,
CA, USA
Cuthbert, B.N., Lang, P.J., Strauss, C., Drobes, D., Patrick, C.S. and Bradley, M.M. (2003)
The psychophysiology of anxiety disorder: Fear memory imagery, Psychophysiology, 40,
pp.407-422
Dalgleish, T. (2004) The emotional brain, Nature Reviews Neuroscience, 5, pp.583-9.
Darwin, C. (1872) The Expression of Emotions in Man and Animals, Oxford University Press
Davidson, S. (1999) From spam to stern: Advertising law and the Internet, in: (Eds.) D.W.
Schumann and E. Thorson, Advertising and the World Wide Web, pp.233-263, Mahwah,
NJ:Lawrence Erlbaum
Davidson, R.J. (2000) Cognitive neuroscience needs affective neuroscience (and vice versa),
Brain & Cognition, 42, pp.89-92
Davidson, R. J. (2003) Affective Neuroscience and Psychophysiology: Toward a Synthesis,
Psychophysiology, 40 (2003), pp.655-665, Blackwell Publishing Inc. Printed in the USA.
Tom Garner
2012
University of Aalborg
Davis, S., Butcher, S.P. and Morris, R.G.M. (1992) The NMDA receptor antagonist D-2amino-5-phosphonopentanoate (D-aAP5) impairs spatial-learning and LTP in vivo at
intracerebral concentrations comparable to those that block LTP in vitro, Journal of
Neuroscience, 12, pp.21-34
DeGroot, D. and Broekens, J. (2003) Using Negative Emotions to Impair Gameplay. Proc. of
the BNAIC, pp.99-106
Denton, D.A. (2006) The primordial emotions: the dawning of consciousness, Oxford
University Press
De Luca, C. (1997) The use of surface electromyography in biomechanics, Journal of
Applied Biomechanics
De Silva, L.C., Miyasato, T., Nakatsu, R. (1998) Use of multimodal information in facial
emotion recognition, IEICE Trans. Inf & Syst, E81-D(1)
De Quincey, T. (2006) On Murder, (Ed.) R. Morrison, New York: Oxford University Press
Descartes, R. (1649) The Passions of the Soul, in: The Philosophical Works of Descartes
(E.S. Haldane and G.T.R. Ross, trans., Vol.1), London, Cambridge University Press, 1911
Difrancesco, M.W., Holland, S.K. and Szaflarski, J.P. (2008) Simultaneous EEG/functional
magnetic resonance imaging at 4 Tesla: correlates of brain activity to spontaneous alpha
rhythm during relaxation. Journal of Clinical Neurophysiology, 25, pp.255-264
Dimberg, U. (1986) Facial reactions to fear-relevant and fear-irrelevant stimuli, Biological
Psychology, 23, pp.153-161
Drachen, A., Nacke, L., Yannakakis, G.N., Pedersen, A.L. (2010) Correlation between heart
rate, electrodermal activity and player experience in First-Person Shooter games.
In press for SIGGRAPH 2010, ACM-SIGGRAPH Publishers
Dreyfus, H.L. (1991) Being in the World, a Commentary on Heidegger‟s “Being and Time”,
Division I, Cambridge, MIT Press
Duckworth, K. L., Bargh, J. A., Garcia, M., and Chaiken, S. (2002) The automatic evaluation
of novel stimuli. Psychological Science, 13, pp.513-519
Dupont, S. (2002) Investigating temporal pole function by functional imaging, Epileptic
Disorders, 4:7, pp.S17-S22
221
222
Appendix: Future Work, References & Complete Datasets
Edelberg, R. (1972) Electrical activity of the skin: Its measurement and uses in
Psychophysiology, In: (Eds.) N.S. Greenfield and R.A. Sternbach, Handbook of
Psychophysiology, pp.367-418, New York: Holt, Rinehart & Winston.
Eich, E., Macauley, D. and Ryan, L. (1994) Mood dependent memory for events of the
person al past, Journal of Experimental Psychology: General, 123, pp.201-215
Ekanayaki, H. (2010) P300 and Emotiv EPOC: Does Emotiv EPOC capture real EEG?
http://neurofeedback.visaduma.info/emotivresearch.htm
Ekman, P. (1992) An Argument for Basic Emotions, Cognition and Emotion, 6 (3/4), pp.169200
Ekman, P. (1999) Basic Emotions, in: (Eds.) T. Dalgleish and T. Power The Handbook of
Cognition and Emotion, pp. 45-60, Sussex, U.K., John Wiley & Sons, Ltd.
Ekman, I. (2008) Psychologically Motivated Techniques for Emotional Sound in Computer
Games, Audio Mostly 2008 (Pitea, Sweden)
Ekman, I. (2009) Modelling the Emotional Listener: Making Psychological Processes
Audible. Audio Mostly 2009, Glasgow 2nd-3rd September
Ekman P. and Friesen W.V. (1975) Unmasking the Face. A guide to recognizing emotions
from facial clues, Prentice-Hall, Inc., Englewood Cliffs, New Jersey
Ekman, P. and Friesen, W.V. (1978) Facial Coding Action System (FACS): A Technique for
the Measurement of Facial Actions, Consulting Psychologists Press, Palo Alto, CA.
Ekman, I. and Kajastila, R. (2009) Localisation Cues Affect Emotional Judgements – Results
from a User Study on Scary Sound. Proc. AES 35th Conference on Audio for Games, London
UK., CD-ROM
Ellis, G.D., Voelkl, J.E. and Morris, C. (1994) Measurement and Analysis Issues with
Explanation of Variance in Daily Experience Using the Flow Model, Journal of Leisure
Research, 26:4, pp.337-356.
Ellsworth, P. (1991) Some implications of cognitive appraisal theories of emotion, in: (Ed.)
K.T. Strongman, International Review of Studies on Emotion, 1, pp.143-160, Chichester, UK,
John Wiley & Sons.
Erber, R., and Tesser, A. (1992) Task effort and the regulation of mood: The absorption
hypothesis, Journal of Experimental Social Psychology, 28, pp.339-359
Ermi, L. and Mäyrä, F. (2005) Fundamental components of the gameplay experience:
analysing immersion, Proceedings of the DiGRA conference Changing views: worlds in play,
Vancouver, Canada. DiGRA, 2005
Tom Garner
2012
University of Aalborg
Esbjörn-Hargens, S. and Zimmerman, M.E. (2009) Integral Ecology: Uniting Multiple
Perspectives on the Natural World, Integral Books, Boston, MA
Fagerlönn, J. (2010) Distracting effects of auditory warning on experienced drivers, The 16th
International Conference on Auditory Display (ICAD-2010) June 9-15, 2010, Washington,
D.C, USA
Fagerlönn, J., & Liljedahl, M. (2009). AWESOME sound design tool: A web based utility
that invites end users into the audio design process. ICAD 09.
Fahs, T. (2009) IGN Presents the History of Survival Horror: Tracing Fear to its Primal
Roots, [Website] http://www.ign.com/articles/2009/10/30/ign-presents-the-history-ofsurvival-horror
Fan, N., Balan, R.V. and Rosca, J. (2004) Comparison of Wavelet and FFT Based Single
Channel Speech Signal Noise Reduction Techniques, Proc. SPIE 5607, Wavelet Applications
in Industrial Processing II, 127 (November 1, 2004); doi:10.1117/12.574050
Fanselow, M.S. (1994) Neural Organization of the Defensive Behaviour System Responsible
for Fear, Psychonomic Bull. Rev., 1, pp.429-438
Fischer, G., Grudin, J., Lemke, A. C., McCall, R., Ostwald, J., Reeves, B. N. and Shipman, F.
(1992) Supporting indirect, collaborative design with integrated knowledge-based design
environments, Human Computer Interaction, 7:3, pp.281-314
Fleckenstein, K. (1991) Defining affect in relation to cognition: A response to Susan
McLeod, Journal of Advanced Composition, 11, pp.447-53.
Flórez, F., Azorín, J., Iáñez, E. et al. (2011) Development of a Low-cost SVM-based
Spontaneous Brain-Computer Interface, In: NCTA 2011 – International Conference on
Neural Computation Theory and Applications. Special Session on Challenges in
Neuroengineering
Folkman, S. and Lazarus, R.S. (1990) Coping and Emotion, in: (Eds.) N.L. Stein, B.
Leventhal and T. Traboass, Psychological and Biological Approaches to Emotion, Lawrence
Erlbaum Associates, Hillsdale, N.J., 21990, pp.313-332
Fowles, D.C. (1986) The Eccrine System and Electrodermal Activity, in: (Eds.) M.G.H.
Coles, E. Donchin and S.W. Porges, Psychophysiology: Systems, Processes and Applications,
pp.51-96, New York, Guilford Press
Fowles, D.C. (1988) Psychophysiology and psychopathology: A motivational approach,
Psychophysiology, 25, pp.373-391
223
224
Appendix: Future Work, References & Complete Datasets
Fowles, D.C., Christie, M.J., Edelberg, R., Grings, W.W., Lykken, D.T., and Venables,
P.H. (1981) Publication recommendations for electrodermal measurements.
Psychophysiology, 18, pp.232-239
Fox, R.G. (1997) On Thrownness, After Postmodernism Conference 1997,
http://www.focusing.org/apm_papers/fox.html
Firelight Technologies (2007) FMOD [Computer Software]
Franssen, J.L.M. (1995) Handboek oppervlakte-elektromyografie, First edition, (Ed.)
Franssen J.L.M. Utrecht, De Tijdstroom
Freeman, D. (2003) Creating Emotion in Games, New Riders Games.
Freitas, A.L., Salovey, P., and Liberman, N. (2001) Abstract and Concrete Self-evaluative
Goals, Journal of Personality and Social Psychology, 80, pp.410-412
Freud, S. (1956) Turnings in the ways of psychoanalytic therapy. In: (Ed.) Jones, E.,
Collected papers, 2, pp.392-402, London, Hogarth Press.
Frick, A., Bächtiger, M.T. and Reips, U.D, (2001) Financial Incentives, Personal Information,
and Dropout in Online Studies, in: (Eds.) U.D. Reips and M. Bosnjak, Dimensions of Internet
Science, pp. 209-219, Lengerich, Germany: Pabst Science.
Frictional (2010), Amnesia: the Dark Descent [Computer Video Game], Steam/Frictional
Friestad, M. and Thorson, E. (1986) Emotion Eliciting Advertising: Effects on Long-term
Memory and Judgement, Advances in Consumer Research, 13, pp.111-116
Frijda, N. H. (1994). Varieties of affect: Emotions and Episodes, Moods, and Sentiments, in:
(Eds.) P. Ekman and R. J. Davidson, The nature emotion: Fundamental questions, pp.59-67,
Oxford, UK: Oxford Univ. Press
Frome, J. (2007) Eight ways Videogames Generate Emotion, Situated Play, Proceedings of
DiGRA 2007 Conference
Fung, M. T., Raine, A., Loeber, R., Lynam, D. R., Steinhauer, S. R., Venables, P. H., et al.
(2005) Reduced Electrodermal Activity in Psychopathy-prone Adolescents, Journal of
Abnormal Psychology, 114, pp.187–196
Funkenstein, D. (1958) The Physiology of Fear and Anger, in: (Eds.) Reed, C. and Tomkins,
S. Psychopathology: A Source Book, Oxford University Press
Furedy, J.J. (1984) Generalities and Specifics in Defining Psychophysiology: Reply to Stern
(1964) and Stern (1984), Psychophysiology, 2, pp.2-4
Tom Garner
2012
University of Aalborg
Gainotti, G. (2000) Neuropsychological theories of emotion, in (Eds.) Gainotti, G. and Joan,
C., The neuropsychology of emotion. Series in affective science, pp. 214-236, New York, NY,
US: Oxford University Press
Gallese, V. (2003). The roots of empathy: The shared manifold hypothesis and the neural
basis of intersubjectivity. Psychopathology, 36, pp.171–180
Ganong, W.F. (2001) Review of Medical Physiology. McGraw-Hill Publishing, New York,
p.123
Garbarini, F. and Adenzato, M. (2004) At the root of embodied cognition: Cognitive science
meets neurophysiology, Brain and Cognition, 56, pp.100-106
Gärdenfors D., (2002) Designing Sound-Based Computer Games, Digital Creativity, 14:2,
Malmö, Sweden: Narrativity Studio, Interactive Institute, 2002.
Garner, T.A. and Grimshaw, M. (2011) A Climate of Fear: Considerations for Designing a
Virtual Acoustic Ecology of Fear, Proceedings of the 6th Audio Mostly conference, Coimbra,
Portugal
Garner, T., Grimshaw, M., Abdel Nabi, D. (2010) A Preliminary Experiment to Assess the
Fear Value of Preselected Sound Parameters in a Survival Horror Game. Proceedings of the
5th Audio Mostly Conference: A Conference on Interaction with Sound, Pitea, Sweeden
Gaver, W. (1993) What in the World do we Hear?: An ecological approach to auditory event
perception, Ecological Psychology, 5:1, pp.1-29
Gega, L., Marks, I.M., Mataix-Cols, D. (2004) Computer-aided CBT self-help for anxiety
and depressive disorders: Experience of a London clinic and future directions. Journal of
Clinical Psychology/In Session, 60, pp.147-157
Gerling, K. M., Klauser, M. and Niesenhaus, J. (2011) Measuring the Impact of Game
Controllers on Player Experience in FPS Games, MindTrek ‟11, Proceedings of the 15th
International Academic MindTrek Conference, pp.83-86, ACM New York, USA
Gettier, E.L. (1963) Is Justified True Belief Knowledge? Analysis, 23, pp.121-123, Oxford
University Press
Giant Bomb.com (2012) www.giantbomb.com [Website]
Gielen, J. (2010) EMG Biofeedback for Virtual Reality Therapy, Masters Thesis, Ghent
University, Belgium
Gilbert, D.T., Pinel, E.C., Wilson, T.D., Blumberg, S.J. and Wheatley, T.P. (1998) Immune
neglect: A source of durability bias in affective forecasting, Journal of Personality and Social
Psychology, 75, pp.617-638
225
226
Appendix: Future Work, References & Complete Datasets
Giles, D. (1984) The Conditions of Pleasure in Horror Cinema, in: Barry Keith Grant
(ed.), Planks of Reason: Essays on the Horror Film, Metuchen, N.J.: Scarecrow Press, pp.3854.
Gillham, B. (2000) Developing a Questionnaire, Continuum, London.
Gilroy, S.W., Porteous, J., Charles, F. and Cavazza, M.O. (2012) Exploring passive
user interaction for adaptive narratives, Proceedings of the 2012 ACM International
Conference on Intelligent User Interfaces, Lisbon, Portugal, 14th-17th February 2012,
New York: ACM, pp.119-128
Goldberg, R.F., Perfetti, C.A., Schnedier, W. (2006) Distinct and common cortical activations
for multimodal semantic categories, Cognitive Affective & Behavioral Neuroscience, 6:3,
pp.214–222
Goldman, A. (1976) Discrimination and perceptual knowledge, Journal of Philosophy, 73,
pp.771–791
Gonzalez-Sanchez, J., Chavez-Echeagaray, M. E., Atkinson, R., and Burleson, W. (2011)
ABE: An Agent Based Software Architecture for a Multimodal Emotion Recognition
Framework, Proceedings of the Ninth Working IEEE/IFIP Conference on Software
Architecture, pp.187-193
Goodman, S. (2008) Sonic Warfare: Sound, Affect, and the Ecology of Fear, MIT Press
Gow, J., Cairns, P., Colton, S., Miller, P. and Baumgarten, R. (2010) Capturing Player
Experience with Post-Game Commentaries, CGAT (2010), Singapore
Graps, A. (1995) An Introduction to Wavelets, IEEE Computational Science and
Engineering, 2:2, Summer 1995.
Gregorios-Pippas L., Tobler, P.N. and Schultz, W. (2009) Short-term temporal discounting of
reward value in human ventral striatum, J. Neurophysiol., 101:3, pp.150-172
Granka, L., Joachims, T. and Gay, G. (2004) Eye-tracking analysis of user behavior in www
search, ACM Conference on Research and Development in Information Retrieval (SIGIR)
Gray, J.A. (1971) The Psychology of Fear and Stress, New York: McGraw-Hill Book
Company
Gray, J.A. (1982) The Neuropsychology of Anxiety, New York: Oxford University Press
Gray, J.A. (1987) The psychology of fear and stress (2nd ed.) Cambridge, England:
Cambridge University Press.
Tom Garner
2012
University of Aalborg
Gray, J.A. and McNaughton N (2000) The Neuropsychology of Anxiety: An Enquiry into the
Functions of the Septo-Hippocampal System. Oxford University Press
Gray, E.K., and Watson, D. (2001) Emotion, mood, and temperament: Similarities,
differences, and a synthesis. In: (Eds.) R.L. Payne and C.L. Cooper, Emotions at Work:
Theory, Research, and Applications for Management, pp.21/43, Chichester, UK: Wiley.
Greco, J. (2010) Achieving Knowledge: a Virtue-theoretic account of epistemic normativity,
Cambridge University Press
Greene, N. (2012) Microsoft Patents Biometric, Pressure Sensitive Video Game Controller,
[Website] http://www.gamedynamo.com/article/showarticle/3516/en/Microsoft_patents_
biometric_pressure-sensitive_video_game_controller
Griffiths, P.E. and Scarantino, A. (2009) Emotions in the Wild: The situated perspective on
emotion, in (Eds.) P. Robbins and M. Aydede, Cambridge Handbook of Situated Cognition,
Cambridge: Cambridge University Press, pp.437-453.
Grimshaw, M. (2007) The Resonating spaces of first-person shooter games. Proceedings of
the 5th International Conference on Game Design and Technology, Liverpool, November
14th-15th, 2007
Grimshaw, M. (2008) Autopoiesis and sonic immersion modeling sound-based player
relationships as a self-organizing system, Conference Papers (Peer-Reviewed). Paper 1.
http://digitalcommons.bolton.ac.uk/gcct_conferencepr/1
Grimshaw, M. (2009) The Audio Uncanny Valley: Sound, Fear and the Horror Game, Audio
Mostly 4th Conference on Interaction with Sound, Glasgow, 2nd–3rd September,
http://digitalcommons.bolton.ac.uk/gcct_conferencepr/9/.
Grimshaw, M., Lindley, C.A., and Nacke, L. (2008) Sound and Immersion in the First-Person
Shooter: Mixed Measurement of the Player's Sonic Experience. Audio Mostly 2008, Pitea,
Sweden
Grimshaw, M., and Schott, G. (2008). A Conceptual Framework for the Analysis of FirstPerson Shooter Audio and its Potential Use for Game Engine, International Journal of
Computer Games Technology, Vol. 2008
Griss, P., Enoksson, P., Tolvanen-Laakso, H.K., Meriläinen, P. Ollmar, S. and Stemme, G.
(2001) Micromachined Electrodes for Biopotential Measurements, Journal of
Microelectromechanical Systems, 10:1, pp.10-16
Gross, J. J., Fredrickson, B. L. and Levenson, R. W. (1994) The psychophysiology of crying.
Psychophysiology, 31, pp.460–468
Gross, J. J., & Levenson, R. W. (1993) Emotional suppression: Physiology, self-report, and
expressive behavior. Journal of Personality and Social Psychology, 64, pp.970-986
227
228
Appendix: Future Work, References & Complete Datasets
Gualeni, S., Janssen, D. and Calvi, L. (2012) How psychophysiology can aid the design
process of casual games: a tale of stress, facial muscles, and paper beasts, Proceedings of the
international conference on the foundations of digital games, ACM, New York, USA
Guardian.co.uk (2010) The UK‟s Top Selling Games of 2010 [Website]
http://www.guardian.co.uk/technology/gamesblog/2011/jan/10/top-selling-games-of-2010
Guardian.co.uk (2011) Best Selling Games of 2011 [Website]
http://www.guardian.co.uk/technology/gamesblog/2012/jan/11/best-selling-games-of-2011
Gurley, K. and Kareem, A. (1997) Applications of wavelet transforms in earthquake, wind
and ocean Engineering, Engineering Structures, 21, pp.149-167
Hale, J. L., Lemieux, R. and Mongeau, P.A. (1995) Cognitive processing of fear-arousing
message content, Communication Research, 22, pp.459-474
Hamalainen, M., Hari, R., Ilmoniemi, R.J., Knuutila, J., Lounesmaa, O.V. (1993) Magnetoencephalography: theory, instrumentation, and applications to noninvasive studies of the
working human brain, Review of Modern Physiology, 65, pp.413-97
Hamann, S. and Canli, T. (2004) Individual differences in emotion processing, Curr Opin
Neurobiology, 14, pp.233–238.
Harmon-Jones, E., Allen, J. J. B. (2001) The role of affect in the mere exposure effect:
Evidence from physiological and individual differences approaches. Personality and Social
Psychology Bulletin, 27, pp.889–898.
Harris, T. (1981) Red Dragon [Novel] GP Putnams, USA
Havas, D.A., Glenberg, A.M. and Rinck, M. (2007) Emotion simulation during
language comprehension, Psychonomic Bulletin & Review, 14:3, pp.436-441
Hazlett, R. (2003) Measurement of user frustration: A biologic approach. Proceedings of the
ACM Conference on Human Factors in Computing Systems, pp.734–735
Hazlett, R., Benedeck, J. (2007) Measuring emotional valence to understand the user‘s
experience of software, Int. J. Human-Computer Studies, 65, pp.306–314
Headfirst Productions (2005) Call of Cthulhu: Dark Corners of the Earth [Computer Video
Game], Bethesda Softworks/2K Games
Hedman, E., et al. (2009) iCalm: measuring electrodermal activity in almost any setting. 3rd
International Conference on Affective Computing and Intelligent Interaction and Workshops,
2009 (Piscataway, N.J.: IEEE): 1-2. © 2009 IEEE
Heidegger, M. (1927) Being and Time, Halle, Niemeyer, Germany
Tom Garner
2012
University of Aalborg
Heim, M. (1998) Virtual Realism, Oxford University Press, New York, USA
Hermens, H., Freriks, B., Disselhorst-Klug, C. and Rau, G. (2000) Development of
recommendations for SEMG sensors and sensor placement procedures. Journal of
Electromyography and Kinesiology, 10, pp.361-374
Herwig, U., Satrapi, P. and Schönfeldt-Lecuona, C. (2003) Using the International 10-20
EEG System for Positioning of Transcranial Magnetic Stimulation, Brain Topography, 16:2,
pp.95-99
Hjorth, B. (1975) An on-line information of EEG scalp potentials into orthogonal
source derivations. Electroencephalography and Clinical Neurophysiology, 39, pp.526-530
Hochschild, A.R., (1979) Emotion Work, Feeling Rules and Social Structure, American
Journal of Sociology, 85:3, pp. 551-575
Hoffman, H. and Searle, J. (1965) Acoustic variables in the modification of startle reaction in
the rat, Journal of Comparative and Physiological Psychology, 60:1, pp.53-58
Hoffman, E., McCabe, K. and Smith, V. (1998) Behavioral foundations of reciprocity:
experimental economics and evolutionary psychology. Econ. Inquiry, 36, pp.335–352
Holbrook, N.J., Munck, A. and Guyre, P.M. (1984) Physiological functions of
glucocorticoids in stress and their relations to pharmacological actions. Endocrinology
Review. 5, pp.25–44
Horswill, M.S. and Coster, M.E. (2001) User controlled photographic animations,
photograph-based questions, and questionnaires: three Internet-based instruments for
measuring drivers‘ risk-taking behaviour, Behav. Res. Methods Instrum. Comput. 33, pp.46–
58
Howard-Jones, P.A., and Demetriou, S. (2009) Uncertainty and engagement with learning
games. Instructional Science, 37:6, pp.519–536.
Howell, P. (2011) Schematically Disruptive Game Design, Proceedings of DiGRA 2011
Conference: Think Design Play
Hermens, H.J., Freriks, B., Disselhorst-Klug, C. and Rau, G. (2000) Development of
recommendations for SEMG sensors and sensor placement procedures, Journal of
Electromyography and Kinesiology, 10, pp.361-374
Huang, C., Chen, C. and Chung, H. (2004) The Review of Applications and Measurements in
Facial Electromyography, Journal of Medical and Biological Engineering, 25:1, pp.15-20
229
230
Appendix: Future Work, References & Complete Datasets
Humphries, M. (2011) Sony patents biometric controllers to monitor gamer‟s state [Website]
http://www.geek.com/articles/games/sony-patents-biometric-controllers-to-monitor-gamersstate-2011113/
Hug, D. (2011) New Wine in New Skins: Sketching the Future of Game Sound Design, in:
(Eds.) Grimshaw, M., Game Sound Technology and Player Interaction, IGI Global, pp.384415, DOI: 10.4018/978-1-61692-828-5
Hughes, C. E. and Stapleton, C. B. (2005) The Shared Imagination: Creative Collaboration in
Augmented Virtuality, Proceedings of Human Computer Interaction International 2005
(HCII2005), Las Vegas, NV
IBM (2012) Biometrics [Website]
http://researcher.watson.ibm.com/researcher/view_project.php?id=1913
ID Software (2004) Doom 3 [Computer Video Game], Activision/Bethesda Softworks
IJsselsteijn, W., de Kort, Y., Poels, K., Jugelionis, A. and Bellotti, F. (2007) Characterizing
and Measuring User Experiences in Digital Games, in: Proceedings of the ACE Conference
2007 (ACM Publishers).
IJsselsteijn, W., Poels, K., and de Kort, Y. A. W. (2008) The Game Experience
Questionnaire: Development of a self-report measure to assess player experiences of digital
games, Eindhoven: TU Eindhoven
Independent.co.uk (2009) Best Selling Video Games of 2009 [Website]
http://www.independent.co.uk/life-style/gadgets-and-tech/news/best-selling-video-games-of2009-modern-warfare-2-beats-nintendos-wii-1888662.html
Infinity Ward (2009) Call of Duty: Modern Warfare 2 [Computer Video Game], Activision
Internet in Numbers (2010) [Website] http://royal.pingdom.com/2011/01/12/internet-2010-innumbers/
Isen, A. M., Shalker, T. E., Clark, M., and Karp, L. (1978), Resources required in the
construction and reconstruction of conversation, Journal of Personality and Social
Psychology, 36, pp.1-12
Isen, A. M., and Geva, N. (1987) The influence of positive affect on acceptable level of risk:
The person with a large canoe has a large worry, Organizational Behavior and Human
Decision Processes, 39, pp.145-154
Ismail, F., Biedert, R., Dengel, A. and Buscher, G. (2011) Emotional Text Tagging
http://gbuscher.com/publications/IsmailBiedert11_EmotionalTextTagging.pdf
Tom Garner
2012
University of Aalborg
Jackson, D., Malmstadt, J., Larson, C. and Davidson, R. (2000) Suppression and
enhancement of emotional responses to unpleasant pictures, Psychophysiology, 37,
pp.515–522
Jackson, P.L., Brunet, E., Meltzoff, A.N., Decety, J. (2006) Empathy examined through the
neural mechanisms involved in imagining how I feel versus how you feel pain: An eventrelated fMRI study, Neuropsychologia, 44, 752–761
James, W. (1884) What is an Emotion? Mind, 9:34, pp.188-205, Oxford University Press
Jancke, L., Vogt, J., Musial, F., Lutz, K., and Kalveram, K. T. (1996) Facial EMG responses
to auditory stimuli. International Journal of Psychophysiology, 22, pp.85-96
Jasper, H. H. (1958) The ten-twenty electrode system of the international
federation, Electroencephalogr. Clin. Neurophysiol,10, pp.371–375
Jitaree, S., Phinyomark, A., Hu, H., Phukpattaranont, P. and Limsakul, C. (2012) Design of
EMG Biofeedback System for Lower-Limb Exercises of the Elderly Using Video Games,
Journal of Sports Science and Health, 12:2, pp.172-187
Johnston, W. A., Dark, V. J., & Jacoby, L. L. (1985) Perceptual fluency and recognition
judgments., Journal of Experimental Psychology: Learning, Memory, and Cognition, 11,
pp.3-11
Joly, J(2012) Can a Polygon Make You Cry? [Website]
http://www.jonathanjoly.com/front.htm
Jones, M. B., and Jones, D. R. (1995) Preferred pathways of behavioural contagion. Journal
of Psychiatric Research, 29, pp.193-209
Jones, N. A., Field, T., Fox, N. A., Davalos, M. and Gomez, C. (2001) EEG during different
emotions in 10-month-old infants of depressed mothers, Journal of Reproductive and Infant
Psychology, 19:4, pp.295-312
Jorgensen, K. (2006) On the Functional Aspects of Computer Game Audio, Audio-Mostly
2006, Pitea, Sweden
Spielberg, S. (1993) Jurassic Park [Film], Amblin Entertainment, Universal Pictures
Kallinen, K., Ravaja, N. (2007) Comparing speakers versus headphones in listening to news
from a computer – individual differences and psychophysiological responses, Computers in
Human Behaviour, 23:1, pp.303–317
Kallio, K.P., Mäyrä, F., and Kaipainen, K. (2011) At Least Nine Ways to Play:
Approaching Gamer Mentalities, Games and Culture, 6:4, pp.327-353
231
232
Appendix: Future Work, References & Complete Datasets
Kant, I. (1964) The Critique of Judgement, (trans.) J.C. Meredith, Clarendon Press, Oxford
Kahan, T. L. and LaBerge, S. (1994) Lucid Dreaming as Metacognition: Implications for
Cognitive Science, Consciousness and Cognition, 3, pp.246-264
Kajastila, R. and Lokki, T. (2009) A gesture-based and eyes-free control method for mobile
devices, Ext. Abs. CHI 2009, ACM, pp.3559-3564
Kappas, A. and Pecchinenda, A. (1999) Don‘t Wait for the Monsters to Get You: A Video
Game Task to Manipulate Appraisals in Real Time, Cognition and Emotion, 13:1, pp.119124
Keeker, K., Pagulayan, R., Sykes, J. and Lazzaro, N. (2004) The untapped world of video
games. CHI'2004, pp.1610-1611
Keysers, C. (2011) The Empathic Brain. Amazon
Kiesler, S., and Sproull, L. S. (1986) Response effects in the electronic survey, Public
Opinion Quarterly, 50, pp.402-413
Kimbrell, T.A., Ketter, T.A., George, M.S. et al. (1995) Assessment of PET data in individual
patients with mood disorders. ACNP Annual Meeting 1995 (Abstract).
Kimbrell, T. A., George, M. S., Parekh, P. I., Ketter, T. A., Podell, D. M., Danielson, A. L.,
Repella, J. D., Benson, B. E., Willis, M. W., Herscovitch, P., and Post, R. M. (1999) Regional
brain activity during transient self-induced anxiety and anger in healthy adults. Biological
Psychiatry, 46, pp.454-465.
King, J. L. (2007) Dig the Dirt: Hashing over Hygiene in the Artifice of the Real, in: (Eds.)
Crowston, K., Sieber, S. and Wynn, E., Virtuality and Virtualization, Boston: Springer,
pp.13-18
Kirkland, E. (2009) Horror Videogames and the Uncanny, Winter Forum on „The Uncanny‟,
Chichester University
Kivikangas, J. M. (2006)Psychophysiology of Flow Experience: An Explorative Study.
University of Helsinki, Helsinki, Finland
Kivikangas, J. M., Ekman, I., Chanel, G., Järvelä, S., Salminen, M., Cowley, B., Henttonen,
P., Ravaja, N. (2010) Review on psychophysiological methods in game research, Proc. of 1st
Nordic DiGRA, DiGRA
Klauer, K.C., Munsch, J., & Naumer, B. (2000) On belief bias in syllogistic reasoning,
Psychological Review, 107, pp.852-884.
Klem, G.H., LuÈders, H.O., Jasper, H.H., Elger, C. (1999) The ten-twenty electrode system
of the International Federation. Electroencephalography Clin, Neurophysiol,, Suppl, 52, pp.36.
Tom Garner
2012
University of Aalborg
Klimmt, C. (2003) Dimensions and determinants of the enjoyment of playing digital games:
A three-level model. In: (Eds.) M. Copier & J. Raessens, Level up: Digital games research
conference, pp. 246–257, Utrecht, The Netherlands: Faculty of Arts, Utrecht University
Klimmt, C., Rizzo, A., Vorderer, P., Koch, J., and Fischer, T. (2009) Experimental Evidence
for Suspense as Determinant of Video Game Enjoyment. CyberPsychology & Behavior, 12:1,
pp.29-31
Koelsch, S., Kilches, S., Steinbeis, N. and Schelinski, S. (2008) Effects of Unexpected
Chords and of Performer's Expression on Brain Responses and Electrodermal Activity. PLoS
ONE, 3:7
Konstan, D. (2010) Rhetoric and Emotion, in: Worthington, I., A Companion to Greek
Rhetoric, Wiley-Blackwell
Krathwohl, D. R, Anderson, L. W. (2001) A Taxonomy for Learning, Teaching, and
Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. Theory into
Practice, 41:4, pp.212
Krantz, J. H. (2001) Stimulus delivery on the Web: What can be presented when calibration
isn‘t possible, in: (Eds.) Reips, U.D. and Bosnjak, M., Dimensions of Internet Science,
pp.113-130, Lengerich, Germany: Pabst Science.
Krause, B. L. (1993) The Niche Hypothesis: A hidden symphony of animal sounds, the
origins of musical expression and the health of habitats, The Explorers Journal, Winter 1993,
pp.156-160
Kromand, D. (2008) Sound and the diegesis in survival horror games, in: Audio Mostly 2008,
Pitea, Sweden
Kumari, V. (2001) Enhanced Startle Reactions to Acoustic Stimuli in Patients With
Obsessive-Compulsive Disorder, American Journal of Psychiatry, 158, pp.134-136
LaBerge, S. (1998) Dreaming and Consciousness, in: (Eds.) S. Hameroff, A. Kaszniak, and
A. Scott, Toward a Science of Consciousness II (pp.494-504) Cambridge, MA: MIT Press
LaBerge, S., and DeGracia, D.J. (1999) Varieties of lucid dreaming experience, In: (Eds.)
R.G. Kunzendorf, and B. Wallace, Individual differences in conscious experience,
Philadelphia: John Benjamins Publishing Company.
National Instruments (2011) LabVIEW 2011 [Computer Software]
Lakoff, G. and Johnson, M. (1999) Philosophy in the Flesh: The Embodied Mind and Its
Challenge to Western Thought, Basic Books, New York
Lang, P.J. (1995) The Emotion Probe: Studies of Motivation and Attention, American
Psychologist, 52, pp.372-385
233
234
Appendix: Future Work, References & Complete Datasets
Lange, R., & Houran, J. (1997) Context-induced paranormal experiences: Support for Houran
and Lange's model of haunting phenomena. Perceptual and Motor Skills, 84, pp.1455-1458.
Lang, P.J., Bradley, M.M., Cuthbert, B.N., Patrick, C.J. (1993) Emotion and
psychopathology: a startle probe analysis. Prog Exp Pers Psychopathol Res, 16, pp.163-199
Lang, P.J., Bradley, M.M. and Cuthbert, B.N. (1998) Motion, motivation, and anxiety: Brain
mechanisms and psychophysiology. Biological Psychiatry, 44, pp.1248-1263
Lang, P., Davis, M. and Öhman, A. (2000) Fear and anxiety: animal models and human
cognitive psychophysiology, Journal of Affective Disorders, 61, pp.137-159
Lanteaume, L., Khalfa, S., Regis, J., Marquis, P., Chauvel, P. and Bartolomei, F. (2007)
Emotion induction after direct intracerebral stimulations of human amygdala, Cereb. Cortex,
17, pp.1307-1313
Larsen, J., Norris, C. and Cacioppo, J. (2003) Effects of positive and negative affect on
electromyographic activity over zygomaticus major and corrugator supercilii,
Psychophysiology, 4:5, pp.776–785
Laureys, S. and Tononi, G. (2009) The Neurology of Consciousness: Cognitive Neuroscience
and Neuropathology, Elsevier Academic, Oxford, UK
Lazarro, B. (2004) Why we Play Games: 4 Keys to More Emotion without Story, GDC, 2004
Lazarus, R.S. (1964) A Laboratory Approach to the Dynamics of Psychological Stress. The
American Psychologist, 19, pp.400-411
Ledoux, J.E, (1995) Emotion: clues from the brain, Annual Review of Psychology, 46,
pp.209-35.
LeDoux, J.E., Sarina, M.R. and Schafe, G.E. (2004) Molecular Mechanisms Underlying
Emotional Learning and Memory in the Lateral Amygdala, Neuron, 44, pp.75-91
Valve (2008) Left for Dead [Computer Video Game] Electronic Arts/Steam
Levenson, R.W. (2003) Autonomic specificity and emotion, In: (Eds.) R.J. Davidson, K.R.
Scherer, and H.H. Goldsmith, Handbook of affective sciences, pp. 212–224, New York:
Oxford University Press.
Levenson, R. W. (1992) Autonomic nervous system differences among emotions,
Psychological Science, 3, pp.23-27
Levesque, J., Eugene, F., Joanette, Y., Paquette, V., Mensour, B., Beaudoin, G., Leroux,
J.M., Bourgouin, P. and Beauregard, M. (2003) Neural circuitry underlying voluntary
suppression of sadness, Biol. Psychiatry, 53:6, pp.502-510
Tom Garner
2012
University of Aalborg
Levis, D.J. (1995) Decoding traumatic memory: Implosive theories of psychopathology, in:
(Eds.) W.O‘Donohue and L. Kramer, Theories of behavior therapy, pp.180-206, Washington,
DC: American Psychological Association.
Lewis, I., Watson, B. And White, K.M. (2008) Internet versus paper-and-pencil survey
methods in psychological experiments: Equivalence testing of participant responses to healthrelated messages, Australian Journal of Psychology, 61:2, pp.107-116
Li, Y., Ashkanasy, N.M. and Ahlstrom, D. (2010) Complexity Theory and Affect Structure: a
Dynamic Approach to Modeling Emotional Changes in Organizations, in: (Eds.) Zerbe, W.J.,
Hartel, E.J. and Ashkanasy, N.M., Emotions and Organizational Dynamism, Emerald Group
Publishing, UK
Liberman N. and Trope Y. (2008) The role of feasibility and desirability considerations in
near and distant future decisions: A test of temporal construal theory, Journal of Personality
and Social Psychology, 75, pp.5–18
Lieberman, N.and Trope, T. (2008) The Psychology of Transcending the Here and Now,
Science, 322, pp.1201-1205
Lievesley, R., Wozencroft, M. and Ewins, D. (2011) The Emotiv EPOC neuroheadset: an
inexpensive method of controlling assistive technologies using facial expressions and
thoughts? Journal of Assistive Technologies, 5:2, pp.67-82
L. Chin-Teng, K. Li-Wei, C. Jin-Chern, D. Jeng-Ren, H. Ruey-Song, L. Sheng-Fu, C. TzaiWen, and J. Tzyy-Ping, (2008) „Noninvasive neural prostheses using mobile and wireless
EEG,‘ Proc. IEEE, 96:7, pp.1167–1183
Lin, Y.P., Duann, J.R., Chen, J.H. and Jung, T.P. (2010) BEEG-Based emotion recognition in
music listening, IEEE Trans. Biomed.Eng., 57:7, pp. 1798–1806, Jul. 2010.
Liu, C., Agrawal, P., Sarkar, N., and Chen, S. (2009) Dynamic Difficulty Adjustment in
Computer Games Through Real-Time Anxiety-Based Affective Feedback, International
Journal of Human-Computer Interaction, 25:6, pp.506-529
Liu, Y., Sourina, O. and Nguyen, M.K. (2010) Real-Time EEG-based Human Emotion
Recognition and Visualization, in: Proc. 2010 Int. Conf. on Cyberworlds, Singapore, pp.262269,
Lombard, M. and Ditton, T. (1997) At the heart of it all: The concept of presence. Journal of
Computer Mediated Communications, 3:2, http://www.ascusc.org/jcmc/vol3/issue2/
lombard.html
Lorber, M. (2004) The psychophysiology of aggression, psychopathy, and conduct problems:
A meta-analysis. Psychological Bulletin, 130, pp.531–552
235
236
Appendix: Future Work, References & Complete Datasets
Ma, M. and McKevitt, P. (2005) Lexical Semantics and Auditory Presentation in Virtual
Storytelling, in: P. Fröhlich and M. Pucher, Proc. Of the Workshop “Combining Speech and
Sound in the User Interface”
Maclean, P.D. (1952) Psychiatric implications of physiological studies on frontotemporal
portion of limbic system (visceral brain), Electroencephalogr Clin Neurophysiol Suppl., 4,
pp.407-18
Mandryk, R.L., Inkpen, K., Calvert, T.W. (2006) Using psychophysiological techniques to
measure user experience with entertainment technologies. Behaviour and Information
Technology (Special Issue on User Experience), 25:2, pp.141-158
Maddock, R.J. (1999) The Retrosplenial Cortex and Emotion: New Insights from Functional
Neuroimaging of the Human Brain, Trends in Neuroscience, 22, pp.310-316
Mandryk, R. and Inkpen, K.M. (2004) Physiological indicators for the evaluation of colocated collaborative play, Presented at CSCW‘04, Nov 6-10, 2004, Chicago, IL.
Mandryk, R.L., Inkpen, K., and Calvert, T.W. (2006) Using psychophysiological techniques
to measure user experience with entertainment technologies, Behaviour and Information
Technology (Special Issue on User Experience), 25:2, pp.141–158
Mangina, C.A. and Beuzeron-Mangina, J.H. (1996) Direct Electrical Stimulation of Specific
Human Brain Structures and Bilateral Electrodermal Activity, International Journal of
Psychophysiology, 22, pp.1-8
Marley, J. (2008) Developing a Model of the Insular Cortex and Emotional Regulation
Marsella, S. and Gratch, J. (2002) A Step Toward Irrationality: Using Emotion to Change
Belief, 1st international conference on Autonomous Agents and Multi-agent Systems,
Bologna, ACM, July 2002
Bioware (2007) Mass Effect [Computer Video Game] Microsoft Game Studios
Massey, H. (2004) Recommendations for Surround Sound Production, Grammy.org,
http://www2.grammy.com/PDFs/Recording_Academy/Producers_And_Engineers/5_1_Rec.p
df
Massumi, B. (2005) Fear - the Spectrum Said, Positions, 13:1, pp.31-48
Remedy (2001) Max Payne [Computer Video Game], Rockstar
McAllister, N. (2011) Biometrics – The Future of Video Games? Edge Online, [Website]
http://www.edge-online.com/features/biometrics-future-videogames/
Tom Garner
2012
University of Aalborg
McMahan, A. (2003) Immersion, engagement, and presence: A new method for analyzing 3D video games, in: M. J. P. Wolf & B. Perron (Eds.), The Video Game Theory Reader, pp.
67–87, New York: Routledge.
Medin, D. L., Lynch, E. B., & Solomon, K. O. 2000. Are there kinds of concepts? Annual
Review of Psychology, 51, pp.121–147
McLeod, D., Lucci, D. (2009). PDA Technology to Improve Self-Awareness in Teens with
ASD, International Meeting for Autism Research, May 7-9.
Meehan, M., Insko, B. Whitton, M., & Brooks Jr., F. P. 2002. Physiological Measures of
Presence in Stressful Virtual Environments. ACM Transactions on Graphics, Proceedings of
ACM SIGGRAPH, 21:3, pp.645-653
Mériau, K., Wartenburger, I., Kazzer, P., Prehn, K., Lammers, C.H., van der Meer, E.,
Villringer, A., Heekeren H.R. (2006) A neural network reflecting individual differences in
cognitive processing of emotions during perceptual decision making. NeuroImage., 33,
pp.1016–1027
Metacritic.com (2005) Resident Evil 4 Review [Website]
http://www.metacritic.com/game/playstation-2/resident-evil-4
Miller, G.A. (1956) The magical number seven, plus or minus two: Some limits on our
capacity for processing information. Psychological Review, 63, 2, pp.81-97
Miller, S. (1998) Monitors and Blunters: Different Patient Coping Styles. Oncology NEWS
international, 7:9
P.Mirza-Babaei, S.Long, E. Foley,G. McAllister, (2011) Understanding the Contribution of
Biometrics to Games User Research, Proceedings of DiGRA 2011, Conference: Think
Design Play., 2011
Miyake, K., Campos, J., Kagan, J., & Bradshaw, D. (1986). Issues in socioemotional
development. In H. Stevenson, H. Azuma & K. Hakuta (Eds.), Child development and
education in Japan, pp.239-261, New York: W.H. Freeman and Company.
Moffat, D.C. and Kiegler, K. 2006. Investigating the effects of music on emotions in games.
In Audio-Mostly 2006 (Pitea, Sweden).
Moncrieff S.,Venkatesh S., and Dorai C. 2001. Affect computing in film through sound
energy dynamics International Multimedia Conference. In Proceedings of
the ninth ACM international conference on Multimedia, 9, pp.525-527.
Monolith (2005) Condemned: Criminal Origins [Computer Video Game], Sega
Monolith (2006) F.E.A.R [Computer Video Game], Vivendi Universal
237
238
Appendix: Future Work, References & Complete Datasets
Monolith (2009) F.E.A.R 2: Project Origin [Computer Video Game], WB Games
Mulhall, D. (2011) Our Molecular Future: How Nanotechnology, Robotics, Genetics and
Artificial Intelligence will Transform our World, Bioethics Research Library,
https://repository.library.georgetown.edu/handle/10822/547211
Mulholland, T.B. (1973) Objective EEG Methods for Studying Covert Shifts of Visual
Attention, In: F. McGuigan and R.Schoonover (Eds.) The Psychophysiology of Thinking,
Academic, New York, pp.109-151
Murphy, D., Neff, F. (2011) Spatial Sound for Computer Games and Virtual Reality, in: (Ed.)
M. Grimshaw, Game Sound Technology and Player Interaction: Concepts and
Developments. Hershey PA: Information Science Reference, pp.287-312
Murphy, D., and Pitt, I. 2001. Spatial sound enhancing virtual story telling, Lecture Notes in
Computer Science, 2197, pp. 20-29
Murphy, K.J. and Brunberg, J.A. (1997) Adult claustrophobia, anxiety and sedation in MRI,
Magnetic Resonance Imaging, 15, pp.51-54
Murugappan, M., Ramachandran, N., Sazali, Y. (2009) Classification of human emotion from
EEG using discrete wavelet transform, Journal of Biomedical Science and Engineering,
Vol.3 no. 4, pp.390-396
Murugappan, M., Rizon, M., Nagarajan, R. and Yaacob, S. (2010) Inferring of Human
Emotional States using Multichannel EEG, European Journal of Scientific Research
ISSN 1450-216X, 48:2, pp.281-299
Murugappan, M., Rizon, M., Nagarajan, R. and Yaacob, S. (2010) Combining Spatial
Filtering and Wavelet Transform for Classifying Human Emotions Using EEG Signals,
Journal of Medical and Biological Engineering, 31(1): 45-51
Murugappan, M., Karthikeyan, P. and Yaacob, S. (2011) A review on stress inducement
stimuli for assessing human stress using physiological signals, (CSPA), IEEE 7th
International Colloquium on Signal Processing and its Applications, pp.420–425, 4–6 March
2011
Mutz, D.C. (2011) Population Based Survey Experiments, Princeton University Press
Mycryengine.com (2012) www.mycryengine.com [Website]
Cyan (1993) Myst [Computer Video Game], Sunsoft
Tom Garner
2012
University of Aalborg
Nacke, L., Drachen, A., Kuikkaniemi, K., Niesenhaus, J., Korhonen, H., Hoogen, W., Poels,
K., IJsselsteijn, W., and Kort, Y. (2009) Playability and Player Experience Research, In:
Proc. DiGRA 2009
Nacke L. (2008) Focus on your players: Psychophysiological player experience logging as a
powerful tool for gameplay analysis, http://www.slideshare.net/acagamic/quo-vadisinteraction-and-psychophysiology-talk-2008
Nacke, L.E., Grimshaw, M.N., Lindley, C.A. (2010)More Than a Feeling: Measurement of
Sonic User Experience and Psychophysiology in a First-Person Shooter Game, Interacting
with Computers, doi:10.1016/j.intcom.2010.04.005
Nacke, L. and Lindley, C. (2008) Boredom, Immersion, Flow - A Pilot Study Investigating
Player Experience. In IADIS Gaming 2008: Design for Engaging Experience and Social
Interaction.
Nacke, L. E. and Mandryk, R. L. (2010) Designing Affective Games with Physiological
Input, Fun and Games, September, 2011, Leuven, Belgium. Copyright 2011 ACM 978-14503-0267-8/11/05
Nagai Y, Goldstein L.H., Fenwick P.B.C. and Trimble, M.R. (2004) Clinical efficacy of
galvanic skin response biofeedback training in reducing seizures in adult epilepsy: a
preliminary randomized controlled study. Epilepsy & Behaviour; 5, pp.216–223.
Nanavati, S.H. and Rajkumar, H. (2002) Identity Verification Method Using a Central
Biometric Authority, US Patent, Fusion Arc Inc. New York, US
Natkin, S. (2000) Mapping a Virtual Sound Space into a Real Visual Space, ICMC, Berlin,
Germany
Niedenthal, P.M. (2007)Embodying Emotion, Science, 316, 1002-1005; DOI:
10.1126/science.1136930
Niedenthal, P. M., Halberstadt, J. B., and Innes-Ker, A. H. (1999) Emotional response
categorization, Psychological Review, 106, pp.337–361
Nielsen Netratings (2000) Internet year 1999 in review [Website] http://www.nielsennetratings.com/press_releases/pr_000120_review.htm
Nijholt, A., Bos, D. O. and Reuderink, B. (2009) Turning Shortcomings into Challenges:
Brain-Computer Interfaces for Games. Entertainment Computing, 1:2, pp.85-94
Norman, D.A. (2004) Emotional Design: Why We Love (Or Hate) Everyday Things. Basic
Books, New York, NY
239
240
Appendix: Future Work, References & Complete Datasets
Nosek, B.A. and Banaji, M.R. (2002) E-Research: Ethics, Security, Design, and Control in
Psychological Research on the Internet, Journal of Social Issues, 58: 1, pp. 161-176
Novak, T. P., Hoffman, D. L. and Yung, Y. F. (2000) Measuring the Flow Construct in
Online Environments: A Structural Modeling Approach. Marketing Science, 19:1, pp.22-42.
Oatley, K., Keltner, D. and Jenkins J.M. (2006) Understanding Emotions, John Wiley & Sons
Ochs, M., Niewiadomski, R., Pelachaud, C. and Sadek, D. (2005) Intelligent expressions of
emotions. In: Proceedings of First International Conference on Affective Computing &
Intelligent Interaction, Pekin, China
Öhman, A. 2000. Fear and anxiety: Evolutionary, cognitive, and clinical perspectives. In:
Lewis, M. & Haviland-Jones, J. (Eds.). Handbook of emotions. (p. 573–593). New York: The
Guilford Press
Öhman, A. and Soares, J. J. F. (1994) Unconscious anxiety: Phobic responses to masked
stimuli. Journal of Abnormal Psychology, 103, pp.231–240
ONS (2010) Internet Access, [Website] Office of National Statistics survey,
http://www.statistics.gov.uk/cci/nugget.asp?id=8
ONS (2011) Internet Access, [Website] Office of National Statistics survey,
http://www.statistics.gov.uk/articles/nojournal/internet-access-q1-2011.pdf
Oostenveld, R. and Praamstra, P. (2000) The Five percent electrode system for highresolution EEG and ERP measurements, Clinical Neurophysiology, 112, pp.713-719
Orgs G., Lange K., Dombrowski J., and Heil M., (2007) Is conceptual priming for
environmental sounds obligatory? International Journal of Psychophysiology, 65(2):162–166
Oppenheimer, J.A. and Frohlich, N. (1996) Experiencing Impartiality to Invoke
Fairness in the n-PD: Some Experimental Results, Public Choice, 86, pp.117 - 135
Özcan, E. and Van Egmond, R. (2009) The effect of visual context on the identification of
ambiguous environmental sounds, Acta Psychologica, 131, 110-119
Panconesi, E. and Hautmann, G. (1996) Psychophysiology of stress in dermatology: the
psychobiologic pattern of psychosomatics. Dermatol Clin, 14, pp.399–421
Panksepp, J. 1991. Affective neuroscience: A conceptual framework for the neurobiological
study of emotions. In: K. T. Strongman (ed.) International Review of Studies on Emotion,Vol.
1, pp.59–100. Chichester, UK: John Wiley & Sons.
Panksepp J. (2005) Affective consciousness: core emotional feelings in animals and
Humans, Cognition and Consciousness. 14, pp.30–80
Tom Garner
2012
University of Aalborg
Papez, J. W. (1937) A Proposed Mechanism of Emotion. Arch. Neurol Psychiat, 79, pp.217224
Parker J. and Heerema J. 2007. Audio Interaction in Computer Mediated Games.
International Journal of Computer Games Technology, Vol. 2008, Article ID178923.
Parker, J.R. and Heerema, J. (2008) Audio Interaction in Computer Mediated Games,
International Journal of Computer Games Technology, 2008, Article ID 178923
Parkes, D. and Thrift, N. (1980) Times, Spaces and Places: a Chroneographic Perspective,
New York: John Wiley
Patel, N. (2009) Nintendo Wii Vitality Sensor detects your pulse [Website]
http://www.engadget.com/2009/06/02/nintendo-wii-vitality-sensor-detects-your-pulse/
IBM (2009) PASW v18 [Computer Software]
Pearce C. (2008), The Truth About Baby Boomer Gamers: A Study of Over-Forty Computer
Game Players, Games and Culture, 3, pp.142
Peretto, P. (1992) An Introduction to the Modeling of Neural Networks, Cambridge
University Press, Cambridge, England.
Perron, B. (2004) Sign of a Threat: The Effects of Warning Systems in Survival Horror
Games, Cosign 2004 Proceedings, Art Academy, University of Split, p. 132-141,
Perron, B. (2005) Coming to Play at Frightening Yourself: Welcome to the World of Horror
Video Games, in Aesthetics of Play, Bergen, Norway, October 14th-15th
Peters, C., Asteriadis, S., Rebolledo-mendez, G. (2009) Modelling user attention for humanagent interaction, 10th Workshop on Image Analysis for Multimedia Interactive Services,
pp.266-269
Pflieger, M.E. and Sands, S.F. (1996) 256-Channel ERP Information Growth, Neuroimage
Adobe (2008) Photoshop CS4 [Computer Software]
Picard, R.W. (2000) Toward computers that recognize and respond to user emotion. IBM
Systems Journal, 39, pp.3-4
Picard, R. W. (2010) Emotion Research by the People, for the People, Emotion Review, 2,
pp.250-254. Web. 21 Oct. 2011. © 2011SAGE Publications
Pine, B.J. and Gilmore, J.H. (1999) The experience economy. Harvard : Harvard Business
School Press
Plutchik , R. (2002). Nature of emotions. American Scientist,89, p.349
241
242
Appendix: Future Work, References & Complete Datasets
Poag, M. (2008) Anxiety Disorders, in: Psychiatry Clerkship Guide (ed. Manley, M.) Mosby
Elsevier
Poh, M.Z., Loddenkemper, T., Swenson, N.C., Goyal, S., Madsen, J.R. and Picard, R.W.,
(2010) Continuous Monitoring of Electrodermal Activity During Epileptic Seizures Using a
Wearable Sensor, Conf Proc IEEE Eng Med Biol Soc, pp.4415-4418
Ming-Zher Poh Swenson, N.C., and R.W. Picard (2010) A Wearable Sensor for Unobtrusive,
Long-Term Assessment of Electrodermal Activity, Biomedical Engineering, IEEE
Transactions On 57.5 (2010) pp.1243-1252. Copyright © 2010, IEEE
Poole, S. (2000) Trigger Happy: The Inner Life of Videogames, London: Fourth Estate
Porges, S.W. (1998) Love: an emergent property of the mammalian autonomic nervous
system. Psychoneuroendocrinology, 23, pp.837-861
Price, J.L. (1999) Prefrontal cortical networks related to visceral function and mood. Ann N
Y Acad Sci. 877, pp.383–396
Prinz, J. (2004) Gut Reactions: A Perceptual theory of Emotions. Oxford: Oxford University
Press.
Qiang, W., Sourina, O. and Khoa, N.M. (2010) A Fractal Dimension Based Algorithm for
Neurofeedback Games, Proc. CGI, 2010. http://www.ntu.edu.sg/home/eosourina/Papers/
CGI2010NeurofeedbackGame.pdf
Rachman, S. (2004) Fear of Contamination, Behaviour Research and Therapy, 42:11,
pp.1227-1255
Radford A. (2000) Games and Learning about Form in Architecture. Automation in
Construction, 9, pp.379-385
Ubisoft (2006) Rainbow Six: Vegas [Computer Video Game]
Ranky, G.N. (2010) Analysis of a commercial EEG device for the control of a robot arm,
Proceedings of the 2010 IEEE 3th Annual Northeast Bioengineering Conference, 26th-28th
March 2010
Ravaja, N. (2002) Presence-related influences of a small talking facial image on
psychophysiological measures of emotion and attention. Proceedings of the 5th Annual
International Workshop Presence 2002. Porto, Portugal: University Fernando Pessoa
Ravaja, N. (2004) Contributions of psychophysiology to media research: Review and
recommendations. Media Psychology, 6, pp.193-235
Tom Garner
2012
University of Aalborg
Ravaja, N., Saari, T., Turpeinen, M., Laarni, J., Salminen, M. and Kivikangas, M. (2006)
Spatial presence and emotions during video game playing: does it matter with whom you
play? Presence Teleoperators & Virtual Environments, 15, pp.381–392
Ravaja, N. and Kivikangas, J. M. (2008) Psychophysiology of digital game playing: The
relationship of self-reported emotions with phasic physiological responses, Proceedings of
Measuring Behavior, pp26-29
Ravaja, N. et al. (2004) Spatial Presence and Emotional Responses to Success in a Video
Game: A Psychophysiological Study, Presence, Proceedings of the NordiCHI 2004
Ravaja, N., Saari, T., Salminen, M., Laarni, J., and Kallinen, K. (2006) Phasic emotional
reactions to video game events: A psychophysiological investigation. Media Psychology, 8:4,
pp.343—367
Ravaja, N., Turpeinen, M., Saari, T., Puttonen, S. and Keltikangas-Jarvinen, L. (2008) The
Psychophysiology of James Bond: Phasic Emotional Responses to Violent Video Game
Events. Emotion, 8:1, pp.114-120.
Sunsoft (2000) RealMyst [Computer Video Game]
Reber, R., Schwarz, N. and Winkielman, P. (2004) Processing Fluency and Aesthetic
Pleasure: Is Beauty in the Perceiver's Processing Experience? Personality and Social
Psychology Review, 8:4, pp.364—382
Rebolledo-Mendez, G., Dunwell, I., Martínez-Mirón, E., Vargas-Cerdán, M., de Freitas,
S., Liarokapis, F. and García-Gaona, A. (2009) Assessing NeuroSky‘s Usability to Detect
Attention Levels in an Assessment Exercise, in: HCI, New Trends, pp.149-158
Rebolledo-Mendez, G. and S. De Freitas (2008) Attention modeling using inputs from a
Brain Computer Interface and user-generated data in Second Life. In: The Tenth International
Conference on Multimodal Interfaces (ICMI 2008). 2008. Crete, Greece
Reeves, B., C. Nass, C. (1996) The Media Equation, Center for the Study of Language and
Information, Stanford University
Reips, U.D. (1995) The Web experiment method [On-line document]. Available:
http://www.genpsy.unizh.ch/Ulf/Lab/WWWExpMethod.html.
Reips, U.D. (1997) Psychological experimenting on the Internet, In B. Batinic (Ed.),
Internet für Psychologen, pp. 245-265, Göttingen, Hogrefe
Reips, U.D. (2002) Standards for Internet-Based Experimenting, Experimental Psychology,
49:4, pp.243-256
Reips, U. (2001) The Web Experimental Psychology Lab: Five years of data collection on the
Internet Behavior Research Methods, Instruments, & Computers, 2001, 33:2, pp.201-211
243
244
Appendix: Future Work, References & Complete Datasets
Reips, U. (2007) The Methodology of internet based experiments, in: (Eds.) Joinson, A.,
McKenna, K., Postmes, T., Reips, U. The Oxford Handbook of internet psychology, Oxford
University Press
Reisenzein, R. (2000) Exploring the strength of association between the components of
emotion syndromes: the case of surprise. Cogn. Emot. 14, pp.1–38
Remedy Entertainment (2010) Alan Wake, [Computer Video Game] Microsoft Game Studios
Renard, Y., Lotte, F., Gibert, G., Congedo, M., Maby, E., Delannoy, V., Bertrand, O. and
Lécuyer, A. (2010) Openvibe: An open-source software platform to design, test and use
brain-computer interfaces in real and virtual environments, Presence: teleoperators and
virtual environments, 19:1, 2010.
Richard, E., Tijou, A., Richard, P. and Ferrier, J. (2006) Multi-modal virtual environments
for education with haptic and olfactory feedback, Virtual Reality, 10:3, pp.207-225
Rizon, M. (2010) Discrete Wavelet Transform Based Classification of Human Emotions
Using Electroencephalogram Signals, American Journal of Applied Sciences, 7:7, pp.878-885
Rockett, W.H. (1988) Devouring Whirlwind. Terror and Transcendence in the Cinema of
Cruelty, Greenwood Press, New York.
Rockstar (2001) Grand Theft Auto 3 [Computer Video Game]
Rolls, E.T. (2000) Précis of the brain and emotion, Behavioral and Brain Sciences, 23,
pp.177-234
Romanosky, S., Telang, R. and Acquisti, A. (2011) Do Data Breach Disclosure Laws Reduce
Identity Theft, Journal of Policy Analysis and Management, 20:2, pp.256-286
Rosenthal, R.H. and Allen, T.W. (1978) An Examination of Attention, Arousal, and Learning
Dysfunctions of Hyper-Kinetic Children, Psychological Bulletin, 85, pp.689-715
Roy, M., Mailhot, J. P., Gosselin, N., Paquette, S., and Peretz, I. (2008) Modulation of the
startle reflex by pleasant and unpleasant music, International Journal of Psychophysiology,
71, pp.37–42
Roychoudhuri, L., Al-Shaer, E., Hamed, H. and Brewster, G.B. (2003) Audio Transmission
over the Internet: Experiments and Observations. IEEE International Conference on
Communications (ICC), Anchorage, Alaska (2003)
Ruben, D.C. and Talarico, J.M. (2009) A Comparison of Dimensional Models of Emotion:
Evidence from Emotions, Prototypical Events, Autobiographical Memories, and Words,
Memory, 17:8, pp.802–808
Tom Garner
2012
University of Aalborg
Rugg, G. and Petre, M. (2007) A Gentle Guide to Research Methods. McGraw-Hill
Education, Open University Press.
Ruiz-Belda, M., Fernandez-Dols J., Carrera, P. and Barchard, K. (2002) Spontaneous facial
expressions of happy bowlers and soccer fans. Cogn. Emot. 17, pp.315– 326
Russell, J.A. (1980) A circumplex model of affect, Journal of Personality and Social
Psychology, 39, pp.1161-1178
Russell, J.A. (1991). Culture and the categorization of emotions, Psychological
Bulletin, 110:3, pp.426–50.
Russell, J.A. (2003) Core affect and the psychological construction of emotion.
Psychological Review, 110, pp.145–172
Russell, J.A, Bachorowski, J. and Fernandez-Dols, J. (2003) Facial and vocal expressions of
emotion. Ann. Rev. Psychol. 54, pp.329-349
Sakurazawa, Shigeru, et. al. (2004) Entertainment Feature of a Game Using Skin
Conductance Response, proceedings of ACE 2004, Advances in Computer Entertainment
Technology, ACM Press, pp.181-186
Salzarulo, P. and Cipolli, C. (1974) Spontaneously recalled verbal material and its linguistic
organization in relation to different stages of sleep. Biological Psychology, 2, pp.47–57
Sanei, S. and Chambers, J. (2007) EEG signal processing. Chichester, England; Hoboken,
NJ: John Wiley & Sons, 2007.
Sasson, N.J. and Elison, J.T. (2012) Eye Tracking Young Children with Autism. J. Vis. Exp.,
61
Sato, W., Fujuimura, T. and Suzuki, N. (1989) Enhanced facial EMG activity in response to
dynamic facial expressions, International Journal of Psychophysiology, 70, pp.70–74
Sato, W., Fujimura, T. and Suzuki, N. (2008) Enhanced facial EMG activity in response to
dynamic facial expressions, International Journal of Psychophysiology, 70, pp.70–74
Schachter, S. and Singer, J. (1962) Cognitive, Social, and Physiological Determinants of
Emotional State, Psychological Review, 69, pp. 379–399
Schadow, J., Lenz, D., Thaerig, S., Busch, N.A., Fründ, I., and Herrmann, C.S. (2007)
Stimulus intensity affects early sensory processing: sound intensity modulates auditory
evoked gamma-band activity in human EEG. Int. J. Psychophysiol., 65, pp.152–161
Schafer, R.M. (1977) The Tuning of the World, New York: Knopf
245
246
Appendix: Future Work, References & Complete Datasets
Schafer, R.M. (1994) Our Sonic Environment and the Soundscape: The Tuning
of the World, Destiny Books, Rochester, Vermont
Schauer, F., Ozvoldova, M. and Lustig, F. (2007) Real remote physics experiments across
Internet - inherent part of Integrated E-Learning, Conference ICL2007, September 26th-28th,
2007, Villach
Scherer, K.R. (1988) Criteria for emotion-antecedent appraisal: A review. In (Eds.) G.H.B.V.
Hamilton, and N.H. Frijda, Cognitive perspectives on emotion and motivation, pp.89-126
Scheucher, B., Bailey, P.H., Gutl, C. and Harward, J.V. (2009) Collaborative Virtual 3D
Environment for Internet-Accessible Physics Experiments, iJOE, 5:1
Schlögl, A., Slater, M. and Pfurtscheller, G. (2002) Presence research and EEG, in:
Proceedings of the 5th International Workshop on Presence. Porto, Portugal, 2002.
Schmidt, W.C. (1997) World-wide Web Survey Research: Benefits, Potential Problems and
Solutions, Behaviour Research Methods, Instruments and Computers, 29, pp.274-279
Schmidt, W. C. (2000). The server-side of psychology Web experiments, in: (Ed.) M. H.
Birnbaum, Psychological experiments on the Internet, pp. 285-310, San Diego, CA:
Academic Press.
Schmidt, L.A. and Trainor, L.J. (2001) Frontal brain electrical activity (EEG) distinguishes
valence and intensity of musical emotions, Cognition & Emotion, 2001, 15:4, pp.487–500
Schneider, S. 2004. Toward an Aesthetics of Cinematic Horror. In: Stephen Prince (ed.), The
Horror Film, New Brunswick : Rutgers University Press, p. 131-149.
Schwarz, S. and Reips, U.D. (2001) CGI versus JavaScript: A Web experiment on the
reversed hindsight bias, In: U.D. Reips & M. Bosnjak (Eds.) Dimensions of Internet Science,
pp. 75-90. Lengerich, Germany: Pabst Science.
Reber, R., Schwarz, N. and Winkielman, P. (2004) Processing fluency and aesthetic pleasure:
Is beauty in the perceiver‘s processing experience? Personality and Social Psychology
Review, 8, pp.364–382
Seigneur, J. (2011) The Emotional Economy for the Augmented Human, Proceedings of the
2nd Augmented Human Interaction Conference, Article No. 24, ACM, New York, USA
Sell LA, Morris J, Bearn J, Frackowiak RS, Friston KJ, Dolan RJ. Activation of reward
circuitry in human opiate addicts. European Journal of Neuroscience (1999) 11(3):1042–1048
Fincher (1995) Se7en [Film], New Line Cinema
Tom Garner
2012
University of Aalborg
Sherman, D.K. and Kim, H.S. (2002) Affective Perseverance: The Resistance of Affect to
Cognitive Invalidation, Personality and Social Psychology Bulletin, 28:2, pp.224-237
Shilling, R., Zyda, M., and Wardynski, E. C., (2002) Introducing emotion into military
simulation and videogame design: America‘s Army: Operations and VIRTE, GameOn
Conference, London.
Shinkle, E. (2005) Feel it, don't think: the significance of affect in the study of digital games,
Proceedings of DiGRA 2005: changing views -- worlds in play
Silva, J.R., Pizzagalli, D.G, Larson, C.L., Jackson, D.C. and Davidson, R.J. (2002) Frontal
Brain Asymmetry in Restrained Eaters, Journal of Abnormal Psychology, 111:4, pp.676–681
Slaney, M. (2002) Semantic–Audio Retrieval, IEEE International Conference on Acoustics,
Speech and Signal Processing, Orlando, Florida, May 13th-17th
Smith, G.M. (1999) Local Emotions, Global Moods, and Film Structure, in: (Eds.) G. Smith
and C. Plantinga, Passionate Views: Film, Cognition and Emotion, Johns Hopkins University
Press, Baltimore, pp.103-126
Solms, M. (1997) The neuropsychology of dreams: A clinico-anatomical study. Mahwah, NJ:
Erlbaum
Solomon, R.L. (1980) The Opponent-Process Theory of Acquired Motivation: The Costs of
Pleasure and the Benefits of Pain, American Psychologist, 35, pp.691-712
Sonnadaraa, R, Alainb C, and Trainora L. (2006) Effects of spatial separation and stimulus
probability on the eventrelated potentials elicited by occasional changes in sound location,
Brain Research, 1071, pp.175-185.
Sony (2009) Vegas Pro 9 [Computer Software]
Sorgatz, H., Christ, O. and Englert, T. (2007) Event Related Surface EMG Potentials in
Typewriting: The Effects of Motor Program and Thermal Conditions on EMG Pattern, in:
Noraxon EMG Meeting 2007, http://www.noraxon.com/emg/EMG-Meeting2007_Abstract.pdf#page=20
Sotres-Bayon, F., Cain, C.K. and LeDoux, J.E. (2006) Brain mechanisms of fear extinction:
historical perspectives on the contribution of prefrontal cortex, Biol Psychiatry, 60, pp.329–
336
Sourina, O. and Liu, Y. (2011) A fractal-based algorithm of emotion recognition from EEG
using arousal-valence model, Biosignals, 2011, Rome, Italy
Sparks, G. (1989) Understanding emotional reactions to a suspenseful movie: The interaction
between forewarning and preferred coping style, Communication Monographs, 56:4, pp.325340
247
248
Appendix: Future Work, References & Complete Datasets
Spence, C., Nicholls, M.E.R., and Driver, J. (2001) The cost of expecting events in the wrong
sensory modality, Perception and Psychophysics, 63, pp.330-336
Spielberg, S. (1993) Jurassic Park [Film], Amblin Entertainment, Universal Pictures
Spinoza, B. (1677) Ethics
Srinivasan, R. (1999) Spatial structure of the alpha rhythm: global correlation in
adults and local correlation in children, Clinical Neurophysiology, 110:8, pp.1351-1362
Staats, A. W. and Eifert, G. H. (1990) The paradigmatic behaviorism theory of emotions:
Basis for unification, Clinical Psychology Review, 10, pp.539–566
Stamler, T.B., Lafreniere, L.L. and Dumala, D.K (2000) The Internet: An effective tool for
nursing research with women, Computers in Nursing, 18, pp.13-18
Stanley, M. A., Diefenbach, G. J., and Hopko, D. R. (2004) Cognitive behavioral treatment
for older adults with generalized anxiety disorder. A therapist manual for primary care
settings. Behavior Modification, 28, pp.73–117.
Stanton, J.M. (2008) ICAO and the Biometric RFID Passport: History and Analysis,
in: (Eds.) C. J. Bennett and D. Lyon, Playing the Identity Card: Surveillance, Security and
Identification in Global Perspective, New York: Routledge, pp.253–267
Stanton, J.M and Rogelberg, S.G. (2001) Using Internet/Intranet Web Pages to Collect
Organizational Research Data, Organizational Research Methods, 4, p.200
Steele, D.L. and Chon, S.H. (2007) A Perceptual Study of Sound Annoyance, Audio Mostly
2007, pp.9-24
Steinberg (2009) Cubase 5.1 [Computer Software]
Stepper, S. and Strack, F. (1993) Proprioceptive determinants of emotional and nonemotional
feelings, Journal of Personality and Social Psychology, 64, pp.211–220
Stern, J.A. (1964) Toward a definition of psychophysiology, Psychophysiology, 7, pp.9091
Stevenson, J. (2008) Anxiety Disorders: How to Overcome Panic Attacks, Phobia and
Anciety, Cranendonck Coaching
Straker, D. (2010) Changing Minds: In Detail, 2nd Edition, Syque
Strauss, E. and Moscovitch, M. (1981) Perception of facial expressions, Brain and Language
13, pp.308–332
Tom Garner
2012
University of Aalborg
Stroop, J.R. (1935) Studies of interference in serial verbal reactions, J. Exp. Psychol., 18,
pp.643–62
Stytsenko, K., Jablonskis, E. and Prahm, C. (2011) Evaluation of Consumer EEG Device
Emotiv EPOC, CogSci Conference 2011, Ljubljana, June 17th-18th
Sunsoft (2000) RealMyst [Computer Video Game]
Svendsen, L. (2008) A Philosophy of Fear. Reaktion books.
Sykes, J. and Brown, S. (2003) Affective gaming: measuring emotion through the GamePad,
in: Conference Supplement to the Conference on Human Factors in Computing Systems,
ACM, New York, pp.732-733
Takahashi, K. (2004) Remarks on Emotion Recognition from Bio-Potential Signals, 2nd
International Conference on Autonomous Robots and Agents, December 13th-15th, 2004,
Palmerston North, New Zealand
Talati, A. and Hirsch, J. (2005) Functional specialization within the medial frontal gyrus for
perceptual go/no-go decisions based on ―what,‖ ―when,‖ and ―where‖ related information: an
fMRI study, J. Cogn. Neurosci., 17, pp.981–993
Tammen, H. and Loviscach, J. (2008) Emotion in Video Games: Quantitative Studies?
Workshop Proceedings Emotion in HCI – Designing for People, pp.25-29
Tan, E. S. (1996) Emotions and the Structure of Narrative Film: Film as an Emotion
Machine, Lawrence Erlbaum Associates, Mahwah, N.J.
Tan, E. S. (2000) Emotion, art, and the humanities., in: (Eds.) M. Lewis and J. M.
Haviland-Jones, Handbook of emotions (2nd ed.), pp.116-134), New York: Guilford Press
Teplan, M. (2002) Fundamentals of EEG Measurement, Measurement Science Review, 2:2,
pp.1-11
Thayer, J. F., Friedman, B. H., and Borkovec, T. D. (1996) Autonomic characteristics of
generalized anxiety disorder and worry, Biological Psychiatry, 39, pp.255–266
Till, J. (2009) Architecture Depends, MIT Press, Cambridge, MA
Tinwell, A. (2009) The Uncanny as Usability Obstacle, in: Proceedings of the “Online
Communities and Social Computing” workshop, HCI International, Springer, San Diego, CA,
USA, July 19th-24th
Tinwell, A., Grimshaw, M. and Williams, A. (2010) Uncanny behaviour in survival horror
games, Journal of Gaming and Virtual Worlds, 2:1, pp.3–25
249
250
Appendix: Future Work, References & Complete Datasets
Tomkins, S.S. (1962) Affect, imagery, consciousness: Vol. 1. The positive affects, New York:
Springer
Truax, B. (1978) The handbook of acoustical ecology. A.R.C. Publications, Vancouver
Truax, B. (1984) Acoustic Communication, New Jersey: Ablex Publishing.
Tucker, D.M. (1992) Developing emotions and cortical networks, in: (Eds.) M. R. Gunnar
and C.A. Nelson, Minnesota symposium on child psychology, Developmental behavioral
neuroscience, 23, pp.75–128, Hillsdale, NJ: Erlbaum.
Tuten, T.L., Bosnjak, M. and Bandilla, W. (2000) Banner-advertised Web-surveys,
Marketing Research, 11:4, pp.17-21
Tuuri, K., Mustonen, M. and Pirhonen, A. (2007) Same sound – Different meanings: A Novel
Scheme for Modes of Listening, Audio Mostly 2007, Ilmenau, Germany
Tychsen, A. and Canossa, A. (2008) Dening personas in games using metrics, in: Future Play
'08: Proceedings of the 2008 Conference on Future Play, pp.73-80, Toronto, Ontario,
Canada, 2008, ACM Press
Ubisoft (2006) Rainbow Six: Vegas [Computer Video Game]
Ullsperger P., Erdmann U., Freude G., and Dehoff W. (2007) When sound and picture do not
fit: Mismatch negativity and sensory interaction, in: International Journal of
Psychophysiology, 59, pp.3-7
Unrealengine.com (2012) www.unrealengine.com [Website]
VaezMousavi, S.M. and Barry, R.J. (2009) Individual differences in task related activation
and performance, International Journal of Psychophysiology, 15:1, pp.34-39
Väisänen, J., Väisänen, O., Malmivuo, J. and Hyttinen, J. (2008) New method for
analysing sensitivity distributions of electroencephalography measurements, Medical
and Biological Engineering and Computing, 46, pp.101–108
Väljamäe A. and Soto-Faraco S. (2008) Filling-in visual motion with sounds, Acta
Psychologica, 129, pp.249–254
Valve (2004) Half-Life 2 [Computer Video Game], Sierra Entertainment
Valve (2008) Left for Dead [Computer Video Game] Electronic Arts/Steam
Van Den Broek, E., Schut, M., Westerink, J., van Herk, J. and Tuinenbreijer, K. (2006)
Computing emotion awareness through facial Electromyography, Computer Science, 3979,
pp.52-63
Tom Garner
2012
University of Aalborg
Van Duyne, D.K., Landay, J.A. and Hong, J.I. (2007) The Design of Sites, 2nd Edition,
Prentice Hall
Van Petten, C. and Rheinfelder, H. (1995) Conceptual relationships between spoken words
and environmental sounds: Event-related brain potential measures, Neuropsychologia, 33, pp.
485–508
van Reekum C.M., Johnstone, T., Banse, R. Etter, A., Wehrle, T. And Scherer, K.R. (2004)
Psychophysiological responses to appraisal dimensions in a computer game, Cognition and
Emotion, 18:5, pp.663-688
VanScoy, H. (2006) Unraveling the Biology of Emotions, Psych Central,
http://psychcentral.com/lib/2006/unraveling-the-biology-of-emotions/
Van Winkle, E. (2000) The toxic mind: the biology of mental illness and violence, Medical
Hypothesis, 55, pp.356-368
Varma, D. (1966) The Gothic Flame, Russell and Russell, New York
Venables, P.H. and Christie, M.J. (1973) Mechanisms, Instrumentation, Recording
Techniques, and Quantification of Responses, in: Prokasy and Rasking, pp.1-124
Vespa, P.M., Nuwer, M.R., Nenov, V., et al. (1999) Increased incidence and impact of
nonconvulsive and convulsive seizures after traumatic brain injury as detected by continuous
electroencephalographic monitoring, J. Neurosurg., 91, pp.750–760
Visceral Games (2008) Dead Space [Computer Video Game] Electronic Arts
Von Uexkull, J. (1957) A Stroll Through the World of Animals and Men, in: (ed.) Schiller,
C., Instinctive Behaviour, New York: International Universities Press
Vourvopoulos, A. and Liarokapis, F. (2011) BrainControlled NXT Robot: Tele-operating a
Robot through Brain Electrical Activity, Third International Conference on Games and
Virtual Worlds for Serious Applications (VSGAMES), pp.140-143, 4th-6th
Vrana, S. R. (1993) The psychophysiology of disgust: Differentiating negative emotional
contexts with facial EMG, Psychophysiology, 30, pp.279–286
Vrana, S. R., Spence, E. and Lang, P., (1988) The Startle Probe Response: A New Measure of
Emotion? Journal of Abnormal Psychology, 97:4, 487-491
Walsh, B. (2009) The Web Startup Success Guide, Springer-Verlag: New York
251
252
Appendix: Future Work, References & Complete Datasets
Walsh, E.O., McQuivey, J.L. and Wakeman, M. (1999) Consumers barely trust Net
advertising. Cambridge, MA: Forrester Research.
Waterink, W., and Van Boxtel, A., (1994) Facial and jawelevator EMG activity in relation to
changes in performance level during a sustained information task. Biological Psychology, 37,
pp.183–198
Weber, R. and Alsina, T.M. (2009) Controlling Behaviour of Eements in a Display
Environment, US Patent, Apple Inc., US
WebKnow Internet Statistics (2011) [Website] http://www.webknow.com/screen-resolutionstatistics.asp
Wegner, D. M., Shortt, J. W., Blake, A. W. and Page, M. S. (1990) The suppression of
exciting thoughts, Journal of Personality and Social Psychology, 58, pp.409–418
Weiss, H. M., and Cropanzano, R. (1996) Affective events theory: A theoretical discussion of
the structure, causes and consequences of affective experiences at work, in: (Eds.) B.M.
Staw and L.L. Cummings, Research in Organizational Behavior. Annual Series of Analytical
Essays and Critical Reviews, 18, pp.1-74
Welch, N. and Krantz, J.H. (1996) The World-Wide Web as a medium for psychoacoustical
demonstrations and experiments: Experience and results, Behavior Research Methods,
Instruments, and Computers, 1996, 28:2, pp.192-196
Wiedemann, G., Pauli, P., Dengler, W., Lutzenberger, W., Birbaumer, N. and Buchkremer,
G., (1999) Frontal brain asymmetry as a biological substrate of emotions in patients with
panic disorders, Archives of General Psychiatry, 56, pp.78–84
Wilde, O. (1891) The Critic as Artist, in: Intentions
Wilde, O. (1966) Complete works, London
Wilson, M. (2002) Six views of embodied cognition. Psychon. Bull. Rev., 9, pp.625–36
Wilson, T.D., Wheatley, T., Meyers, J.M., Gilbert, D.T. and Axsom, D. (2000) Focalism: a
source of durability bias in affective forecasting, Journal of Personal Social Psychology,
78:5, pp.821-36
Winer, J.A. (1979) The Art of Equalisation, Popular Electronics Magazine,
http://www.ethanwiner.com/equalizers.html.
Winer, J.A. (2005) Decoding the auditory corticofugal systems, Hearing Research, 207:1-2,
pp.1–9
Tom Garner
2012
University of Aalborg
Winkel, M., Novak, D.M. and Hopson, M. (1987) Personality factors, subject gender and the
effects of aggressive video games on aggression in adolescents, Journal of Research in
Personality, 21, pp.211-223
Winograd, T. and Flores, F. (1986) Understanding Computers and Cognition: A New
Foundation for Design, Norwood, NJ, Ablex Publishing
Witvliet, C.V.O. and Vrana, S.R. (1995) Psychophysiological responses as indices of
affective dimensions, Psychophysiology, 32, pp.436–443
Woodward, A.L. (2003) Infants‘ developing understanding of the link between
looker and object, Developmental Science, 6:3, pp.297–311
Woolley, B. (1993) Virtual Worlds: a journey in hype and hyperreality, Oxford. Blackwell
Wrightson, K. (2000) An Introduction to Acoustic Ecology, Soundscape: the Journal of
Acoustic Ecology, 1:1, pp.10-13
Wu, J., Li, P. and Rao, S. (2008) Why They Enjoy Virtual Game Worlds? An Empirical
Investigation, Journal of Electronic Commerce Research, 9:3, pp.219-230
Wu, C.H., Liu C.J. and Tzeng, Y. (2011) Brain Wave Analysis in Optimal Color Allocation
for Children‟s Electronic Book Design, ftp://ftp.scu.edu.tw/scu/network/tanet2011/
TANet2011/K1/1225.pdf
Wundt, W. (1904) Principles of physiological psychology (5th German edition, Volume 1)
(E. B. Titchener, trans.) New York: Macmillan
Xu, M.,Chia, L., and Jin, J. (2005) Affective content analysis in comedy and horror videos by
audio emotional event detection, in: IEEE International Conference on Multimedia and Expo.
Yartz, A., Hawk Jr, L. (2002) Addressing the specificity of affective startle modulation: fear
versus disgust, Biological Psychology, 59, pp.55–68
Yehunda, R. (2002) Treating Trauma Survivors with PTSD, American Psychiatric
Publishing.
Yokota, T. And Fujimori, B. (1962) Impedence Change of the Skin During the Galvanic Skin
Reflex, Japanese Journal of Physiology, 12, pp.200-209
Yost, P.R. and Homer, L.E. (1998) Electronic versus paper surveys: Does the medium affect
the response? Annual meeting of the Society for Industrial and Organizational Psychology,
Dallas, TX
253
254
Appendix: Future Work, References & Complete Datasets
Yurgelun-Todd, D.A. and Killgore, W.D. (2006) Fear-related activity in the prefrontal cortex
increases with age during adolescence: a preliminary fMRI study, Neurosci. Lett. 406,
pp.194–199
Zajonc, R.B. (1984) On the Primacy of Affect, American Psychologist, 39, pp.117-123
Zillman, D. (1996) The psychology of suspense in dramatic exposition, in: Suspense:
conceptualizations, theoretical analyses, and empirical explorations, Lawrence Erlbaum
Associates Inc.
Zoetrope Interactive (2007) Darkness Within: Pursuit of Loath Nolder [Computer Video
Game], Lighthouse Interactive/Iceberg Interactive
Zwicker, E. and Fastl, H. (1999) Psychoacoustics – Facts and Models, 2nd edition, SpringerVerlag
i) Real-time Biometric Fear Assessment of Audio
during Video Gameplay
ii) Game Completion Time
Game Completion time (seconds)
Game Completion time (seconds)
Player 1
1400.47
Player 6
592.49
Player 2
703.139
Player 7
532.62
Player 3
471.14
Player 8
529.59
Player 4
779.18
Player 9
501.36
Player 5
385.45
Player 10
816.97
ii) Average (mean) Biometric Output
Game
Events
EMG (5 seconds)
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Player 1
-0.1932
0.1817
-0.0019
8.6128
0.0000
48.1212
61.6455
55.1202
2620.0685
0.0008
Player 2
-0.1895
0.1371
-0.0019
2.1599
0.0000
76.2100
111.6257
86.0977
10582.1693
0.0094
Player 3
-0.0782
0.0703
-0.0019
1.1637
-0.0000
71.4417
104.2252
80.4326
6209.4030
-0.0415
Player 4
-0.0591
0.1224
-0.0019
3.6823
0.0000
87.9822
115.9794
96.6648
10385.2933
-0.0116
Player 5
-0.0733
0.0920
-0.0020
2.2600
0.0000
89.3097
172.3251
109.5869
12474.0917
-0.1267
Player 6
-0.1652
0.1339
-0.0019
38.9178
0.0002
70.3964
87.3718
75.7320
4990.1912
0.0014
Player 7
-0.1788
0.1317
-0.0019
2.4383
-0.0000
151.1002
186.0275
166.1906
3842.0477
-0.0034
Player 8
-0.3123
0.2554
-0.0019
8.8096
0.0000
70.9839
118.4464
87.8569
9217.1232
0.0287
Player 9
-0.2573
0.2706
-0.0019
1.7986
0.0000
81.3141
131.0043
95.5596
3826.1672
-0.0267
Player 10
-0.2201
0.1938
-0.0019
42.0299
-0.0001
6.1697
136.1389
104.1216
7623.4559
0.0497
EDA (5 seconds)
Raw EMG (5 second sample following sound onset)
Raw EDA (5 second sample following sound onset)
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0366
0.0382
-0.0019
0.0353
-0.0009
50.5842
51.2231
50.9041
0.5804
0.0097
Music
-0.0145
0.0098
-0.0020
0.0182
-0.0012
52.4826
53.8559
52.9507
2.2208
-0.1876
Ship Groan
-0.0450
0.0388
-0.0020
0.0417
0.0036
52.9404
54.2374
53.9851
0.6806
-0.0915
light explosion
-0.0531
0.0426
-0.0019
0.0570
0.0060
56.2744
58.9218
57.6651
2.1224
0.2852
Muffled Scream
-0.0303
0.0413
-0.0019
0.0323
-0.0007
56.2363
59.1736
57.7736
2.0850
0.3859
Man Whimper
-0.0349
0.0441
-0.0018
0.0506
-0.0018
55.9311
57.7698
56.6076
3.2259
0.1800
Pipe Banging
-0.0320
0.0391
-0.0019
0.0454
0.0030
57.2662
58.1360
57.7072
0.7651
0.1312
Breath
-0.0429
0.0365
-0.0019
0.0654
0.0077
57.6706
59.9518
58.9169
2.9419
0.1495
Ship Voice
-0.0386
0.0373
-0.0019
0.0424
0.0017
57.6706
58.7845
58.2172
1.2556
0.1190
Scream
-0.0392
0.0340
-0.0019
0.0611
0.0039
57.2052
61.2717
58.9945
2.2485
0.7946
Animal Scream
-0.0342
0.0289
-0.0020
0.0386
-0.0025
59.4177
61.6608
60.6169
3.9727
-0.1052
Monster Roar
-0.0452
0.0504
-0.0020
0.0578
-0.0013
59.5779
61.0504
60.2048
2.7418
-0.0229
Door Slams
-0.0695
0.0797
-0.0020
0.0695
-0.0073
61.1420
62.5229
61.7185
2.6656
0.0656
Game Events
[Player 1]
Game Events
[Player 1]
Differential EDA (Raw data minus base reading)
Differential EMG (Raw data minus base reading)
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
0.0221
-0.0284
-0.0001
-0.0171
-0.0003
1.8984
2.6328
2.0466
1.6404
-0.1973
Ship Groan
-0.0084
0.0006
-0.0001
0.0064
0.0045
2.3562
3.0143
3.081
0.1002
-0.1012
light explosion
-0.0165
0.0044
0
0.0217
0.0069
5.6902
7.6987
6.761
1.542
0.2755
Muffled Scream
0.0063
0.0031
0
-0.003
0.0002
5.6521
7.9505
6.8695
1.5046
0.3762
Man Whimper
0.0017
0.0059
0.0001
0.0153
-0.0009
5.3469
6.5467
5.7035
2.6455
0.1703
Pipe Banging
0.0046
0.0009
0
0.0101
0.0039
6.682
6.9129
6.8031
0.1847
0.1215
-0.0063
-0.0017
0
0.0301
0.0086
7.0864
8.7287
8.0128
2.3615
0.1398
-0.002
-0.0009
0
0.0071
0.0026
7.0864
7.5614
7.3131
0.6752
0.1093
-0.0026
-0.0042
0
0.0258
0.0048
6.621
10.0486
8.0904
1.6681
0.7849
Animal Scream
0.0024
-0.0093
-0.0001
0.0033
-0.0016
8.8335
10.4377
9.7128
3.3923
-0.1149
Monster Roar
-0.0086
0.0122
-0.0001
0.0225
-0.0004
8.9937
9.8273
9.3007
2.1614
-0.0326
Door Slams
-0.0329
0.0415
-0.0001
0.0342
-0.0064
10.5578
11.2998
10.8144
2.0852
0.0559
Music
Breath
Ship Voice
Scream
Raw EDA
Game
Events [Player 2]
Raw EMG
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0224
0.0174
-0.0018
0.0158
-0.0005
76.2100
77.1027
76.6353
1.2526
-0.0817
Music
-0.0265
0.0179
-0.0019
0.0336
0.0015
87.7991
90.4007
88.7199
2.5934
-0.3356
Ship Groan
-0.0141
0.0095
-0.0020
0.0126
0.0012
78.8193
80.5588
79.5282
1.6981
-0.1952
light explosion
-0.0274
0.0241
-0.0023
0.0218
-0.0009
80.9250
99.8306
88.7233
8.9310
3.6362
Muffled Scream
-0.0163
0.0123
-0.0016
0.0149
-0.0015
93.0176
103.1647
97.4664
7.6145
1.7663
Man Whimper
-0.0125
0.0100
-0.0019
0.0122
0.0004
91.9724
94.3985
93.2508
1.2360
-0.3783
Pipe Banging
-0.0201
0.0147
-0.0019
0.0123
0.0000
81.1768
84.5337
82.8489
2.9092
0.4439
Breath
-0.0190
0.0203
-0.0019
0.0172
0.0004
80.8029
83.3588
81.7038
3.8640
0.2559
Ship Voice
-0.0193
0.0176
-0.0019
0.0233
0.0023
82.2678
86.0443
84.1516
2.9711
0.6132
Scream
-0.0168
0.0201
-0.0019
0.0124
0.0002
82.8018
87.5473
84.5889
4.7252
0.8114
Animal Scream
-0.0157
0.0124
-0.0020
0.0130
0.0003
88.6612
98.3658
94.6106
9.2617
1.5405
Monster Roar
-0.0367
0.0164
-0.0022
0.0173
0.0013
88.0280
106.0791
95.6387
10.0439
3.5035
Door Slams
-0.0162
0.0152
-0.0022
0.0254
0.0007
101.8600
111.6257
107.7458
8.0343
1.5237
Game
Events [Player 2]
Differential EMG
Mean
Area
Slope
Music
-0.0041
0.0005
-0.0001
0.0178
0.002
11.5891
13.298
12.0846
1.3408
-0.2539
Ship Groan
0.0083
-0.0079
-0.0002
-0.0032
0.0017
2.6093
3.4561
2.8929
0.4455
-0.1135
light explosion
-0.005
0.0067
-0.0005
0.006
-0.0004
4.715
22.7279
12.088
7.6784
3.7179
Muffled Scream
0.0061
-0.0051
0.0002
-0.0009
-0.001
16.8076
26.062
20.8311
6.3619
1.848
Man Whimper
0.0099
-0.0074
-0.0001
-0.0036
0.0009
15.7624
17.2958
16.6155
-0.0166
-0.2966
Pipe Banging
0.0023
-0.0027
-0.0001
-0.0035
0.0005
4.9668
7.431
6.2136
1.6566
0.5256
Breath
0.0034
0.0029
-0.0001
0.0014
0.0009
4.5929
6.2561
5.0685
2.6114
0.3376
Ship Voice
0.0031
0.0002
-0.0001
0.0075
0.0028
6.0578
8.9416
7.5163
1.7185
0.6949
Scream
0.0056
0.0027
-0.0001
-0.0034
0.0007
6.5918
10.4446
7.9536
3.4726
0.8931
Animal Scream
0.0067
-0.005
-0.0002
-0.0028
0.0008
12.4512
21.2631
17.9753
8.0091
1.6222
Monster Roar
-0.0143
-0.001
-0.0004
0.0015
0.0018
11.818
28.9764
19.0034
8.7913
3.5852
0.0062
-0.0022
-0.0004
0.0096
0.0012
25.65
34.523
31.1105
6.7817
1.6054
Door Slams
Min
Max
Differential EDA
Mean
Area
Min
Slope
Max
Game
Events [Player 3]
Raw EMG
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0123
0.0074
-0.0019
0.0094
0.0001
71.8613
72.7463
72.3981
0.6572
-0.1357
Music
-0.0107
0.0067
-0.0020
0.0266
-0.0004
86.9217
88.8977
87.5987
3.7967
0.2242
Ship Groan
-0.0131
0.0054
-0.0019
0.0100
-0.0009
77.3468
78.4760
78.0117
0.6915
-0.1617
light explosion
-0.0362
0.0347
-0.0019
0.0140
-0.0006
71.4417
81.5125
74.7715
9.8683
1.8850
Muffled Scream
-0.0515
0.0526
-0.0019
0.0167
-0.0006
77.5681
81.6803
78.4347
6.9274
0.7443
Man Whimper
-0.0161
0.0096
-0.0019
0.0097
0.0004
80.6503
83.2443
82.3562
3.2633
0.2928
Pipe Banging
-0.0139
0.0117
-0.0020
0.0079
-0.0005
77.9724
79.0405
78.6306
1.0736
-0.0381
Breath
0.0195
0.0157
-0.0019
0.0236
0.0007
76.9959
77.6291
77.3368
0.5336
-0.0473
Ship Voice
-0.0666
0.0703
-0.0018
0.0157
-0.0002
76.0269
79.2313
77.1938
3.3532
0.3920
Scream
-0.0127
0.0069
-0.0019
0.0069
-0.0000
73.2574
75.5692
73.9776
2.0789
0.2593
Animal Scream
-0.0478
0.0486
-0.0018
0.0262
-0.0001
74.8062
77.8427
75.7080
4.2708
0.5155
Monster Roar
-0.0285
0.0281
-0.0019
0.0106
-0.0003
75.2792
78.5294
76.4297
2.9391
0.5689
Door Slams
-0.0219
0.0112
-0.0019
0.0105
0.0003
76.4008
77.6062
77.1548
1.4275
-0.0793
Game
Events [Player 3]
Differential EMG
Mean
Area
Slope
Min
Raw EDA
Max
Differential EDA
Mean
Area
Slope
Min
Max
0.0016
-0.0007
-0.0001
0.0172
-0.0005
15.0604
16.1514
15.2006
3.1395
0.3599
Ship Groan
-0.0008
-0.002
0
0.0006
-0.001
5.4855
5.7297
5.6136
0.0343
-0.026
light explosion
-0.0239
0.0273
0
0.0046
-0.0007
-0.4196
8.7662
2.3734
9.2111
2.0207
Muffled Scream
-0.0392
0.0452
0
0.0073
-0.0007
5.7068
8.934
6.0366
6.2702
0.88
Man Whimper
-0.0038
0.0022
0
0.0003
0.0003
8.789
10.498
9.9581
2.6061
0.4285
Pipe Banging
-0.0016
0.0043
-0.0001
-0.0015
-0.0006
6.1111
6.2942
6.2325
0.4164
0.0976
0.0318
0.0083
0
0.0142
0.0006
5.1346
4.8828
4.9387
-0.1236
0.0884
Ship Voice
-0.0543
0.0629
0.0001
0.0063
-0.0003
4.1656
6.485
4.7957
2.696
0.5277
Scream
-0.0004
-0.0005
0
-0.0025
-0.0001
1.3961
2.8229
1.5795
1.4217
0.395
Animal Scream
-0.0355
0.0412
0.0001
0.0168
-0.0002
2.9449
5.0964
3.3099
3.6136
0.6512
Monster Roar
-0.0162
0.0207
0
0.0012
-0.0004
3.4179
5.7831
4.0316
2.2819
0.7046
Door Slams
-0.0096
0.0038
0
0.0011
0.0002
4.5395
4.8599
4.7567
0.7703
0.0564
Music
Breath
EDA (5 seconds)
Game
Events [Player 4]
EMG (5 seconds)
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0311
0.0200
-0.0019
0.0374
0.0012
87.9822
88.6230
88.2087
0.8693
-0.0756
Music
-0.0201
0.0130
-0.0020
0.0108
0.0001
88.2339
89.2410
88.6671
0.4459
-0.0482
Ship Groan
-0.0229
0.0159
-0.0019
0.0345
-0.0016
97.7707
99.0448
98.2892
0.7698
-0.0961
light explosion
-0.0222
0.0192
-0.0020
0.0246
-0.0018
96.2753
97.2824
96.7542
0.5601
0.1388
Muffled Scream
-0.0232
0.0166
-0.0019
0.0236
0.0014
97.7707
98.4497
98.3555
0.4747
-0.0610
Man Whimper
-0.0347
0.0235
-0.0019
0.0303
-0.0020
94.2841
95.2835
94.8273
0.8876
-0.1998
Pipe Banging
-0.0317
0.0266
-0.0020
0.0315
0.0004
94.5663
95.3674
94.8035
0.9285
-0.1525
Breath
-0.0253
0.0319
-0.0019
0.0410
0.0017
93.9255
94.9860
94.4250
0.7892
-0.0839
Ship Voice
-0.0273
0.0222
-0.0019
0.0464
0.0025
100.4791
103.8208
101.6025
3.8544
0.6161
Scream
-0.0539
0.0597
-0.0018
0.0390
-0.0005
96.8323
97.6410
97.1980
1.5182
0.0564
Animal Scream
-0.0637
0.1238
-0.0019
0.0314
-0.0022
99.7696
106.2164
104.6560
11.7090
1.0188
Monster Roar
-0.0250
0.0197
-0.0019
0.0215
0.0011
103.9505
110.2600
108.7247
11.2863
0.9684
Door Slams
-0.0366
0.0224
-0.0019
0.0334
-0.0008
106.4835
107.4142
106.8779
0.6411
0.0031
Game
Events [Player 4]
EMG
Max
Mean
Area
Slope
Music
Min
EDA
Max
Mean
Area
Slope
Min
0.011
-0.007
-0.0001
-0.0266
-0.0011
0.2517
0.618
0.4584
-0.4234
0.0274
Ship Groan
0.0082
-0.0041
0
-0.0029
-0.0028
9.7885
10.4218
10.0805
-0.0995
-0.0205
light explosion
0.0089
-0.0008
-0.0001
-0.0128
-0.003
8.2931
8.6594
8.5455
-0.3092
0.2144
Muffled Scream
0.0079
-0.0034
0
-0.0138
0.0002
9.7885
9.8267
10.1468
-0.3946
0.0146
Man Whimper
-0.0036
0.0035
0
-0.0071
-0.0032
6.3019
6.6605
6.6186
0.0183
-0.1242
Pipe Banging
-0.0006
0.0066
-0.0001
-0.0059
-0.0008
6.5841
6.7444
6.5948
0.0592
-0.0769
Breath
0.0058
0.0119
0
0.0036
0.0005
5.9433
6.363
6.2163
-0.0801
-0.0083
Ship Voice
0.0038
0.0022
0
0.009
0.0013
12.4969
15.1978
13.3938
2.9851
0.6917
Scream
-0.0228
0.0397
0.0001
0.0016
-0.0017
8.8501
9.018
8.9893
0.6489
0.132
Animal Scream
-0.0326
0.1038
0
-0.006
-0.0034
11.7874
17.5934
16.4473
10.8397
1.0944
Monster Roar
0.0061
-0.0003
0
-0.0159
-0.0001
15.9683
21.637
20.516
10.417
1.044
-0.0055
0.0024
0
-0.004
-0.002
18.5013
18.7912
18.6692
-0.2282
0.0787
Door Slams
Game
Events [Player 5]
Raw EMG
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0326
0.0208
-0.0017
0.0090
0.0000
89.3097
90.2634
89.6986
1.7971
-0.0529
Music
-0.0245
0.0101
-0.0020
0.0128
0.0001
141.1285
149.7879
145.3598
0.9853
-1.7312
Ship Groan
-0.0180
0.0125
-0.0020
0.0203
-0.0017
109.7565
112.2284
111.1564
0.7380
-0.4683
light explosion
-0.0217
0.0211
-0.0020
0.0225
-0.0010
103.8513
105.2933
104.4535
3.0377
0.0930
Muffled Scream
-0.0226
0.0190
-0.0020
0.0410
-0.0006
99.2279
100.0366
99.7385
1.1454
-0.0915
Man Whimper
-0.0310
0.0240
-0.0019
0.0281
-0.0012
100.8911
102.2949
101.6353
0.9453
-0.1922
Pipe Banging
-0.0323
0.0243
-0.0019
0.0351
0.0011
100.8301
102.1957
101.4239
1.3128
-0.1693
Breath
-0.0410
0.0429
-0.0020
0.0439
-0.0024
97.5494
98.8083
98.1875
0.6098
-0.2318
Ship Voice
-0.0455
0.0370
-0.0021
0.0363
0.0009
96.3593
97.4731
96.9964
1.2046
0.0320
Scream
-0.0439
0.0389
-0.0019
0.0740
0.0005
93.8644
94.9631
94.3241
0.8360
-0.0305
Animal Scream
-0.0470
0.0327
-0.0020
0.0493
0.0017
94.1772
95.6573
94.8259
0.8636
-0.0793
Monster Roar
-0.0403
0.0436
-0.0019
0.0533
-0.0012
99.7696
101.4175
100.3142
1.6661
0.2471
Door Slams
-0.0254
0.0215
-0.0017
0.0474
-0.0003
121.2387
126.1139
123.5397
0.9860
-0.9746
Game
Events [Player 5]
Differential EMG
Mean
Area
Slope
Music
-0.0571
-0.0107
-0.0003
0.0038
0.0001
51.8188
59.5245
55.6612
-0.8118
-1.6783
Ship Groan
-0.0506
-0.0083
-0.0003
0.0113
-0.0017
20.4468
21.965
21.4578
-1.0591
-0.4154
light explosion
-0.0543
0.0003
-0.0003
0.0135
-0.001
14.5416
15.0299
14.7549
1.2406
0.1459
Muffled Scream
-0.0552
-0.0018
-0.0003
0.032
-0.0006
9.9182
9.7732
10.0399
-0.6517
-0.0386
Man Whimper
-0.0636
0.0032
-0.0002
0.0191
-0.0012
11.5814
12.0315
11.9367
-0.8518
-0.1393
Pipe Banging
-0.0649
0.0035
-0.0002
0.0261
0.0011
11.5204
11.9323
11.7253
-0.4843
-0.1164
Breath
-0.0736
0.0221
-0.0003
0.0349
-0.0024
8.2397
8.5449
8.4889
-1.1873
-0.1789
Ship Voice
-0.0781
0.0162
-0.0004
0.0273
0.0009
7.0496
7.2097
7.2978
-0.5925
0.0849
Scream
-0.0765
0.0181
-0.0002
0.065
0.0005
4.5547
4.6997
4.6255
-0.9611
0.0224
Animal Scream
-0.0796
0.0119
-0.0003
0.0403
0.0017
4.8675
5.3939
5.1273
-0.9335
-0.0264
Monster Roar
-0.0729
0.0228
-0.0002
0.0443
-0.0012
10.4599
11.1541
10.6156
-0.131
0.3
-0.058
0.0007
0
0.0384
-0.0003
31.929
35.8505
33.8411
-0.8111
-0.9217
Door Slams
Min
Raw EDA
Max
Differential EDA
Mean
Area
Slope
Min
Max
Game
Events [Player 6]
Raw EMG
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0189
0.0214
-0.0020
0.0196
-0.0003
70.3964
71.9833
71.4716
0.7335
-0.0371
Music
-0.0237
0.0172
-0.0020
0.0170
0.0001
75.4547
76.8509
76.2118
0.7173
-0.2744
Ship Groan
-0.0178
0.0172
-0.0019
0.0166
0.0009
72.6929
73.3414
73.0249
0.4000
0.0030
light explosion
-0.0427
0.0435
-0.0018
0.0166
-0.0007
73.6160
78.6667
75.3225
4.9132
0.8963
Muffled Scream
-0.0139
0.0117
-0.0019
0.0115
-0.0005
79.4830
80.6122
79.7991
1.3698
-0.1921
Man Whimper
-0.0155
0.0151
-0.0019
0.0141
0.0005
77.5299
77.9953
77.9112
0.6303
-0.0793
Pipe Banging
-0.0164
0.0133
-0.0019
0.0156
-0.0016
76.2405
76.4465
76.3076
0.2577
0.0320
Breath
-0.0207
0.0144
-0.0019
0.0174
0.0003
73.8144
75.7370
75.1703
0.6398
0.0046
Ship Voice
-0.0253
0.0183
-0.0019
0.0228
0.0007
74.4171
75.4852
75.0807
0.5971
0.1433
Scream
-0.0182
0.0114
-0.0019
0.0133
-0.0001
72.7310
73.4177
73.0817
0.6270
0.0244
Animal Scream
-0.0214
0.0125
-0.0020
0.0182
0.0019
72.8531
74.0814
73.3754
1.2888
0.1113
Monster Roar
-0.0298
0.0254
-0.0018
0.0222
0.0009
70.3975
76.9730
73.1911
5.6054
0.9481
Door Slams
-0.0265
0.0144
-0.0019
0.0161
-0.0003
83.3664
85.0449
84.2279
0.8060
-0.2226
Game
Events [Player 6]
Differential EDA
Mean
Area
Slope
Music
-0.0048
-0.0042
0
-0.0026
0.0004
5.0583
4.8676
4.7402
-0.0162
-0.2373
0.0011
-0.0042
0.0001
-0.003
0.0012
2.2965
1.3581
1.5533
-0.3335
0.0401
-0.0238
0.0221
0.0002
-0.003
-0.0004
3.2196
6.6834
3.8509
4.1797
0.9334
Muffled Scream
0.005
-0.0097
0.0001
-0.0081
-0.0002
9.0866
8.6289
8.3275
0.6363
-0.155
Man Whimper
0.0034
-0.0063
0.0001
-0.0055
0.0008
7.1335
6.012
6.4396
-0.1032
-0.0422
Pipe Banging
0.0025
-0.0081
0.0001
-0.004
-0.0013
5.8441
4.4632
4.836
-0.4758
0.0691
Breath
-0.0018
-0.007
0.0001
-0.0022
0.0006
3.418
3.7537
3.6987
-0.0937
0.0417
Ship Voice
-0.0064
-0.0031
0.0001
0.0032
0.001
4.0207
3.5019
3.6091
-0.1364
0.1804
0.0007
-0.01
0.0001
-0.0063
0.0002
2.3346
1.4344
1.6101
-0.1065
0.0615
Animal Scream
-0.0025
-0.0089
0
-0.0014
0.0022
2.4567
2.0981
1.9038
0.5553
0.1484
Monster Roar
-0.0109
0.004
0.0002
0.0026
0.0012
0.0011
4.9897
1.7195
4.8719
0.9852
Door Slams
-0.0076
-0.007
0.0001
-0.0035
0
12.97
13.0616
12.7563
0.0725
-0.1855
Ship Groan
light explosion
Scream
Min
Raw EDA
Max
Differential EDA
Mean
Area
Min
Slope
Max
Raw EDA
Game
Events [Player 7]
Raw EMG
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0121
0.0070
-0.0016
0.0128
-0.0007
156.7383
157.7606
157.2321
1.7053
-0.0587
Music
-0.0219
0.0210
-0.0019
0.0116
-0.0003
173.7518
182.1747
178.2915
2.1824
-1.5091
Ship Groan
-0.0265
0.0238
-0.0020
0.0388
-0.0015
159.2789
160.3012
159.7822
0.4737
-0.1556
light explosion
-0.0250
0.0189
-0.0019
0.0300
0.0035
159.0576
161.4380
160.2746
2.0963
0.2196
Muffled Scream
-0.0225
0.0188
-0.0020
0.0165
-0.0010
156.7764
163.8260
159.9262
5.6077
1.2294
Man Whimper
-0.0332
0.0257
-0.0020
0.0291
-0.0007
163.1775
168.3197
165.6311
4.9348
0.6650
Pipe Banging
-0.0365
0.0267
-0.0019
0.0357
-0.0041
163.1317
164.3143
163.7133
0.6453
-0.1724
Breath
-0.0330
0.0328
-0.0019
0.0251
0.0007
160.3775
166.4658
162.9351
4.6649
1.1058
Ship Voice
-0.0362
0.0259
-0.0018
0.0454
-0.0011
163.5513
164.7110
164.0633
0.5381
-0.2059
Scream
-0.0383
0.0336
-0.0020
0.0364
-0.0027
162.5137
167.0380
164.6206
3.7726
0.7215
Animal Scream
-0.0269
0.0341
-0.0019
0.0224
-0.0007
164.7034
167.0380
165.7209
2.8791
0.1205
Monster Roar
-0.0287
0.0191
-0.0020
0.0243
0.0013
160.2631
167.7628
163.4242
5.4854
1.4673
Door Slams
-0.0235
0.0246
-0.0019
0.0254
0.0007
171.5164
173.5153
172.5912
0.5099
-0.3447
Game
Events [Player 7]
Differential EMG
Mean
Area
Slope
Music
-0.0098
0.014
-0.0003
-0.0012
0.0004
17.0135
24.4141
21.0594
0.4771
-1.4504
Ship Groan
-0.0144
0.0168
-0.0004
0.026
-0.0008
2.5406
2.5406
2.5501
-1.2316
-0.0969
light explosion
-0.0129
0.0119
-0.0003
0.0172
0.0042
2.3193
3.6774
3.0425
0.391
0.2783
Muffled Scream
-0.0104
0.0118
-0.0004
0.0037
-0.0003
0.0381
6.0654
2.6941
3.9024
1.2881
Man Whimper
-0.0211
0.0187
-0.0004
0.0163
0
6.4392
10.5591
8.399
3.2295
0.7237
Pipe Banging
-0.0244
0.0197
-0.0003
0.0229
-0.0034
6.3934
6.5537
6.4812
-1.06
-0.1137
Breath
-0.0209
0.0258
-0.0003
0.0123
0.0014
3.6392
8.7052
5.703
2.9596
1.1645
Ship Voice
-0.0241
0.0189
-0.0002
0.0326
-0.0004
6.813
6.9504
6.8312
-1.1672
-0.1472
Scream
-0.0262
0.0266
-0.0004
0.0236
-0.002
5.7754
9.2774
7.3885
2.0673
0.7802
Animal Scream
-0.0148
0.0271
-0.0003
0.0096
0
7.9651
9.2774
8.4888
1.1738
0.1792
Monster Roar
-0.0166
0.0121
-0.0004
0.0115
0.002
3.5248
10.0022
6.1921
3.7801
1.526
Door Slams
-0.0114
0.0176
-0.0003
0.0126
0.0014
14.7781
15.7547
15.3591
-1.1954
-0.286
Min
Max
Differential EDA
Mean
Area
Min
Slope
Max
Raw EDA
Game
Events [Player 8]
Raw EMG
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0167
0.0138
-0.0019
0.0203
-0.0007
70.9839
71.7087
71.2348
1.5503
-0.0696
Music
-0.0186
0.0113
-0.0019
0.0120
-0.0003
87.3261
95.2988
89.8498
17.0222
0.8191
Ship Groan
-0.0088
0.0074
-0.0019
0.0092
-0.0006
87.7304
90.2786
88.9035
0.9116
-0.4942
light explosion
-0.0121
0.0079
-0.0018
0.0109
0.0009
90.1184
97.3358
94.3570
4.6298
1.3702
Muffled Scream
-0.0094
0.0069
-0.0019
0.0125
0.0001
93.6203
96.5729
94.9007
3.6814
0.1571
Man Whimper
-0.0129
0.0061
-0.0019
0.0240
0.0002
86.8988
87.9593
87.4219
0.6638
-0.1739
Pipe Banging
-0.0290
0.0095
-0.0019
0.0202
-0.0014
77.5604
78.2928
77.9985
0.6499
-0.0717
Breath
-0.0135
0.0125
-0.0019
0.0175
0.0000
75.5234
76.4771
76.0337
1.4902
0.0214
Ship Voice
-0.0150
0.0101
-0.0019
0.0226
0.0008
75.0427
84.7626
78.0504
10.0379
1.8860
Scream
-0.0139
0.0142
-0.0019
0.0239
-0.0025
78.5980
85.0678
80.6701
7.3231
1.1256
Animal Scream
-0.0180
0.0170
-0.0019
0.0145
0.0002
82.9468
91.9800
86.2465
10.6474
1.3498
Monster Roar
-0.0196
0.0184
-0.0018
0.0178
-0.0003
70.9845
96.9238
81.2454
21.5723
5.1829
Door Slams
-0.0176
0.0110
-0.0019
0.0155
0.0004
109.0698
111.9461
109.8741
4.2236
-0.4909
Game
Events [Player 8]
Differential EMG
Mean
Area
Slope
Music
-0.0019
-0.0025
0
-0.0083
0.0004
16.3422
23.5901
18.615
15.4719
0.8887
Ship Groan
0.0079
-0.0064
0
-0.0111
0.0001
16.7465
18.5699
17.6687
-0.6387
-0.4246
light explosion
0.0046
-0.0059
0.0001
-0.0094
0.0016
19.1345
25.6271
23.1222
3.0795
1.4398
Muffled Scream
0.0073
-0.0069
0
-0.0078
0.0008
22.6364
24.8642
23.6659
2.1311
0.2267
Man Whimper
0.0038
-0.0077
0
0.0037
0.0009
15.9149
16.2506
16.1871
-0.8865
-0.1043
-0.0123
-0.0043
0
-0.0001
-0.0007
6.5765
6.5841
6.7637
-0.9004
-0.0021
Breath
0.0032
-0.0013
0
-0.0028
0.0007
4.5395
4.7684
4.7989
-0.0601
0.091
Ship Voice
0.0017
-0.0037
0
0.0023
0.0015
4.0588
13.0539
6.8156
8.4876
1.9556
Scream
0.0028
0.0004
0
0.0036
-0.0018
7.6141
13.3591
9.4353
5.7728
1.1952
Animal Scream
-0.0013
0.0032
0
-0.0058
0.0009
11.9629
20.2713
15.0117
9.0971
1.4194
Monster Roar
-0.0029
0.0046
0.0001
-0.0025
0.0004
0.0006
25.2151
10.0106
20.022
5.2525
Door Slams
-0.0009
-0.0028
0
-0.0048
0.0011
38.0859
40.2374
38.6393
2.6733
-0.4213
Pipe Banging
Min
Max
Differential EDA
Mean
Area
Slope
Min
Max
Raw EDA
Game
Events [Player 9]
Raw EMG
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0133
0.0103
-0.0018
0.0127
0.0006
81.3141
81.9016
81.5234
1.1290
-0.0015
Music
-0.0594
0.0560
-0.0020
0.0208
0.0005
122.2382
131.0043
126.2175
13.8884
0.0702
Ship Groan
-0.0835
0.0720
-0.0019
0.0340
0.0012
93.5898
95.4590
94.5100
1.0352
-0.3458
light explosion
-0.0150
0.0118
-0.0019
0.0136
0.0009
83.5342
85.0143
84.2018
1.6124
-0.0503
Muffled Scream
-0.0125
0.0084
-0.0019
0.0107
-0.0004
81.3144
82.0160
81.6223
0.7460
-0.0351
Man Whimper
-0.0146
0.0130
-0.0019
0.0195
-0.0013
83.1985
84.5490
83.7858
1.8435
-0.1434
Pipe Banging
-0.0939
0.0886
-0.0018
0.0382
-0.0003
102.6611
106.9717
104.4599
2.4823
-0.8145
Breath
-0.0153
0.0121
-0.0019
0.0177
0.0021
98.3810
100.1358
99.2540
0.7428
-0.3160
Ship Voice
-0.0170
0.0145
-0.0019
0.0113
-0.0001
96.2448
97.2748
96.6125
0.7688
0.0213
Scream
-0.0161
0.0089
-0.0019
0.0163
-0.0013
86.7844
88.4476
87.4236
1.0698
-0.2569
Animal Scream
-0.0155
0.0130
-0.0019
0.0140
0.0011
86.8835
90.8585
89.4672
5.1539
0.7102
Monster Roar
-0.0122
0.0102
-0.0019
0.0100
0.0007
85.0449
86.6013
85.7147
1.6582
0.0504
Door Slams
-0.0179
0.0122
-0.0019
0.0131
-0.0010
92.0105
93.8187
92.9309
1.0532
-0.1939
Game
Events [Player 9]
Differential EMG
Mean
Area
Slope
Music
-0.0461
0.0457
-0.0002
0.0081
-0.0001
40.9241
49.1027
44.6941
12.7594
0.0717
Ship Groan
-0.0702
0.0617
-0.0001
0.0213
0.0006
12.2757
13.5574
12.9866
-0.0938
-0.3443
light explosion
-0.0017
0.0015
-0.0001
0.0009
0.0003
2.2201
3.1127
2.6784
0.4834
-0.0488
Muffled Scream
0.0008
-0.0019
-0.0001
-0.002
-0.001
0.0003
0.1144
0.0989
-0.383
-0.0336
Man Whimper
-0.0013
0.0027
-0.0001
0.0068
-0.0019
1.8844
2.6474
2.2624
0.7145
-0.1419
Pipe Banging
-0.0806
0.0783
0
0.0255
-0.0009
21.347
25.0701
22.9365
1.3533
-0.813
-0.002
0.0018
-0.0001
0.005
0.0015
17.0669
18.2342
17.7306
-0.3862
-0.3145
Ship Voice
-0.0037
0.0042
-0.0001
-0.0014
-0.0007
14.9307
15.3732
15.0891
-0.3602
0.0228
Scream
-0.0028
-0.0014
-0.0001
0.0036
-0.0019
5.4703
6.546
5.9002
-0.0592
-0.2554
Animal Scream
-0.0022
0.0027
-0.0001
0.0013
0.0005
5.5694
8.9569
7.9438
4.0249
0.7117
Monster Roar
0.0011
-0.0001
-0.0001
-0.0027
0.0001
3.7308
4.6997
4.1913
0.5292
0.0519
-0.0046
0.0019
-0.0001
0.0004
-0.0016
10.6964
11.9171
11.4075
-0.0758
-0.1924
Breath
Door Slams
Min
Max
Differential EDA
Mean
Area
Slope
Min
Max
Raw EDA
Game Events
[Player 10]
Raw EMG
Min
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.0265
0.0290
-0.0013
0.0270
-0.0006
70.5032
74.2340
71.9034
6.9073
-0.3848
Music
-0.0332
0.0176
-0.0019
0.0190
0.0003
84.0683
87.5015
86.2899
4.3308
0.5587
Ship Groan
-0.0340
0.0288
-0.0020
0.0339
0.0014
98.1140
100.5859
99.4435
4.9044
0.1250
light explosion
-0.0490
0.0405
-0.0019
0.0328
0.0017
87.4939
124.6490
119.4763
18.4357
1.1709
Muffled Scream
-0.0708
0.0682
-0.0019
0.0289
-0.0019
113.7848
120.2011
118.4553
5.6154
0.7327
Man Whimper
-0.0336
0.0286
-0.0019
0.0312
0.0004
110.7407
112.6480
111.4705
3.8608
-0.0061
Pipe Banging
-0.0410
0.0337
-0.0019
0.0387
0.0008
103.7674
116.5695
112.9415
8.4552
0.6312
Breath
-0.0414
0.0390
-0.0019
0.0439
-0.0022
80.8029
121.9788
113.5548
29.4184
0.9262
Ship Voice
-0.0968
0.0685
-0.0020
0.0456
0.0019
98.2819
126.5030
116.3370
35.5495
1.2227
Scream
-0.0438
0.0611
-0.0019
0.0484
-0.0017
117.5156
126.7014
120.7566
7.1936
1.8357
Animal Scream
-0.0474
0.0365
-0.0019
0.0420
0.0003
124.9084
129.4174
126.9136
6.1829
0.5251
Monster Roar
-0.1329
0.0893
-0.0020
0.0765
0.0067
83.4503
134.4223
119.7767
46.4461
7.7467
Door Slams
-0.0505
0.0518
-0.0019
0.0541
0.0031
123.2834
126.4648
125.5346
3.1489
-0.0991
Game Events
[Player 10]
Differential EMG
Mean
Area
Slope
Music
-0.0067
-0.0114
-0.0006
-0.008
0.0009
13.5651
13.2675
14.3865
-2.5765
0.9435
Ship Groan
-0.0075
-0.0002
-0.0007
0.0069
0.002
27.6108
26.3519
27.5401
-2.0029
0.5098
light explosion
-0.0225
0.0115
-0.0006
0.0058
0.0023
16.9907
50.415
47.5729
11.5284
1.5557
Muffled Scream
-0.0443
0.0392
-0.0006
0.0019
-0.0013
43.2816
45.9671
46.5519
-1.2919
1.1175
Man Whimper
-0.0071
-0.0004
-0.0006
0.0042
0.001
40.2375
38.414
39.5671
-3.0465
0.3787
Pipe Banging
-0.0145
0.0047
-0.0006
0.0117
0.0014
33.2642
42.3355
41.0381
1.5479
1.016
Breath
-0.0149
0.01
-0.0006
0.0169
-0.0016
10.2997
47.7448
41.6514
22.5111
1.311
Ship Voice
-0.0703
0.0395
-0.0007
0.0186
0.0025
27.7787
52.269
44.4336
28.6422
1.6075
Scream
-0.0173
0.0321
-0.0006
0.0214
-0.0011
47.0124
52.4674
48.8532
0.2863
2.2205
Animal Scream
-0.0209
0.0075
-0.0006
0.015
0.0009
54.4052
55.1834
55.0102
-0.7244
0.9099
Monster Roar
-0.1064
0.0603
-0.0007
0.0495
0.0073
12.9471
60.1883
47.8733
39.5388
8.1315
-0.024
0.0228
-0.0006
0.0271
0.0037
52.7802
52.2308
53.6312
-3.7584
0.2857
Door Slams
Min
Max
Differential EDA
Mean
Area
Slope
Min
Max
1.2 Average (Mean) Statistics: Raw Data Set
Group A Mean
EMG
Min
EDA
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
Base
-0.02138
0.01674
-0.00178
0.01584
-0.00018
89.96152
90.779
90.35126
1.1738
-0.04782
Music
-0.0262
0.02072
-0.00198
0.018
-0.00026
115.30456
121.1441
118.08364
4.61472
-0.6267
-0.03722
0.0305
-0.00196
0.02896
0.00014
98.58248
100.1404
99.48908
0.7238
-0.24458
light explosion
-0.0302
0.02576
-0.00192
0.02742
0.00176
94.83184
98.43598
96.2733
3.74742
0.4865
Muffled Scream
-0.02788
0.02802
-0.00194
0.02344
-0.00066
94.22456
97.3465
95.49906
3.3023
0.4466
Man Whimper
-0.02596
0.02328
-0.0019
0.0274
-0.00092
96.7697
99.23554
98.0032
2.84256
0.16044
Pipe Banging
-0.04172
0.03808
-0.0019
0.03246
-0.00016
100.3723
102.13164
101.18698
1.25582
-0.21262
Breath
-0.02254
0.028
-0.00192
0.03514
0.00176
98.19488
100.59816
99.32606
1.8986
0.13204
Ship Voice
-0.04078
0.037
-0.0019
0.03022
0.00024
97.97058
99.49494
98.61664
1.42406
0.07168
Scream
-0.03004
0.02446
-0.00192
0.03894
0.00008
94.72502
97.45792
95.86808
2.00116
0.2976
Animal Scream
-0.03428
0.03146
-0.00192
0.0301
-0.0001
95.9976
98.61146
97.26778
3.42802
0.23234
Monster Roar
-0.03098
0.03028
-0.00194
0.0312
-0.00016
95.98694
99.07228
97.21752
2.89812
0.46216
Door Slams
-0.03164
0.02984
-0.00188
0.03318
-0.00152
104.46168
106.7154
105.58702
1.32844
-0.30538
Group B Mean
EMG
Ship Groan
Min
EDA
Max
Mean
Area
Min
Slope
Max
Mean
Area
Slope
Base
-0.02312
0.02032
-0.00178
0.02402
-0.00018
75.21514
76.73034
75.89076
2.2626
-0.12976
Music
-0.02442
0.0154
-0.00194
0.01848
0.00034
84.57642
87.85858
85.9477
5.02192
0.14392
Ship Groan
-0.01952
0.01576
-0.00194
0.02136
0.00026
87.02546
88.7619
87.83786
1.73678
-0.1315
light explosion
-0.03068
0.02704
-0.00196
0.02134
-0.00016
85.68572
99.5529
94.92666
7.49396
1.44248
Muffled Scream
-0.02672
0.02314
-0.00184
0.01828
-0.00048
95.53528
99.80012
97.7954
3.75116
0.4806
Man Whimper
-0.02184
0.01666
-0.0019
0.02236
-0.0001
92.28518
93.65692
92.97634
1.4557
-0.16748
Pipe Banging
-0.02764
0.01956
-0.00192
0.02366
-0.00036
86.66228
90.24198
88.98
2.6401
0.17658
Breath
-0.02398
0.02362
-0.0019
0.0274
0.00004
80.97382
90.50754
88.17752
7.24032
0.22484
Ship Voice
-0.03674
0.0273
-0.00192
0.03214
0.00164
86.09772
95.32318
91.04444
10.602
0.89626
Scream
-0.02932
0.0333
-0.00188
0.0274
-0.00092
89.69574
94.07504
91.25906
4.27742
0.7707
Animal Scream
-0.03324
0.04044
-0.00194
0.02382
0.0001
93.82782
100.0122
97.16042
7.81796
0.9091
104.93164
95.71532
18.9908
3.66992
108.49914
106.85206
3.37078
0.14284
Monster Roar
Door Slams
-0.0488
0.03384
-0.00194
0.03106
0.00194
83.36182
-0.02948
0.02296
-0.00196
0.0289
0.00062
104.81262
1.3 Average (Mean) Statistics: Differential Data Set
Group A Mean
EDA
EMG
Min
Max
Mean
Area
Min
Slope
Max
Mean
Area
Slope
Music
-0.01786
0.00398
-0.0002
0.00216
-0.00008
25.34304
Ship Groan
-0.02888
0.01376
-0.00018
0.01312
0.00032
8.62096
9.3614
9.13782
-0.45
-0.19676
light explosion
-0.02186
0.00902
-0.00014
0.01158
0.00194
4.87032
7.65698
5.92204
2.57362
0.53432
Muffled Scream
-0.01954
0.01128
-0.00016
0.0076
-0.00048
4.26304
6.5675
5.1478
2.1285
0.49442
Man Whimper
-0.01762
0.00654
-0.00012
0.01156
-0.00074
6.80818
8.45654
7.65194
1.66876
0.20826
Pipe Banging
-0.03338
0.02134
-0.00012
0.01662
0.00002
10.41078
11.35264
10.83572
0.08202
-0.1648
-0.0142
0.01126
-0.00014
0.0193
0.00194
8.23336
9.81916
8.9748
0.7248
0.17986
-0.03244
0.02026
-0.00012
0.01438
0.00042
8.00906
8.71594
8.26538
0.25026
0.1195
-0.0217
0.00772
-0.00014
0.0231
0.00026
4.7635
6.67892
5.51682
0.82736
0.34542
Animal Scream
-0.02594
0.01472
-0.00014
0.01426
0.00008
6.03608
7.83246
6.91652
2.25422
0.28016
Monster Roar
-0.02264
0.01354
-0.00016
0.01536
0.00002
6.02542
8.29328
6.86626
1.72432
0.50998
-0.0233
0.0131
-0.0001
0.01734
-0.00134
14.50016
15.9364
15.23576
0.15464
-0.25756
Breath
Ship Voice
Scream
Door Slams
Group B
EMG
Min
Music
30.3651
27.73238
3.44092
-0.57888
EDA
Max
Mean
Area
Slope
Min
Max
Mean
Area
Slope
-0.0013
-0.00492
-0.00016
-0.00554
0.00052
9.36128
11.12824
10.05694
2.75932
0.27368
0.0036
-0.00456
-0.00016
-0.00266
0.00044
11.81032
12.03156
11.9471
-0.52582
-0.00174
light explosion
-0.00756
0.00672
-0.00018
-0.00268
0.00002
10.47058
22.82256
19.0359
5.23136
1.57224
Muffled Scream
-0.0036
0.00282
-0.00006
-0.00574
-0.0003
20.32014
23.06978
21.90464
1.48856
0.61036
Man Whimper
0.00128
-0.00366
-0.00012
-0.00166
0.00008
17.07004
16.92658
17.08558
-0.8069
-0.03772
Pipe Banging
-0.00452
-0.00076
-0.00014
-0.00036
-0.00018
11.44714
13.51164
13.08924
0.3775
0.30634
Breath
-0.00086
0.0033
-0.00012
0.00338
0.00022
5.75868
13.7772
12.28676
4.97772
0.3546
Ship Voice
-0.01362
0.00702
-0.00014
0.00812
0.00182
10.88258
18.59284
15.15368
8.3394
1.02602
-0.0062
0.01298
-0.0001
0.00338
-0.00074
14.4806
17.3447
15.3683
2.01482
0.90046
Animal Scream
-0.01012
0.02012
-0.00016
-0.0002
0.00028
18.61268
23.28186
21.26966
5.55536
1.03886
Monster Roar
-0.02568
0.01352
-0.00016
0.00704
0.00212
8.14702
28.2013
19.82456
16.7282
3.79968
Door Slams
-0.00636
0.00264
-0.00018
0.00488
0.0008
29.59748
31.7688
30.9613
1.10818
0.2726
Ship Groan
Scream
1.4 Questionnaire Data
Group A (control)
Player 1
Player 3
Player 5
Player 7
Player 9
Overall intensity (1-10)
5
8
8
4
7
Significant emotion
Anticipation
Anticipation
Anticipation
Anticipation
Anticipation
Description of most memorable &
intense section of game
Final countdown
bangs
Alien in corridor
and black screen
Alien in corridor and
black screen
Female scream in
control room
Following NPC into
unknown
Game Difficulty (1-10)
3
5
3
3
3
Average hrs spent playing games (0-20)
8
18
13
8
20
Preferred platform
Console
PC
Console
PC
PC
FPS experience (1-10)
5
7
10
9
10
Horror game experience (1-10)
2
4
6
6
10
Gender
Male
Male
Male
Male
Male
Age
25
19
21
26
26
Interruption of flow/immersion
3
4
2
2
4
Comfort of sensors
8
8
7
6
7
Group B (test group)
Player 2
Player 4
Player 6
Player 8
Player 10
Overall intensity (1-10)
8
6
7
7
7
Significant emotion
Excitement
Anticipation
Anticipation
Surprise
Anticipation
Description of most memorable &
intense section of game
Alien in corridor
and black screen
Alien rushing past
open door
Walk up to alien and
Lights explode
Final countdown
bangs
Bodies in control room
Game Difficulty (1-10)
3
4
3
3
3
Average hrs spent playing games (0-20)
13
4
4
8
8
Preferred platform
Console
PC
Console
PC
Console
FPS experience (1-10)
8
9
8
10
5
Horror game experience (1-10)
2
5
7
4
1
Gender
Male
Male
Male
Male
Female
Age
19
27
23
27
27
Interruption of flow/immersion
3
2
2
2
3
Comfort of sensors
6
7
7
8
7
ii) Preliminary Web-based Assessment of Affective
Computer Game Audio
2.1 The Sounds of Fear: Raw data comparing untreated audio against DSP effects (
measures of emotional activation on a nine point scale
Footsteps
Music Radio
Window
Zombie
Voice Radio
Tree fall
Church
Monster
Scream
Manhole
Roar
Water
) and online datasets (A1-10) against offline (B1-10) for user-defined
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
3
4
1
2
3
4
5
1
7
3
4
2
1
1
6
7
5
7
4
1
3
4
1
3
5
5
2
3
7
4
1
3
5
1
2
8
2
5
7
2
3
7
1
7
1
5
1
5
3
2
6
6
8
5
5
7
4
2
3
6
5
4
1
6
4
5
1
4
6
3
4
6
9
5
5
4
6
3
4
9
5
5
7
3
3
9
1
5
1
2
2
1
7
5
8
6
2
4
7
7
3
5
1
6
1
5
3
5
1
4
3
1
2
5
7
4
4
5
5
6
7
7
6
6
6
7
1
9
5
4
6
6
7
4
7
2
1
7
4
2
7
9
5
3
1
8
4
9
1
8
2
4
3
1
2
5
1
5
8
1
7
8
7
5
1
8
7
9
1
7
8
9
9
8
8
7
9
2
4
9
1
2
4
2
2
5
7
8
5
2
6
7
8
3
8
8
5
4
9
8
7
6
1
8
4
6
4
6
4
2
3
6
9
6
5
9
3
6
4
7
7
8
4
8
5
7
2
8
6
3
3
5
9
6
7
8
6
6
9
5
8
7
3
6
5
7
5
1
8
3
6
8
8
7
6
8
7
3
4
3
6
8
8
3
5
8
1
3
1
4
3
3
4
3
5
2
3
3
9
4
6
3
6
7
1
6
5
9
3
3
4
5
7
5
3
3
2
7
5
6
7
7
2
8
5
6
6
5
4
3
4
2
7
2
4
4
3
6
9
3
8
6
3
8
4
6
2
9
5
4
7
7
9
2
7
7
7
5
9
9
7
3
2
6
8
3
8
9
4
4
7
6
9
7
8
8
8
4
8
9
8
4
2
6
4
5
2
1
6
5
4
7
6
2
2
7
6
4
5
6
7
6
6
7
5
6
8
6
7
2
7
7
7
5
7
9
6
4
3
6
7
6
2
9
4
6
1
7
4
4
2
5
6
6
7
6
6
5
3
1
7
3
5
7
1
4
3
9
1
4
8
8
6
8
7
9
7
5
9
5
6
9
7
3
3
8
1
1
2
6
7
8
9
1
8
8
9
7
9
9
6
8
3
2
5
5
1
7
7
2
8
8
9
8
9
9
8
7
9
8
2.2 The Sounds of Fear: Raw data comparing untreated audio against DSP effects (
measures of emotional valence on a nine point scale
Footsteps
Music Radio
Window
Zombie
Voice Radio
Tree fall
Church
Monster
Scream
Manhole
Roar
Water
) and offline datasets (A1-10) against online (B1-10) for user-defined
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
5
6
5
5
5
5
4
3
5
3
2
5
1
1
4
7
5
5
4
5
5
4
5
3
5
4
4
3
2
4
7
5
4
1
5
7
5
5
7
4
5
3
5
3
4
5
5
7
3
5
4
3
8
4
5
3
5
4
9
1
5
3
5
4
5
4
5
6
2
3
3
3
9
4
4
4
3
2
4
1
5
4
4
5
4
3
5
5
1
3
8
5
7
4
2
4
5
4
5
3
5
4
5
4
3
4
5
6
1
5
7
5
2
3
3
4
5
4
3
3
3
4
3
5
7
3
5
1
3
5
6
2
5
4
2
6
5
7
4
4
3
1
3
5
1
2
3
1
5
1
7
2
3
5
2
3
5
3
2
4
3
2
2
4
5
2
3
1
5
1
4
1
9
2
1
6
1
3
4
1
5
5
5
5
7
5
2
1
1
3
4
2
8
5
1
3
3
3
1
3
2
3
2
3
3
3
7
3
2
4
7
3
7
3
5
8
5
5
4
3
3
2
2
2
5
3
3
3
1
3
3
4
7
3
2
8
4
5
1
3
5
5
1
4
7
5
3
1
2
5
6
1
8
2
3
7
2
5
6
4
4
4
4
3
4
3
6
4
5
5
7
3
4
5
4
3
5
3
7
3
3
5
3
7
7
4
3
2
2
3
6
4
5
4
5
4
5
6
4
4
2
5
5
2
5
4
2
5
3
4
6
4
7
5
5
2
4
5
2
5
5
3
4
3
5
4
5
1
2
3
6
2
9
5
2
7
2
6
1
1
1
5
5
3
9
5
1
1
4
3
2
3
9
2
2
9
2
3
2
1
1
5
5
3
4
5
3
4
2
2
7
2
7
4
3
6
3
4
5
4
6
3
2
5
6
4
2
4
2
5
5
2
4
4
3
8
3
4
5
3
3
3
5
2
6
4
5
1
2
3
6
3
7
2
2
6
2
7
4
4
3
5
3
8
7
3
4
9
5
3
3
2
7
2
2
9
2
5
1
3
4
3
2
2
5
3
5
5
3
2
6
2
9
5
3
6
1
7
2
1
6
2
1
5
6
5
5
7
3
5
3
2
9
2
2
8
1
7
2
1
2.3 The Horror Sound Designer: Raw data comparing user input and online datasets (A1-10) against offline (B1-10)
Pref.
Sound
Attack
1
d
Attack
2
d
Pitch
1
d
Pitch
2
d
Delay
1
d
Delay
2
d
Position
1
d
Position
2
d
A1
Slam
15
1
5
2
+7
3
-10
2
2
2
2
3
Centre
2
Centre
2
A2
Slam
15
3
15
3
+7
1
0
1
2
1
14
2
Centre
1
Centre
1
A3
Slam
5
3
15
4
+7
1
-10
2
14
4
14
3
L2R
2
L2R
2
A4
Thunder
15
1
15
4
-10
1
-10
1
2
1
14
1
R2L
1
Centre
1
A5
Thunder
15
3
15
2
-10
3
-10
3
14
4
8
3
R2L
3
R2L
3
A6
Thunder
5
6
15
5
+7
4
-10
3
2
2
2
2
L2R
3
L2R
3
A7
Thunder
5
3
15
1
-10
1
+7
2
8
2
8
2
L2R
3
Right
1
A8
Thunder
15
3
15
4
-10
5
-10
2
14
4
2
3
Left
3
R2L
3
A9
Thunder
15
1
5
1
+7
6
0
3
8
1
8
1
L2R
1
Right
1
A10
Thunder
15
2
15
1
-10
1
-10
1
2
1
2
1
R2L
1
R2L
1
B1
Thunder
5
4
0
4
0
4
+7
3
2
4
14
3
Centre
3
R2L
4
B2
Thunder
0
6
5
6
+7
5
+7
5
2
5
14
5
Centre
5
Centre
5
B3
Thunder
5
4
15
3
+7
3
-10
2
14
2
8
2
Left
2
Right
2
B4
Insect
5
6
15
6
+7
6
0
6
2
5
8
5
L2R
5
R2L
5
B5
Thunder
5
4
15
3
0
2
-10
1
8
1
2
1
Centre
1
Right
1
B6
SMG
0
4
15
3
+7
6
-10
4
14
4
14
4
Centre
4
Left
3
B7
Slam
5
2
5
1
0
4
+7
4
2
4
2
3
R2L
1
Centre
1
B8
Insect
0
4
15
3
0
1
-10
1
2
1
14
1
L2R
2
Centre
2
B9
Train
5
4
5
6
+7
5
+7
3
8
3
2
4
Centre
1
L2R
3
B10
Thunder
15
4
15
5
-10
5
0
1
8
4
2
2
L2R
3
R2L
3
2.4 The Horror Sound Designer: Complete online dataset (users 1-19)
Pref.
Sound
Attack
1
d
Attack
2
d
Pitch
1
d
Pitch
2
d
Delay
1
d
Delay
2
d
Position
1
d
Position
2
d
1
Thunder
5
4
0
4
0
4
+7
3
2
4
14
3
Centre
3
R2L
4
2
Thunder
0
6
5
6
+7
5
+7
5
2
5
14
5
Centre
5
Centre
5
3
Thunder
5
4
15
3
+7
3
-10
2
14
2
8
2
Left
2
Right
2
4
Insect
5
6
15
6
+7
6
0
6
2
5
8
5
L2R
5
R2L
5
5
Thunder
5
4
15
3
0
2
-10
1
8
1
2
1
Centre
1
Right
1
6
SMG
0
4
15
3
+7
6
-10
4
14
4
14
4
Centre
4
Left
3
7
Slam
5
2
5
1
0
4
+7
4
2
4
2
3
R2L
1
Centre
1
8
Insect
0
4
15
3
0
1
-10
1
2
1
14
1
L2R
2
Centre
2
9
Train
5
4
5
6
+7
5
+7
3
8
3
2
4
Centre
1
L2R
3
10
Thunder
15
4
15
5
-10
5
0
1
8
4
2
2
L2R
3
R2L
3
11
Thunder
0
4
15
4
-10
4
-10
3
14
3
2
2
Left
1
L2R
1
12
Slam
15
4
15
4
-10
4
-10
3
8
2
14
2
Left
3
L2R
3
13
Thunder
5
3
15
1
-10
1
+7
2
8
2
8
2
L2R
3
Right
1
14
Thunder
15
3
15
4
0
3
0
1
14
4
8
3
L2R
1
Right
1
15
Slam
15
4
15
3
+7
3
0
3
8
3
8
3
L2R
2
L2R
2
16
Slam
15
3
15
4
0
3
-10
1
8
2
8
2
Centre
1
Left
4
17
Thunder
15
3
15
3
+7
4
0
3
14
3
8
3
R2L
1
Right
1
18
Insect
0
5
15
3
-10
1
-10
1
8
1
14
1
R2L
1
Right
1
19
Thunder
15
3
15
3
0
2
0
2
14
4
8
3
L2R
6
Left
5
2.5 The Horror Sound Designer: Complete online dataset (users 20-38)
Pref.
Sound
Attack
1
d
Attack
2
d
Pitch
1
d
Pitch
2
d
Delay
1
d
Delay
2
d
Position
1
d
Position
2
d
20
Slam
0
3
15
2
+7
2
-10
3
8
3
8
2
Left
3
Centre
3
21
Slam
15
4
15
3
0
3
+7
2
2
2
2
2
Centre
1
Centre
2
22
Insect
0
4
0
5
0
3
0
3
14
2
8
2
R2L
2
Centre
2
23
Thunder
15
3
15
4
-10
5
-10
2
14
4
2
3
Left
3
R2L
3
24
Slam
5
3
15
4
0
5
0
3
2
3
2
3
Left
5
L2R
5
25
Slam
15
4
15
5
-10
3
0
2
2
5
14
4
L2R
2
L2R
3
26
Slam
5
4
0
4
-10
5
+7
4
14
3
2
4
L2R
3
Left
4
27
Slam
15
2
15
4
-10
4
+7
2
8
4
2
3
Centre
6
Centre
4
28
Thunder
15
3
0
5
+7
5
-10
2
8
6
2
4
L2R
4
L2R
4
29
Slam
15
5
0
4
-10
6
+7
1
2
4
14
5
Centre
2
Centre
3
30
Thunder
5
4
5
4
-10
5
0
4
14
6
2
3
Centre
4
Centre
4
31
Thunder
15
1
5
1
+7
6
0
3
8
1
8
1
L2R
1
Right
1
32
Thunder
15
2
15
1
-10
1
-10
1
2
1
2
1
R2L
1
R2L
1
33
Slam
15
1
5
2
+7
3
-10
2
2
2
2
3
Centre
2
Centre
2
34
Slam
15
3
15
3
+7
1
0
1
2
1
14
2
Centre
1
Centre
1
35
Slam
5
3
15
4
+7
1
-10
2
14
4
14
3
L2R
2
L2R
2
36
Thunder
15
1
15
4
-10
1
-10
1
2
1
14
1
R2L
1
Centre
1
37
Thunder
15
3
15
2
-10
3
-10
3
14
4
8
3
R2L
3
R2L
3
38
Thunder
5
6
15
5
+7
4
-10
3
2
2
2
2
L2R
3
L2R
3
2.6 The Sounds of Fear: Complete activation online dataset (users 1-14)
Footsteps
Music Radio
Window
Zombie
Voice Radio
Tree fall
Church
Monster
Scream
Manhole
Roar
Water
1
2
3
4
5
6
7
8
9
10
11
12
13
14
4
2
1
1
6
7
5
7
4
1
2
4
3
5
1
3
5
1
2
8
2
5
7
2
3
3
1
5
6
6
8
5
5
7
4
2
3
6
6
1
1
2
4
6
9
5
5
4
6
3
4
9
6
3
2
5
2
1
7
5
8
6
2
4
7
7
2
5
2
1
3
1
2
5
7
4
4
5
5
6
3
4
1
6
6
6
7
4
7
2
1
7
4
2
4
2
7
8
2
4
3
1
2
5
1
5
8
1
3
1
8
6
8
9
9
8
8
7
9
2
4
9
7
7
6
3
6
7
8
3
8
8
5
4
9
8
7
3
6
6
3
6
9
6
5
9
3
6
4
7
5
4
5
7
3
5
9
6
7
8
6
6
9
5
7
1
8
6
6
8
8
7
6
8
7
3
4
3
7
5
2
6
3
3
4
3
5
2
3
3
9
4
4
2
2
7
4
5
7
5
3
3
2
7
5
6
4
7
7
7
4
2
7
2
4
4
3
6
9
3
3
8
3
5
7
7
9
2
7
7
7
5
9
9
5
9
2
6
7
6
9
7
8
8
8
4
8
9
6
8
7
3
4
7
6
2
2
7
6
4
5
6
5
4
2
6
7
7
7
5
7
9
6
4
3
6
5
2
6
4
2
5
6
6
7
6
6
5
3
1
6
3
2
2
8
8
6
8
7
9
7
5
9
5
7
5
4
6
7
8
9
1
8
8
9
7
9
9
8
5
8
9
8
8
9
8
9
9
8
7
9
8
8
7
8
6
2.7 The Sounds of Fear: Complete activation online dataset (users 15-28)
15
Footsteps
Music Radio
Window
Zombie
Voice Radio
Tree fall
Church
Monster
Scream
Manhole
Roar
Water
16
17
18
19
20
21
22
23
24
25
26
27
28
3
3
2
2
5
1
7
3
3
4
1
2
3
4
3
2
1
3
2
3
7
4
3
4
1
3
5
5
5
6
4
6
1
5
3
2
3
7
1
7
1
5
3
3
4
5
1
4
6
3
5
4
1
6
4
5
2
3
3
3
1
5
1
2
5
5
7
3
3
9
2
4
2
5
3
5
1
4
3
5
1
6
1
5
4
4
3
3
1
9
5
4
7
7
6
6
6
7
4
1
2
3
4
9
1
8
7
9
5
3
1
8
8
6
8
8
7
9
1
7
7
8
7
5
1
8
3
1
7
6
7
8
5
2
1
2
4
2
2
5
2
5
3
7
4
6
4
2
7
6
1
8
4
6
2
6
3
4
2
8
6
3
7
8
4
8
5
7
5
7
3
7
5
1
8
3
8
7
3
6
5
7
3
3
2
3
1
3
1
4
6
8
8
3
5
8
6
3
4
4
5
9
3
3
6
3
6
7
1
6
3
4
5
3
6
5
4
3
7
7
2
8
5
6
7
4
7
7
2
9
5
4
8
6
3
8
4
6
7
8
6
7
8
9
4
4
7
3
2
6
8
3
3
7
5
8
2
1
6
5
8
4
2
6
4
5
5
7
5
6
8
6
7
2
7
6
6
7
5
6
2
3
3
4
1
7
4
4
7
6
2
9
4
6
7
3
4
6
3
9
1
4
7
3
5
7
1
4
7
8
6
8
1
1
2
6
6
9
7
3
3
8
8
9
7
4
1
7
7
2
6
8
3
2
5
5
2.8 The Sounds of Fear: Complete valence online dataset (users 1-14)
Footsteps
Music Radio
Window
Zombie
Voice Radio
Tree fall
Church
Monster
Scream
Manhole
Roar
Water
1
2
3
4
5
6
7
8
9
10
11
12
13
14
2
5
1
1
4
7
5
5
4
5
5
4
5
3
7
5
4
1
5
7
5
5
7
4
5
7
5
3
4
3
8
4
5
3
5
4
9
1
4
5
5
5
3
3
9
4
4
4
3
2
4
1
4
6
5
4
8
5
7
4
2
4
5
4
5
3
5
3
5
5
7
5
2
3
3
4
5
4
3
3
5
2
5
2
6
2
5
4
2
6
5
7
4
4
4
5
2
1
7
2
3
5
2
3
5
3
2
4
5
7
1
2
4
1
9
2
1
6
1
3
4
1
4
3
4
6
4
2
8
5
1
3
3
3
1
3
2
5
3
2
7
3
7
3
5
8
5
5
4
3
4
4
3
3
3
4
7
3
2
8
4
5
1
3
3
6
2
3
6
1
8
2
3
7
2
5
6
4
4
3
4
3
7
3
4
5
4
3
5
3
7
3
4
6
5
4
6
4
5
4
5
4
5
6
4
4
5
1
2
1
6
4
7
5
5
2
4
5
2
5
5
2
4
2
6
2
9
5
2
7
2
6
1
1
5
2
1
2
2
3
9
2
2
9
2
3
2
1
3
2
2
4
7
2
7
4
3
6
3
4
5
4
5
3
5
4
5
2
4
4
3
8
3
4
5
3
4
2
4
6
6
3
7
2
2
6
2
7
4
4
4
5
5
6
3
2
7
2
2
9
2
5
1
3
2
2
4
5
6
2
9
5
3
6
1
7
2
1
2
2
2
1
3
2
9
2
2
8
1
7
2
1
3
2
2
2
2.9 The Sounds of Fear: Complete activation online dataset (users 15-28)
Footsteps
Music Radio
Window
Zombie
Voice Radio
Tree fall
Church
Monster
Scream
Manhole
Roar
Water
15
16
17
18
19
20
21
22
23
24
25
26
27
28
5
2
5
5
4
3
5
3
5
6
5
5
5
5
5
2
5
5
4
3
2
4
5
4
5
3
5
4
4
7
4
4
5
7
3
5
5
3
5
3
4
5
4
6
4
4
5
6
2
3
5
3
5
4
5
4
5
4
5
4
5
5
1
3
5
4
4
5
4
3
4
6
5
5
5
6
1
5
5
4
5
4
3
4
4
5
4
4
5
1
3
5
3
4
3
5
7
3
5
2
5
3
3
1
5
1
3
1
3
5
1
2
2
5
2
2
3
1
5
1
3
2
2
4
5
2
5
1
2
3
2
1
1
3
5
5
5
5
7
5
5
6
5
3
7
3
2
4
2
3
2
3
3
3
4
7
5
4
3
3
1
3
3
2
2
2
5
3
4
7
4
3
3
1
2
5
5
5
1
4
7
5
5
3
4
5
6
4
5
5
4
4
4
3
4
3
4
5
4
4
3
2
2
3
3
5
3
7
7
4
4
4
4
5
2
5
3
4
2
5
5
2
5
4
3
3
2
2
5
1
2
3
5
3
4
3
5
4
3
8
3
2
1
1
4
3
1
5
5
3
9
5
5
7
4
3
3
4
2
2
1
5
5
3
4
5
4
7
4
4
2
4
2
5
6
3
2
5
6
4
5
5
5
4
5
1
2
3
3
3
5
2
6
4
3
7
5
2
4
9
5
3
3
5
3
8
7
3
3
8
3
6
5
5
3
2
4
3
2
2
5
3
3
8
3
5
5
7
3
5
6
2
1
5
6
5
iii) Preliminary Assessment of the Fear Value of
Preselected Sound Parameters in a Survival Horror
Game
3.1 Empirical & subjective datasets from three individual audio treatments and one untreated control group
Untreated Audio
Treatment
Order
Completion
Time
RTI
Mean
Reaction Time
Run function
Activations
1
1234
106.132
2.20
3.506
1
2.00
1.00
Anticipation
2
4321
54.543
2.00
2.231
7
3.00
1.00
Anticipation
3
3124
56.403
1.60
1.504
2
2.00
1.00
Trust
4
3412
54.829
2.00
1.898
1
3.00
2.00
Anticipation
5
2314
457.136
3.60
1.808
2
4.00
3.00
Anticipation
6
1342
396.392
4.60
4.318
0
4.00
3.00
Fear
7
4123
88.801
2.00
2.774
4
3.00
1.00
Anticipation
8
2134
53.442
1.40
2.018
3
2.00
1.00
Anticipation
9
1423
434.804
3.60
1.875
2
4.00
3.00
Fear
10
2413
401.560
4.60
3.454
2
5.00
3.00
Fear
11
3142
443.655
3.20
2.102
1
3.00
2.00
Anticipation
12
4213
384.474
4.00
4.119
0
4.00
3.00
Surprise
Player
Post-Game
Intensity
Level
Difficulty
Significant
Emotion
Pitch
Player
Completion
Time
RTI
1
107.561
4.000
3.00
1.648
3.00
2.00
Surprise
2
62.399
5.000
2.20
2.452
2.00
2.00
Anticipation
3
57.609
5.000
1.00
1.253
1.00
1.00
Trust
4
49.545
4.000
3.20
1.740
5.00
3.00
Fear
5
237.412
2.000
3.40
2.452
4.00
3.00
Surprise
6
107.673
1.000
2.40
2.469
3.00
2.00
Anticipation
7
98.361
5.000
2.00
2.557
3.00
1.00
Anticipation
8
116.023
4.000
1.40
2.242
3.00
1.00
Anticipation
9
234.985
2.000
4.00
2.411
4.00
3.00
Anticipation
10
345.888
2.000
2.60
2.601
4.00
3.00
Surprise
11
100.620
1.000
2.60
2.546
3.00
2.00
Anticipation
12
122.745
1.000
2.20
2.294
3.00
2.00
Fear
3D Localisation
Player
Completion
Time
1
66.182
4.000
1.60
1.705
1.00
3.00
Boredom
2
168.815
5.000
2.20
2.528
2.00
3.00
Anticipation
3
62.264
5.000
2.00
1.438
2.00
2.00
Fear
4
80.296
4.000
2.80
1.911
3.00
3.00
Fear
5
229.760
2.000
3.40
2.607
4.00
3.00
Anticipation
6
706.558
1.000
4.40
2.040
3.00
4.00
Anticipation
7
156.169
5.000
2.00
2.504
4.00
1.00
Surprise
8
49.318
4.000
2.00
2.068
3.00
1.00
Anticipation
9
256.112
2.000
3.60
2.459
4.00
3.00
Anticipation
10
356.456
1.000
4.40
1.896
3.00
4.00
Surprise
11
203.008
2.000
3.40
2.607
4.00
3.00
Anticipation
12
281.020
1.000
4.40
2.143
4.00
4.00
Surprise
RTI
Mean
Reaction Time
Mean
Reaction Time
Run function
Activations
Run function
Activations
Post-Game
Intensity
Post-Game
Intensity
Level
Difficulty
Level
Difficulty
Significant
Emotion
Significant
Emotion
Loudness Boost
Player
Completion
Time
RTI
Mean
Reaction Time
Run function
Activations
Post-Game
Intensity
Level
Difficulty
Significant
Emotion
1
53.281
4.000
2.40
1.521
3.00
2.00
Surprise
2
94.193
5.000
2.00
2.288
2.00
3.00
Anticipation
3
78.351
5.000
1.80
1.545
2.00
3.00
Anticipation
4
95.390
4.000
2.40
1.910
4.00
3.00
Anticipation
5
211.718
2.000
3.40
1.556
4.00
3.00
Boredom
6
352.680
1.000
3.80
4.358
3.00
2.00
Excited
7
88.086
5.000
2.60
3.338
3.00
1.00
Anticipation
8
63.300
4.000
2.00
1.937
3.00
1.00
Anticipation
9
199.900
2.000
3.20
1.344
4.00
3.00
Boredom
10
294.315
1.000
3.60
3.654
4.00
2.00
Anticipation
11
241.654
2.000
3.60
1.659
3.00
3.00
Anticipation
12
364.778
1.000
4.00
3.981
4.00
2.00
Surprise
3.2 Overall experience & personal information
Player
Immersion
Flow
Most Intense Level
Experience (hrs p/w)
Age
Gender
Impairments
Nationality
1
2
3
Loud
13
22
Male
None
British
2
3
3
Untreated
18
21
Male
None
British
3
3
2
Loud
18
24
Male
None
British
4
4
1
Pitch
14
25
Female
None
British
5
3
3
Untreated
3
26
Male
None
British
6
4
3
Untreated
0
55
Female
None
British
7
3
1
Loud
17
27
Male
None
British
8
3
2
Surround
14
24
Male
None
British
9
3
3
Loud
12
24
Male
None
British
10
4
3
Pitch
13
21
Male
None
British
11
3
1
Untreated
15
21
Male
None
British
12
2
3
Loud
15
22
Male
None
British
3.3 Event log data & real-time intensity responses
Quantitative Datasets obtained in Real-time During Gameplay [Player 1]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
28.144
3
42.866
4.722
Twig Snap
57.362
1
59.650
2.288
Woman Scream
69.389
2
72.363
2.974
Monster Scream
78.733
3
82.556
3.829
Animal Scream
94.838
2
98.555
3.717
Zombie Call
11.789
2
13.991
2.202
Twig Snap
20.337
3
21.131
0.794
Woman Scream
82.935
3
84.161
1.226
Monster Scream
89.901
3
91.835
1.934
Animal Scream
98.093
4
100.177
2.084
Zombie Call
6.229
2
8.051
1.822
Twig Snap
14.082
1
15.172
1.090
Woman Scream
39.437
1
41.198
1.761
Monster Scream
47.432
2
49.639
2.207
Animal Scream
59.020
2
60.665
1.645
Zombie Call
5.373
1
6.574
1.201
Twig Snap
15.429
1
16.484
1.055
Woman Scream
27.313
3
28.997
1.684
Monster Scream
36.854
3
38.826
1.972
Animal Scream
44.471
4
46.166
1.695
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 2]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
6.105
2
9.676
3.571
Twig Snap
13.876
1
14.314
0.438
Woman Scream
24.919
2
27.360
2.441
Monster Scream
34.425
2
36.836
2.411
Animal Scream
47.899
3
50.149
2.250
Zombie Call
8.007
2
11.744
3.737
Twig Snap
16.394
1
18.050
1.656
Woman Scream
27.873
2
30.095
2.222
Monster Scream
43.700
3
46.145
2.445
Animal Scream
55.322
3
57.523
2.201
Zombie Call
20.755
2
23.925
3.170
Twig Snap
50.288
1
51.886
1.598
Woman Scream
101.267
2
103.505
2.238
Monster Scream
110.761
3
113.481
2.72
Animal Scream
159.710
3
162.623
2.913
Zombie Call
13.003
2
16.173
3.170
Twig Snap
41.269
1
42.901
1.632
Woman Scream
55.463
2
57.481
2.018
Monster Scream
63.403
2
65.856
2.453
Animal Scream
74.969
3
77.134
2.165
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 3]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
6.631
2
7.762
1.131
Twig Snap
13.129
1
14.505
1.376
Woman Scream
21.990
1
23.881
1.891
Monster Scream
30.198
2
31.923
1.725
Animal Scream
40.070
2
41.466
1.396
Zombie Call
8.530
1
9.397
0.867
Twig Snap
14.844
1
15.837
0.993
Woman Scream
23.365
1
24.946
1.581
Monster Scream
31.476
1
33.154
1.678
Animal Scream
41.817
1
42.964
1.147
Zombie Call
8.566
2
10.368
1.802
Twig Snap
24.068
1
25.015
0.947
Woman Scream
35.185
3
37.194
2.009
Monster Scream
45.977
2
47.037
1.060
Animal Scream
53.373
2
54.745
1.372
Zombie Call
6.351
1
7.318
0.967
Twig Snap
14.197
1
15.260
1.063
Woman Scream
26.748
3
28.807
2.059
Monster Scream
61.889
1
63.842
1.953
Animal Scream
69.134
3
70.815
1.681
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 4]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
6.597
1
8.065
1.468
Twig Snap
22.167
1
24.114
1.947
Woman Scream
29.403
2
31.322
1.919
Monster Scream
37.212
2
39.306
2.094
Animal Scream
44.406
4
46.47
2.064
Zombie Call
8.368
2
9.903
1.535
Twig Snap
14.914
2
16.476
1.562
Woman Scream
23.39
3
25.352
1.962
Monster Scream
32.152
4
34.361
2.201
Animal Scream
42.763
5
44.204
1.441
Zombie Call
15.100
3
17.502
2.402
Twig Snap
27.056
2
28.647
1.591
Woman Scream
33.713
3
35.253
1.54
Monster Scream
42.492
3
44.529
2.037
Animal Scream
49.586
3
51.569
1.983
Zombie Call
7.432
2
9.701
2.269
Twig Snap
15.497
1
17.041
1.544
Woman Scream
24.292
2
26.918
2.626
Monster Scream
46.402
3
48.172
1.77
Animal Scream
68.953
4
70.294
1.341
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 5]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
18.263
3
20.132
1.869
Twig Snap
39.960
2
41.529
1.569
Woman Scream
152.882
4
155.200
2.318
Monster Scream
329.045
4
330.676
1.631
Animal Scream
426.920
5
428.573
1.653
Zombie Call
21.725
3
25.061
3.336
Twig Snap
37.764
1
41.111
3.347
Woman Scream
69.260
4
71.274
2.014
Monster Scream
156.490
4
158.862
2.372
Animal Scream
196.107
5
197.300
1.193
Zombie Call
20.546
3
22.214
1.668
Twig Snap
58.828
2
61.42
2.592
Woman Scream
92.010
3
93.752
1.742
Monster Scream
112.593
4
115.441
2.848
Animal Scream
205.047
5
206.231
1.184
Zombie Call
19.882
3
21.584
1.702
Twig Snap
44.870
2
46.576
1.706
Woman Scream
71.932
4
73.336
1.404
Monster Scream
90.751
4
92.321
1.57
Animal Scream
183.45
4
184.847
1.397
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 6]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
39.809
4
50.620
10.811
Twig Snap
87.615
4
91.561
3.946
Woman Scream
243.115
5
245.415
2.300
Monster Scream
321.840
5
323.993
2.153
Animal Scream
373.000
5
375.378
2.378
Zombie Call
32.089
3
35.058
2.969
Twig Snap
44.235
3
47.371
3.136
Woman Scream
60.797
2
62.853
2.056
Monster Scream
72.897
2
75.031
2.134
Animal Scream
95.702
2
97.754
2.052
Zombie Call
7.249
4
9.418
2.169
Twig Snap
24.319
4
25.668
1.349
Woman Scream
115.017
5
116.992
1.975
Monster Scream
220.860
4
222.898
2.038
Animal Scream
675.282
5
677.953
2.671
Zombie Call
28.672
3
31.208
2.536
Twig Snap
41.071
3
43.420
2.349
Woman Scream
53.982
4
55.165
1.183
Monster Scream
208.380
4
211.888
3.508
Animal Scream
323.467
5
335.679
12.212
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 7]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
11.099
1
14.903
3.804
Twig Snap
23.642
1
25.714
2.072
Woman Scream
38.999
2
41.563
2.564
Monster Scream
52.756
3
55.627
2.871
Animal Scream
68.198
3
70.758
2.560
Zombie Call
10.864
1
13.466
2.602
Twig Snap
29.499
1
31.751
2.252
Woman Scream
47.142
2
49.369
2.227
Monster Scream
60.263
3
63.426
3.163
Animal Scream
75.655
3
78.198
2.543
Zombie Call
13.065
1
15.701
2.636
Twig Snap
26.940
1
29.582
2.642
Woman Scream
49.243
2
51.103
1.860
Monster Scream
69.460
2
71.891
2.431
Animal Scream
129.070
4
131.751
2.681
Zombie Call
13.346
2
16.817
3.471
Twig Snap
27.659
1
30.864
3.205
Woman Scream
44.122
3
47.681
3.559
Monster Scream
57.692
3
61.061
3.369
Animal Scream
67.985
4
71.071
3.086
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 8]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
8.894
1
11.153
2.259
Twig Snap
17.437
1
19.394
1.957
Woman Scream
27.234
2
29.004
1.770
Monster Scream
34.424
2
36.512
2.088
Animal Scream
41.370
1
43.385
2.015
Zombie Call
10.548
1
12.165
1.617
Twig Snap
17.220
1
19.924
2.704
Woman Scream
47.800
1
49.456
1.656
Monster Scream
56.897
1
60.497
3.600
Animal Scream
76.517
3
78.148
1.631
Zombie Call
6.544
1
9.014
2.47
Twig Snap
12.694
1
14.819
2.125
Woman Scream
22.054
2
23.728
1.674
Monster Scream
30.808
3
32.904
2.096
Animal Scream
42.339
3
44.316
1.977
Zombie Call
6.881
1
9.183
2.302
Twig Snap
13.349
1
15.056
1.707
Woman Scream
24.874
3
26.667
1.793
Monster Scream
32.219
1
34.308
2.089
Animal Scream
42.723
4
44.519
1.796
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 9]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
20.665
3
19.718
0.947
Twig Snap
101.331
3
99.326
2.005
Woman Scream
143.985
4
142.184
1.801
Monster Scream
305.884
3
303.772
2.112
Animal Scream
410.351
5
407.841
2.510
Zombie Call
5.687
3
3.684
2.003
Twig Snap
24.953
3
22.152
2.801
Woman Scream
86.64
5
84.162
2.478
Monster Scream
149.561
4
146.893
2.668
Animal Scream
201.561
5
199.456
2.105
Zombie Call
23.347
3
21.045
2.302
Twig Snap
98.78
4
96.666
2.114
Woman Scream
126.648
4
125.05
1.598
Monster Scream
200.154
3
197.208
2.946
Animal Scream
245.228
4
241.893
3.335
Zombie Call
15.014
3
13.222
1.792
Twig Snap
90.15
3
89.129
1.021
Woman Scream
123.447
4
122.215
1.232
Monster Scream
143.687
5
142.02
1.667
Animal Scream
187.186
5
186.178
1.008
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 10]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
12.758
4
8.917
3.841
Twig Snap
165.675
4
162.218
3.457
Woman Scream
256.98
5
253.875
3.105
Monster Scream
304.445
5
301
3.445
Animal Scream
381.875
5
378.453
3.422
Zombie Call
38.155
2
35.698
2.457
Twig Snap
184.765
1
182.79
1.975
Woman Scream
246.557
3
244.085
2.472
Monster Scream
305.48
3
302.694
2.786
Animal Scream
332.564
4
329.249
3.315
Zombie Call
10.547
4
9.062
1.485
Twig Snap
104.65
3
102.654
1.996
Woman Scream
201.756
5
200.056
1.7
Monster Scream
246.56
5
243.798
2.762
Animal Scream
346.982
5
345.445
1.537
Zombie Call
16.975
3
12.96
4.015
Twig Snap
83.954
3
81.37
2.584
Woman Scream
156.655
4
153.501
3.154
Monster Scream
186.645
4
182.749
3.896
Animal Scream
280.114
4
275.493
4.621
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 11]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
15.947
3
14.3
1.647
Twig Snap
46.471
3
43.984
2.487
Woman Scream
319.775
3
317.298
2.477
Monster Scream
401.607
4
400.117
1.49
Animal Scream
429.647
3
427.238
2.409
Zombie Call
9.995
2
8.121
1.874
Twig Snap
30.017
1
28.063
1.954
Woman Scream
56.782
3
54.586
2.196
Monster Scream
76.775
3
73.094
3.681
Animal Scream
94.504
4
91.479
3.025
Zombie Call
23.881
3
20.424
3.457
Twig Snap
46.345
2
44.86
1.485
Woman Scream
91.015
4
88.864
2.151
Monster Scream
104.666
4
101.721
2.945
Animal Scream
180.264
4
177.267
2.997
Zombie Call
27.993
3
26.948
1.045
Twig Snap
40.677
2
39.023
1.654
Woman Scream
104.751
4
102.906
1.845
Monster Scream
156.364
4
154.52
1.844
Animal Scream
210.784
5
202.489
8.295
Pitch
3D Localisation
Loudness Boost
Quantitative Datasets obtained in Real-time During Gameplay [Player 12]
Game Type
Sound Event Name
Sound Event Time α
Intensity Rating
Audio Response Time β
Reaction Time α-β
Original
Zombie Call
56.751
3
52.076
4.675
Twig Snap
106.446
3
102.989
3.457
Woman Scream
241.915
5
237.891
4.024
Monster Scream
304.554
4
301.453
3.101
Animal Scream
361.457
5
356.119
5.338
Zombie Call
20.115
2
18.568
1.547
Twig Snap
34.197
1
30.74
3.457
Woman Scream
65.485
3
62.531
2.954
Monster Scream
81.467
2
79.82
1.647
Animal Scream
102.45
3
90.98
11.47
Zombie Call
63.473
4
62.021
1.452
Twig Snap
104.555
3
102.56
1.995
Woman Scream
156.778
5
154.293
2.485
Monster Scream
201.468
5
198.467
3.001
Animal Scream
267.648
5
265.866
1.782
Zombie Call
19.998
4
15.484
4.514
Twig Snap
101.798
3
98.541
3.257
Woman Scream
243.765
4
239.77
3.995
Monster Scream
311.645
4
306.691
4.954
Animal Scream
340.151
5
331.856
3.185
Pitch
3D Localisation
Loudness Boost