KISHORE Aswathy - School of Computing

Transcription

KISHORE Aswathy - School of Computing
'
$
Analysis Of EEG Data and Model Predictions
Aswathy Kishore
MSc Artificial Intelligence
Session 2010/2011
&
%
The candidate confirms that the work submitted is their own and the appropriate credit has been given
where reference has been made to the work of others.
I understand that failure to attribute material which is obtained from another source may be considered
as plagiarism.
(Signature of student)
Summary
The human brain is capable of a variety of functions such as cognition, sensory perception and
memory. One such function performed by the brain is visual attention. Visual attention can be defined
as the attention that is deployed to visual scenes around as we observe them. Several studies have
been conducted to obtain an understanding of the underlying mechanisms of the phenomenon. One
important focus of these studies has been to understand how the brain represents the visual information that enters it through the eye. As a result of these studies, it can now be said with a fair amount
of confidence that the brain uses what is called a ‘compositional representation’ to store visual information efficiently. This means that different features of the visual scene such as, the colour and shape
of the different visual stimuli in the scene are stored in different areas of the brain. These distributed
representations from the different areas are then combined or bound is some way to produce the overall representation. How, in fact, this binding together of information from different areas is done is an
issue that is being studied and researched extensively and this problem is called the ‘binding problem’.
This thesis studies a neural network model which puts forward a hypothesis about how the brain
solves the ‘binding problem’. A behavioural experiment based on attention was conducted at the
School of Psychology to verify this model and the hypothesis put forward by it. Brain activity during
this experiment was collected from the participants in the form of EEG signals. The aim of this thesis
is then, to analyse this EEG data and verify whether patterns of processing can be seen in the brain,
as suggested by the model. This exercise can then confirm the validity of the hypothesis. A detailed
study of the mechanism of visual attention, types of attention, the binding problem and the model is
also done as part of this thesis.
i
Acknowledgements
Firstly, I would like to thank my supervisor, Dr. Marc de Kamps for having provided me with the
guidance necessary for the project and for instilling confidence in me whenever things went wrong.
Secondly, I would like to extend my heartfelt gratitude to David G. Harrison, researcher at the Biosystems group of the University of Leeds, who in spite of being busy with his PhD, offered to help me
whenever I had trouble with the tools I needed for the project. I would also like to thank Melanie
Burke, Claudia Gonzalez and Jean-Franois Delvenne of the School of Psychology for having spared
the time to meet me and clarify my doubts regarding the experiment conducted at the School of Psychology and also for having provided me with the data required for the thesis. I am grateful to my
assessor, Professor Netta Cohen, for offering valuable suggestions and feedback during the progress
meeting. Despite the project not being his responsibility, my tutor, Dr. Hamish Carr offered valuable
advice on how to go about the project. I would therefore like to thank him from the depths of my
heart. I would like to thank my fellow students Elaine and Amal for reading through my report and
providing feedback. Besides this, I would like to thank all my classmates, friends and family for all
the support they have given me during the course of the project.
ii
Contents
1 Introduction
1.1
1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.1.1
Visual Attention and The Binding Problem . . . . . . . . . . . . . . . . . .
1
1.2
Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3
1.4
Objectives and Minimum Requirements . . . . . . . . . . . . . . . . . . . . . . . .
List Of Deliverables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
4
1.5
Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
2 Background
2.1
2.2
2.3
6
Visual Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.1.1
The Visual Pathways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
2.1.2
Bottom-up and Top-down Visual Processing . . . . . . . . . . . . . . . . . .
8
Attention as a Performance Enhancement Mechanism . . . . . . . . . . . . . . . . .
9
2.2.1
Spatial Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.2.2
Non-spatial Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
Attention as a Representational Mechanism . . . . . . . . . . . . . . . . . . . . . .
13
3 Pre-processing and Analysis
16
3.1
The Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
3.2
Electroencephalography (EEG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
3.2.1
Event-Related Potentials and Evoked Potentials . . . . . . . . . . . . . . . .
18
3.3
The Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
3.4
Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
3.4.1
Stages of Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . .
20
3.4.2
Iteration 1 of Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . .
22
3.4.3
Iteration 2 of Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.4.4 Iteration 3 of Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . .
Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
28
3.5.1
28
3.5
Independent Component Analysis (ICA) . . . . . . . . . . . . . . . . . . . .
iii
4 The Model
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
33
4.2
The Feedforward Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
4.3
The Feedback Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
4.4
Dynamic Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
4.5
The Disinhibition Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
4.6
Lateral Inhibition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
4.7
Implementation of a traversal program . . . . . . . . . . . . . . . . . . . . . . . . .
41
5 Evaluation
43
6 Conclusion
47
6.1
Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bibliography
48
49
A Personal Reflection
51
B Interim Report
53
C Schedule
54
D Tools and Resources used
57
iv
Chapter 1
Introduction
1.1 Overview
For years, scientists have studied the human brain, trying to understand its organisation and how it
works. Attempts have been made to roughly define how the human brain performs various functions
such as cognition and sensory perception. An understanding of these aspects could help in the design
of intelligent agents in the future. However, we still have a long way to go in understanding the
organisation and information flow within the brain.
1.1.1 Visual Attention and The Binding Problem
Visual Attention is one of the functions of the brain which has been studied in great detail. Visual
attention refers to the attention deployed when we perceive things around us. In other words, it is
the attention deployed to visual stimuli. As we observe our surroundings, we attend to objects which
either stand out from the rest or hold some significance or relevance to us. For example, if we consider
a scene where everything is either black or white, an object in a colour such as bright yellow would
draw our attention to it. Similarly, suppose we are shown a scene in which an object each is present
for each of the seven colours. If we are instructed to look at the yellow object in the scene, our
attention automatically shifts to the yellow object irrespective of what we were attending to earlier.
In the first case, the yellow object caught our attention because it stood out from its surroundings. In
the second scenario, the task to be performed caused our attention to shift. Several studies have been
conducted to understand the underlying mechanisms for these phenomena and some of these studies
are described in Chapter 2. The focus of this thesis on the second scenario where attention has to be
deployed according to a given task.
1
According to previous studies (for example, [1], [2]), when perceiving a visual stimulus such as
a simple coloured object, the brain stores information about it using what is called a ‘compositional
representation’. Accordingly, different features of the object such as shape, texture and colour will
be represented in different parts of the brain. Hence, in order to have a complete representation
for the object, these individual localised representations have to be bound together to form a global
representation of the object. In the area which stores information about a feature, different neurons
are selective to different values of the feature. Thus, if a red circle is presented in the scene, in the area
where colour information is stored, neurons which code for the colour ‘red’ will be active. Similarly,
in the area which stores shape information, neurons which code for the shape ‘circle’ will be active.
These two sets of neurons together code for the ‘red circle’. The idea of a compositional representation
also exists in the field of computer programming where we use modules or functions to solve a single
problem. This representation provides an efficient storage mechanism and the same is the case in the
human brain. However, in the human brain, this leads to another problem as discussed below.
When multiple objects of different types are present in the visual scene, the situation changes
from that discussed above. If for example, we have a ‘red circle’ and a ‘blue square’ in a visual scene,
neurons representing the colours ‘red’ and ‘blue’ will be active and those representing the shapes
‘circle’ and ‘square’ will be active. When using the ‘compositional representation’ assumption, this
would lead to confusion since we cannot be sure whether to associate the colour ‘red’ to the shape
‘circle’ or to ‘square’. This uncertainty in associating a feature representation with its corresponding
object is referred to as the ‘binding problem’. However, if we are asked to identify the shape of
the red-coloured object, we can easily say that it is a circle. Hence, there has to be a mechanism
using which, our brain associates the colour ‘red’ with the shape ‘circle’ for the first object, and the
colour ‘blue’ with the shape ‘square’ for the second object. Some scientists believe that it is based
on the synchronous activity of the neurons i.e. the neurons which code for ‘red’ and ‘circle’ fire
synchronously and those for ‘blue’ and ‘square’ fire synchronously.
The model in [1], on which this thesis is based, puts forward a different hypothesis. It proposes
that binding of the different features to the appropriate object takes place using information about
the retinotopic position of the object. Retinotopic position is the position of the object as received
by the retina. A detailed study of the model is given in Chapter 4. In order to verify this model
and the hypothesis of binding by retinotopic position as suggested by it, a behavioural experiment
was conducted on human participants at the School of Psychology, University of Leeds. Responses
from the participants were collected in the form of EEG signals. Analysis of these EEG signals could
provide evidence for the hypothesis. The experiment and the analysis of the EEG signals have been
described in detail in Chapter 3. In this thesis, we analyse the results of this experiment and the
possible conclusions we could derive about visual processing in the brain from these results.
2
1.2 Aim
The overall goal of this thesis is to analyse the results of the experiment conducted at the School of
Psychology in order to verify the neural network model in [1] and hence, the principle of binding by
retinotopy as the mechanism used when visual attention is deployed in the human brain.
Originally, there was an additional goal to find a means of mapping the neural model to a model of
the human brain using a tool called ‘QTriplot’. In principle, this mapping could then help us generate
an EEG signal from the model much like the EEG signals recorded during the experiment. In order
to implement this mapping, a detailed study of the connectivity structure of the different visual areas
of the brain was also to be done. The thesis was thus, initially composed of the following main
components as indicated in figure 1.1:
• analysis of the EEG data obtained from the experiment at the School Of Psychology
• developing and implementing a method to derive a predicted EEG signal from the model
• comparing the predicted signal to the observations made on analysing EEG data
Figure 1.1: Initial aim of the project and its stages
However, as the project progressed, it was found that there were restrictions on the tool, QTriplot,
which was to be used to generate the predicted EEG signal from the model. It was pointed out that
the implementation of the mapping was quite complicated and that it would probably require more
time than is actually allotted to the entire project. Thus, I had to resort to a different approach. The
basic aim of the thesis and the scientific hypothesis on which it is based, however, still remained the
same. According to the new approach, the main focus of the project was thus shifted to analysis of the
EEG data obtained from the experiment using a signal processing approach. The conclusions drawn
from this can then be used to provide supporting evidence for the model or to shed some light on the
3
modifications to be made both to the model and the hypothesis. In addition to this, more emphasis was
placed on using CLAMVis, a tool that was built by David G.Harrison and Marc de Kamps, to visualise
the working of the model and to investigate how variations in the model could affect its working. This
enhanced emphasis was the result of inability to use QTriplot for the project.
To summarise, the aim of the project was revised and the new aim of the project was to analyse
the EEG data from the experiment to provide supporting evidence for the model. Also, the stages of
prediction and comparison of the simulated EEG signal as indicated in the lower half of fig 1.1 had to
be shelved for the time being.
1.3 Objectives and Minimum Requirements
The main objectives and minimum requirements addressed by the project are to be able to:
• Develop an understanding of the problem of visual attention.
• Develop a thorough understanding of the neural model representing the functioning of the visual
cortex when attention is deployed.
• Explain the rationale for the psychology experiment in connection with the problem of visual
attention.
• Understand what EEGs are and provide a description of the process of pre-processing and
analysing EEG data obtained from the experiment using a specialized MATLAB tool for EEG
analysis - EEGLAB.
• Run the simulation program for the neural model and visualise the working of the model.
1.4 List Of Deliverables
Since the main aim of the project is analysis of data rather than development of a complete software
product, the deliverable is more in the form of a report of the analysis rather than a final executable
file. The following are the expected deliverables:
• A report containing the results of analysis of EEG data
• The pre-processed EEG data and plots generated for analysis
1.5 Research Methodology
Since the project is research-oriented, I thought that it was inappropriate to use a conventional software development model such as the Waterfall model. The Waterfall model is a sequential software
development model. The requirements are fixed at the start of the project and these requirements are
4
used to guide the implementation and testing phases which follow afterwards.In the case of a research
project, the initial set of requirements is very vague and incomplete. The requirements take a more
concrete form during the course of the project and hence may have to be revised very frequently. Besides, there may be a change in direction as the project progresses. Thus I felt that an incremental or
evolutionary model of development would be best suited to go about the project. This methodology
gave me the flexibility to make changes to my aims and requirements as needed and allowed me to
add new ones when I had achieved some of the previous aims.
In accordance with this methodology, an initial schedule for the project was developed. This
schedule is shown in appendix []. This was however a rough schedule and not a fine-grained plan.
Writing up was not done during this period although regular notes were taken of the background reading completed. Background reading could not be confined just to the initial period of the project. The
background reading done during this period only served to attain a basic understanding of the problem.
During subsequent stages in the project, more background reading had to be done whenever I could
not make progress owing to doubts. Till the submission of the interim report, the work progressed
as per the schedule. After this however, in about the last week of July, there was a major change of
direction in the project and this led to an updated schedule which is indicated in appendix []. This also
led me to revisit the aims of the project and make revisions. This was done under the careful guidance
of my supervisor. During the progress meeting, the assessor also provided valuable suggestions on
how to go about the rest of the project. Accordingly, some of the sub-tasks to be done to realise the
goal had to dropped due to time constraints and hence more focus shifted to writing up the tasks which
had been done till then and also to extend these sub-tasks such that it was not constrained by the time
available.
5
Chapter 2
Background
2.1 Visual Attention
When we observe things around us, not all the visual stimuli captured by the eye are processed consciously in the brain. This is because the human brain has limited capacity and it cannot process all
the information that falls on the retina fast enough to elicit a timely response. The following experiment from [2] provides a better understanding of the problem of limited processing capacity of the
brain. Here, a human participant was instructed to identify the letters printed in a particular colour,
say, black, from an array of letters. The array contained letters printed in either black or white. This
array was flashed on the display for a very brief time interval. Figure 2.1 shows the display screens
for two such experimental trials. It was observed that the subject could pick out the black letter ‘N’
in condition b) faster than in condition a). The observation can be explained as follows. The black
letters are the target objects to be identified and the white objects are the non-targets. In b) there is
only one target object and it seems to stand out in the array. Hence the subject can easily identify it.
In a) however, there are three target objects. When attention is deployed to one of the target objects
in the scene, less attention is available to other target objects in the scene [2]. Thus if we increase the
number of target objects, the subject finds it more difficult to pick them out from the array.
The array of objects in the scenario discussed above is very simple compared to the visual scenes
we see around us. The visual scenes around us are complex, cluttered with noise and have many different types of objects. These information-rich scenes, combined with the limited processing capacity
of the brain, demand a mechanism by which only the relevant parts of the visual information are selected for further processing in the higher areas of the brain. This is where visual attention comes in.
Visual attention is the mechanism which enables the selection of the relevant visual information in
6
Figure 2.1: Experiment demonstrating limited processing capacity of brain (Image from [2])
order to generate an appropriate response. As suggested in [1], often, only a very small portion of all
the available information is found to be relevant to the task assigned.
2.1.1 The Visual Pathways
The visual information that enters the eye is sent to the brain for processing through the optic nerve.
Figure 2.2 shows the path from the eye to the brain.
Figure 2.2: pathway from the eye to the brain
(Image from [3])
The eye sends the optical signals to the Lateral Geniculate Nucleus (LGN) which in turn sends all
the information to the visual cortex where all the processing takes place.
Most studies on visual attention propose that processing in the visual cortex progresses along two
pathways.
7
• The ventral stream
This pathway is concerned with processing shape information. It starts at area V1 and ends at
the Anterior Inferotemporal Cortex (AIT) with intermediate areas V2, V4, and Posterior Inferotemporal Cortex (PIT) in that order. The receptive field sizes and complexity of the stimulus
to which the cells respond increase as we go from V1 to AIT. Hence the cells in V1 respond
to simple features like lines of a particular orientation and those in AIT respond to complex
shapes.
• The dorsal stream
It processes information on position of stimuli to control eye movements. It starts at V1 and
ends at the parietal cortex.
These two visual pathways are shown in figure 2.3
Figure 2.3: The Dorsal and Ventral Streams
(Image from [4])
2.1.2 Bottom-up and Top-down Visual Processing
Desimone and Duncan give a basic understanding of the two types of processing in [2]. Bottom-up
visual processing is task-independent and automatic. When a stimulus is present in the visual field,
neurons in the different visual areas are activated. Bottom-up visual process depends heavily on the
saliency of the stimulus. If the stimulus is something that has never been encountered before or if it
stands out from the surroundings, it immediately captures our attention. This pop-out effect is often
explained by bottom-up visual processing. However, bottom-up processing alone is not sufficient for
the survival of the human species. Hence we have top-down visual processing.
Top-down visual processing is task-dependent and it creates the attentional template. Here, attention is used to enhance processing in particular locations of interest and produce an appropriate
response according to the task. Top-down processing can be categorised according to the task involved. Most often, both top-down and bottom-up processing are concurrently done and there is
8
an interaction between the top-down and bottom-up activations in the visual hierarchy to produce a
response.
2.2 Attention as a Performance Enhancement Mechanism
Most studies regard attention as a performance enhancement mechanism. When visual stimuli are
presented, the neurons in the visual areas fire according to their selectivity to the stimuli. The idea
of attention as a performance enhancement mechanism suggests that focusing attention on a stimulus
enhances performance of the neuron by, for instance, a change in its activation levels or firing rates.
This means that when we deploy attention towards a particular stimulus, it receives increased processing by the brain. Attention as a performance enhancement mechanism can be broadly categorised into
two types on the basis of the criteria for directing attention i.e. the task involved. These two categories
are spatial attention and non-spatial attention.
2.2.1 Spatial Attention
Spatial attention or location-based attention is the mechanism used when we attend to particular locations of interest in the visual scene. Here the location where the attention has to be directed is known
already and enhanced processing by the brain occurs at that location.
Since location is predefined in spatial attention, it is widely suggested that feedback activation
from upper visual areas is sent back to only those lower visual areas corresponding to that location
(for example in [2]). This means that psychological or behavioural measures such as the reaction times
to tasks involving spatial attention are relatively low. There could also be higher baseline activation
for the same reason. As indicated in [2], a study by Luck et al [5] of the macaque brain observed that
when a cue regarding the location was given to the monkey, there was already an increased level of
neural activity even before the display of the final target array.
One famous study to demonstrate spatial attention was given by Moran and Desimone in [6], using
a single-cell recording experiment to record responses of cells in V4 and IT (Inferotemporal) cortex.
After choosing a neuron whose activity was to be recorded and calculating the size of its receptive field
( The receptive field of a nueron is the part of the visual field which affects its activity ), an effective
stimulus and an ineffective stimulus were found out for the neuron. An effective stimulus causes the
neuron to fire vigorously while an ineffective stimulus produces very little response in the neuron.
In the experiment, a monkey was trained to do a simple match-to-sample task. Two stimuli were
displayed in the visual field and the monkey had to focus on the stimuli appearing at one location and
ignore the other. In order to perform the task, the monkey initially had to fixate on a spot in the visual
field. A ‘sample stimulus’ was then displayed at one of the two locations. This is an indication of the
location to which the monkey had to attend. After a delay period, another stimulus was presented at
the same location and the monkey had to respond by releasing a bar when the stimulus matched the
sample stimulus. The monkey was rewarded when it gave the correct response. Figure 2.4 gives a
9
pictorial description of the task.
Figure 2.4: Experiment demonstrating deployment of Spatial Attention (Image from [6])
In the case where both the effective and ineffective stimuli were present in the receptive field of
the recording neuron (condition A in figure 2.4 ), the following observation was made. When spatial
attention was deployed to the location containing the effective stimulus, the response was increased.
On the other hand, when the monkey attended to the ineffective stimulus, the response was reduced in
spite of the presence of the effective stimulus in the receptive field. Thus, the response was dependent
on whether the target stimulus was effective or ineffective. Moran and Desimone in [6] describe that
“ the receptive field has contracted around the attended stimulus”. This phenomenon is called taskbiased competition. The competition between the different stimuli for the receptive field is biased in
favour of the stimulus relevant to the task [2]. Task-biased competition is however, not specific to
spatial attention alone. It applies to non-spatial attention as well. In the context of spatial attention
however, the knowledge of location to be attended in advance biases the attention in favour of the
stimulus appearing at that location.
In condition B in figure 2.4, the effective stimulus was placed in the receptive field and the ineffective stimulus was placed outside. Here, regardless of whether the attention was directed to the
stimulus in the receptive field or not, the cell gave good response. This means that there was no attentional modulation when one of the stimuli was outside the receptive field. This again, is evidence
for task-biased competition. Here, since only one stimulus is within the receptive field, there is no
competition involved and hence, no attentional modulation [2]. This means that objects outside the
10
receptive field do not carry any significance when attending to something. This also underlines the
need to bring the object into the receptive field to focus attention, in other words, the need to foveate
. When attention is directed outside the receptive field, the response is probably due to the bottom-up
processing.
2.2.2 Non-spatial Attention
Non-spatial attention refers to all forms of attention which come into play when we focus on other
aspects or features of the visual scene rather than a particular location of interest. These can be
coarsely divided into two:
• Feature-based attention
It is the mechanism involved when we try to attend to particular features in the visual stimuli
and it is independent of the location of the stimuli. Feature-based attention is most often the
basis for a perceptual task that is used in experiments the visual search.
• Object-based attention
It is used when the focus is on particular objects in the visual scene. It is however not possible
to distinguish clearly between feature-based and object-based attention. An object can be considered as being made up of a combination of many features. Hence, when directing attention to
an object, we are indirectly focusing on a set of different features. Thus it would not be wrong
to say that object-based attention is a form of feature-based attention.
In non-spatial attention, the location to which attention is to be directed is unknown. This means
that feedback activations from the upper visual areas should be fed back to the lower visual areas
corresponding to all locations i.e. non-spatial attention is pervasive. A feature matching procedure is
then performed at all locations. The object or feature of interest is then in that part of the visual field
which contains the maximum number of matching activations. This is how the location information
is obtained in non-spatial attention. This location information is then fed to the dorsal stream and the
necessary behavioural response is generated.
Two experiments which demonstrate non-spatial attention are described here. One experiment to
describe visual attention in monkeys was done by Chelazzi et al in [7]. Here, the monkey was required
to perform a visual search task. The monkey had to attend to a cue object on a display screen for a
few milliseconds, followed a delay period when the screen was cleared. After the delay period, the
monkey was presented with an array of objects. The monkey was rewarded for making eye movements
(saccades) to the target (cue) object i.e. for identifying the cue object displayed earlier from among
the array of objects. Neuronal firing rates were recorded using electrodes inserted into the monkey
brain. Figure 2.5 gives a rough idea of the visual search task, the monkey performs.
During the display of the cue, cells of the inferior temporal cortex selective to the cue are activated
and this activation is maintained throughout the delay period although reduced in strength. When the
11
Figure 2.5: The Visual Search Task ( Image from [1] )
array of objects is displayed, the cells selective to each of these stimuli are initially activated, but
ultimately, the activation corresponding to the target object is high while those of the ‘distractor’
objects is gradually reduced. Figure 2.6 is a pictorial representation of this.
Figure 2.6: Activations in Inferior Temporal Cortex during Visual Search (Image from [7])
Prior to the saccade to the target object, the neurons selective to the target object and those selective
to the non-target objects compete for attention until those selective to the target object ‘win’. The type
of attention deployed in the experiment is ‘object-based attention’ since the knowledge of the cue
object is essential to focus attention.
Feature-based attention, is described in studies conducted in [8], [9] and [10], of the responses
MT neurons in the dorsal pathway. MT neurons are concerned with the perception of motion. An
12
experiment was conducted on monkeys to study feature-based attention, as follows. The monkey
was presented two random dot patterns one in the receptive field of the neuron being monitored and
which moved in the direction of preference of the neuron and the other outside the receptive field and
moving in either the preferred or anti-preferred direction of the neuron. The monkey was asked to
direct attention to the stimulus outside the receptive field. It was observed that the response of the
neuron was increased when the direction of motion of the dot pattern attended to (i.e. outside the
receptive field), was closer to the preferred direction of the neuron i.e. the gain in response of the
neuron is more when the feature (direction of motion) is similar to the preferred direction of motion
of neuron and least when the stimulus being attended to moves in the anti-preferred direction of
motion of neuron. The same trend of response was found in the neuronal responses even when animal
was fixating and not attending to either of the two stimuli. This is the idea put forth by the ‘feature
similarity gain model’ in [8] and [9]. Figure 2.7 shows the experiment and the neuronal response
observed during the experiment.
Figure 2.7: The feature similarity gain model (Image from [10])
2.3 Attention as a Representational Mechanism
It is suggested in [1] that the brain stores visual information using a ‘compositional representation’ i.e.
different features are represented and processed by different sub-areas of the visual cortex. The entire
visual scene can then be represented as a composition or combination of the representations of the
different component features. As has already been discussed, the human brain has limited processing
capacity and hence cannot process all the visual information which enters the eye. The compositional
representation is therefore an advantage because it saves space and optimises the processing capacity
of the brain. This representation also provides for ‘systematicity’ and ‘combinatorial productivity’
as mentioned in [1]. If we consider separate classes to represent different features, an object can be
represented as a combination of its constituent features to form an overall feature representation. This
13
property is called productivity. The notion of systematicity allows us to obtain the constituent features
from the overall compositional representation. This property helps us to classify new objects into one
of the known classes and also helps us identify similarities between objects belonging to the same
class and differences between objects of different classes.
The alternative to a compositional representation is a ‘conjunctive representation’. Consider that
we have objects which are defined by two main features - shape and colour. Then we can define
objects as ‘red circle’, ‘blue circle’ and ‘red triangle’. If we were to use a conjunctive representation,
then we would have to create separate representations for each combination of these two features.
Here, we consider only one object with two simple features. In real life however, a visual scene has
many objects with many features. The total number of possible feature combinations in the scene
is then very large and this would lead to a shortage of space for the representations. Besides, since
the conjunctive representation does not have the property of systematicity it would not be possible to
separate the representation into its constituent features. Thus we cannot identify similarities between
objects belonging to the same class. Since there are separate conjunctive representations for a red
circle and a blue circle, we cannot classify them as both belonging to the same class (i.e. circles).
Another problem of this representation is that novel objects cannot be classified. Due to the reasons
mentioned above, it is not likely for the human brain to use a conjunctive representation to represent
objects.
The compositional representation although advantageous, does have problems as mentioned in
the Chapter 1. The binding problem is one of the serious issues caused by the compositional representation. Several mechanisms have been suggested to solve this problem. One of them is to use the
synchronous activity of the neurons. According to this theory, it is considered that neurons representing different features of the same object fire synchronously. Hence if we have a blue square and a
red circle in the visual field, the neurons selective for ‘blue’ and ‘square’, representing the first object fire synchronously. Similarly, the neurons selective for ‘red’ and ‘circle’ representing the second
object fire synchronously, but not with neurons representing ‘blue’ and ‘square’. The synchronicity
mechanism does not take into consideration, attentional modulation of neuronal responses.
Another approach to solving the binding problem is use the retinotopic location to bind features
of the same object together. The lower visual areas such as V1 and V2 contain information about the
location of the target object or feature. This information is however, lost as we go up in the visual
hierarchy. Hence this location information is not present in higher visual areas such as V4 and AIT.
In order to obtain this location information, a feature-matching procedure will have to be invoked to
match the bottom-up activation in the hierarchy with the top-down activation. This will then provide
the exact location of the target stimulus. According to this theory, attention becomes a fundamental
representation mechanism instead of a performance enhancement mechanism. Here, spatial attention
is used to bind together, properties of a single object represented using a distributed, compositional
representation. This theory is supported by the Feature Integration Theory given by Treisman et al in
[11]. The feature integration theory suggests that attention provides the basis to integrate the different
14
features of the same object. It suggests that the features in the visual scenes are recognized in the
early stages of visual processing by bottom-up processing. However, the features are bound together
only when attention is focused.
In contrast to the Feature Integration Theory, the model in [1] is pervasive since it sends feedback
activations to all locations while Feature Integration Theory sends feedback activations only to one
single location and supports a spatial attention approach more than a feature-based attention approach.
By conducting the experiment and looking at the EEG data obtained from it, we aim to provide
evidence for this pervasive, feature-based attention and binding by retinotopic location. The analysis
of EEG data can confirm the hypothesis if we can get evidence for the information flow suggested by
the model from it.
15
Chapter 3
Pre-processing and Analysis
3.1 The Experiment
The experiment that was conducted at the School of Psychology is similar to the Chelazzi experiment
in [7]. There were 11 human participants in the experiment and each participant had to perform a
visual search task. Initially, the participant was asked to fixate on a display screen where a cue object
was displayed after a few milliseconds. This cue object was an indication of the task to be performed.
For example, a cue object would be a combination of an object (a circle, square, triangle or diamond
of a particular colour) and a letter (‘C’, ‘S’ or ‘L’). The objects used could be any of four different
colours - red, green, yellow or blue. An object with ‘C’ or ‘S’ used as a cue indicates to the subject
to look for an object with the same colour (C) or shape (S) as the object presented in the cue. The
letter ‘L’ appeared along with an arrow indicating to the participant to direct his eyes to that particular
location. The arrow could point in one of four different directions - upper left, upper right, lower left
or lower right.
The task to be performed was varied, based on three different conditions colour, shape or location.
The cue object was then cleared from the display and this was followed by a delay period. The delay
period was varied between 0.5 and 1.5 seconds. After the delay period, the array of objects was
displayed on the screen and the target object had to be chosen from this array by making a saccade
to it. The number of objects in the array was also varied in each task. The array could have 1, 2,
3 or 4 objects. This leads to total of 12 different conditions - 3 conditions based on task types x 4
conditions based on the number of objects in final object array. A single block was composed of 120
such trials and each subject had to go through 5 such blocks. The conditions for the tasks were varied
randomly and not according to any predefined order. During each trial, EEG signals were recorded
16
from the subjects using electrodes placed on the scalp. In addition, eye tracker data was also recorded
to obtain eye movements of the subjects. Figure 3.1(a) depicts the task and figure 3.1(b) shows the
experimental setup.
(a) The task
(b) The setup
Figure 3.1: The experiment
In the task in figure 3.1(a) for example, the cue object consisted of a red triangle and the letter
‘C’. This meant that the subject had to select a target object of red colour from the array irrespective
of the shape and location of the target object. Thus the subject had to make an eye movement to the
red circle in the upper left corner in this trial.
The main idea of the experiment is to verify the hypothesis of binding by retinotopic location as
suggested by the model in [1]. If the this hypothesis is correct, then we can see evidence for this in
the EEG signals recorded during the experiment. The model, based on the same hypothesis, has been
designed to simulate this experiment and the information flow indicated in the model will then be the
same as that indicated by the EEG data. By analysis of the EEG data, we can thus validate both the
hypothesis and the model. Besides, we can also verify whether there are any significant differences
between the three different conditions of the task colour, shape and location. In particular, we can
look for differences between brain activity during spatial and non-spatial attention. This is because
the location condition (‘L’) basically demonstrates spatial attention while the other two conditions
demonstrate non-spatial attention.
3.2 Electroencephalography (EEG)
Electroencephalography or EEG is an electrophysiological method used to record the electrical activity of neurons in our brain. Most early studies on the brain were done using invasive techniques
(single or multi-cell recordings) of the monkey brain since it bears a striking resemblance with the
human brain. Here, the electrodes are inserted into the brain directly and the neuronal responses of
a single cell or a group of cells are recorded. EEG, on the other hand, is a non-invasive technique
in which the electrical activity is recorded by using electrodes placed on the scalp. When measuring
17
the electrical activity by recording the EEG signal, we actually obtain the activity of a population of
neurons rather than that of a single neuron in the brain. As suggested in [12] the activity detected at
one electrode may not be solely caused by neurons near the electrode. It could also have contributions from neurons far off from the electrode. EEG is especially useful in neuroscience experiments
because, when tasks like visual search are given to a subject, the changes in the brain activity of the
subject can be recorded. This can help us identify the regions of the brain which play important roles
in cognitive and visual processing. This is what we intend to achieve with the experiment and the
analysis of results obtained from it.
3.2.1 Event-Related Potentials and Evoked Potentials
Event-related potentials (ERP) are means of measuring responses in the brain to various stimuli and
events. They are averaged measures. Not all of the activity indicated by the EEG recording is however
related to the stimulus or the task presented. Some of the activity is due to background processes in
the body, for example, due to movement of muscles. Each of these processes contribute to the peaks
and troughs that are observed in the ERP waveform.
As [12] suggests, the important components of the ERP that are relevant to the task can be identified by analysis. The ERP components can be classified into two types - exogenous and endogenous.
Exogenous components are those which appear to arise from the physical stimuli i.e. from an external
source. The exogenous components are also called Evoked Potentials. Endogenous components on
the other hand are processes related to the task occurring within the brain. This could include cognitive processes such as memory, thought and emotion. When analysing the ERPs, the emphasis is not
so much on the peaks and their positivity or negativity, instead, the timing, frequency and amplitude
of the peaks are more important [12] since we are measuring activity relative to events occurring at
certain points in time.
In order to record electrical activity from different regions in the brain, we place electrodes evenly
across the scalp, generally, according to the widely accepted 10-20 system of electrode placement.
Figure 3.2 shows how electrodes are arranged on the scalp according to the system.
The electrodes are named based on their location on the scalp. Each electrode name is a combination of one or more letters and a number. The electrodes on the left hemisphere of the brain
are numbered using odd numbers and those on the right hemisphere are numbered using even numbers. Those electrodes which are situated along the middle line are indicated by the letter ‘z’ . Those
which occur in the frontal lobe, parietal lobe, occipital lobe, temporal lobe and central region take ‘F’,
‘P’, ‘O’, ‘T’ and ‘C’ respectively in their names. Thus the electrode ‘Cz’ in figure 3.2 indicates the
electrode located in the central region of the brain along the middle line between the two hemispheres.
The electrical activity at any electrode is actually measured as a difference of voltages or electrode
potentials. The entire EEG signal is represented as channels of several signals. The method using
which the channel signal is represented is called a montage and there are mainly three ways to do this:
• Bipolar montage : A channel signal is obtained by calculating the difference in voltage between
18
Figure 3.2: Electrodes arranged according to the 10-20 system of electrode placement
two adjacent electrodes.
• Referential montage : One of the electrodes is chosen as a reference electrode and the channel
signal for any electrode is calculated as a difference between the particular electrode and the
reference electrode.
• Average reference montage : An average of the electrical activity of all the electrodes is calculated and this is chosen as the reference value. The channel signal for each electrode is then
measured with respect to this value.
3.3 The Dataset
The dataset from the experiment was obtained from Melanie Burke, Claudia Gonzalez and JeanFrancois Delvenne at the School Of Psychology, and David G. Harrison of the Biosystems group at
the University of Leeds. It contained the EEG recordings obtained from each participant in .cnt files.
The EEG recording was done using an referential montage with electrode ‘Cz’ chosen as the common reference. The .cnt file contains information about the electrode locations, electrode potentials
recorded in the time period for each electrode and the exact time of occurrence and type of each event
19
in this time period [13]. In this experiment, there are two types of events the presentation of the cue
(event ‘5’) and the presentation of the object array (event ‘6’). The time period of one complete run
of the experiment during which these two events take place is called a trial. Each block had the signal
recordings for 120 such trials and these trials belonged to any one of the 12 conditions. These 120
trials were recorded without any break in between. The .cnt file contained data as continuous values,
recorded for one entire block. There were five such blocks and hence there were 5 .cnt files per subject. One of these 5 blocks had 128 trials, of which 8 were practice trials to get the subject used to the
experimental paradigm and these had to be removed. Each .cnt file was around 140 MB. There were
11 subjects in total and hence 55 .cnt files. There was thus, nearly 8 GB of data to pre-process.
3.4 Pre-processing
The EEG signals in a .cnt file were in such a form that the signals for all the trials in a block were
concatenated together to form a single stretch of signal. The data at this point is very noisy and
contains a lot of irrelevant information which requires removing. This is what we intend to achieve
by pre-processing. Thus the significant portions of the EEG signal for the individual trials had to be
identified and extracted for further analysis. The pre-processing was performed using the MATLAB
tool, EEGLAB with the help of the tutorial in [14] and with the kind assistance from David G. Harrison
of the Biosystems Group. David did the EEG pre-processing for 5 subjects while I did the preprocessing for 6 subjects. The pre-processing stage includes various tasks such as removing the
neuronal responses created due to muscle artifacts and other background processes from the neuronal
recording (i.e. baseline removal) and handling of missing data.
Pre-processing of the data turned out to take up a lot of time during the course of the project since
the right parameters to be used had to be found out more or less experimentally. This stage alone had
to be performed three times since there were problems in identifying the relevant information and the
right parameters. A detailed explanation for the process is given below.
3.4.1 Stages of Pre-processing
Loading/Importing Data
The .cnt file is loaded into EEGLAB to perform the pre-processing. Here, we are prompted to make a
choice of the bit representation to be used for the EEG data. We can choose from a 16-bit and a 32-bit
representation. Initially, it was assumed that the bit representation affects only the precision of storage
of the EEG data. However, problems were noticed due to the wrong choice of bit representation as
will be explained in forthcoming sections.
20
Re-sampling of data
The data may have been sampled at a high rate when recording was done. However, not all of the data
collected is always necessary and hence we may have to re-sample the data to a lower rate. Sampling
data at a high rate when recording is advantageous and guarantees low risk since we are collecting as
much data as possible. If this huge amount of data is not deemed necessary we can re-sample it any
time to a suitable lower rate. Lowering the sampling rate reduces the storage space required for the
files.
Baseline Removal
The neural activity indicated by the EEG not only includes the task-related activity but also, the
activity due to background processes in our body. The normal functioning of the body produces
a particular level of neural activity. This can be classified as baseline activity. To obtain a proper
measure of the task-related activity in the brain, we have to remove the baseline activity from the
EEG signal. This is what is achieved in this phase. If it is not removed, we cannot differentiate
between the normal activity in the brain and the activity due to the task. One can also choose the
baseline as activity during any particular period during the task. The neural activity is then calculated
relative to this baseline activity.
Re-referencing the channel data
The EEG data may have been recorded using one particular electrode montage. For example, a referential montage may be used with one particular electrode, say ‘Cz’, as the reference. It is however,
possible to convert this recorded EEG data such that they are based upon a different electrode as reference, to provide for easier analysis. It is also possible to change the electrode montage. For example,
if a referential montage was used during recording, it can be changed to an average reference montage.
This would, however, not alter the data since as suggested in [14], re-referencing the data only leads
to a simple linear transformation of the data.
Filtering
The EEG data may contain artifacts due to activity in the muscles, cardiac activity, movement of
eyelids, and blinking of eyes. Blinking of eyes is one major source of artifacts in the EEG data. One
way to avoid this is to instruct the subjects not to do so, but, this would not be practical especially if
several trials of the task are done continuously over a long period of time. Besides, as indicated in
[12], it takes the concentration of the subject off the task. Hence it is easier to later on filter out these
artifacts. Another source of interference is artifacts from the power lines or in the connection wires of
the electrodes or the electrodes themselves.
21
Extraction of Epochs
Here, we extract the time periods of interest from the continuous stretch of signal. An epoch can be
defined as the signal of a particular time interval during the entire period of the trial, when an event
of interest takes place. These periods or epochs can then be extracted for further analysis. In this
experiment, there are two events of interest - the cue onset and the target onset. The cue onset is
the time point when the cue appears on the screen and the target onset is the point when the array of
objects is displayed. Hence we extract the epochs which contain these two events. We intend to look
at the effects of these two events on the processing of the brain in these intervals.
Selection of Epochs or Events
Once the epochs have been extracted, we get a concatenated set of epochs for the different trials in a
block. We may select specific epochs for categorising them into different classes. For instance, if we
consider this experiment, there are twelve different conditions or classes for each subject - based on
type of cue (C, S or L) and number of objects in the array(1, 2, 3 or 4). Hence we have conditions
- C1, C2, C3, C4, S1, S2, S3, S4, L1 and so on. Out of the extracted epochs, we have to select the
epochs which belong to each of these conditions and group them together. Here, we also make sure
that only those epochs in which the subjects chose the correct target are selected for further analysis.
3.4.2 Iteration 1 of Pre-processing
When the EEG data was recorded during the experiment, it was sampled at a rate of 1000Hz. In
iteration 1 of pre-processing, this sampling rate of 1000Hz was preserved. The .cnt file was loaded
using a 16-bit representation. The EEG measures were calculated by choosing the electrode ‘Cz’ as
the reference electrode and all slow artifacts below the frequency 1 Hz were removed using a high
pass filter. This could be used to remove the artifacts which occur due to tiredness or boredom in
subjects after a long period or functioning of sweat glands. The next step was to extract epochs for
the two events. Here, a single epoch was extracted which contained both the events. The cue onset
event was labelled ‘event 5’ and the target onset was labelled ‘event 6’. In order to obtain epochs, the
period from 1 second before event 5 (indicated as -1 seconds relative to event 5 ) till 3 seconds after
event 5 (indicated as 3 seconds relative to event 5) was extracted. Baseline activity was also removed
from this stretch of signal with the activity from -1 seconds to -0.5 seconds chosen as baseline. This
is the pre-processed data. When this was done, the epochs for the different conditions were selected
and grouped together. Figure 3.3(a) shows a plot of the EEG signal before pre-processing and figure
3.3(b) shows the signal after pre-processing.
The y-axis indicates the potential in microvolts at each of the 64 electrodes and the x-axis indicates the time in seconds. The signals show the activations recorded at each of the corresponding
64 electrodes. The red lines marked ‘5’ indicate the cue onset in the different trials and the events
‘6’ marked in green indicate the target onset. This entire procedure was repeated for each of the 11
22
Scale
483
−+
68
69
70
71
(a) Before pre-processing
5
6
5
4
6
5
3
6
5
6
5
2
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
Scale
520
−+
0
72
1
6
5
6
5
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
67
1000
−1000
0
1000
−1000
0
1000
−1000
0
1000
−1000
0
1000
(b) After pre-processing
Figure 3.3: Plots before and after pre-processing
subjects ( and for a total of 55 files ). As can be noticed from the figure, the signals appear as blobs
and cannot be clearly identified and this could be due to the fact that the sampling rate was very high.
Thus, a second round of pre-processing was done.
3.4.3 Iteration 2 of Pre-processing
The second round of pre-processing was done after consultation with the EEG analysts at the School
of Psychology. In the second iteration of pre-processing, after loading the data using a 16-bit representation, it was re-sampled to 250Hz. This made it easier to analyse as indicated in figures 3.4(a) and
3.4(b). The signals were more clear and similar to EEG signals and less like blobs as was seen in the
first iteration.
The next step was to remove the baseline activity from each channel. Here, the mean value of each
channel was chosen as the baseline. The mean value of each channel was thus, subtracted from the
rest of the channel signal. Once the baseline was removed from each channel, the electrode locations
were loaded and the recorded activity, re-referenced . Here, we used the average reference montage
to represent the EEG measure. When calculating the average, however, we excluded the electrodes,
HEOG and VEOG which record activity related to eye movements since it did not seem sensible to
include them while all we were interested in studying, was the activity in the brain areas. Next, we
filtered out the noise with frequency below 1 Hz. This removed the slow artifacts just as in the first
23
−1000
5
6
5
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
Scale
501
−+
2
3
4
5
6
7
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
Scale
7.251
+−
0
(a) Raw signal
1
2
3
4
(b) After resampling
Figure 3.4: Plots before and after re-sampling
iteration.
After filtering was done, we extracted epochs for the two events. Here, we extracted two separate
epochs for the two events event 5 and event 6 instead of just one epoch as in the first iteration.
Besides, the time intervals chosen for the epochs were also different. In order to obtain epochs for
event 5, we extracted the time interval from -0.2 seconds to 0.5 seconds relative to event 5. We also
removed baseline activity from this stretch of signal with the activity from -0.2 seconds to event 5 ( 0
seconds relative to event 5 ) chosen as baseline. This indicates the condition before the stimulus has
come on and hence provides a reasonable measure of the normal activity in the body. Similarly, to
obtain epochs for event 6, we extracted the signal from -0.5 seconds to 0.5 seconds relative to event
6. The baseline period chosen in this case was from -0.5 seconds to -0.3 seconds relative to event 6.
This is the period when the cue has been shown to the subject. Part of the initial neural activity in
this period is due to the effects visual processing of the stimulus. Most of the activity however can
be attributed to the memory retention of the cue object. Thus it seemed to be a reasonable period to
use as the baseline period for the participant’s task when the array comes on. After the extraction of
epochs, we selected the epochs for the different conditions as before. The figures 3.5(a) and 3.5(b)
indicate the images of the pre-processed data after extraction of the two sets of epochs. From the plots
of the extracted epochs, we can observe that some activations in the lower parts of the plots show
huge deflections. These are due the electrodes HEOG and VEOG, which record the horizontal and
vertical movements of the eye, respectively. This indicates that the participant moved the eyes during
24
5
this period and that these eye artifacts have to be removed. These artifacts are more pronounced in
figure 3.5(b) which shows activity before and after event 6, the target onset. It is during this period,
specifically, after event 6, that the saccade to the target object takes place. Hence there is a huge
Scale
13
−+
0
200
−200
0
200
−200
0
200
−200
0
200
−200
0
200
−200
4
5
6
3
6
2
6
1
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
6
18
6
17
5
16
5
15
5
14
5
5
deflection after event 6.
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
Scale
4.468
−+
−250
(a) Epochs for event 5
0
−500 −250
0
−500 −250
0
−500 −250
0
−500 −250
0
(b) Epochs for event 6
Figure 3.5: Plots for pre-processed, extracted epochs
However, when the pre-processed data generated from the second round of pre-processing was
evaluated, it was found to be faulty. The timing between the two events in the EEG data seemed
to have been doubled when compared with the eye-tracker data for the same two events. This is
explained in detail in the evaluation section in Chapter 5. This mismatch of time between the EEG
data and the eye-tracker data was found to be due to a wrong choice of bit representation. Thus a third
round of pre-processing was done to confirm this. However, due to limitations in the time available,
the pre-processing of all the subjects could not be done. Hence only the pre-processing of data for 2
subjects was done to make sure the right technique was used.
3.4.4 Iteration 3 of Pre-processing
In the third iteration of the pre-processing, the bit representation used was changed to a 32-bit representation unlike the 16-bit representation used in the first two iterations. The raw signal now looked
more reasonable in terms of similarity to an EEG signal as shown in figure 3.6. Besides, the timing
between events in the eye-tracker data seemed to match the timing in the signal plot.
25
−500
6
5
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
36
Scale
16
−+
37
38
39
40
41
Figure 3.6: Raw Signal using a 32-bit representation
The data was then re-sampled to 250Hz (figure 3.7(a)) and the baseline was removed as in the
second iteration (figure 3.7(b)).
Next, we re-referenced the EEG measure to an average reference. As re-sampling, baseline removal and re-referencing of the EEG data are carried out, we see that some electrode activations which
were not present in the earlier stage start appearing. This could probably, only be explained with the
guidance of the EEG analysts. Next, we filtered the noise and artifacts which have frequency less than
1 Hz ( figure 3.8 ).
Then, we extracted the epochs for analysis from the filtered EEG signals. The time intervals and
the baseline periods chosen for the events were the same as in the second iteration. After extracting the
epochs for the two events, the data looked like in figures 3.9(a) and 3.9(b). Although the filtered EEG
signals look more or less smooth, after baseline removal and extraction of epochs, we can clearly see
variations in the activity. As before, the lower parts of the plots show huge deflections in electrodes
HEOG and VEOG due to eye movements. These have to be removed.
Next, we select epochs for each of the 12 conditions and group them together. Each of these steps
has to be done on each block for each of the 11 subjects. However, due to limited time, this could be
done only for two subjects.
26
Scale
209
−+
37
38
39
40
41
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
36
6
5
6
5
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
36
Scale
209
+−
37
(a) Resampled data
38
39
40
41
(b) After removing baseline
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
36
Scale
179
−+
37
38
39
40
41
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
36
Scale
849
+−
37
(a) Re-referenced Data
38
39
(b) Filtered Data
Figure 3.8: Re-referenced and filtered data
27
6
5
6
5
Figure 3.7: Plots for data after re-sampling and baseline removal
40
41
0 200
−200 0 200
−200 0 200
−200 0 200
−200 0 200
Scale
18
−+
−200
43
44
6
42
6
41
6
40
6
30
6
29
5
28
5
27
5
5
5
26
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
−250 0
−500 −250 0
(a) Epoch for event 5
−500 −250 0
−500 −250 0
−500 −250 0
(b) Epoch for event 6
Figure 3.9: After Epoch Extraction
3.5 Analysis
In order to perform analysis of the EEG data, we use a technique called Independent Components
Analysis (ICA) which identifies the component activations that contribute most to an EEG signal and
the most likely source electrode of each component. There is an in-built tool in EEGLAB which helps
in performing ICA on the data.
3.5.1 Independent Component Analysis (ICA)
Independent Component Analysis, as the name suggests, is a technique is used to identify the various
components present in a signal. It can be applied to different types of signals such as sound or speech
signals, or electrical signals such as EEG signals. As discussed in [15], a sound or electrical signal is
most often composed of several underlying signals from various sources. Thus, the signal is a mixture
of these ‘source signals’. ICA splits this single signal into the different source signals. In order to
do this, it makes an assumption that the different source signals which form the observed signal are
independent of each other i.e. the value of the one source signal bears no relation with the value of any
other signal at any given point of time. This is explained clearly in [15] using an example. Consider a
speech signal obtained from the combination of two speech signals from two different people, within
the same time frame. The value of the speech signal of one person at a point of time is independent of
the value of the other person’s speech signal. ICA uses this information to split the observed speech
28
Scale
17
+−
−500
signal into the two speech signals from two independent sources. The same principle can be applied to
the EEG signal. The EEG signal is essentially a mixture of signals from several neuronal populations.
If these populations are considered to be independent sources, the EEG signal can be split to generate
the source signals from each of these populations. Thus the important components their sources can
be identified. In EEGLAB, these components can be visualised as scalp maps or as ERPs.
As suggested in [12], the signal to noise ratio of an EEG waveform is very low. This is because the
electrical activity produced by the activity in the brain is very small compared to the background noise.
In order to reduce the amount of noise, we generate a grand average waveform for each condition
across all subjects and then perform ICA on the resulting EEG signal. Thus, the data for different
subjects for each condition was appended to form a single set and ICA was performed on this. In the
second iteration, this was done by considering all 11 subjects. However, in the third iteration, due to
limited time, the grand average waveform was generated using two subjects and ICA was performed.
In order to generate the grand average waveform, the data for each of the twelve conditions,
irrespective of the subject was merged together. For instance, if we take condition C1, the epochs for
all subjects and for all blocks was merged into a single set. This was done for all 12 conditions to
generate 12 different data sets. The next step was to group all the ‘C’ conditions (i.e. C1, C2, C3, C4)
into a single condition ‘C’. Similar data sets were created for ‘L’ and ‘S’. Thus we now have three
data sets for each of the three main conditions. ICA can then be performed on these data sets.
Once ICA has been performed, we can then generate plots and scalp maps which compare the
component activations for different conditions. This can give us a rough idea about the sources of
brain activity in different conditions colour, shape and location. For example, figure 3.10 shows
the component activations for a visual search task where the subject is asked to look for a target
object which has the same shape as the cue object, from an array of objects. These are the most
prominent components in the average EEG signal for this condition. This along with the scalp map
for these different components can be analysed by EEG experts to identify the sources for each of
these components. The scalp map for the first component is given in 3.11
Figure 3.12 shows the channel activations for the same condition. In the figure, the VEOG channel
shows huge deflection. A zoomed-in picture of the VEOG electrode in 3.13 shows this. This indicates
the presence of eye artifacts and they have to be removed.
Figure 3.14 shows the comparison of component activations for two different conditions colour
and shape.
By looking at these plots generated, we can verify whether they suggest a pathway of processing
of signals in the brain as suggested by the neural model. This can thus help verify the binding by
retinotopy hypothesis suggested by the model. Similarly, by looking at the plots for each of the three
different conditions, we can analyse how the brain activity differs in each condition. The analysis of
these plots will however, have to be done by the EEG experts. At this stage, nothing can be done
since the plots generated are based on averaged data of two participants. This is most likely to be very
noisy. Thus, a proper analysis can only be done after the pre-processed data for all 11 participants is
29
all−05 5−S ERP
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
+1.32
64
−1.32
−500
496
Time (ms)
Figure 3.10: Component Activations for target selection based on shape
1
Figure 3.11: Scalp map for first component
averaged.
30
all−05 5−S
0
FPZ
FP1
HEOG
AF3
F7
T7
TP7
F3
FP2
AF4
F1
FZ
F2
F6
F4
F8
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
C5
C3
C1
CZ
C2
C4
C6
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
P7
P5
P3
PZ
P1
PO3
PO7PO5
POZ
O1
OZ
P2
P4
P6
FT8
T8
TP8
P8
PO4PO6
PO8
CB2
CB1
+15.2
O2
−15.2
−500 496
Time (ms)
Figure 3.12: Channel Activations for target selection based on shape
15
10
Potential (µV)
FT7
F5
VEOG
5
0
−5
VEOG
−10
−15
−400
−200
0
Time (ms)
200
400
Figure 3.13: Channel Activations for VEOG electrode
31
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
57
58
59
60
61
62
63
−2.69
56
+2.69
64
−500
496
Time (ms)
Figure 3.14: Comparison of activations for shape and colour conditions
32
Chapter 4
The Model
4.1 Overview
A neural model for visual processing based on feature-based attention was put forward in [1] to
demonstrate how the mechanism of deployment of visual attention is actually implemented in the
visual cortex. It shows specifically, how the information flows along the ventral stream during a visual search task much like the task in the Chelazzi experiment described in Chapter 2. This model
proposes that the binding of different features of an object is done on the basis of the retinotopic location. The model was developed using MIIND [16], a neuroscience framework made by Dr.Marc
de Kamps and David G.Harrison for implementing artificial neural networks. CLAMVis, a software
simulation for the neural models was also implemented by them and this tool was used to visualise
the information flow in the model. Figure 4.1 gives an idea of the information flow in the model.
The ventral stream in the model is represented as a hierarchy of 5 layers of the visual areas - V1,
V2, V4, PIT and AIT in that order. The dorsal stream consists of layers V1,V2, V4 and the parietal
cortex (indicated as PG in the figure). The ventral stream is concerned with the function of object
recognition. When the array of objects is displayed, the stimulus activations are communicated up the
hierarchy from V1. As we go up the layers in the hierarchy, the complexity of stimuli identified by the
cells increases. Thus, whereas the V1 layer is selective only to simple lines of particular orientations,
V2 responds to a set of two lines forming a slightly more complex feature. At the topmost level,
i.e. AIT, the individual objects are recognised. Here, the target object is identified from the array of
objects at the AIT layer. However, here we are faced with another problem. The lower layers in the
ventral stream are retinotopically organised. But this property is lost as we go up the hierarchy and
hence, AIT does not have position information of the target object. It provides a translation-invariant
33
Figure 4.1: The model (Image from [17])
recognition of objects i.e. AIT responds to the object it is selective for, irrespective of its position in
the visual field. The position information of the target object is however, necessary to make an eye
movement (saccade) to the target. This information is obtained by the interaction of the bottom-up
activation caused in the model by the stimulus and the top-down activation by the attentional template
in the ventral stream. The position information can then be transferred to the LIP (lateral intra parietal)
area of the parietal cortex where the necessary actions to make the saccade are taken. In order to model
these bottom-up and top-down activations, a feedforward and feedback network are used respectively.
4.2 The Feedforward Network
The feedforward network used to model the bottom-up processing is shown in figure4.2. The bottomup processing comes from the activations caused due to the presence of the stimulus in visual field.
For example, if a square, a cross, a circle and a triangle are present in the visual field, the square causes
activations in some V1 neurons which leads to the activation of some V2 neurons and so on until the
AIT neuron selective for squares is activated. Similarly, each of the other three objects produces its
own activations in the model. The receptive field increases as we go up the feedforward network.
This is modelled by having one V2 neuron see a 2x2 submatrix of V1 neurons, a V4 neuron see a 4x4
submatrix of V1 neurons and so on as indicated in [17].
Figure 4.3 shows a simulation of the working of the feedforward network when four objects (i.e.
a square, a diamond, a cross and a diagonal cross) are presented on the display screen during a visual
search task. The simulation was generated using CLAMVis.
The four layers on the top of the figure are the V1 layers. There are four V1 layers, each of which
34
Figure 4.2: The Feedforward Network (Image from [18])
Figure 4.3: Simulation of The Feedforward Network
are selective to lines of particular orientations. For example, the first layer in the figure is selective to
horizontal lines and the third layer is selective to vertical lines. The other two layers are selective to
diagonal lines. The second layer from the top represents V2, the third layer represents V4, the fourth
layer represents PIT and the last layer of 4 nodes represent the 4 AIT neurons selective to the four
objects. The neurons with high positive activation (+1) are indicated in red colour and those with high
35
negative activation (-1) are indicated in green color.
In the actual visual system, the information that enters the eye is taken via the optic nerve to the
Lateral Geniculate Nucleus (LGN) and then to the visual cortex. However, the LGN is not part of
the model. Thus, in order to indicate the presence of four stimuli or objects in the visual field, we
activate the corresponding layers of V1. Thus, to indicate the presence of a square in the top-left
corner of the display, we activate the neurons of layers 1 and 3 of V1 at their corresponding locations.
Similarly, neurons to represent a diamond in the top-right corner, a cross in the bottom-right corner and
a diagonal cross in the bottom-left corner, are activated. As the simulation progresses, these neurons
in turn activate other neurons in the layers V2, V4 and PIT. As a result of the stimulus activations due
to the four objects the four AIT neurons selective to these objects are activated, and hence they are
indicated in red.
4.3 The Feedback Network
The feedback network is essentially the reverse of the feedforward network i.e. it starts at AIT and
ends at V1. There are feedback connections from the higher layers to the lower layers in the feedforward network. Thus the connections in the feedback network have what is called a fan-out structure
as shown in figure 4.4. The purpose of the feedback network is to provide for the attentional template to aid in choosing the target object from the array. The target object has been identified by the
bottom-up activation in the feedforward network. By providing feedback connections, therefore, we
are providing identity information about the target object to the lower visual areas like V1 which have
retinotopic organisation.
Figure 4.4: The Feedback Network (Image from [18])
Figure 4.5 indicates a simulation of the feedback network.
Since the structure of the feedback network is the inverse of the feedforward network, the first
layer in figure 4.5 consists of the four AIT neurons. These are followed by PIT, V4,V2 and the four
36
Figure 4.5: Simulation of The Feedback Network
V1 layers. When the feedforward activations have been produced, the attentional template is indicated
in the feedback network by selecting the AIT neuron selective to the target object. This AIT neuron
is indicated in red in the first layer at the top of the image. As the simulation proceeds, this activation
produces activations in the layers PIT, V4, V2 and the four V1 layers. The activations in the other
layers indicated in red and green in the figure are thus due to the attentional template.
4.4 Dynamic Networks
As suggested in [18], modelling the nodes in the neural network as individual neurons is probably
not biologically plausible. If the model has to explain how visual processing actually takes place
in the brain, it should provide for quick processing and response and low firing rates in the neurons
as observed in its biological counterparts. In accordance with this, we can consider a perceptron to
represent the neural activity of a population of neurons rather than that of a single neuron [18]. This
however, may lead to spurious activity in the higher layers of the model, i.e. the higher layers may
show activity even if a stimulus is not present.
Consider a node in a neural network. The output of a node is given by 4.1 where each xi is an
input to the node and each wi is the weight of this connection. Here, function f is called the squashing
function of the neural network. f is usually the sigmoid function, indicated in 4.2. If this is replaced
37
by a new squashing function of the same form as indicated in 4.3, spurious activity in the network can
be done away with.
out put = f (∑ wi xi − θ )
(4.1)
i
1
1 + e−β x
(4.2)
2
−1
1 + e−β x
(4.3)
f (x) =
f (x) =
This new squashing function however, causes a perceptron to produce negative activations at
times. To solve this problem, we can replace each perceptron by a circuit as shown in figure 4.6
Figure 4.6: The Circuit (Image from [18])
The circuit is made up of excitatory populations P, N, E p and En . Ip and In represent inhibitory
populations. The connections between these populations are either inhibitory (indicated by black
triangles) or excitatory (indicated by white triangles). An inhibitory connection tries to suppress the
activity of the neuron to which it is connected while an excitatory connection tries to raise its activity.
If for example, we consider the scenario where the input labelled Jp is active, the populations E p and
Ip are activated. This causes the population P to be activated due to the excitatory connection from E p
to P and the population N to be inhibited due to the inhibitory connection from Ip to N. If there is no
input present at Jn , this results in a net ‘positive’ activation from the entire circuit. Similarly, if input
is present at Jn instead of Jp , it leads to a net ‘negative’ activation from N. The positive or negative
activations, however, only serve to indicate the population which is dominant in the circuit and are
basically represented as spike rates of populations. The output from the circuit at any point in time is
either positive or negative, i.e. it cannot be both positive and negative at the same time.
Each of the nodes in the feedforward and feedback network is thus replaced by the circuit shown
38
in figure 4.6 and this produces what are called the dynamic feedforward and feedback networks.
4.5 The Disinhibition Network
The disinhibition network is where the interaction between the top-down activation due to the attentional template and the bottom-up activation due to the stimuli takes place. A detailed description of
the working of the disinhibition circuit is given in [19]. The disinhibition network is essentially not
an artificial neural network, but a simple circuit formed from the shaded neurons as shown in figure
4.7. The nodes labelled Pf and N f indicate the positive and negative activations from a single node
in the dynamic feedforward network and Pr and Nr represent the corresponding activations from the
feedback network. G p and Gn are gating nodes which inhibit the excitatory nodes E p and En . They
are in turn, however inhibited by the inhibitory nodes Ip and In . The nodes E p and En send their output
to the lateral intraparietal area (LIP).
Figure 4.7: The Disinhibition Network (Image from [19])
Consider the case when Pf is active. It drives the nodes E p and G p due to the excitatory connections. Although G p inhibits E p , this inhibition comes into play only after some time since there is an
extra connection along this path. Thus E p initially produces some activation and then dies down when
it starts receiving inhibitory signals from G p . Similar is the case when N f becomes active.
Now, consider the case where there is a match between the stimulus activation (bottom-up activation) and the attentional template (top-down activation) at a node. Then, either the nodes Pf and
39
Pr or the nodes N f and Nr in the circuit will be active simultaneously. Take the case where Pf and Pr
are active. Now, Pf drives E p and G p which causes E p to be active for a brief period of time before
G p inhibits its activity. However, now Pr is also active and it drives the inhibitory node Ip which in
turn inhibits the gating node G p . Thus E p continues to be active and sends a positive activation to the
LIP region. This is how feature-matching is implemented in the model. All those nodes which have
matching top-down and bottom-up activity indicate presence of matching features at that location.
Thus the area which has the maximum number of matches is a strong indication of the location of the
target object. Similar dynamics is observed in the circuit when N f and Nr are active at the same time.
4.6 Lateral Inhibition
The phenomenon of task-biased competition as explained in Chapter2 is implemented in the model
using lateral inhibition. The node labelled LI in figure 4.7 serves to provide lateral inhibition. When
either Pf and Nr or N f and Pr become active together at a node, we have a mismatch of features at that
particular location or node. If say Pf and Nr are active together, the activation in Pf makes E p active
for a brief period, but after a while, the inhibition from G p makes it inactive. At the same time, the
activation of Nr makes LI active which in turn sends inhibitory signals to the neighbouring E p or En
nodes. Thus in the case of mismatch at a node, lateral inhibition ensures that the nodes in the vicinity
of the mismatched node are made less active. This is done since absence of the feature match at a
node is an implication that the neighbouring nodes will also be different.
If the LI node was present alone, a feature match would also lead to lateral inhibition of the
neighbouring nodes. In order to prevent this, a node labelled ILI is used. As indicated in figure 4.7,
E p is connected to ILI through an excitatory connection. When Pf and Pr are active together, a match
occurs, resulting in E p becoming active and activating ILI in turn. The ILI node has an inhibitory
connection to the LI node and hence suppresses the activity of the LI node. This makes sure that
lateral inhibition of activity of neighbouring nodes is not done when there is a match of features. A
detailed account of how lateral inhibition is implemented in the model is given in [19].
As part of the project I had to study variations in the circuits in the model. One such variation was
to study the effect of removing the circuits which implement lateral inhibition in the network. This is
indicated in the figure 4.8.
As shown in figure 4.8, on removing lateral inhibition, it was found that the output to the LIP layer
could not indicate clearly, the location of the target object. In figure 4.8, the set of figures in the first
column indicate the activation in the feedforward network representing the stimulus activations when
the array of objects is displayed. The second column shows the activations in the feedback network
due to the attentional template. The third column indicates the activity in the disinhibition network
due to the feature-matching of activations in the feedforward and feedback networks. The last column
indicates the output in the LIP. When there is no lateral inhibition, the activity is distributed in the LIP
layer among the four different locations and we cannot state without doubt the location of the target
40
(a) Without Lateral Inhibition
(b) With Lateral Inhibition
Figure 4.8: Activations with and without Lateral Inhibition (Images from [19])
object. This is however, not the case in the figure on the left hand side. Here, there is activity clearly
visible in the top-left corner. Thus we can clearly conclude that the target object is on the top-left
corner.
From this observation, we can conclude that lateral inhibition is necessary in the network. If we
examine human behaviour in the visual search task, we can see that humans do not find it difficult to
arrive at a decision about the location of the target object. They can clearly identify the location of
the target object. In this aspect, human behaviour is similar to the model implemented with lateral
inhibition. However, whether this behaviour is achieved using a similar mechanism of inhibiting the
activity of the neighbours needs to be verified and this can be done by the analysis of the EEG data and
by generating a simulated EEG signal prediction from the model. This again, is one reason why the
experiment is deemed necessary. But, it is reasonable to conclude that lateral inhibition is necessary
in the model to be able to explain human behaviour in the task. Without lateral inhibition the model
cannot justify observations made in humans.
4.7 Implementation of a traversal program
In order to map the model to the tool QTriplot so as to generate an EEG signal prediction for the
model, the following were the steps involved.
1. Generate a program to read a simulation file and traverse the nodes in the feedforward, feedback
41
and disinhibition networks
2. Obtain relevant data regarding the connectivity of various visual areas such as V1, V2 etc. and
receptive field sizes of the neurons in each of these areas
3. map the model.
Accordingly, a program was implemented in C++ to read in a simulation file and traverse the
networks. The simulation file was generated by the tool CLAMVis. The simulation file generated is
a ‘.root’ file and it had to be read into the program to perform the traversal. The root package is an
object-oriented framework which was developed to provide easier simulation and analysis of a large
amount of data. The ‘.root’ file generated by CLAMVis contains the simulation results. This means
that the file contains the activity values of the different nodes in three networks the disinhibition, the
feedforward and the feedback networks.
In order to implement the traversal program, the ‘.root’ was read into the program. The MIIND
framework contains a class called SimulationResult. The file is wrapped into an object of type SimulationResult. The different networks of the model are implemented as C++ vector data types. The
vector data type is basically a dynamic array. In order to traverse the elements of a vector one by one, a
data structure called an iterator is used. There are inbuilt C++ functions begin() and end() which return
iterators to the start and end of the vector respectively. The networks are implemented as objects of
a class DynamicSubNetwork which provides iterators to traverse the network. There are two different types of iterators ForwardSubNetworkIterator and ReverseSubNetworkIterator which are used to
traverse the two different types of networks, namely, the feedforward and feedback networks. These
are implemented as classes in MIIND. The ForwardSubNetworkIterator provides functions equivalent to the begin() and end() functions inbuilt in C++ which return iterators to traverse the feedforward
network while the ReverseSubNetworkIterator provides similar functions to traverse the feedforward
network in the reverse order, in effect, traversing the feedback network. The disinhibition network is
basically a feedforward network and hence, can be traversed using ForwardSubNetworkIterator. The
implementation of this program required a study of the complex hierarchical structure of the MIIND
libraries. As a result of this exercise, it is now possible to extract data out of the simulation files.
Although a lot of reading was done on the connectivity data and receptive field sizes, not much
relevant information was obtained from it. Besides, step 3 could not be implemented due to restrictions
on time. It came to our notice that the implementation of the mapping would alone require a lot more
time than available and hence the focus was shifted to studying the circuits in the model and the
analysis of pre-processed data.
42
Chapter 5
Evaluation
The main outcome of the project was the pre-processed data and the plots generated after analysis.
Due to absence of prior knowledge or experience in analysing the EEG signals and the scalp maps
generated, expert advice had to be sought from the EEG analysts and experts at the School of Psychology. Evaluation of the pre-processed data was therefore, done by conducting discussions with
them.
When the pre-processed data from the first round of pre-processing was examined, the sampling
rate was found to be too high and the EEG experts advised that a lower sampling rate would be more
appropriate and would make it easier for further analysis. Accordingly, the sampling rate was lowered
from 1000 Hz to 250 Hz. It was under their advice that two separate epochs for the events 5 and 6
(cue and target onset) were extracted per subject. The epoch for event 5 (cue onset) would contain
information about neural activity before, during and after visual processing of the stimulus. Similarly,
the epoch for event 6 would provide information about how the neural activity would vary during
when and after the array of objects is displayed (target onset). The two types of epochs could then be
compared to identify the differences in neural activity.
After the second round of pre-processing, the data was verified by looking at the eye-tracker data
to see if the various events matched in both the cases. It was during this exercise that Claudia Gonzalez
of the School of Psychology and I found out that there was a timing mismatch between the events in
the EEG signal and the eye-tracker data. For example, consider the plot in figure 5.1. It indicates the
raw signal re-sampled to 250 Hz for one of the subjects
According to the plot, the time between the two events 5 (indicated in red) and 6 (indicated in
green) is 1.5 seconds. Figure 5.2 indicates a plot of the eye movements with time for the same subject
and for the same trial.
43
6
5
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
85
Scale
7.251
−+
86
87
88
89
90
Figure 5.1: EEG signal for a subject using 16-bit representation
Figure 5.2: EEG signal for a subject
As shown in the figure, there is a huge deflection in the values of eye-position and velocity along
the x and y coordinates. This is when the eye movement or the saccade takes place. The DISPLAY CUE and DISPLAY TARGET time points in the figure represent the cue and target onset
respectively. The cue onset is chosen as zero or the starting point here. Nevertheless, ideally, the time
44
between cue onset and target onset should be the same as that in the EEG signal i.e. 1.5 seconds in this
case. The eye-tracker data was looked up to verify this. The time difference between the two events
was however found to be 0.751 seconds in the eye-tracker data. This was only half of the time difference in the EEG data. This meant that signal for a period of 1 second in the EEG data was actually
only 0.5 seconds of original signal. The problem was then traced back to the bit representation used
in EEGLAB when importing data. The tool used a default representation of 16-bits and due to lack of
knowledge about the format used by EEGLAB to store the data, this default value was assumed to be
appropriate. Figure 5.3 indicates the same trial for the same user using a 32-bit representation. The
time interval between the two events was found to be 0.75 seconds and hence matching with the eyetracker data. Although the data was sent to the EEG experts during the second round of pre-processing
for verification during each stage, the problem could not be identified. This problem could have been
FP1
FPZ
FP2
AF3
AF4
F7
F5
F3
F1
FZ
F2
F4
F6
F8
FT7
FC5
FC3
FC1
FCZ
FC2
FC4
FC6
FT8
T7
C5
C3
C1
CZ
C2
C4
C6
T8
TP7
CP5
CP3
CP1
CPZ
CP2
CP4
CP6
TP8
P7
P5
P3
P1
PZ
P2
P4
P6
P8
PO7
PO5
PO3
POZ
PO4
PO6
PO8
CB1
O1
OZ
O2
CB2
HEOG
VEOG
42
6
5
averted if the eye-tracker data had been checked earlier.
Scale
11
−+
43
44
45
46
47
Figure 5.3: EEG signal for a subject using 32-bit representation
Besides this, the plots generated using the two bit representations were also compared. Figure
5.4 shows the a comparison of ERPs for the two conditions , ‘L’ and ‘S’ recorded for each of the
64 channels in the two representations. The window on the left indicated the plot for the 16-bit
representation and it indicates a lot of variation between the two conditions. However, this extent of
variation in activity is not expected. Similarly, a plot of the comparison of the component activations
for the two conditions is indicated in figure 5.5. A similar result is observed here as well.
45
Figure 5.4: A comparison of the channel ERPs for conditions ’L’ and ’S’ using the two bit representations
Figure 5.5: A comparison of the component ERPs for conditions ’L’ and ’S’ using the two bit representations
In order to evaluate the pre-processed data further, the pre-processing of all the subjects would
have to be done again and the plots and scalp maps would have to be generated for each condition.
The EEG experts can then look at the data and the maps and verify whether evidence for the visual
processing pathway suggested by the model is present in the EEG data.
46
Chapter 6
Conclusion
The aim of this project was to verify the hypothesis of binding by retinotopic location by analysing
the EEG data obtained from the experiment and mapping the model of binding in [1] to a tool called
QTriplot to generate a prediction of the EEG signal from the model. The hypothesis can then be
verified by analysing the predicted EEG signal and comparing it to the observed EEG signal in the
experiment. However, due to several hurdles during the progress of the project and due to restrictions
in time available, this could not be done completely. However, it is possible to draw the following
conclusions from the work that has been done.
• It can now be concluded that the pre-processing stage is not as trivial as it seems. It cannot
be approached as a black box. The parameters during each stage of pre-processing have to be
carefully chosen. It has now also been understood that it is not possible to verify the correctness
of the process until it has been completed entirely. For instance, it may not be possible to
look at plots of the channel ERPs generated during the pre-processing and indicate whether the
pre-processing is progressing correctly.
• It can be said with a fair amount of confidence that the appropriate parameters for pre-processing
have finally been found. However, this will require another iteration for verification.
• The tool, EEGLAB used for pre-processing and analysing EEG data is probably not suited
to someone inexperienced in working with EEGs and is not very user friendly since it makes
assumptions regarding factors such as the bit representation which although seemingly unimportant, have a huge impact on the process.
In spite of the hurdles a thorough study of the model, the experiment and the EEG data involved
could be done. A proper understanding of the rationale of the experiment has been obtained and the
47
various theories about visual attention have been studied.
6.1 Further Work
Since all the work could not be completed, further work on the project would consist of the following
main tasks.
• The first step would be to complete the third iteration of preprocessing with the right parameters
and to generate the plots required for analysis. Analysis of these plots can then be done to draw
conclusions as to whether a pattern of visual processing as suggested by the neural model can be
found. The brain activity in different conditions can be compared. However, a proper strategy
has to found out before comparison between these conditions is done. For example, if we
consider the task for shape condition, there is a set of component activations ordered according
to their prominence in the signal. Similarly, there is a set of component activations for the
colour condition. It may not, however, be rational to compare the most prominent component
of the shape condition with that of the colour condition as they may be from different sources.
Besides, when there are eye artifacts clearly present, these have to be removed before analysis
can be done. Hence, a detailed strategy has to be devised.
• The next step could be to conduct a further detailed study about the connectivity patterns of the
visual areas
• The mapping of the model to QTriplot can then be done to generate a predicted EEG signal
from the model.
• Another possible extension is to consider automating the pre-processing stage since it is very
tedious and time-consuming.
48
Bibliography
[1] Marc de Kamps and Frank van der Velde. Neural blackboard architectures: the realization of
compositionality and systematicity in neural networks. Journal Of Neural Engineering, 3:R1–
R12, 2006.
[2] R Desimone and J Duncan. Neural mechanisms of selective visual attention. Annual Review
Neuroscience, 18:193–222, 1995.
[3] http://www-psych.stanford.edu/~lera/psych115s/notes/lecture3/figures.
html.
[4] http://cueflash.com/decks/LESIONS\_OF\_THE\_VISUAL\_SYSTEM\_-\_50.
[5] S Luck, Chelazzi L, Hillyard S, and R Desimone. Effects of spatial attention on responses of v4
neurons in the macaque. Society of Neuroscience, 19, 1993.
[6] J Moran and R Desimone. Selective attention gates visual processing in the extrastriate cortex.
Science, 229(4715):782–784, 1985.
[7] Chelazzi L, Miller E K, Duncan J, and Desimone R. A neural basis for visual search in inferior
temporal cortex. Nature, 363:345–7, 1993.
[8] J.C Martinez-Trujillo and S.Treue. Feature-based attention increases the selectivity of population
responses in primate visual cortex. Current Biology, 14(9):744–51, 2004.
[9] J.C Martinez-Trujillo and S.Treue. Feature-based attention influences motion processing gain in
macaque visual cortex. Nature, 399:575–79, 1999.
[10] J.H.R Maunsell and S.Treue. Feature-based attention in visual cortex. Trends in Neurosciences,
29(6):317–322, 2006.
[11] Anne M.Treisman and Garry Gelade. A feature-integration theory of attention. Cognitive Psychology, 12:97–136, 1980.
[12] Jamie Ward. The Student’s Guide to Cognitive Neuroscience. Psychology Press, 2006.
49
[13] Paul Bourke. Various eeg file formats and conventions.
dataformats/eeg/.
http://paulbourke.net.
[14] Arnaud Delorme, Hilit Serby, and Scott Makeig. EEGLAB Wikitorial. http://sccn.ucsd.
edu/eeglab/eeglabtut.html.
[15] James V.Stone. Independent Component Analysis : A Tutorial Introduction . MIT Press, 2004.
[16] Marc de Kamps and V.Baier. Multiple interfacing instantiations of neuronal dynamics(miind) : a
library for rapid prototyping of models in cognitive neuroscience. International Joint Conference
on Neural Networks, 19:2829–2834, 2007.
[17] Marc de Kamps and Frank van der Velde. From knowing what to knowing where : modeling
object-attention with feedback disinhibition of activation. Journal of Cognitive Neuroscience,
13(4):479–91, 2001.
[18] Marc de Kamps and Frank van der Velde. From artificial neural networks to spiking neuron
populations and back again. Neural Networks, 14:941–953, 2001.
[19] Marc de Kamps and David G. Harrison. A dynamical model of feature-based attention with
strong lateral inhibition to resolve competition among candidate feature locations. University of
Leeds, 2011.
50
Appendix A
Personal Reflection
This project has been an enlightening experience and it has given me valuable lessons for life. I
have not had previous exposure to a project which is research-oriented. Therefore, this project, which
is strongly based on research, gave me a deep insight into how research is to be conducted. My
undergraduate project was a general software development project and it did not have the amount of
planning and background reading that was done in this project. Although dividing the project period
up into different stages was done during my previous projects, it was not planned to this level of
detail. Due to this inexperience, I initially had difficulties in planning my work. The situation was
no different in the case of background reading. I have never had to do such extensive reading in
the past, to have an understanding of the problem. I found it difficult finding articles relevant to my
problem and sometimes I even spent days reading irrelevant articles or trying to find relevant ones.
My supervisor helped me a lot in coping with these difficulties and he advised me to make use of tools
such as the Web of Science and Google Scholar. The level of detailed reading required was another
aspect I learnt from the project. The method of reading I am used to, was to read through every single
detail of an article. This became increasingly difficult as the amount of reading to be done grew. Thus,
I had to learn to use speed reading techniques such as skimming. I also learnt to use mind maps to
organise my ideas and this helped me a lot in writing up my report.
During the implementation phase ( which has mainly been pre-processing of EEG data ) of the
project, I was constrained by the fact that I was heavily dependent on the EEG experts at School of
Psychology for feedback on the pre-processing stage. I could not proceed to analyse the EEG data
without their approval. Although they gave feedback on changes to be made after the first iteration of
pre-processing, they could not identify a major error in the second round of pre-processing which was
identified quite late during the project. It could thus be concluded that such errors cannot be detected
51
at such an early stage. Besides this, I did not have a choice in the tools I could use to solve the
problem. The tool to be used for pre-processing and analysing EEG data, EEGLAB, was suggested
by the EEG experts themselves. The implementation of a program to traverse the neural model using a
simulation file was one part where I could work independently. However, the tool to be used, MIIND,
was not documented enough and I had to struggle to understand the class hierarchy of the framework.
Similarly, working with the simulation tool, CLAMVis, was also not trivial. Merely setting up these
tools took a lot of time and effort. Here, however, I received a lot of help from my supervisor, Dr.Marc
de Kamps and from David G. Harrison of the Biosystems group. I can never thank them enough for all
the support they have given me. The initial aim of implementing a mapping in QTriplot could not be
done due to restrictions in time. This was probably because a proper understanding of the capabilities
of the tool was not attained. Although this did worry me a little in the beginning, my supervisor
convinced me that this was not the end of the road.
To summarise, I would definitely say that I had a wonderful experience working on this project.
Not everything in the project worked out as planned and there were major setbacks in the implementation of the project. However, I have learnt a lot from this experience. I have learnt that research is
not always fruitful and that perseverance is the only solution to be successful, not only in research,
but also in life in general. When we face hurdles, we often have to come up with alternative solutions
which may not always be acceptable to us. Besides this, I have now realised that I need to work more
on my planning and organising skills. I am grateful to my supervisor, Dr.Marc de Kamps for having
faith in my abilities whenever things went wrong in the project and for the unwavering support he has
given me all throughout. He has gone beyond his responsibility as a mere supervisor and has been
more of a mentor to me. I am extremely thankful to David G.Harrison for the countless times he came
to my assistance when I had trouble with the tools like CLAMVis, and EEGLAB. In spite of it not
being his obligation, right from the start of my project, he has been a great help every step of the way.
This project would not have been possible without their valuable guidance.
52
Appendix B
Interim Report
The Interim Report is submitted along with the Project Report as hard copy.
53
Appendix C
Schedule
The schedules created during the schedule are shown here. Figure C.1 shows the Initial Schedule and
figure C.2 shows the Revised Schedule.
54
Figure C.1: Initial Schedule
55
Figure C.2: Revised Schedule
56
Appendix D
Tools and Resources used
The list of tools and resources required includes MATLAB and EEGLAB. The software tools necessary to run the model are MIIND and CLAMVis and their linux dependencies. With the help
of my supervisor Dr. Marc de Kamps and with assistance from David G.Harrison of Biosystems
Group, all the necessary tools were installed. The EEG data required for the project was obtained
from Melanie Burke, Claudia Gonzalez and Jean-Francois Delvenne of the School Of Psychology
and David G.Harrison of the School of Computing.
57