KISHORE Aswathy - School of Computing
Transcription
KISHORE Aswathy - School of Computing
' $ Analysis Of EEG Data and Model Predictions Aswathy Kishore MSc Artificial Intelligence Session 2010/2011 & % The candidate confirms that the work submitted is their own and the appropriate credit has been given where reference has been made to the work of others. I understand that failure to attribute material which is obtained from another source may be considered as plagiarism. (Signature of student) Summary The human brain is capable of a variety of functions such as cognition, sensory perception and memory. One such function performed by the brain is visual attention. Visual attention can be defined as the attention that is deployed to visual scenes around as we observe them. Several studies have been conducted to obtain an understanding of the underlying mechanisms of the phenomenon. One important focus of these studies has been to understand how the brain represents the visual information that enters it through the eye. As a result of these studies, it can now be said with a fair amount of confidence that the brain uses what is called a ‘compositional representation’ to store visual information efficiently. This means that different features of the visual scene such as, the colour and shape of the different visual stimuli in the scene are stored in different areas of the brain. These distributed representations from the different areas are then combined or bound is some way to produce the overall representation. How, in fact, this binding together of information from different areas is done is an issue that is being studied and researched extensively and this problem is called the ‘binding problem’. This thesis studies a neural network model which puts forward a hypothesis about how the brain solves the ‘binding problem’. A behavioural experiment based on attention was conducted at the School of Psychology to verify this model and the hypothesis put forward by it. Brain activity during this experiment was collected from the participants in the form of EEG signals. The aim of this thesis is then, to analyse this EEG data and verify whether patterns of processing can be seen in the brain, as suggested by the model. This exercise can then confirm the validity of the hypothesis. A detailed study of the mechanism of visual attention, types of attention, the binding problem and the model is also done as part of this thesis. i Acknowledgements Firstly, I would like to thank my supervisor, Dr. Marc de Kamps for having provided me with the guidance necessary for the project and for instilling confidence in me whenever things went wrong. Secondly, I would like to extend my heartfelt gratitude to David G. Harrison, researcher at the Biosystems group of the University of Leeds, who in spite of being busy with his PhD, offered to help me whenever I had trouble with the tools I needed for the project. I would also like to thank Melanie Burke, Claudia Gonzalez and Jean-Franois Delvenne of the School of Psychology for having spared the time to meet me and clarify my doubts regarding the experiment conducted at the School of Psychology and also for having provided me with the data required for the thesis. I am grateful to my assessor, Professor Netta Cohen, for offering valuable suggestions and feedback during the progress meeting. Despite the project not being his responsibility, my tutor, Dr. Hamish Carr offered valuable advice on how to go about the project. I would therefore like to thank him from the depths of my heart. I would like to thank my fellow students Elaine and Amal for reading through my report and providing feedback. Besides this, I would like to thank all my classmates, friends and family for all the support they have given me during the course of the project. ii Contents 1 Introduction 1.1 1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 Visual Attention and The Binding Problem . . . . . . . . . . . . . . . . . . 1 1.2 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 1.4 Objectives and Minimum Requirements . . . . . . . . . . . . . . . . . . . . . . . . List Of Deliverables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 1.5 Research Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Background 2.1 2.2 2.3 6 Visual Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 The Visual Pathways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Bottom-up and Top-down Visual Processing . . . . . . . . . . . . . . . . . . 8 Attention as a Performance Enhancement Mechanism . . . . . . . . . . . . . . . . . 9 2.2.1 Spatial Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.2 Non-spatial Attention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Attention as a Representational Mechanism . . . . . . . . . . . . . . . . . . . . . . 13 3 Pre-processing and Analysis 16 3.1 The Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 Electroencephalography (EEG) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.1 Event-Related Potentials and Evoked Potentials . . . . . . . . . . . . . . . . 18 3.3 The Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.4 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4.1 Stages of Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.4.2 Iteration 1 of Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.4.3 Iteration 2 of Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4.4 Iteration 3 of Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 28 3.5.1 28 3.5 Independent Component Analysis (ICA) . . . . . . . . . . . . . . . . . . . . iii 4 The Model 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 33 4.2 The Feedforward Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3 The Feedback Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.4 Dynamic Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.5 The Disinhibition Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.6 Lateral Inhibition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.7 Implementation of a traversal program . . . . . . . . . . . . . . . . . . . . . . . . . 41 5 Evaluation 43 6 Conclusion 47 6.1 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography 48 49 A Personal Reflection 51 B Interim Report 53 C Schedule 54 D Tools and Resources used 57 iv Chapter 1 Introduction 1.1 Overview For years, scientists have studied the human brain, trying to understand its organisation and how it works. Attempts have been made to roughly define how the human brain performs various functions such as cognition and sensory perception. An understanding of these aspects could help in the design of intelligent agents in the future. However, we still have a long way to go in understanding the organisation and information flow within the brain. 1.1.1 Visual Attention and The Binding Problem Visual Attention is one of the functions of the brain which has been studied in great detail. Visual attention refers to the attention deployed when we perceive things around us. In other words, it is the attention deployed to visual stimuli. As we observe our surroundings, we attend to objects which either stand out from the rest or hold some significance or relevance to us. For example, if we consider a scene where everything is either black or white, an object in a colour such as bright yellow would draw our attention to it. Similarly, suppose we are shown a scene in which an object each is present for each of the seven colours. If we are instructed to look at the yellow object in the scene, our attention automatically shifts to the yellow object irrespective of what we were attending to earlier. In the first case, the yellow object caught our attention because it stood out from its surroundings. In the second scenario, the task to be performed caused our attention to shift. Several studies have been conducted to understand the underlying mechanisms for these phenomena and some of these studies are described in Chapter 2. The focus of this thesis on the second scenario where attention has to be deployed according to a given task. 1 According to previous studies (for example, [1], [2]), when perceiving a visual stimulus such as a simple coloured object, the brain stores information about it using what is called a ‘compositional representation’. Accordingly, different features of the object such as shape, texture and colour will be represented in different parts of the brain. Hence, in order to have a complete representation for the object, these individual localised representations have to be bound together to form a global representation of the object. In the area which stores information about a feature, different neurons are selective to different values of the feature. Thus, if a red circle is presented in the scene, in the area where colour information is stored, neurons which code for the colour ‘red’ will be active. Similarly, in the area which stores shape information, neurons which code for the shape ‘circle’ will be active. These two sets of neurons together code for the ‘red circle’. The idea of a compositional representation also exists in the field of computer programming where we use modules or functions to solve a single problem. This representation provides an efficient storage mechanism and the same is the case in the human brain. However, in the human brain, this leads to another problem as discussed below. When multiple objects of different types are present in the visual scene, the situation changes from that discussed above. If for example, we have a ‘red circle’ and a ‘blue square’ in a visual scene, neurons representing the colours ‘red’ and ‘blue’ will be active and those representing the shapes ‘circle’ and ‘square’ will be active. When using the ‘compositional representation’ assumption, this would lead to confusion since we cannot be sure whether to associate the colour ‘red’ to the shape ‘circle’ or to ‘square’. This uncertainty in associating a feature representation with its corresponding object is referred to as the ‘binding problem’. However, if we are asked to identify the shape of the red-coloured object, we can easily say that it is a circle. Hence, there has to be a mechanism using which, our brain associates the colour ‘red’ with the shape ‘circle’ for the first object, and the colour ‘blue’ with the shape ‘square’ for the second object. Some scientists believe that it is based on the synchronous activity of the neurons i.e. the neurons which code for ‘red’ and ‘circle’ fire synchronously and those for ‘blue’ and ‘square’ fire synchronously. The model in [1], on which this thesis is based, puts forward a different hypothesis. It proposes that binding of the different features to the appropriate object takes place using information about the retinotopic position of the object. Retinotopic position is the position of the object as received by the retina. A detailed study of the model is given in Chapter 4. In order to verify this model and the hypothesis of binding by retinotopic position as suggested by it, a behavioural experiment was conducted on human participants at the School of Psychology, University of Leeds. Responses from the participants were collected in the form of EEG signals. Analysis of these EEG signals could provide evidence for the hypothesis. The experiment and the analysis of the EEG signals have been described in detail in Chapter 3. In this thesis, we analyse the results of this experiment and the possible conclusions we could derive about visual processing in the brain from these results. 2 1.2 Aim The overall goal of this thesis is to analyse the results of the experiment conducted at the School of Psychology in order to verify the neural network model in [1] and hence, the principle of binding by retinotopy as the mechanism used when visual attention is deployed in the human brain. Originally, there was an additional goal to find a means of mapping the neural model to a model of the human brain using a tool called ‘QTriplot’. In principle, this mapping could then help us generate an EEG signal from the model much like the EEG signals recorded during the experiment. In order to implement this mapping, a detailed study of the connectivity structure of the different visual areas of the brain was also to be done. The thesis was thus, initially composed of the following main components as indicated in figure 1.1: • analysis of the EEG data obtained from the experiment at the School Of Psychology • developing and implementing a method to derive a predicted EEG signal from the model • comparing the predicted signal to the observations made on analysing EEG data Figure 1.1: Initial aim of the project and its stages However, as the project progressed, it was found that there were restrictions on the tool, QTriplot, which was to be used to generate the predicted EEG signal from the model. It was pointed out that the implementation of the mapping was quite complicated and that it would probably require more time than is actually allotted to the entire project. Thus, I had to resort to a different approach. The basic aim of the thesis and the scientific hypothesis on which it is based, however, still remained the same. According to the new approach, the main focus of the project was thus shifted to analysis of the EEG data obtained from the experiment using a signal processing approach. The conclusions drawn from this can then be used to provide supporting evidence for the model or to shed some light on the 3 modifications to be made both to the model and the hypothesis. In addition to this, more emphasis was placed on using CLAMVis, a tool that was built by David G.Harrison and Marc de Kamps, to visualise the working of the model and to investigate how variations in the model could affect its working. This enhanced emphasis was the result of inability to use QTriplot for the project. To summarise, the aim of the project was revised and the new aim of the project was to analyse the EEG data from the experiment to provide supporting evidence for the model. Also, the stages of prediction and comparison of the simulated EEG signal as indicated in the lower half of fig 1.1 had to be shelved for the time being. 1.3 Objectives and Minimum Requirements The main objectives and minimum requirements addressed by the project are to be able to: • Develop an understanding of the problem of visual attention. • Develop a thorough understanding of the neural model representing the functioning of the visual cortex when attention is deployed. • Explain the rationale for the psychology experiment in connection with the problem of visual attention. • Understand what EEGs are and provide a description of the process of pre-processing and analysing EEG data obtained from the experiment using a specialized MATLAB tool for EEG analysis - EEGLAB. • Run the simulation program for the neural model and visualise the working of the model. 1.4 List Of Deliverables Since the main aim of the project is analysis of data rather than development of a complete software product, the deliverable is more in the form of a report of the analysis rather than a final executable file. The following are the expected deliverables: • A report containing the results of analysis of EEG data • The pre-processed EEG data and plots generated for analysis 1.5 Research Methodology Since the project is research-oriented, I thought that it was inappropriate to use a conventional software development model such as the Waterfall model. The Waterfall model is a sequential software development model. The requirements are fixed at the start of the project and these requirements are 4 used to guide the implementation and testing phases which follow afterwards.In the case of a research project, the initial set of requirements is very vague and incomplete. The requirements take a more concrete form during the course of the project and hence may have to be revised very frequently. Besides, there may be a change in direction as the project progresses. Thus I felt that an incremental or evolutionary model of development would be best suited to go about the project. This methodology gave me the flexibility to make changes to my aims and requirements as needed and allowed me to add new ones when I had achieved some of the previous aims. In accordance with this methodology, an initial schedule for the project was developed. This schedule is shown in appendix []. This was however a rough schedule and not a fine-grained plan. Writing up was not done during this period although regular notes were taken of the background reading completed. Background reading could not be confined just to the initial period of the project. The background reading done during this period only served to attain a basic understanding of the problem. During subsequent stages in the project, more background reading had to be done whenever I could not make progress owing to doubts. Till the submission of the interim report, the work progressed as per the schedule. After this however, in about the last week of July, there was a major change of direction in the project and this led to an updated schedule which is indicated in appendix []. This also led me to revisit the aims of the project and make revisions. This was done under the careful guidance of my supervisor. During the progress meeting, the assessor also provided valuable suggestions on how to go about the rest of the project. Accordingly, some of the sub-tasks to be done to realise the goal had to dropped due to time constraints and hence more focus shifted to writing up the tasks which had been done till then and also to extend these sub-tasks such that it was not constrained by the time available. 5 Chapter 2 Background 2.1 Visual Attention When we observe things around us, not all the visual stimuli captured by the eye are processed consciously in the brain. This is because the human brain has limited capacity and it cannot process all the information that falls on the retina fast enough to elicit a timely response. The following experiment from [2] provides a better understanding of the problem of limited processing capacity of the brain. Here, a human participant was instructed to identify the letters printed in a particular colour, say, black, from an array of letters. The array contained letters printed in either black or white. This array was flashed on the display for a very brief time interval. Figure 2.1 shows the display screens for two such experimental trials. It was observed that the subject could pick out the black letter ‘N’ in condition b) faster than in condition a). The observation can be explained as follows. The black letters are the target objects to be identified and the white objects are the non-targets. In b) there is only one target object and it seems to stand out in the array. Hence the subject can easily identify it. In a) however, there are three target objects. When attention is deployed to one of the target objects in the scene, less attention is available to other target objects in the scene [2]. Thus if we increase the number of target objects, the subject finds it more difficult to pick them out from the array. The array of objects in the scenario discussed above is very simple compared to the visual scenes we see around us. The visual scenes around us are complex, cluttered with noise and have many different types of objects. These information-rich scenes, combined with the limited processing capacity of the brain, demand a mechanism by which only the relevant parts of the visual information are selected for further processing in the higher areas of the brain. This is where visual attention comes in. Visual attention is the mechanism which enables the selection of the relevant visual information in 6 Figure 2.1: Experiment demonstrating limited processing capacity of brain (Image from [2]) order to generate an appropriate response. As suggested in [1], often, only a very small portion of all the available information is found to be relevant to the task assigned. 2.1.1 The Visual Pathways The visual information that enters the eye is sent to the brain for processing through the optic nerve. Figure 2.2 shows the path from the eye to the brain. Figure 2.2: pathway from the eye to the brain (Image from [3]) The eye sends the optical signals to the Lateral Geniculate Nucleus (LGN) which in turn sends all the information to the visual cortex where all the processing takes place. Most studies on visual attention propose that processing in the visual cortex progresses along two pathways. 7 • The ventral stream This pathway is concerned with processing shape information. It starts at area V1 and ends at the Anterior Inferotemporal Cortex (AIT) with intermediate areas V2, V4, and Posterior Inferotemporal Cortex (PIT) in that order. The receptive field sizes and complexity of the stimulus to which the cells respond increase as we go from V1 to AIT. Hence the cells in V1 respond to simple features like lines of a particular orientation and those in AIT respond to complex shapes. • The dorsal stream It processes information on position of stimuli to control eye movements. It starts at V1 and ends at the parietal cortex. These two visual pathways are shown in figure 2.3 Figure 2.3: The Dorsal and Ventral Streams (Image from [4]) 2.1.2 Bottom-up and Top-down Visual Processing Desimone and Duncan give a basic understanding of the two types of processing in [2]. Bottom-up visual processing is task-independent and automatic. When a stimulus is present in the visual field, neurons in the different visual areas are activated. Bottom-up visual process depends heavily on the saliency of the stimulus. If the stimulus is something that has never been encountered before or if it stands out from the surroundings, it immediately captures our attention. This pop-out effect is often explained by bottom-up visual processing. However, bottom-up processing alone is not sufficient for the survival of the human species. Hence we have top-down visual processing. Top-down visual processing is task-dependent and it creates the attentional template. Here, attention is used to enhance processing in particular locations of interest and produce an appropriate response according to the task. Top-down processing can be categorised according to the task involved. Most often, both top-down and bottom-up processing are concurrently done and there is 8 an interaction between the top-down and bottom-up activations in the visual hierarchy to produce a response. 2.2 Attention as a Performance Enhancement Mechanism Most studies regard attention as a performance enhancement mechanism. When visual stimuli are presented, the neurons in the visual areas fire according to their selectivity to the stimuli. The idea of attention as a performance enhancement mechanism suggests that focusing attention on a stimulus enhances performance of the neuron by, for instance, a change in its activation levels or firing rates. This means that when we deploy attention towards a particular stimulus, it receives increased processing by the brain. Attention as a performance enhancement mechanism can be broadly categorised into two types on the basis of the criteria for directing attention i.e. the task involved. These two categories are spatial attention and non-spatial attention. 2.2.1 Spatial Attention Spatial attention or location-based attention is the mechanism used when we attend to particular locations of interest in the visual scene. Here the location where the attention has to be directed is known already and enhanced processing by the brain occurs at that location. Since location is predefined in spatial attention, it is widely suggested that feedback activation from upper visual areas is sent back to only those lower visual areas corresponding to that location (for example in [2]). This means that psychological or behavioural measures such as the reaction times to tasks involving spatial attention are relatively low. There could also be higher baseline activation for the same reason. As indicated in [2], a study by Luck et al [5] of the macaque brain observed that when a cue regarding the location was given to the monkey, there was already an increased level of neural activity even before the display of the final target array. One famous study to demonstrate spatial attention was given by Moran and Desimone in [6], using a single-cell recording experiment to record responses of cells in V4 and IT (Inferotemporal) cortex. After choosing a neuron whose activity was to be recorded and calculating the size of its receptive field ( The receptive field of a nueron is the part of the visual field which affects its activity ), an effective stimulus and an ineffective stimulus were found out for the neuron. An effective stimulus causes the neuron to fire vigorously while an ineffective stimulus produces very little response in the neuron. In the experiment, a monkey was trained to do a simple match-to-sample task. Two stimuli were displayed in the visual field and the monkey had to focus on the stimuli appearing at one location and ignore the other. In order to perform the task, the monkey initially had to fixate on a spot in the visual field. A ‘sample stimulus’ was then displayed at one of the two locations. This is an indication of the location to which the monkey had to attend. After a delay period, another stimulus was presented at the same location and the monkey had to respond by releasing a bar when the stimulus matched the sample stimulus. The monkey was rewarded when it gave the correct response. Figure 2.4 gives a 9 pictorial description of the task. Figure 2.4: Experiment demonstrating deployment of Spatial Attention (Image from [6]) In the case where both the effective and ineffective stimuli were present in the receptive field of the recording neuron (condition A in figure 2.4 ), the following observation was made. When spatial attention was deployed to the location containing the effective stimulus, the response was increased. On the other hand, when the monkey attended to the ineffective stimulus, the response was reduced in spite of the presence of the effective stimulus in the receptive field. Thus, the response was dependent on whether the target stimulus was effective or ineffective. Moran and Desimone in [6] describe that “ the receptive field has contracted around the attended stimulus”. This phenomenon is called taskbiased competition. The competition between the different stimuli for the receptive field is biased in favour of the stimulus relevant to the task [2]. Task-biased competition is however, not specific to spatial attention alone. It applies to non-spatial attention as well. In the context of spatial attention however, the knowledge of location to be attended in advance biases the attention in favour of the stimulus appearing at that location. In condition B in figure 2.4, the effective stimulus was placed in the receptive field and the ineffective stimulus was placed outside. Here, regardless of whether the attention was directed to the stimulus in the receptive field or not, the cell gave good response. This means that there was no attentional modulation when one of the stimuli was outside the receptive field. This again, is evidence for task-biased competition. Here, since only one stimulus is within the receptive field, there is no competition involved and hence, no attentional modulation [2]. This means that objects outside the 10 receptive field do not carry any significance when attending to something. This also underlines the need to bring the object into the receptive field to focus attention, in other words, the need to foveate . When attention is directed outside the receptive field, the response is probably due to the bottom-up processing. 2.2.2 Non-spatial Attention Non-spatial attention refers to all forms of attention which come into play when we focus on other aspects or features of the visual scene rather than a particular location of interest. These can be coarsely divided into two: • Feature-based attention It is the mechanism involved when we try to attend to particular features in the visual stimuli and it is independent of the location of the stimuli. Feature-based attention is most often the basis for a perceptual task that is used in experiments the visual search. • Object-based attention It is used when the focus is on particular objects in the visual scene. It is however not possible to distinguish clearly between feature-based and object-based attention. An object can be considered as being made up of a combination of many features. Hence, when directing attention to an object, we are indirectly focusing on a set of different features. Thus it would not be wrong to say that object-based attention is a form of feature-based attention. In non-spatial attention, the location to which attention is to be directed is unknown. This means that feedback activations from the upper visual areas should be fed back to the lower visual areas corresponding to all locations i.e. non-spatial attention is pervasive. A feature matching procedure is then performed at all locations. The object or feature of interest is then in that part of the visual field which contains the maximum number of matching activations. This is how the location information is obtained in non-spatial attention. This location information is then fed to the dorsal stream and the necessary behavioural response is generated. Two experiments which demonstrate non-spatial attention are described here. One experiment to describe visual attention in monkeys was done by Chelazzi et al in [7]. Here, the monkey was required to perform a visual search task. The monkey had to attend to a cue object on a display screen for a few milliseconds, followed a delay period when the screen was cleared. After the delay period, the monkey was presented with an array of objects. The monkey was rewarded for making eye movements (saccades) to the target (cue) object i.e. for identifying the cue object displayed earlier from among the array of objects. Neuronal firing rates were recorded using electrodes inserted into the monkey brain. Figure 2.5 gives a rough idea of the visual search task, the monkey performs. During the display of the cue, cells of the inferior temporal cortex selective to the cue are activated and this activation is maintained throughout the delay period although reduced in strength. When the 11 Figure 2.5: The Visual Search Task ( Image from [1] ) array of objects is displayed, the cells selective to each of these stimuli are initially activated, but ultimately, the activation corresponding to the target object is high while those of the ‘distractor’ objects is gradually reduced. Figure 2.6 is a pictorial representation of this. Figure 2.6: Activations in Inferior Temporal Cortex during Visual Search (Image from [7]) Prior to the saccade to the target object, the neurons selective to the target object and those selective to the non-target objects compete for attention until those selective to the target object ‘win’. The type of attention deployed in the experiment is ‘object-based attention’ since the knowledge of the cue object is essential to focus attention. Feature-based attention, is described in studies conducted in [8], [9] and [10], of the responses MT neurons in the dorsal pathway. MT neurons are concerned with the perception of motion. An 12 experiment was conducted on monkeys to study feature-based attention, as follows. The monkey was presented two random dot patterns one in the receptive field of the neuron being monitored and which moved in the direction of preference of the neuron and the other outside the receptive field and moving in either the preferred or anti-preferred direction of the neuron. The monkey was asked to direct attention to the stimulus outside the receptive field. It was observed that the response of the neuron was increased when the direction of motion of the dot pattern attended to (i.e. outside the receptive field), was closer to the preferred direction of the neuron i.e. the gain in response of the neuron is more when the feature (direction of motion) is similar to the preferred direction of motion of neuron and least when the stimulus being attended to moves in the anti-preferred direction of motion of neuron. The same trend of response was found in the neuronal responses even when animal was fixating and not attending to either of the two stimuli. This is the idea put forth by the ‘feature similarity gain model’ in [8] and [9]. Figure 2.7 shows the experiment and the neuronal response observed during the experiment. Figure 2.7: The feature similarity gain model (Image from [10]) 2.3 Attention as a Representational Mechanism It is suggested in [1] that the brain stores visual information using a ‘compositional representation’ i.e. different features are represented and processed by different sub-areas of the visual cortex. The entire visual scene can then be represented as a composition or combination of the representations of the different component features. As has already been discussed, the human brain has limited processing capacity and hence cannot process all the visual information which enters the eye. The compositional representation is therefore an advantage because it saves space and optimises the processing capacity of the brain. This representation also provides for ‘systematicity’ and ‘combinatorial productivity’ as mentioned in [1]. If we consider separate classes to represent different features, an object can be represented as a combination of its constituent features to form an overall feature representation. This 13 property is called productivity. The notion of systematicity allows us to obtain the constituent features from the overall compositional representation. This property helps us to classify new objects into one of the known classes and also helps us identify similarities between objects belonging to the same class and differences between objects of different classes. The alternative to a compositional representation is a ‘conjunctive representation’. Consider that we have objects which are defined by two main features - shape and colour. Then we can define objects as ‘red circle’, ‘blue circle’ and ‘red triangle’. If we were to use a conjunctive representation, then we would have to create separate representations for each combination of these two features. Here, we consider only one object with two simple features. In real life however, a visual scene has many objects with many features. The total number of possible feature combinations in the scene is then very large and this would lead to a shortage of space for the representations. Besides, since the conjunctive representation does not have the property of systematicity it would not be possible to separate the representation into its constituent features. Thus we cannot identify similarities between objects belonging to the same class. Since there are separate conjunctive representations for a red circle and a blue circle, we cannot classify them as both belonging to the same class (i.e. circles). Another problem of this representation is that novel objects cannot be classified. Due to the reasons mentioned above, it is not likely for the human brain to use a conjunctive representation to represent objects. The compositional representation although advantageous, does have problems as mentioned in the Chapter 1. The binding problem is one of the serious issues caused by the compositional representation. Several mechanisms have been suggested to solve this problem. One of them is to use the synchronous activity of the neurons. According to this theory, it is considered that neurons representing different features of the same object fire synchronously. Hence if we have a blue square and a red circle in the visual field, the neurons selective for ‘blue’ and ‘square’, representing the first object fire synchronously. Similarly, the neurons selective for ‘red’ and ‘circle’ representing the second object fire synchronously, but not with neurons representing ‘blue’ and ‘square’. The synchronicity mechanism does not take into consideration, attentional modulation of neuronal responses. Another approach to solving the binding problem is use the retinotopic location to bind features of the same object together. The lower visual areas such as V1 and V2 contain information about the location of the target object or feature. This information is however, lost as we go up in the visual hierarchy. Hence this location information is not present in higher visual areas such as V4 and AIT. In order to obtain this location information, a feature-matching procedure will have to be invoked to match the bottom-up activation in the hierarchy with the top-down activation. This will then provide the exact location of the target stimulus. According to this theory, attention becomes a fundamental representation mechanism instead of a performance enhancement mechanism. Here, spatial attention is used to bind together, properties of a single object represented using a distributed, compositional representation. This theory is supported by the Feature Integration Theory given by Treisman et al in [11]. The feature integration theory suggests that attention provides the basis to integrate the different 14 features of the same object. It suggests that the features in the visual scenes are recognized in the early stages of visual processing by bottom-up processing. However, the features are bound together only when attention is focused. In contrast to the Feature Integration Theory, the model in [1] is pervasive since it sends feedback activations to all locations while Feature Integration Theory sends feedback activations only to one single location and supports a spatial attention approach more than a feature-based attention approach. By conducting the experiment and looking at the EEG data obtained from it, we aim to provide evidence for this pervasive, feature-based attention and binding by retinotopic location. The analysis of EEG data can confirm the hypothesis if we can get evidence for the information flow suggested by the model from it. 15 Chapter 3 Pre-processing and Analysis 3.1 The Experiment The experiment that was conducted at the School of Psychology is similar to the Chelazzi experiment in [7]. There were 11 human participants in the experiment and each participant had to perform a visual search task. Initially, the participant was asked to fixate on a display screen where a cue object was displayed after a few milliseconds. This cue object was an indication of the task to be performed. For example, a cue object would be a combination of an object (a circle, square, triangle or diamond of a particular colour) and a letter (‘C’, ‘S’ or ‘L’). The objects used could be any of four different colours - red, green, yellow or blue. An object with ‘C’ or ‘S’ used as a cue indicates to the subject to look for an object with the same colour (C) or shape (S) as the object presented in the cue. The letter ‘L’ appeared along with an arrow indicating to the participant to direct his eyes to that particular location. The arrow could point in one of four different directions - upper left, upper right, lower left or lower right. The task to be performed was varied, based on three different conditions colour, shape or location. The cue object was then cleared from the display and this was followed by a delay period. The delay period was varied between 0.5 and 1.5 seconds. After the delay period, the array of objects was displayed on the screen and the target object had to be chosen from this array by making a saccade to it. The number of objects in the array was also varied in each task. The array could have 1, 2, 3 or 4 objects. This leads to total of 12 different conditions - 3 conditions based on task types x 4 conditions based on the number of objects in final object array. A single block was composed of 120 such trials and each subject had to go through 5 such blocks. The conditions for the tasks were varied randomly and not according to any predefined order. During each trial, EEG signals were recorded 16 from the subjects using electrodes placed on the scalp. In addition, eye tracker data was also recorded to obtain eye movements of the subjects. Figure 3.1(a) depicts the task and figure 3.1(b) shows the experimental setup. (a) The task (b) The setup Figure 3.1: The experiment In the task in figure 3.1(a) for example, the cue object consisted of a red triangle and the letter ‘C’. This meant that the subject had to select a target object of red colour from the array irrespective of the shape and location of the target object. Thus the subject had to make an eye movement to the red circle in the upper left corner in this trial. The main idea of the experiment is to verify the hypothesis of binding by retinotopic location as suggested by the model in [1]. If the this hypothesis is correct, then we can see evidence for this in the EEG signals recorded during the experiment. The model, based on the same hypothesis, has been designed to simulate this experiment and the information flow indicated in the model will then be the same as that indicated by the EEG data. By analysis of the EEG data, we can thus validate both the hypothesis and the model. Besides, we can also verify whether there are any significant differences between the three different conditions of the task colour, shape and location. In particular, we can look for differences between brain activity during spatial and non-spatial attention. This is because the location condition (‘L’) basically demonstrates spatial attention while the other two conditions demonstrate non-spatial attention. 3.2 Electroencephalography (EEG) Electroencephalography or EEG is an electrophysiological method used to record the electrical activity of neurons in our brain. Most early studies on the brain were done using invasive techniques (single or multi-cell recordings) of the monkey brain since it bears a striking resemblance with the human brain. Here, the electrodes are inserted into the brain directly and the neuronal responses of a single cell or a group of cells are recorded. EEG, on the other hand, is a non-invasive technique in which the electrical activity is recorded by using electrodes placed on the scalp. When measuring 17 the electrical activity by recording the EEG signal, we actually obtain the activity of a population of neurons rather than that of a single neuron in the brain. As suggested in [12] the activity detected at one electrode may not be solely caused by neurons near the electrode. It could also have contributions from neurons far off from the electrode. EEG is especially useful in neuroscience experiments because, when tasks like visual search are given to a subject, the changes in the brain activity of the subject can be recorded. This can help us identify the regions of the brain which play important roles in cognitive and visual processing. This is what we intend to achieve with the experiment and the analysis of results obtained from it. 3.2.1 Event-Related Potentials and Evoked Potentials Event-related potentials (ERP) are means of measuring responses in the brain to various stimuli and events. They are averaged measures. Not all of the activity indicated by the EEG recording is however related to the stimulus or the task presented. Some of the activity is due to background processes in the body, for example, due to movement of muscles. Each of these processes contribute to the peaks and troughs that are observed in the ERP waveform. As [12] suggests, the important components of the ERP that are relevant to the task can be identified by analysis. The ERP components can be classified into two types - exogenous and endogenous. Exogenous components are those which appear to arise from the physical stimuli i.e. from an external source. The exogenous components are also called Evoked Potentials. Endogenous components on the other hand are processes related to the task occurring within the brain. This could include cognitive processes such as memory, thought and emotion. When analysing the ERPs, the emphasis is not so much on the peaks and their positivity or negativity, instead, the timing, frequency and amplitude of the peaks are more important [12] since we are measuring activity relative to events occurring at certain points in time. In order to record electrical activity from different regions in the brain, we place electrodes evenly across the scalp, generally, according to the widely accepted 10-20 system of electrode placement. Figure 3.2 shows how electrodes are arranged on the scalp according to the system. The electrodes are named based on their location on the scalp. Each electrode name is a combination of one or more letters and a number. The electrodes on the left hemisphere of the brain are numbered using odd numbers and those on the right hemisphere are numbered using even numbers. Those electrodes which are situated along the middle line are indicated by the letter ‘z’ . Those which occur in the frontal lobe, parietal lobe, occipital lobe, temporal lobe and central region take ‘F’, ‘P’, ‘O’, ‘T’ and ‘C’ respectively in their names. Thus the electrode ‘Cz’ in figure 3.2 indicates the electrode located in the central region of the brain along the middle line between the two hemispheres. The electrical activity at any electrode is actually measured as a difference of voltages or electrode potentials. The entire EEG signal is represented as channels of several signals. The method using which the channel signal is represented is called a montage and there are mainly three ways to do this: • Bipolar montage : A channel signal is obtained by calculating the difference in voltage between 18 Figure 3.2: Electrodes arranged according to the 10-20 system of electrode placement two adjacent electrodes. • Referential montage : One of the electrodes is chosen as a reference electrode and the channel signal for any electrode is calculated as a difference between the particular electrode and the reference electrode. • Average reference montage : An average of the electrical activity of all the electrodes is calculated and this is chosen as the reference value. The channel signal for each electrode is then measured with respect to this value. 3.3 The Dataset The dataset from the experiment was obtained from Melanie Burke, Claudia Gonzalez and JeanFrancois Delvenne at the School Of Psychology, and David G. Harrison of the Biosystems group at the University of Leeds. It contained the EEG recordings obtained from each participant in .cnt files. The EEG recording was done using an referential montage with electrode ‘Cz’ chosen as the common reference. The .cnt file contains information about the electrode locations, electrode potentials recorded in the time period for each electrode and the exact time of occurrence and type of each event 19 in this time period [13]. In this experiment, there are two types of events the presentation of the cue (event ‘5’) and the presentation of the object array (event ‘6’). The time period of one complete run of the experiment during which these two events take place is called a trial. Each block had the signal recordings for 120 such trials and these trials belonged to any one of the 12 conditions. These 120 trials were recorded without any break in between. The .cnt file contained data as continuous values, recorded for one entire block. There were five such blocks and hence there were 5 .cnt files per subject. One of these 5 blocks had 128 trials, of which 8 were practice trials to get the subject used to the experimental paradigm and these had to be removed. Each .cnt file was around 140 MB. There were 11 subjects in total and hence 55 .cnt files. There was thus, nearly 8 GB of data to pre-process. 3.4 Pre-processing The EEG signals in a .cnt file were in such a form that the signals for all the trials in a block were concatenated together to form a single stretch of signal. The data at this point is very noisy and contains a lot of irrelevant information which requires removing. This is what we intend to achieve by pre-processing. Thus the significant portions of the EEG signal for the individual trials had to be identified and extracted for further analysis. The pre-processing was performed using the MATLAB tool, EEGLAB with the help of the tutorial in [14] and with the kind assistance from David G. Harrison of the Biosystems Group. David did the EEG pre-processing for 5 subjects while I did the preprocessing for 6 subjects. The pre-processing stage includes various tasks such as removing the neuronal responses created due to muscle artifacts and other background processes from the neuronal recording (i.e. baseline removal) and handling of missing data. Pre-processing of the data turned out to take up a lot of time during the course of the project since the right parameters to be used had to be found out more or less experimentally. This stage alone had to be performed three times since there were problems in identifying the relevant information and the right parameters. A detailed explanation for the process is given below. 3.4.1 Stages of Pre-processing Loading/Importing Data The .cnt file is loaded into EEGLAB to perform the pre-processing. Here, we are prompted to make a choice of the bit representation to be used for the EEG data. We can choose from a 16-bit and a 32-bit representation. Initially, it was assumed that the bit representation affects only the precision of storage of the EEG data. However, problems were noticed due to the wrong choice of bit representation as will be explained in forthcoming sections. 20 Re-sampling of data The data may have been sampled at a high rate when recording was done. However, not all of the data collected is always necessary and hence we may have to re-sample the data to a lower rate. Sampling data at a high rate when recording is advantageous and guarantees low risk since we are collecting as much data as possible. If this huge amount of data is not deemed necessary we can re-sample it any time to a suitable lower rate. Lowering the sampling rate reduces the storage space required for the files. Baseline Removal The neural activity indicated by the EEG not only includes the task-related activity but also, the activity due to background processes in our body. The normal functioning of the body produces a particular level of neural activity. This can be classified as baseline activity. To obtain a proper measure of the task-related activity in the brain, we have to remove the baseline activity from the EEG signal. This is what is achieved in this phase. If it is not removed, we cannot differentiate between the normal activity in the brain and the activity due to the task. One can also choose the baseline as activity during any particular period during the task. The neural activity is then calculated relative to this baseline activity. Re-referencing the channel data The EEG data may have been recorded using one particular electrode montage. For example, a referential montage may be used with one particular electrode, say ‘Cz’, as the reference. It is however, possible to convert this recorded EEG data such that they are based upon a different electrode as reference, to provide for easier analysis. It is also possible to change the electrode montage. For example, if a referential montage was used during recording, it can be changed to an average reference montage. This would, however, not alter the data since as suggested in [14], re-referencing the data only leads to a simple linear transformation of the data. Filtering The EEG data may contain artifacts due to activity in the muscles, cardiac activity, movement of eyelids, and blinking of eyes. Blinking of eyes is one major source of artifacts in the EEG data. One way to avoid this is to instruct the subjects not to do so, but, this would not be practical especially if several trials of the task are done continuously over a long period of time. Besides, as indicated in [12], it takes the concentration of the subject off the task. Hence it is easier to later on filter out these artifacts. Another source of interference is artifacts from the power lines or in the connection wires of the electrodes or the electrodes themselves. 21 Extraction of Epochs Here, we extract the time periods of interest from the continuous stretch of signal. An epoch can be defined as the signal of a particular time interval during the entire period of the trial, when an event of interest takes place. These periods or epochs can then be extracted for further analysis. In this experiment, there are two events of interest - the cue onset and the target onset. The cue onset is the time point when the cue appears on the screen and the target onset is the point when the array of objects is displayed. Hence we extract the epochs which contain these two events. We intend to look at the effects of these two events on the processing of the brain in these intervals. Selection of Epochs or Events Once the epochs have been extracted, we get a concatenated set of epochs for the different trials in a block. We may select specific epochs for categorising them into different classes. For instance, if we consider this experiment, there are twelve different conditions or classes for each subject - based on type of cue (C, S or L) and number of objects in the array(1, 2, 3 or 4). Hence we have conditions - C1, C2, C3, C4, S1, S2, S3, S4, L1 and so on. Out of the extracted epochs, we have to select the epochs which belong to each of these conditions and group them together. Here, we also make sure that only those epochs in which the subjects chose the correct target are selected for further analysis. 3.4.2 Iteration 1 of Pre-processing When the EEG data was recorded during the experiment, it was sampled at a rate of 1000Hz. In iteration 1 of pre-processing, this sampling rate of 1000Hz was preserved. The .cnt file was loaded using a 16-bit representation. The EEG measures were calculated by choosing the electrode ‘Cz’ as the reference electrode and all slow artifacts below the frequency 1 Hz were removed using a high pass filter. This could be used to remove the artifacts which occur due to tiredness or boredom in subjects after a long period or functioning of sweat glands. The next step was to extract epochs for the two events. Here, a single epoch was extracted which contained both the events. The cue onset event was labelled ‘event 5’ and the target onset was labelled ‘event 6’. In order to obtain epochs, the period from 1 second before event 5 (indicated as -1 seconds relative to event 5 ) till 3 seconds after event 5 (indicated as 3 seconds relative to event 5) was extracted. Baseline activity was also removed from this stretch of signal with the activity from -1 seconds to -0.5 seconds chosen as baseline. This is the pre-processed data. When this was done, the epochs for the different conditions were selected and grouped together. Figure 3.3(a) shows a plot of the EEG signal before pre-processing and figure 3.3(b) shows the signal after pre-processing. The y-axis indicates the potential in microvolts at each of the 64 electrodes and the x-axis indicates the time in seconds. The signals show the activations recorded at each of the corresponding 64 electrodes. The red lines marked ‘5’ indicate the cue onset in the different trials and the events ‘6’ marked in green indicate the target onset. This entire procedure was repeated for each of the 11 22 Scale 483 −+ 68 69 70 71 (a) Before pre-processing 5 6 5 4 6 5 3 6 5 6 5 2 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG Scale 520 −+ 0 72 1 6 5 6 5 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 67 1000 −1000 0 1000 −1000 0 1000 −1000 0 1000 −1000 0 1000 (b) After pre-processing Figure 3.3: Plots before and after pre-processing subjects ( and for a total of 55 files ). As can be noticed from the figure, the signals appear as blobs and cannot be clearly identified and this could be due to the fact that the sampling rate was very high. Thus, a second round of pre-processing was done. 3.4.3 Iteration 2 of Pre-processing The second round of pre-processing was done after consultation with the EEG analysts at the School of Psychology. In the second iteration of pre-processing, after loading the data using a 16-bit representation, it was re-sampled to 250Hz. This made it easier to analyse as indicated in figures 3.4(a) and 3.4(b). The signals were more clear and similar to EEG signals and less like blobs as was seen in the first iteration. The next step was to remove the baseline activity from each channel. Here, the mean value of each channel was chosen as the baseline. The mean value of each channel was thus, subtracted from the rest of the channel signal. Once the baseline was removed from each channel, the electrode locations were loaded and the recorded activity, re-referenced . Here, we used the average reference montage to represent the EEG measure. When calculating the average, however, we excluded the electrodes, HEOG and VEOG which record activity related to eye movements since it did not seem sensible to include them while all we were interested in studying, was the activity in the brain areas. Next, we filtered out the noise with frequency below 1 Hz. This removed the slow artifacts just as in the first 23 −1000 5 6 5 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG Scale 501 −+ 2 3 4 5 6 7 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG Scale 7.251 +− 0 (a) Raw signal 1 2 3 4 (b) After resampling Figure 3.4: Plots before and after re-sampling iteration. After filtering was done, we extracted epochs for the two events. Here, we extracted two separate epochs for the two events event 5 and event 6 instead of just one epoch as in the first iteration. Besides, the time intervals chosen for the epochs were also different. In order to obtain epochs for event 5, we extracted the time interval from -0.2 seconds to 0.5 seconds relative to event 5. We also removed baseline activity from this stretch of signal with the activity from -0.2 seconds to event 5 ( 0 seconds relative to event 5 ) chosen as baseline. This indicates the condition before the stimulus has come on and hence provides a reasonable measure of the normal activity in the body. Similarly, to obtain epochs for event 6, we extracted the signal from -0.5 seconds to 0.5 seconds relative to event 6. The baseline period chosen in this case was from -0.5 seconds to -0.3 seconds relative to event 6. This is the period when the cue has been shown to the subject. Part of the initial neural activity in this period is due to the effects visual processing of the stimulus. Most of the activity however can be attributed to the memory retention of the cue object. Thus it seemed to be a reasonable period to use as the baseline period for the participant’s task when the array comes on. After the extraction of epochs, we selected the epochs for the different conditions as before. The figures 3.5(a) and 3.5(b) indicate the images of the pre-processed data after extraction of the two sets of epochs. From the plots of the extracted epochs, we can observe that some activations in the lower parts of the plots show huge deflections. These are due the electrodes HEOG and VEOG, which record the horizontal and vertical movements of the eye, respectively. This indicates that the participant moved the eyes during 24 5 this period and that these eye artifacts have to be removed. These artifacts are more pronounced in figure 3.5(b) which shows activity before and after event 6, the target onset. It is during this period, specifically, after event 6, that the saccade to the target object takes place. Hence there is a huge Scale 13 −+ 0 200 −200 0 200 −200 0 200 −200 0 200 −200 0 200 −200 4 5 6 3 6 2 6 1 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 6 18 6 17 5 16 5 15 5 14 5 5 deflection after event 6. FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG Scale 4.468 −+ −250 (a) Epochs for event 5 0 −500 −250 0 −500 −250 0 −500 −250 0 −500 −250 0 (b) Epochs for event 6 Figure 3.5: Plots for pre-processed, extracted epochs However, when the pre-processed data generated from the second round of pre-processing was evaluated, it was found to be faulty. The timing between the two events in the EEG data seemed to have been doubled when compared with the eye-tracker data for the same two events. This is explained in detail in the evaluation section in Chapter 5. This mismatch of time between the EEG data and the eye-tracker data was found to be due to a wrong choice of bit representation. Thus a third round of pre-processing was done to confirm this. However, due to limitations in the time available, the pre-processing of all the subjects could not be done. Hence only the pre-processing of data for 2 subjects was done to make sure the right technique was used. 3.4.4 Iteration 3 of Pre-processing In the third iteration of the pre-processing, the bit representation used was changed to a 32-bit representation unlike the 16-bit representation used in the first two iterations. The raw signal now looked more reasonable in terms of similarity to an EEG signal as shown in figure 3.6. Besides, the timing between events in the eye-tracker data seemed to match the timing in the signal plot. 25 −500 6 5 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 36 Scale 16 −+ 37 38 39 40 41 Figure 3.6: Raw Signal using a 32-bit representation The data was then re-sampled to 250Hz (figure 3.7(a)) and the baseline was removed as in the second iteration (figure 3.7(b)). Next, we re-referenced the EEG measure to an average reference. As re-sampling, baseline removal and re-referencing of the EEG data are carried out, we see that some electrode activations which were not present in the earlier stage start appearing. This could probably, only be explained with the guidance of the EEG analysts. Next, we filtered the noise and artifacts which have frequency less than 1 Hz ( figure 3.8 ). Then, we extracted the epochs for analysis from the filtered EEG signals. The time intervals and the baseline periods chosen for the events were the same as in the second iteration. After extracting the epochs for the two events, the data looked like in figures 3.9(a) and 3.9(b). Although the filtered EEG signals look more or less smooth, after baseline removal and extraction of epochs, we can clearly see variations in the activity. As before, the lower parts of the plots show huge deflections in electrodes HEOG and VEOG due to eye movements. These have to be removed. Next, we select epochs for each of the 12 conditions and group them together. Each of these steps has to be done on each block for each of the 11 subjects. However, due to limited time, this could be done only for two subjects. 26 Scale 209 −+ 37 38 39 40 41 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 36 6 5 6 5 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 36 Scale 209 +− 37 (a) Resampled data 38 39 40 41 (b) After removing baseline FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 36 Scale 179 −+ 37 38 39 40 41 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 36 Scale 849 +− 37 (a) Re-referenced Data 38 39 (b) Filtered Data Figure 3.8: Re-referenced and filtered data 27 6 5 6 5 Figure 3.7: Plots for data after re-sampling and baseline removal 40 41 0 200 −200 0 200 −200 0 200 −200 0 200 −200 0 200 Scale 18 −+ −200 43 44 6 42 6 41 6 40 6 30 6 29 5 28 5 27 5 5 5 26 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG −250 0 −500 −250 0 (a) Epoch for event 5 −500 −250 0 −500 −250 0 −500 −250 0 (b) Epoch for event 6 Figure 3.9: After Epoch Extraction 3.5 Analysis In order to perform analysis of the EEG data, we use a technique called Independent Components Analysis (ICA) which identifies the component activations that contribute most to an EEG signal and the most likely source electrode of each component. There is an in-built tool in EEGLAB which helps in performing ICA on the data. 3.5.1 Independent Component Analysis (ICA) Independent Component Analysis, as the name suggests, is a technique is used to identify the various components present in a signal. It can be applied to different types of signals such as sound or speech signals, or electrical signals such as EEG signals. As discussed in [15], a sound or electrical signal is most often composed of several underlying signals from various sources. Thus, the signal is a mixture of these ‘source signals’. ICA splits this single signal into the different source signals. In order to do this, it makes an assumption that the different source signals which form the observed signal are independent of each other i.e. the value of the one source signal bears no relation with the value of any other signal at any given point of time. This is explained clearly in [15] using an example. Consider a speech signal obtained from the combination of two speech signals from two different people, within the same time frame. The value of the speech signal of one person at a point of time is independent of the value of the other person’s speech signal. ICA uses this information to split the observed speech 28 Scale 17 +− −500 signal into the two speech signals from two independent sources. The same principle can be applied to the EEG signal. The EEG signal is essentially a mixture of signals from several neuronal populations. If these populations are considered to be independent sources, the EEG signal can be split to generate the source signals from each of these populations. Thus the important components their sources can be identified. In EEGLAB, these components can be visualised as scalp maps or as ERPs. As suggested in [12], the signal to noise ratio of an EEG waveform is very low. This is because the electrical activity produced by the activity in the brain is very small compared to the background noise. In order to reduce the amount of noise, we generate a grand average waveform for each condition across all subjects and then perform ICA on the resulting EEG signal. Thus, the data for different subjects for each condition was appended to form a single set and ICA was performed on this. In the second iteration, this was done by considering all 11 subjects. However, in the third iteration, due to limited time, the grand average waveform was generated using two subjects and ICA was performed. In order to generate the grand average waveform, the data for each of the twelve conditions, irrespective of the subject was merged together. For instance, if we take condition C1, the epochs for all subjects and for all blocks was merged into a single set. This was done for all 12 conditions to generate 12 different data sets. The next step was to group all the ‘C’ conditions (i.e. C1, C2, C3, C4) into a single condition ‘C’. Similar data sets were created for ‘L’ and ‘S’. Thus we now have three data sets for each of the three main conditions. ICA can then be performed on these data sets. Once ICA has been performed, we can then generate plots and scalp maps which compare the component activations for different conditions. This can give us a rough idea about the sources of brain activity in different conditions colour, shape and location. For example, figure 3.10 shows the component activations for a visual search task where the subject is asked to look for a target object which has the same shape as the cue object, from an array of objects. These are the most prominent components in the average EEG signal for this condition. This along with the scalp map for these different components can be analysed by EEG experts to identify the sources for each of these components. The scalp map for the first component is given in 3.11 Figure 3.12 shows the channel activations for the same condition. In the figure, the VEOG channel shows huge deflection. A zoomed-in picture of the VEOG electrode in 3.13 shows this. This indicates the presence of eye artifacts and they have to be removed. Figure 3.14 shows the comparison of component activations for two different conditions colour and shape. By looking at these plots generated, we can verify whether they suggest a pathway of processing of signals in the brain as suggested by the neural model. This can thus help verify the binding by retinotopy hypothesis suggested by the model. Similarly, by looking at the plots for each of the three different conditions, we can analyse how the brain activity differs in each condition. The analysis of these plots will however, have to be done by the EEG experts. At this stage, nothing can be done since the plots generated are based on averaged data of two participants. This is most likely to be very noisy. Thus, a proper analysis can only be done after the pre-processed data for all 11 participants is 29 all−05 5−S ERP 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 +1.32 64 −1.32 −500 496 Time (ms) Figure 3.10: Component Activations for target selection based on shape 1 Figure 3.11: Scalp map for first component averaged. 30 all−05 5−S 0 FPZ FP1 HEOG AF3 F7 T7 TP7 F3 FP2 AF4 F1 FZ F2 F6 F4 F8 FC5 FC3 FC1 FCZ FC2 FC4 FC6 C5 C3 C1 CZ C2 C4 C6 CP5 CP3 CP1 CPZ CP2 CP4 CP6 P7 P5 P3 PZ P1 PO3 PO7PO5 POZ O1 OZ P2 P4 P6 FT8 T8 TP8 P8 PO4PO6 PO8 CB2 CB1 +15.2 O2 −15.2 −500 496 Time (ms) Figure 3.12: Channel Activations for target selection based on shape 15 10 Potential (µV) FT7 F5 VEOG 5 0 −5 VEOG −10 −15 −400 −200 0 Time (ms) 200 400 Figure 3.13: Channel Activations for VEOG electrode 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 57 58 59 60 61 62 63 −2.69 56 +2.69 64 −500 496 Time (ms) Figure 3.14: Comparison of activations for shape and colour conditions 32 Chapter 4 The Model 4.1 Overview A neural model for visual processing based on feature-based attention was put forward in [1] to demonstrate how the mechanism of deployment of visual attention is actually implemented in the visual cortex. It shows specifically, how the information flows along the ventral stream during a visual search task much like the task in the Chelazzi experiment described in Chapter 2. This model proposes that the binding of different features of an object is done on the basis of the retinotopic location. The model was developed using MIIND [16], a neuroscience framework made by Dr.Marc de Kamps and David G.Harrison for implementing artificial neural networks. CLAMVis, a software simulation for the neural models was also implemented by them and this tool was used to visualise the information flow in the model. Figure 4.1 gives an idea of the information flow in the model. The ventral stream in the model is represented as a hierarchy of 5 layers of the visual areas - V1, V2, V4, PIT and AIT in that order. The dorsal stream consists of layers V1,V2, V4 and the parietal cortex (indicated as PG in the figure). The ventral stream is concerned with the function of object recognition. When the array of objects is displayed, the stimulus activations are communicated up the hierarchy from V1. As we go up the layers in the hierarchy, the complexity of stimuli identified by the cells increases. Thus, whereas the V1 layer is selective only to simple lines of particular orientations, V2 responds to a set of two lines forming a slightly more complex feature. At the topmost level, i.e. AIT, the individual objects are recognised. Here, the target object is identified from the array of objects at the AIT layer. However, here we are faced with another problem. The lower layers in the ventral stream are retinotopically organised. But this property is lost as we go up the hierarchy and hence, AIT does not have position information of the target object. It provides a translation-invariant 33 Figure 4.1: The model (Image from [17]) recognition of objects i.e. AIT responds to the object it is selective for, irrespective of its position in the visual field. The position information of the target object is however, necessary to make an eye movement (saccade) to the target. This information is obtained by the interaction of the bottom-up activation caused in the model by the stimulus and the top-down activation by the attentional template in the ventral stream. The position information can then be transferred to the LIP (lateral intra parietal) area of the parietal cortex where the necessary actions to make the saccade are taken. In order to model these bottom-up and top-down activations, a feedforward and feedback network are used respectively. 4.2 The Feedforward Network The feedforward network used to model the bottom-up processing is shown in figure4.2. The bottomup processing comes from the activations caused due to the presence of the stimulus in visual field. For example, if a square, a cross, a circle and a triangle are present in the visual field, the square causes activations in some V1 neurons which leads to the activation of some V2 neurons and so on until the AIT neuron selective for squares is activated. Similarly, each of the other three objects produces its own activations in the model. The receptive field increases as we go up the feedforward network. This is modelled by having one V2 neuron see a 2x2 submatrix of V1 neurons, a V4 neuron see a 4x4 submatrix of V1 neurons and so on as indicated in [17]. Figure 4.3 shows a simulation of the working of the feedforward network when four objects (i.e. a square, a diamond, a cross and a diagonal cross) are presented on the display screen during a visual search task. The simulation was generated using CLAMVis. The four layers on the top of the figure are the V1 layers. There are four V1 layers, each of which 34 Figure 4.2: The Feedforward Network (Image from [18]) Figure 4.3: Simulation of The Feedforward Network are selective to lines of particular orientations. For example, the first layer in the figure is selective to horizontal lines and the third layer is selective to vertical lines. The other two layers are selective to diagonal lines. The second layer from the top represents V2, the third layer represents V4, the fourth layer represents PIT and the last layer of 4 nodes represent the 4 AIT neurons selective to the four objects. The neurons with high positive activation (+1) are indicated in red colour and those with high 35 negative activation (-1) are indicated in green color. In the actual visual system, the information that enters the eye is taken via the optic nerve to the Lateral Geniculate Nucleus (LGN) and then to the visual cortex. However, the LGN is not part of the model. Thus, in order to indicate the presence of four stimuli or objects in the visual field, we activate the corresponding layers of V1. Thus, to indicate the presence of a square in the top-left corner of the display, we activate the neurons of layers 1 and 3 of V1 at their corresponding locations. Similarly, neurons to represent a diamond in the top-right corner, a cross in the bottom-right corner and a diagonal cross in the bottom-left corner, are activated. As the simulation progresses, these neurons in turn activate other neurons in the layers V2, V4 and PIT. As a result of the stimulus activations due to the four objects the four AIT neurons selective to these objects are activated, and hence they are indicated in red. 4.3 The Feedback Network The feedback network is essentially the reverse of the feedforward network i.e. it starts at AIT and ends at V1. There are feedback connections from the higher layers to the lower layers in the feedforward network. Thus the connections in the feedback network have what is called a fan-out structure as shown in figure 4.4. The purpose of the feedback network is to provide for the attentional template to aid in choosing the target object from the array. The target object has been identified by the bottom-up activation in the feedforward network. By providing feedback connections, therefore, we are providing identity information about the target object to the lower visual areas like V1 which have retinotopic organisation. Figure 4.4: The Feedback Network (Image from [18]) Figure 4.5 indicates a simulation of the feedback network. Since the structure of the feedback network is the inverse of the feedforward network, the first layer in figure 4.5 consists of the four AIT neurons. These are followed by PIT, V4,V2 and the four 36 Figure 4.5: Simulation of The Feedback Network V1 layers. When the feedforward activations have been produced, the attentional template is indicated in the feedback network by selecting the AIT neuron selective to the target object. This AIT neuron is indicated in red in the first layer at the top of the image. As the simulation proceeds, this activation produces activations in the layers PIT, V4, V2 and the four V1 layers. The activations in the other layers indicated in red and green in the figure are thus due to the attentional template. 4.4 Dynamic Networks As suggested in [18], modelling the nodes in the neural network as individual neurons is probably not biologically plausible. If the model has to explain how visual processing actually takes place in the brain, it should provide for quick processing and response and low firing rates in the neurons as observed in its biological counterparts. In accordance with this, we can consider a perceptron to represent the neural activity of a population of neurons rather than that of a single neuron [18]. This however, may lead to spurious activity in the higher layers of the model, i.e. the higher layers may show activity even if a stimulus is not present. Consider a node in a neural network. The output of a node is given by 4.1 where each xi is an input to the node and each wi is the weight of this connection. Here, function f is called the squashing function of the neural network. f is usually the sigmoid function, indicated in 4.2. If this is replaced 37 by a new squashing function of the same form as indicated in 4.3, spurious activity in the network can be done away with. out put = f (∑ wi xi − θ ) (4.1) i 1 1 + e−β x (4.2) 2 −1 1 + e−β x (4.3) f (x) = f (x) = This new squashing function however, causes a perceptron to produce negative activations at times. To solve this problem, we can replace each perceptron by a circuit as shown in figure 4.6 Figure 4.6: The Circuit (Image from [18]) The circuit is made up of excitatory populations P, N, E p and En . Ip and In represent inhibitory populations. The connections between these populations are either inhibitory (indicated by black triangles) or excitatory (indicated by white triangles). An inhibitory connection tries to suppress the activity of the neuron to which it is connected while an excitatory connection tries to raise its activity. If for example, we consider the scenario where the input labelled Jp is active, the populations E p and Ip are activated. This causes the population P to be activated due to the excitatory connection from E p to P and the population N to be inhibited due to the inhibitory connection from Ip to N. If there is no input present at Jn , this results in a net ‘positive’ activation from the entire circuit. Similarly, if input is present at Jn instead of Jp , it leads to a net ‘negative’ activation from N. The positive or negative activations, however, only serve to indicate the population which is dominant in the circuit and are basically represented as spike rates of populations. The output from the circuit at any point in time is either positive or negative, i.e. it cannot be both positive and negative at the same time. Each of the nodes in the feedforward and feedback network is thus replaced by the circuit shown 38 in figure 4.6 and this produces what are called the dynamic feedforward and feedback networks. 4.5 The Disinhibition Network The disinhibition network is where the interaction between the top-down activation due to the attentional template and the bottom-up activation due to the stimuli takes place. A detailed description of the working of the disinhibition circuit is given in [19]. The disinhibition network is essentially not an artificial neural network, but a simple circuit formed from the shaded neurons as shown in figure 4.7. The nodes labelled Pf and N f indicate the positive and negative activations from a single node in the dynamic feedforward network and Pr and Nr represent the corresponding activations from the feedback network. G p and Gn are gating nodes which inhibit the excitatory nodes E p and En . They are in turn, however inhibited by the inhibitory nodes Ip and In . The nodes E p and En send their output to the lateral intraparietal area (LIP). Figure 4.7: The Disinhibition Network (Image from [19]) Consider the case when Pf is active. It drives the nodes E p and G p due to the excitatory connections. Although G p inhibits E p , this inhibition comes into play only after some time since there is an extra connection along this path. Thus E p initially produces some activation and then dies down when it starts receiving inhibitory signals from G p . Similar is the case when N f becomes active. Now, consider the case where there is a match between the stimulus activation (bottom-up activation) and the attentional template (top-down activation) at a node. Then, either the nodes Pf and 39 Pr or the nodes N f and Nr in the circuit will be active simultaneously. Take the case where Pf and Pr are active. Now, Pf drives E p and G p which causes E p to be active for a brief period of time before G p inhibits its activity. However, now Pr is also active and it drives the inhibitory node Ip which in turn inhibits the gating node G p . Thus E p continues to be active and sends a positive activation to the LIP region. This is how feature-matching is implemented in the model. All those nodes which have matching top-down and bottom-up activity indicate presence of matching features at that location. Thus the area which has the maximum number of matches is a strong indication of the location of the target object. Similar dynamics is observed in the circuit when N f and Nr are active at the same time. 4.6 Lateral Inhibition The phenomenon of task-biased competition as explained in Chapter2 is implemented in the model using lateral inhibition. The node labelled LI in figure 4.7 serves to provide lateral inhibition. When either Pf and Nr or N f and Pr become active together at a node, we have a mismatch of features at that particular location or node. If say Pf and Nr are active together, the activation in Pf makes E p active for a brief period, but after a while, the inhibition from G p makes it inactive. At the same time, the activation of Nr makes LI active which in turn sends inhibitory signals to the neighbouring E p or En nodes. Thus in the case of mismatch at a node, lateral inhibition ensures that the nodes in the vicinity of the mismatched node are made less active. This is done since absence of the feature match at a node is an implication that the neighbouring nodes will also be different. If the LI node was present alone, a feature match would also lead to lateral inhibition of the neighbouring nodes. In order to prevent this, a node labelled ILI is used. As indicated in figure 4.7, E p is connected to ILI through an excitatory connection. When Pf and Pr are active together, a match occurs, resulting in E p becoming active and activating ILI in turn. The ILI node has an inhibitory connection to the LI node and hence suppresses the activity of the LI node. This makes sure that lateral inhibition of activity of neighbouring nodes is not done when there is a match of features. A detailed account of how lateral inhibition is implemented in the model is given in [19]. As part of the project I had to study variations in the circuits in the model. One such variation was to study the effect of removing the circuits which implement lateral inhibition in the network. This is indicated in the figure 4.8. As shown in figure 4.8, on removing lateral inhibition, it was found that the output to the LIP layer could not indicate clearly, the location of the target object. In figure 4.8, the set of figures in the first column indicate the activation in the feedforward network representing the stimulus activations when the array of objects is displayed. The second column shows the activations in the feedback network due to the attentional template. The third column indicates the activity in the disinhibition network due to the feature-matching of activations in the feedforward and feedback networks. The last column indicates the output in the LIP. When there is no lateral inhibition, the activity is distributed in the LIP layer among the four different locations and we cannot state without doubt the location of the target 40 (a) Without Lateral Inhibition (b) With Lateral Inhibition Figure 4.8: Activations with and without Lateral Inhibition (Images from [19]) object. This is however, not the case in the figure on the left hand side. Here, there is activity clearly visible in the top-left corner. Thus we can clearly conclude that the target object is on the top-left corner. From this observation, we can conclude that lateral inhibition is necessary in the network. If we examine human behaviour in the visual search task, we can see that humans do not find it difficult to arrive at a decision about the location of the target object. They can clearly identify the location of the target object. In this aspect, human behaviour is similar to the model implemented with lateral inhibition. However, whether this behaviour is achieved using a similar mechanism of inhibiting the activity of the neighbours needs to be verified and this can be done by the analysis of the EEG data and by generating a simulated EEG signal prediction from the model. This again, is one reason why the experiment is deemed necessary. But, it is reasonable to conclude that lateral inhibition is necessary in the model to be able to explain human behaviour in the task. Without lateral inhibition the model cannot justify observations made in humans. 4.7 Implementation of a traversal program In order to map the model to the tool QTriplot so as to generate an EEG signal prediction for the model, the following were the steps involved. 1. Generate a program to read a simulation file and traverse the nodes in the feedforward, feedback 41 and disinhibition networks 2. Obtain relevant data regarding the connectivity of various visual areas such as V1, V2 etc. and receptive field sizes of the neurons in each of these areas 3. map the model. Accordingly, a program was implemented in C++ to read in a simulation file and traverse the networks. The simulation file was generated by the tool CLAMVis. The simulation file generated is a ‘.root’ file and it had to be read into the program to perform the traversal. The root package is an object-oriented framework which was developed to provide easier simulation and analysis of a large amount of data. The ‘.root’ file generated by CLAMVis contains the simulation results. This means that the file contains the activity values of the different nodes in three networks the disinhibition, the feedforward and the feedback networks. In order to implement the traversal program, the ‘.root’ was read into the program. The MIIND framework contains a class called SimulationResult. The file is wrapped into an object of type SimulationResult. The different networks of the model are implemented as C++ vector data types. The vector data type is basically a dynamic array. In order to traverse the elements of a vector one by one, a data structure called an iterator is used. There are inbuilt C++ functions begin() and end() which return iterators to the start and end of the vector respectively. The networks are implemented as objects of a class DynamicSubNetwork which provides iterators to traverse the network. There are two different types of iterators ForwardSubNetworkIterator and ReverseSubNetworkIterator which are used to traverse the two different types of networks, namely, the feedforward and feedback networks. These are implemented as classes in MIIND. The ForwardSubNetworkIterator provides functions equivalent to the begin() and end() functions inbuilt in C++ which return iterators to traverse the feedforward network while the ReverseSubNetworkIterator provides similar functions to traverse the feedforward network in the reverse order, in effect, traversing the feedback network. The disinhibition network is basically a feedforward network and hence, can be traversed using ForwardSubNetworkIterator. The implementation of this program required a study of the complex hierarchical structure of the MIIND libraries. As a result of this exercise, it is now possible to extract data out of the simulation files. Although a lot of reading was done on the connectivity data and receptive field sizes, not much relevant information was obtained from it. Besides, step 3 could not be implemented due to restrictions on time. It came to our notice that the implementation of the mapping would alone require a lot more time than available and hence the focus was shifted to studying the circuits in the model and the analysis of pre-processed data. 42 Chapter 5 Evaluation The main outcome of the project was the pre-processed data and the plots generated after analysis. Due to absence of prior knowledge or experience in analysing the EEG signals and the scalp maps generated, expert advice had to be sought from the EEG analysts and experts at the School of Psychology. Evaluation of the pre-processed data was therefore, done by conducting discussions with them. When the pre-processed data from the first round of pre-processing was examined, the sampling rate was found to be too high and the EEG experts advised that a lower sampling rate would be more appropriate and would make it easier for further analysis. Accordingly, the sampling rate was lowered from 1000 Hz to 250 Hz. It was under their advice that two separate epochs for the events 5 and 6 (cue and target onset) were extracted per subject. The epoch for event 5 (cue onset) would contain information about neural activity before, during and after visual processing of the stimulus. Similarly, the epoch for event 6 would provide information about how the neural activity would vary during when and after the array of objects is displayed (target onset). The two types of epochs could then be compared to identify the differences in neural activity. After the second round of pre-processing, the data was verified by looking at the eye-tracker data to see if the various events matched in both the cases. It was during this exercise that Claudia Gonzalez of the School of Psychology and I found out that there was a timing mismatch between the events in the EEG signal and the eye-tracker data. For example, consider the plot in figure 5.1. It indicates the raw signal re-sampled to 250 Hz for one of the subjects According to the plot, the time between the two events 5 (indicated in red) and 6 (indicated in green) is 1.5 seconds. Figure 5.2 indicates a plot of the eye movements with time for the same subject and for the same trial. 43 6 5 FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 85 Scale 7.251 −+ 86 87 88 89 90 Figure 5.1: EEG signal for a subject using 16-bit representation Figure 5.2: EEG signal for a subject As shown in the figure, there is a huge deflection in the values of eye-position and velocity along the x and y coordinates. This is when the eye movement or the saccade takes place. The DISPLAY CUE and DISPLAY TARGET time points in the figure represent the cue and target onset respectively. The cue onset is chosen as zero or the starting point here. Nevertheless, ideally, the time 44 between cue onset and target onset should be the same as that in the EEG signal i.e. 1.5 seconds in this case. The eye-tracker data was looked up to verify this. The time difference between the two events was however found to be 0.751 seconds in the eye-tracker data. This was only half of the time difference in the EEG data. This meant that signal for a period of 1 second in the EEG data was actually only 0.5 seconds of original signal. The problem was then traced back to the bit representation used in EEGLAB when importing data. The tool used a default representation of 16-bits and due to lack of knowledge about the format used by EEGLAB to store the data, this default value was assumed to be appropriate. Figure 5.3 indicates the same trial for the same user using a 32-bit representation. The time interval between the two events was found to be 0.75 seconds and hence matching with the eyetracker data. Although the data was sent to the EEG experts during the second round of pre-processing for verification during each stage, the problem could not be identified. This problem could have been FP1 FPZ FP2 AF3 AF4 F7 F5 F3 F1 FZ F2 F4 F6 F8 FT7 FC5 FC3 FC1 FCZ FC2 FC4 FC6 FT8 T7 C5 C3 C1 CZ C2 C4 C6 T8 TP7 CP5 CP3 CP1 CPZ CP2 CP4 CP6 TP8 P7 P5 P3 P1 PZ P2 P4 P6 P8 PO7 PO5 PO3 POZ PO4 PO6 PO8 CB1 O1 OZ O2 CB2 HEOG VEOG 42 6 5 averted if the eye-tracker data had been checked earlier. Scale 11 −+ 43 44 45 46 47 Figure 5.3: EEG signal for a subject using 32-bit representation Besides this, the plots generated using the two bit representations were also compared. Figure 5.4 shows the a comparison of ERPs for the two conditions , ‘L’ and ‘S’ recorded for each of the 64 channels in the two representations. The window on the left indicated the plot for the 16-bit representation and it indicates a lot of variation between the two conditions. However, this extent of variation in activity is not expected. Similarly, a plot of the comparison of the component activations for the two conditions is indicated in figure 5.5. A similar result is observed here as well. 45 Figure 5.4: A comparison of the channel ERPs for conditions ’L’ and ’S’ using the two bit representations Figure 5.5: A comparison of the component ERPs for conditions ’L’ and ’S’ using the two bit representations In order to evaluate the pre-processed data further, the pre-processing of all the subjects would have to be done again and the plots and scalp maps would have to be generated for each condition. The EEG experts can then look at the data and the maps and verify whether evidence for the visual processing pathway suggested by the model is present in the EEG data. 46 Chapter 6 Conclusion The aim of this project was to verify the hypothesis of binding by retinotopic location by analysing the EEG data obtained from the experiment and mapping the model of binding in [1] to a tool called QTriplot to generate a prediction of the EEG signal from the model. The hypothesis can then be verified by analysing the predicted EEG signal and comparing it to the observed EEG signal in the experiment. However, due to several hurdles during the progress of the project and due to restrictions in time available, this could not be done completely. However, it is possible to draw the following conclusions from the work that has been done. • It can now be concluded that the pre-processing stage is not as trivial as it seems. It cannot be approached as a black box. The parameters during each stage of pre-processing have to be carefully chosen. It has now also been understood that it is not possible to verify the correctness of the process until it has been completed entirely. For instance, it may not be possible to look at plots of the channel ERPs generated during the pre-processing and indicate whether the pre-processing is progressing correctly. • It can be said with a fair amount of confidence that the appropriate parameters for pre-processing have finally been found. However, this will require another iteration for verification. • The tool, EEGLAB used for pre-processing and analysing EEG data is probably not suited to someone inexperienced in working with EEGs and is not very user friendly since it makes assumptions regarding factors such as the bit representation which although seemingly unimportant, have a huge impact on the process. In spite of the hurdles a thorough study of the model, the experiment and the EEG data involved could be done. A proper understanding of the rationale of the experiment has been obtained and the 47 various theories about visual attention have been studied. 6.1 Further Work Since all the work could not be completed, further work on the project would consist of the following main tasks. • The first step would be to complete the third iteration of preprocessing with the right parameters and to generate the plots required for analysis. Analysis of these plots can then be done to draw conclusions as to whether a pattern of visual processing as suggested by the neural model can be found. The brain activity in different conditions can be compared. However, a proper strategy has to found out before comparison between these conditions is done. For example, if we consider the task for shape condition, there is a set of component activations ordered according to their prominence in the signal. Similarly, there is a set of component activations for the colour condition. It may not, however, be rational to compare the most prominent component of the shape condition with that of the colour condition as they may be from different sources. Besides, when there are eye artifacts clearly present, these have to be removed before analysis can be done. Hence, a detailed strategy has to be devised. • The next step could be to conduct a further detailed study about the connectivity patterns of the visual areas • The mapping of the model to QTriplot can then be done to generate a predicted EEG signal from the model. • Another possible extension is to consider automating the pre-processing stage since it is very tedious and time-consuming. 48 Bibliography [1] Marc de Kamps and Frank van der Velde. Neural blackboard architectures: the realization of compositionality and systematicity in neural networks. Journal Of Neural Engineering, 3:R1– R12, 2006. [2] R Desimone and J Duncan. Neural mechanisms of selective visual attention. Annual Review Neuroscience, 18:193–222, 1995. [3] http://www-psych.stanford.edu/~lera/psych115s/notes/lecture3/figures. html. [4] http://cueflash.com/decks/LESIONS\_OF\_THE\_VISUAL\_SYSTEM\_-\_50. [5] S Luck, Chelazzi L, Hillyard S, and R Desimone. Effects of spatial attention on responses of v4 neurons in the macaque. Society of Neuroscience, 19, 1993. [6] J Moran and R Desimone. Selective attention gates visual processing in the extrastriate cortex. Science, 229(4715):782–784, 1985. [7] Chelazzi L, Miller E K, Duncan J, and Desimone R. A neural basis for visual search in inferior temporal cortex. Nature, 363:345–7, 1993. [8] J.C Martinez-Trujillo and S.Treue. Feature-based attention increases the selectivity of population responses in primate visual cortex. Current Biology, 14(9):744–51, 2004. [9] J.C Martinez-Trujillo and S.Treue. Feature-based attention influences motion processing gain in macaque visual cortex. Nature, 399:575–79, 1999. [10] J.H.R Maunsell and S.Treue. Feature-based attention in visual cortex. Trends in Neurosciences, 29(6):317–322, 2006. [11] Anne M.Treisman and Garry Gelade. A feature-integration theory of attention. Cognitive Psychology, 12:97–136, 1980. [12] Jamie Ward. The Student’s Guide to Cognitive Neuroscience. Psychology Press, 2006. 49 [13] Paul Bourke. Various eeg file formats and conventions. dataformats/eeg/. http://paulbourke.net. [14] Arnaud Delorme, Hilit Serby, and Scott Makeig. EEGLAB Wikitorial. http://sccn.ucsd. edu/eeglab/eeglabtut.html. [15] James V.Stone. Independent Component Analysis : A Tutorial Introduction . MIT Press, 2004. [16] Marc de Kamps and V.Baier. Multiple interfacing instantiations of neuronal dynamics(miind) : a library for rapid prototyping of models in cognitive neuroscience. International Joint Conference on Neural Networks, 19:2829–2834, 2007. [17] Marc de Kamps and Frank van der Velde. From knowing what to knowing where : modeling object-attention with feedback disinhibition of activation. Journal of Cognitive Neuroscience, 13(4):479–91, 2001. [18] Marc de Kamps and Frank van der Velde. From artificial neural networks to spiking neuron populations and back again. Neural Networks, 14:941–953, 2001. [19] Marc de Kamps and David G. Harrison. A dynamical model of feature-based attention with strong lateral inhibition to resolve competition among candidate feature locations. University of Leeds, 2011. 50 Appendix A Personal Reflection This project has been an enlightening experience and it has given me valuable lessons for life. I have not had previous exposure to a project which is research-oriented. Therefore, this project, which is strongly based on research, gave me a deep insight into how research is to be conducted. My undergraduate project was a general software development project and it did not have the amount of planning and background reading that was done in this project. Although dividing the project period up into different stages was done during my previous projects, it was not planned to this level of detail. Due to this inexperience, I initially had difficulties in planning my work. The situation was no different in the case of background reading. I have never had to do such extensive reading in the past, to have an understanding of the problem. I found it difficult finding articles relevant to my problem and sometimes I even spent days reading irrelevant articles or trying to find relevant ones. My supervisor helped me a lot in coping with these difficulties and he advised me to make use of tools such as the Web of Science and Google Scholar. The level of detailed reading required was another aspect I learnt from the project. The method of reading I am used to, was to read through every single detail of an article. This became increasingly difficult as the amount of reading to be done grew. Thus, I had to learn to use speed reading techniques such as skimming. I also learnt to use mind maps to organise my ideas and this helped me a lot in writing up my report. During the implementation phase ( which has mainly been pre-processing of EEG data ) of the project, I was constrained by the fact that I was heavily dependent on the EEG experts at School of Psychology for feedback on the pre-processing stage. I could not proceed to analyse the EEG data without their approval. Although they gave feedback on changes to be made after the first iteration of pre-processing, they could not identify a major error in the second round of pre-processing which was identified quite late during the project. It could thus be concluded that such errors cannot be detected 51 at such an early stage. Besides this, I did not have a choice in the tools I could use to solve the problem. The tool to be used for pre-processing and analysing EEG data, EEGLAB, was suggested by the EEG experts themselves. The implementation of a program to traverse the neural model using a simulation file was one part where I could work independently. However, the tool to be used, MIIND, was not documented enough and I had to struggle to understand the class hierarchy of the framework. Similarly, working with the simulation tool, CLAMVis, was also not trivial. Merely setting up these tools took a lot of time and effort. Here, however, I received a lot of help from my supervisor, Dr.Marc de Kamps and from David G. Harrison of the Biosystems group. I can never thank them enough for all the support they have given me. The initial aim of implementing a mapping in QTriplot could not be done due to restrictions in time. This was probably because a proper understanding of the capabilities of the tool was not attained. Although this did worry me a little in the beginning, my supervisor convinced me that this was not the end of the road. To summarise, I would definitely say that I had a wonderful experience working on this project. Not everything in the project worked out as planned and there were major setbacks in the implementation of the project. However, I have learnt a lot from this experience. I have learnt that research is not always fruitful and that perseverance is the only solution to be successful, not only in research, but also in life in general. When we face hurdles, we often have to come up with alternative solutions which may not always be acceptable to us. Besides this, I have now realised that I need to work more on my planning and organising skills. I am grateful to my supervisor, Dr.Marc de Kamps for having faith in my abilities whenever things went wrong in the project and for the unwavering support he has given me all throughout. He has gone beyond his responsibility as a mere supervisor and has been more of a mentor to me. I am extremely thankful to David G.Harrison for the countless times he came to my assistance when I had trouble with the tools like CLAMVis, and EEGLAB. In spite of it not being his obligation, right from the start of my project, he has been a great help every step of the way. This project would not have been possible without their valuable guidance. 52 Appendix B Interim Report The Interim Report is submitted along with the Project Report as hard copy. 53 Appendix C Schedule The schedules created during the schedule are shown here. Figure C.1 shows the Initial Schedule and figure C.2 shows the Revised Schedule. 54 Figure C.1: Initial Schedule 55 Figure C.2: Revised Schedule 56 Appendix D Tools and Resources used The list of tools and resources required includes MATLAB and EEGLAB. The software tools necessary to run the model are MIIND and CLAMVis and their linux dependencies. With the help of my supervisor Dr. Marc de Kamps and with assistance from David G.Harrison of Biosystems Group, all the necessary tools were installed. The EEG data required for the project was obtained from Melanie Burke, Claudia Gonzalez and Jean-Francois Delvenne of the School Of Psychology and David G.Harrison of the School of Computing. 57