Doctoral Dissertation Neural Decoding of Visual Dream contents
Transcription
Doctoral Dissertation Neural Decoding of Visual Dream contents
NAIST-IS-DD1061022 Doctoral Dissertation Neural Decoding of Visual Dream contents Tomoyasu Horikawa December 13, 2013 Department of Bioinformatics and Genomics Graduate School of Information Science Nara Institute of Science and Technology A Doctoral Dissertation submitted to Graduate School of Information Science, Nara Institute of Science and Technology in partial fulfillment of the requirements for the degree of Doctor of SCIENCE Tomoyasu Horikawa Thesis Committee: Professor Kazushi Ikeda Professor Yuji Matsumoto Associate Professor Tomohiro Shibata Professor Mitsuo Kawato Professor Yukiyasu Kamitani (Supervisor) (Co-supervisor) (Co-supervisor) (Co-supervisor) (Co-supervisor) Neural Decoding of Visual Dream contents∗ Tomoyasu Horikawa Abstract Dreaming is a subjective experience during sleep often accompanied by vivid visual contents. Previous research has attempted to link physiological states with dreaming but has not demonstrated how specific visual dream contents are represented in brain activity. The recent advent of machine learning-based analysis has allowed for the decoding of stimulus- and task-induced brain activity patterns to reveal visual contents. Here, we extend this approach to decode spontaneous brain activity associated with dreaming with the assistance by lexical and image databases. We measured the brain activity of sleeping human subjects using fMRI while monitoring sleep stages by EEG. Subjects were awakened when a specific EEG pattern was observed during the sleep-onset (hypnagogic) period. They gave a verbal report on the visual experiences just before awakening and then returned to sleep. The words describing visual contents were extracted and grouped into 16-26 categories defined in the English lexical database WordNet, for systematically labeling dream contents. Decoders were trained on fMRI responses to natural images describing each category, and then tested on sleep data. Pairwise and multilabel decoding revealed that accurate classification, detection, and identification regarding dream contents could be achieved with the higher visual cortex, with semantic preferences of individual areas mirroring known stimulus representation. Our results demonstrate that specific dream contents are represented in activity patterns of visual cortical areas, which are shared by stimulus perception. Our method uncovers contents represented by brain activity not induced by stimulus or task, which could provide insights into the functions of dreaming and spontaneous neural events. Keywords: Neural decoding, fMRI, dream, multivariate pattern analysis ∗ Doctoral Dissertation, Department of Bioinformatics and Genomics, Graduate School of Information Science, Nara Institute of Science and Technology, NAIST-IS-DD1061022, December 13, 2013. i Contents 1. Introduction 1 2. Methods 2.1 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Prior instructions to subjects . . . . . . . . . . . . . . 2.3 Sleep adaptation . . . . . . . . . . . . . . . . . . . . . 2.4 Sleep experiment . . . . . . . . . . . . . . . . . . . . . 2.5 MRI acquisition . . . . . . . . . . . . . . . . . . . . . . 2.6 PSG recordings . . . . . . . . . . . . . . . . . . . . . . 2.7 Offline EEG artifact removal and sleep-stage scoring . 2.8 Visual dream content labeling . . . . . . . . . . . . . . 2.9 Visual stimulus experiment . . . . . . . . . . . . . . . 2.10 Localizer experiments . . . . . . . . . . . . . . . . . . 2.11 MRI data preprocessing . . . . . . . . . . . . . . . . . 2.12 Region of interest (ROI) selection . . . . . . . . . . . . 2.13 Decoding analysis . . . . . . . . . . . . . . . . . . . . . 2.14 Synset pair selection by within-dataset cross-validation 3. Results 3.1 Behavioral results of sleep experiments 3.2 Dream contents decoding . . . . . . . 3.2.1 Pairwise classification analysis 3.2.2 Multilabel decoding analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 6 6 7 9 9 10 10 14 14 16 17 22 24 . . . . 25 25 34 34 51 4. Discussion 59 References 63 Acknowledgements 70 Appendix 71 A. Supplementary results 71 ii List of Figures 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 REM and sleep-onset sleeping . . . . . . . . . . . . . . . . . . . . . . . . Experimental overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . Base synsets selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . Visual content vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . Visual stimulus experiment design . . . . . . . . . . . . . . . . . . . . . Functionally defined regions of interest on the flattened cortex . . . . . Inflated view of the anatomically defined regions of interest . . . . . . . Schematic overview of the pairwise classification analysis . . . . . . . . . Schematic overview of the multilabel decoding analysis . . . . . . . . . . Time course of theta power . . . . . . . . . . . . . . . . . . . . . . . . . Awakening statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Time course of sleep state proportion . . . . . . . . . . . . . . . . . . . . Distributions of pairwise decoding accuracy for stimulus-to-dream decoding Within dataset cross-validation decoding . . . . . . . . . . . . . . . . . . Representational similarity analysis . . . . . . . . . . . . . . . . . . . . . Decoding with averaged vs. multivoxel activity . . . . . . . . . . . . . . Mean accuracies for the pairs within and across meta-categories . . . . . Mean accuracies for the samples from each sleep state . . . . . . . . . . Pairwise decoding accuracies across visual cortical areas . . . . . . . . . Time course of pairwise decoding accuracy . . . . . . . . . . . . . . . . . Stimulus-to-stimulus decoding accuracy on whole cortical areas for the three subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Time course of stimulus-to-dream decoding accuracy on whole cortical areas for Subject 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Time course of stimulus-to-dream decoding accuracy on whole cortical areas for Subject 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Time course of stimulus-to-dream decoding accuracy on whole cortical areas for Subject 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ROC analysis for the three subjects . . . . . . . . . . . . . . . . . . . . AUC averaged within meta-categories for different visual areas . . . . . Synset score time course . . . . . . . . . . . . . . . . . . . . . . . . . . . Identification analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . Distribution of pairwise decoding accuracies . . . . . . . . . . . . . . . . Stimulus-to-stimulus pairwise decoding . . . . . . . . . . . . . . . . . . . iii 3 5 12 13 15 19 21 23 23 26 28 29 35 37 39 40 42 43 44 45 47 48 49 50 52 54 56 58 72 73 31 32 33 34 35 36 37 38 Dream-to-dream pairwise decoding . . . . . . . . . . . . . . . . . . . . . Decoding with averaged vs. multivoxel activity for individual subjects . Mean accuracies for the samples from each sleep state for individual subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pairwise decoding accuracies across visual cortical areas . . . . . . . . . Time course of pairwise decoding accuracy . . . . . . . . . . . . . . . . . Examples for the time courses of synset scores . . . . . . . . . . . . . . . Time courses of averaged synset scores for each subject . . . . . . . . . Identification performance for individual subjects . . . . . . . . . . . . . 74 75 76 77 78 79 80 81 List of Tables 1 2 3 4 Examples of Verbal Reports . . . List of Base Synsets for Subect 1 List of Base Synsets for Subect 2 List of Base Synsets for Subect 3 . . . . iv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 31 32 33 1. Introduction Dreaming is a subjective experience during sleep often accompanied by vivid visual contents. Due to its fundamentally subjective nature, the objective study of dreaming has been challenging. However, since the discovery of the rapid eye movement (REM) during sleep, scientific knowledge on the relationship between dreaming and physiological measures including brain activity has accumulated. Although dreaming has often been associated with the REM sleep stage, recent studies have shown that dreaming can be experienced during non-REM periods (Hobson and Stickgold 1994; Solms, 1997; Takeuchi et al., 2001; Baylor and Cavallero, 2001; Hori et al., 1994; Cavallero et al., 1992; Foulkes and Vogel, 1965; Foulkes, 1962; Nir and Tononi, 2010), and much research has been conducted to link various aspects of dreaming with physiological and behavioral measures during sleeping. Those studies have reported relations between dreaming and specific patterns of polysomnography (PSG; electroencephalography (EEG), electrooculogram (EOG), and electromyography (EMG)) (Aserinsky and Kleitman, 1953; Hori et al., 1998; German and Nielsen, 2001; Palagini et al., 2004), a specific type of behavior observed during sleeping (Gugger and Wagner, 2007), and changes of brain activity in several regions, including activations in multisensory areas (Hong et al., 2009), visual cortical areas (Maquet et al., 1996; Maquet, 2000; Miyauchi et al., 2009) and hippocampus (Wilson and McNaughton, 1994). However, none has demonstrated how specific visual dream contents are represented in the brain activity. The advent of machine learning-based analysis allows for the decoding of stimulus- and task-induced brain activity patterns to reveal visual contents (Haxby et al., 2001; Cox and Savoy, 2003; Kamitani and Tong 2005, 2006; Polyn et al., 2005; Miyawaki et al., 2008; Stokes et al., 2009; Reddy et al., 2010; Harrison and Tong, 2010; Albers et al., 2013). Those studies have demonstrated that not only the visual contents explicitly presented to subjects (Haxby et al., 2001; Cox and Savoy, 2003; Kamitani and Tong 2005, 2006; Miyawaki et al., 2008), but also the subjective visual contents, such as attended visual features (Kamitani and Tong, 2005, 2006), visually imagined shapes (Stokes et al., 2009) and imagined object categories (Reddy et al., 2010), and orientations maintained in working memory (Harrison and Tong, 2010; Albers et al., 2013), can be read out from brain activity 1 patterns using decoders trained on stimulus induced brain activity patterns. The results of these studies suggest that similar experiences may be represented by similar brain activity patterns. If this is true, because dreaming often contained vivid visual contents, we can expect that decoders trained on stimulus-induced brain activity can predict dream contents given brain activity patterns during sleep. However, dreaming is a phenomenon which is spontaneously generated by the brain during sleep, and it is not yet clear whether the neural representational similarity observed between perception and some kind of imagery during awakening (Kamitani and Tong 2005, 2006; Polyn et al., 2005; Stokes et al., 2009; Reddy et al., 2010; Harrison and Tong, 2010; Albers et al., 2013) can generalize to the similarity between perception and visual phenomenon during sleep. Here, we extend the decoding approach to the decoding of spontaneous brain activity during sleep, and examine whether we can read out visual dream contents from brain activity patterns in human visual cortex measured by functional resonance imaging (fMRI) during sleep. This approach is a most direct demonstration to establish a link between dreaming and observed brain activity during sleep. There were two major challenges to this approach. First, although several previous studies have shown that dream contents can be affected by waking experience or conditioning (Stickgold et al., 2000; Wamsley et al., 2010), it is generally difficult to experimentally control dream contents. We instead let the subject sleep and dream without pre-conditioning and freely describe the contents after awakening. Reports were analyzed with the assistance of a lexical database, WordNet (Fellbaum, 1998), which were used to create systematic descriptions of dream contents. The second challenge was the difficulty in collecting a large amount of dream data. Although dreaming has often been associated with the REM sleep stage (Aserinsky and Kleitman, 1953; Dement and Kleitman, 1957; Dement and Wolpert 1958; Hobson, 2009), since it takes at least 1 hour to enter the first REM stage, REM dreams are not suitable for collecting sufficient data for quantitative evaluation with decoding analysis. Recent studies have demonstrated that dreaming is dissociable from REM sleep and can be experienced during non-REM periods (Hobson and Stickgold 1994; Solms, 1997; Takeuchi et al., 2001; Baylor and Cavallero, 2001; Hori et al., 1994; Cavallero et al., 1992; Foulkes and Vogel, 1965; Foulkes, 1962; Nir and Tononi, 2010), and reports at awakenings 2 Sleep stage in sleep-onset and REM periods share general features such as frequency, length, and contents while differing in several aspects including the affective component (Foulkes and Vogel, 1965; Vogel et al., 1972; Oudiette, 2012). Here, we focused on visual dreaming experienced during the sleep-onset (hypnagogic) period (sleep stage 1 or 2) (Hori et al., 1994; Stickgold et al., 2000; Foulkes and Vogel, 1965), making it easy to collect many observations by repeating awakenings and recording subjects’ verbal reports of visual experience (Fig. 1). Wake 1 2 3 4 REM/NREM 0 1 2 3 4 5 6 7 8 Figure 1. REM and sleep-onset sleeping. Sleep state measurements reveal about 90 minute cycles of REM and non-REM (NREM) sleep (the red/blue lines indicate periods of REM/NREM sleep). Since the discovery of the REM sleep (Aserinsky and Kleitman, 1953) the dreaming has often been associated with the REM sleep stage. However, reports of dreaming are also common from NREM sleep stage, including sleep stage 1 and 2. While there are differences in several aspects of reports obtained at awakenings in sleep-onset and REM periods, they share general features such as frequency, length, and contents. 3 In this thesis, we present a neural decoding approach in which machine learning models predict the contents of visual dreaming during the sleep onset period given measured brain activity, by discovering links between human fMRI patterns and verbal reports with the assistance of lexical and image databases (Fig. 2). We hypothesized that contents of visual dreaming are represented at least partly by visual cortical activity patterns shared by stimulus representation. Thus we trained decoders on brain activity patterns induced by viewing natural images collected from web image databases, and tested on brain activity during sleeping. The results showed that the decoding models trained on stimulus-induced brain activity in higher visual cortical areas showed accurate classification, detection, and identification of contents. Our findings demonstrate that specific visual experience during sleep is represented by brain activity patterns shared by stimulus perception, providing a means to uncover subjective contents of dreaming using objective neural measurement. 4 Yes, well, I saw a person. Yes. What it was... It was something like a scene that I hid a key in a place between a chair and a bed and someone took it. Awakening Sleep 1 stages 2 Report period Wake fMRI volumes t Prediction fMRI activity pattern before awakening Machine learning decoder assisted by lexical and image databases Figure 2. Experimental overview. fMRI data were acquired from sleeping subjects simultaneous with PSG. Subjects were awakened during sleep stage 1 or 2 (red dashed line) and verbally reported their visual experience during sleep. fMRI data immediately before awakening (average of three volumes [= 9 s]) were used as the input for main decoding analyses (sliding time windows were used for time course analyses). Words describing visual objects or scenes (red letters) were extracted. The visual contents were predicted using machine learning decoders trained on fMRI responses to natural images. 5 2. Methods The study protocol was approved by the Ethics Committee of ATR. 2.1 Subjects Potential subjects answered questionnaires regarding their sleep-wake habits. Usual sleep and wake times, regularity of the sleep habits, habits of taking a nap, sleep complaints, and regularity of lifestyle (e.g., mealtime), their physical and psychiatric health, and sleeping conditions were checked. Anyone who had physical or psychiatric diseases, was currently receiving medical treatment or was suspected of having a sleep disorder was excluded. People who had a habit of taking alcoholic beverages before sleep or smoking were also excluded. Finally, three healthy subjects (Japanese-speaking male adults, aged 27-39 years) with normal visual acuity participated in the experiments. All subjects gave written informed consent for their participation in the experiment. 2.2 Prior instructions to subjects From three days prior to each experiment, subjects were instructed to maintain their sleep-wake habits, i.e., daily wake/sleep time and sleep duration. They were also instructed to refrain from excessive alcohol consumption, unusual physical exercise, and taking of naps, from the day before each experiment. Their sleepwake habits were monitored by a sleep log. No subject was chronically sleep deprived, and all slept over 6 hours on average the night before experiments. 2.3 Sleep adaptation Subjects underwent two adaptation sleep experiments before the main fMRI sleep experiments to get used to sleeping in the experimental setting (Agnew et al., 1966; Tamaki et al., 2005). The adaptation experiments were conducted using the same procedures as the fMRI sleep experiments except that real fMRI scans were not performed. The experimental environment was simulated using a mock scanner consisting of the shell of a real scanner without the magnet. Echo-planar 6 imaging (EPI) acoustic noise was also simulated and given to the subject via speakers. 2.4 Sleep experiment Sleep (nap) experiments were carried out from 1:00 pm until 5:30 pm, and were scheduled to include the mid-afternoon dip (Monk et al., 1996). Subjects were instructed to sleep if they could, but not to try to sleep if they felt they could not. This instruction was given to reduce psychological pressure toward sleeping because efforts to sleep may themselves cause difficulty in falling asleep. fMRI scans were conducted simultaneous with PSG recordings (electroencephalogram [EEG], electrooculogram [EOG], electromyogram [EMG], and electrocardiogram [ECG]). We performed multiple awakenings (see below for details) to collect verbal reports on visual experience during sleep. The multiple-awakening procedure was repeated while fMRI was performed continuously (interrupted upon the subject 1-3 request for a break; mean ± SD of duration across all 55 runs, 88.99 ± 26.09 min [mean ± SD]). The experiment was repeated over 10, 7, and 7 days in Subject 1-3, respectively, until at least 200 awakenings with a visual report were obtained from each subject. Offline sleep stage scoring confirmed that in >90% of the awakenings followed by visual reports, the last 15-s epoch before awakening was classified as sleep stage 1 or 2. If the last 15-s epoch was classified as the waking stage, we excluded the data from decoding analysis. As a result, 235, 198, and 186 awakenings were selected for Subject 1-3, respectively, constituting sleep data samples for further analysis. Multiple-awakening procedure Once the fMRI scan began, the subject was allowed to fall asleep. The experimenter monitored noise-reduced EEG recordings in real time while performing EEG staging on every 6-s epoch. The experimenter awakened subjects by calling them by name via a microphone/headphone when a single epoch with alpha-wave suppression and theta-wave (ripple) occurrence, which are known to be EEG signatures of NREM sleep stage 1 for obtaining frequent visual reports upon awakening (Hori et al., 1994), was detected. The subject was asked to verbally describe 7 what they saw before awakening along with other mental content and then to go to sleep again. If the EEG signatures were detected before the elapsed time from the previous awakening reached 2 min, then the experimenter waited until it reached 2-3 min. If the subject was already in NREM sleep stage 2 when 2 min had passed, and it was unlikely they would go back to NREM sleep stage 1, the subject was awakened. When the subject repeatedly entered NREM sleep stage 2 within 2 min, the subject was awakened with short intervals (less than 2 min) or was asked to remain awake with eyes closed for one or two minutes after the awakening to increase their arousal level. This multiple-awakening procedure was repeated during the fMRI session. The subject was also asked to respond by button press when they detected a beep sound (250 Hz pure tone, 500 ms duration, 12-18-s inter-stimulus intervals). This auditory task was conducted for potential use in monitoring the sleep stage. However, in the present study, we did not use the data because we failed to record responses in some of the experiments owing to computer trouble. Our preliminary analysis using successfully recorded data (Subject 1) showed that the detection rates in each of the wake/sleep stages were similar to those of previous work (Ogilvie and Wilkinson, 1989) even when sleep samples were limited to those in which a sound was played during the last 15-s epoch before awakening but not detected by the subject, the decoding results were similar. Subjects were informed that they could quit the experiment anytime they wished to, and that they could refuse to report mental contents in cases where there were privacy concerns. Acquisition of verbal reports On each awakening, the subject was asked if the subject had seen anything just before awakening, and then to freely describe it along with other mental contents. If a description of the contents was unclear, the subject was asked to report the contents in more detail. Most reports started immediately upon awakening and lasted for 34 ± 19 s (mean ± SD; three subjects pooled). After the free verbal report, the subject was asked to answer specific questions such as rating the vividness of the image and the subjective timing of the experience (from when and until when relative to awakening), but the reports obtained by these explicit questions were not used in the analyses in the current study. Free reports that 8 contained at least one visual element were classified as visual reports. If no visual content was present, reports were classified as others including thought (active thinking), forgot, non-visual report, and no report. The classification was first conducted in real time by the experimenter, and was later confirmed by other investigators. Examples of verbal reports are shown in table 1. The subject’ voice during the procedure was recorded by an optical microphone. 2.5 MRI acquisition fMRI data were collected using 3.0-Tesla scanner located at the ATR Brain Activity Imaging Center. An interleaved T2*-weighted gradient-EPI scan was performed to acquire functional images to cover whole brain (sleep experiments, visual stimulus experiments, and higher visual area localizer experiments: TR, 3,000 ms; TE, 30 ms; flip angle, 80 deg; FOV, 192 × 192 mm; voxel size, 3 × 3 × 3 mm; slice gap, 0 mm; number of slices, 50) or the entire occipital lobe (retinotopy experiments; TR, 2,000 ms; TE, 30 ms; flip angle, 80 deg; FOV, 192 × 192 mm; voxel size, 3 × 3 × 3 mm; slice gap, 0 mm; number of slices, 30). T2-weighted turbo spin echo images were scanned to acquire high-resolution anatomical images of the same slices used for the EPI (sleep experiments, visual stimulus experiments, and higher visual area localizer experiments: TR, 7,020 ms; TE, 69 ms; flip angle, 160 deg; FOV, 192 × 192 mm; voxel size, 0.75 × 0.75 × 3.0 mm; retinotopy experiments: TR, 6,000 ms; TE, 57 ms; flip angle, 160 deg; FOV, 192 × 192 mm; voxel size, 0.75 × 0.75 × 3.0 mm). T1-weighted magnetizationprepared rapid acquisition gradient-echo (MP-RAGE) fine-structural images of the whole head were also acquired (TR, 2,250 ms; TE, 3.06 ms; TI, 900 ms; flip angle, 9 deg, FOV, 256 × 256 mm; voxel size, 1.0 × 1.0 × 1.0 mm). 2.6 PSG recordings PSG was performed simultaneously with fMRI. PSG consisted of EEG, EOG, EMG, and ECG recordings. EEGs were recorded at the 31 scalp sites in all experiments except for one (Fp1, Fp2, F7, F3, Fz, F4, F8, FC5, FC1, FC2, FC6, T7, C3, Cz, C4, T8, TP9, CP5, CP1, CP2, CP6, TP10, P7, P3, Pz, P4, P8, POz, O1, Oz, O2) according to 10% electrode positions (Sharbrough et al., 9 1991). For one experiment, EEG was recorded at 25 scalp sites (Fp1, Fp2, F7, F3, Fz, F4, F8, T7, C3, Cz, C4, T8, P7, P3, Pz, P4, P8, PO7, PO3, POz, PO4, PO8, O1, Oz, O2). EOGs were recorded bipolarly from four electrodes placed at the outer canthi of both eyes (horizontal EOG) and above and below the right eye (vertical EOG). EMGs were recorded bipolarly from the mentum. ECGs were recorded from the lower shoulder blade. EEG and ECG recordings were referenced to FCz. All EEG electrodes in the cap were with 5 kΩ-resistors, while other electrodes were with 15 kΩ resistors. In consideration of EEG data quality, impedance of EEG electrodes was kept below 15 kΩ and that of other electrodes was kept below 25 kΩ. All data were recorded by an MRI-compatible amplifier at a sampling rate of 5,000 Hz using BrainVision Recorder. The artifacts derived from T2*-weighted gradient-EPI scan and ballistocardiogram (Bonmassar et al., 2002) were reduced in real time using RecView so that the experimenter was able to monitor the EEG patterns online. Because EEG recordings were referenced to FCz, EEGs recorded in the occipital area were better than those recorded in the central area at detecting subtle changes in EEG waves. Thus, O1 was used for online monitoring of EEG data. When the O1 channel was contaminated by artifacts, O2 was used instead. 2.7 Offline EEG artifact removal and sleep-stage scoring Artifacts were removed offline from EEG recordings after each experiment (the FMRIB plug-in for EEGLAB, The University of Oxford) for further analyses. EEG data were then down-sampled to 500 Hz, re-referenced to TP9 or TP10, depending on the derivation, and low-pass filtered with a 216-Hz cut-off frequency using a two-way least-squares FIR filter. EEG recordings were scored into sleep stages according to the standard criteria (Rechtshaffen and Kales, 1968) for every 15 s. 2.8 Visual dream content labeling The subjects’ report at each awakening, given verbally in Japanese, was transcribed into text. The reports that contained at least one visual object or scene were classified as visual report, and those without visual content were classified as 10 others. Three labelers extracted words (nouns) that described visual objects or scenes from each visual report text and translated them into English (verified by a bilingual speaker). These words were mapped to WordNet, an English lexical database in which words with a similar meaning are grouped as synsets (an abbreviation of ”synonym-set”) in a hierarchical structure (Fellbaum, 1998). Synset assignment was cross-checked by all three labelers. For each extracted word, hypernym synsets (superordinate categories defined on the WordNet tree) of the assigned synset were also identified in the WordNet hierarchy. To determine representative synsets that describe visual contents, we selected the synsets (synsets assigned to each word and their hypernyms) that were found in 10 or more visual reports without co-occurrence with at least one other synset. We then removed the synsets that were the hypernyms of others. These procedures produced base synsets, which were frequently reported while being semantically exclusive and specific (Fig. 3). The visual contents at each awakening were coded by a visual content vector, in which the presence/absence of each synset was denoted by 1/0 (Fig. 4). 11 Um, what I saw now was like, a place with a street and some houses around it... Artefact Structure Way Street n=21 Base synset WordNet Tree Building n=18 House Hotel n=7 Figure 3. Base synsets selection. Words describing visual objects or scenes (red) were mapped onto synsets of the WordNet tree. Synsets were grouped into base synsets (blue frames) located higher in the tree. 12 Base synsets building chair character clothing code cognition external body part geographical area girl group illustration implement line male material natural object performer picture room shape table vertebrate way window workplace writing book building car character commodity computer screen covering dwelling electronic equipment female food furniture male mercantile establishment point region representation street Subject 1 50 100 150 50 100 150 50 100 Awakening index 150 200 Subject 2 Subject 3 car communication display external body part female geological formation house male material room surface table tract vascular plant vertebrate way Figure 4. Visual content vectors. Visual reports are represented by visual content vectors, in which the presence/absence of the base synsets in the report at each awakening is indicated by white/black. The visual content vectors are shown for all subjects and awakenings with visual reports (excluding those with contamination of the wake stage detected by offline sleep staging). Each column denotes the presence/absence of the base synsets in the sleep sample. Note that there are several samples in which no synset is present. This is because the reported words in these samples were rare and not included in the base synsets. 13 2.9 Visual stimulus experiment We selected stimulus images for decoder construction using ImageNet (http://www.imagenet.org/; 2011 fall release) (Deng et al., 2009), an image database in which web images are grouped according to WordNet. Two-hundred and forty images were collected for each base synset (each image corresponding to exclusively one synset). If images for a synset were not available in ImageNet, we collected images using Google Images (http://www.google.com/imghp). In the visual stimulus experiment, the selected images were resized so that the width and height of images were within 16 deg (the original aspect ratio was preserved) and were presented at the center of the screen on gray background (Fig. 5). Subjects were allowed to freely view the images without fixation. We measured stimulus-induced fMRI activity for each base synset by presenting these images (using the same fMRI setting as the sleep experiment). In a 9-s stimulus block, six images were randomly sampled from the 192 (Subject 1) or 240-image (Subject 2 and Subject 3) set for one base synset without replacement, and each image was presented for 0.75 s with intervening blanks of 0.75 s ( 4 hours for a whole visual stimulus experiment). We presented multiple images for each base synset to attenuate influences from irrelevant features (e.g., age or view of a depicted person for ”male” synset). Thus each of the images for each base synset was presented only once during the experiment. The stimulus block (9 s) was followed by a 6-s rest period, and was repeated for all base synsets in each run. Extra 33-s and 6-s rest periods were added at the beginning and the end of each run, respectively. We repeated the run with different images to obtain 32, 40, and 40 blocks per base synset for Subject 1-3, respectively. 2.10 Localizer experiments Retinotopy The retinotopy mapping session followed the conventional procedure (Engel et al., 1994; Sereno et al., 1995) using a rotating wedge and an expanding ring of 14 “male” “building” 9s 6s 9s Time Figure 5. Visual stimulus experiment design. Subjects freely viewed six different exemplars of each base synset on each 9-s stimulus block. a flickering checkerboard. The data were used to delineate the borders between each visual cortical area, and to identify the retinotopic map on the flattened cortical surfaces of individual subjects. Localizers for higher visual areas We performed functional localizer experiments to identify the lateral occipital complex (LOC), fusiform face area (FFA), and parahippocampal place area (PPA) for each individual subject (Epstein and Kanwisher, 1998; Kanwisher et al., 1997; Kourtzi and Kanwisher, 2000). The localizer experiment consisted of 4-8 runs and each run contained 16 stimulus blocks. In this experiment, intact or scrambled images (12 × 12 deg) of face, object, house and scene categories were presented at the center of the screen. Each of eight stimulus types (four categories × two conditions) was presented twice per run. Each stimulus block consisted of a 15-s intact or scrambled stimulus presentation. The intact and scrambled stimulus blocks were presented successively (the order of the intact and scrambled stimulus block was random), followed by a 15-s rest period of uniform gray background. Extra 33-s and 6-s rest periods were added at the beginning and end of each run, respectively. In each stimulus block, twenty different images of the same type were presented with a duration of 0.3 s followed by intervening 15 blanks of 0.4-s duration. Images for each category were collected from the following resources: face images from THE CENTER FOR VITAL LONGEVITY (http://vitallongevity.utdallas.edu) (Minear and Park, 2004); object images from The Object Databank (http://stims.cnbc.cmu.edu/ImageDatabases/TarrLab/Objects/; Stimulus images courtesy of Michael J. Tarr, Center for the Neural Basis of Cognition and Department of Psychology Carnegie Mellon University, http://www.tarrlab.org/); house and scene images from SUN database (http://groups.csail.mit.edu/vision/SUN/) (Xiao et al., 2010). 2.11 MRI data preprocessing The first 9-s scans for experiments with TR = 3 s (sleep, visual stimulus, and higher visual area localizer experiments) and 8-s scans for experiments with TR = 2 s (retinotopy experiments) of each run were discarded to avoid instability of the MRI scanner. The acquired fMRI data underwent three-dimensional motion correction by SPM5 (http://www.fil.ion.ucl.ac.uk/spm). The data were then coregistered to the within-session high-resolution anatomical image of the same slices used for EPI and subsequently to the whole-head high-resolution anatomical image. The coregistered data were then reinterpolated by 3 × 3 × 3 mm voxels. For the sleep data, after a linear trend was removed within each run, voxel amplitude around awakening was normalized relative to the mean amplitude during the period 60-90 s prior to each awakening. The average proportions of wake, stage 1, and stage 2 during this period were 32.7%, 44.0%, and 23.3%, respectively (three subjects pooled). This period was used as the baseline because it tended to show relatively stable BOLD signals over time. We assumed that the occurrence of sleep-onset dreaming would be rare with the relatively low theta amplitudes (Hori et al., 1994). However, it cannot be ruled out that visual dreaming may have been experienced during this period, and that the baseline using this period as the baseline would make it difficult to detect visual contents relevant to the dreaming in this period. The voxel values averaged across the three volumes (9 s) immediately before awakening served as a data sample for decoding analysis (the time window was shifted for time course analysis). For the visual stimulus data, after within-run linear trend removal, voxel amplitudes were normalized relative to the mean amplitude of the pre-rest period of each 16 run and then averaged within each 9-s stimulus block (three volumes) shifted by 3 s (one volume) to compensate for hemodynamic delays. Voxels used for the decoding of a synset pair in each ROI were selected by t statistics comparing the mean responses to the images of paired synsets (highest absolute t values; 400 voxels for individual areas, and 1,000 voxels for LVC and HVC). The voxel values in each data sample were z-transformed for removing potential mean-level differences between the sleep and visual stimulus experiments. 2.12 Region of interest (ROI) selection Functionally localized areas V1, V2, and V3 were delineated in the standard retinotopy experiment (Engel et al., 1994; Sereno et al., 1995), and the lateral occipital complex (LOC), the fusiform face area (FFA), and the parahippocampal place area (PPA) were identified using conventional functional localizers (Epstein and Kanwisher, 1998; Kanwisher et al., 1997; Kourtzi and Kanwisher, 2000). The retinotopy experiment data were transformed to the Talairach coordinates and the visual cortical borders were delineated on the flattened cortical surfaces using BrainVoyager QX (http://www.brainvoyager.com). The voxel coordinates around the gray-white matter boundary in V1-V3 were identified and transformed back into the original coordinates of the EPI images. The voxels from V1, V2, and V3 were combined as the lower visual cortex (LVC; 2,000 voxels in each subject). The localizer experiment data for higher visual areas were analyzed using SPM5. The voxels that showed significantly higher activation in response to objects, faces, or scenes than scrambled images for each (t test, uncorrected P < 0.05 or 0.01) were identified, and were used as ROIs for LOC, FFA, and PPA, respectively. We set relatively low thresholds for identifying ROIs so that a larger number of voxels would be included to preserve broadly distributed patterns. A continuous region covering LOC, FFA, and PPA was manually delineated on the flattened cortical surfaces, and the region was defined as the higher visual cortex (HVC; 2,000 voxels in each subject. Voxels overlapping with LVC were excluded from HVC. After voxels of extremely low signal amplitudes were removed, approximately 2,000 voxels remained in LVC (2054, 2172, and 1935 voxels for Subject 1-3, respectively) and in 17 HVC (1956, 1788, and 2235 voxels for Subject 1-3, respectively). For the analysis of individual subareas, the following numbers of voxels were identified for V1, V2, V3, LOC, FFA, and PPA, respectively: 885, 901, 728, 523, 537, and 353 voxels for Subject 1; 779, 949, 897, 329, 382, and 334 voxels for Subject 2; 710, 859, 765, 800, 432, and 316 voxels for Subject 3. LOC, FFA, and PPA identified by the localizer experiments may overlap: the voxels in the overlapping region were included in both ROIs (Fig. 6). 18 Subject 1 V2 V3 V2 V1 V3 V1 HVC HVC LOC LOC Dorsal FFA FFA Left Right PPA PPA Ventral Subject 2 V3 V2 HVC V1 V1 V2 V3 LOC HVC FFA LOC FFA PPA PPA Subject 3 V3 V2 V1 V1 V2 V3 HVC HVC LOC LOC FFA FFA PPA PPA Figure 6. Functionally defined regions of interest on the flattened cortex. The individual areas of each subject are shown on the flattened cortex. A contiguous region covering LOC, FFA, and PPA was manually delineated on the flattened cortical surface, and the region was defined as the ”higher visual cortex”(red line). The voxels overlapping with the lower visual cortical areas (V1-V3) were excluded from ROI for the higher visual cortex. For individual ROIs voxels near the area border were included in both areas. 19 Anatomically delineated areas The T1-image of each subject was analyzed using FreeSurfer (http://surfer.nmr.mgh.harvard.edu), and regions covering a whole cortical surface were anatomically identified on each subject’s cortical surface (Fig. 7). Two types of parcellations provided from FreeSurfer were used to define a total of 108 cortical regions on one hemisphere (Desikan et al., 2006, Destrieux et al., 2010). 20 A Anterior Lateral Superior Posterior Medial Inferior Lateral Superior Medial Inferior B Anterior Posterior Figure 7. Inflated view of the anatomically defined regions of interest. The whole cortical surface was automatically delineated according to two types of parcellations provided from FreeSurfer. (A) The 34 anatomically delineated ROIs defined by a parcellation from Desikan et al. (2006). (B) The 74 anatomically delineated ROIs defined by a parcellation from Destrieux et al. (2010). Here, the ROIs were mapped on the left hemisphere of Subject 1. 21 2.13 Decoding analysis For all pairs of the base synsets, a binary decoder consisting of linear Support Vector Machine (Vapnik, 1998) (SVM; implemented by LIBSVM (Chang and Lin, 2011)) was trained on the visual stimulus data of each ROI. The fMRI signals of the selected voxels and the synset labels were given to the decoder as training data. SVM provided a linear discriminant function for classification between synset k and l given input voxel values x = [x1 ,. . ., xD ] (D, number of voxels), fkl (x) = D ! wd xd + w0 d=1 where wd is the weight parameter for voxel d, and w0 is the bias. The performance was evaluated by the correct classification rate for all sleep samples selected for each pair. In the pairwise decoding analysis, the stimulus-trained decoder was tested on the sleep data that contained exclusively one of the paired synsets (Fig. 8). Prediction was made on the basis of whether fkl (x) was positive (k) or negative (l), given a sleep fMRI data sample. In the multilabel decoding, the discriminant functions comparing a base synset k and each of the other synsets (l #= k) were averaged after the normalization by the norm of the weight vector wkl to yield the linear detector function (Kamitani and Tong, 2005), which indicates how likely synset k is to be present, fk (x) = 1 ! fkl (x) N − 1 l!=k ||wkl || where N is the number of base synsets. Given a sleep fMRI data sample, multilabel decoding produced a vector consisting of the output scores of the detector functions for all base synsets [f1 (x), f2 (x), . . ., fN (x)] (Fig. 9). 22 Synset Awakening Male 1 0 0 1 Car 0 1 1 0 Z ... z Male z or Pairwise decoder Car Figure 8. Schematic overview of the pairwise classification analysis. A binary classifier for pairs of base synsets was constructed. A decoder was trained with fMRI responses to stimulus images of two base synsets, and sleep samples labeled with either of the two synsets exclusively were tested. Z z male food z ... ... Multi-label decoder car street Figure 9. Schematic overview of the multilabel decoding analysis. The synset detectors for each base synset were constructed from a combination of the pairwise classifiers. Given an arbitrary sleep data, each detector outputs a continuous score indicating how likely the synset is to be present in each report. 23 2.14 Synset pair selection by within-dataset cross-validation To select synset pairs with content-specific patterns in both the stimulus-induced and sleep datasets, we performed cross-validation decoding analysis for each pair in each dataset. For the stimulus-induced dataset, samples from one run were left for testing, and the rest were used for decoder training (repeated until all runs were tested; leave-one-run-out cross validation) (Kamitani and Tong, 2005). For the sleep dataset, one sample was left for testing, and the rest were used for decoder training (leave-one-sample-out cross-validation). Note that since the frequency in the sleep reports is generally different between paired synsets, their available samples are usually unbalanced. To avoid possible biases in decoder training caused by the imbalance, we trained multiple decoders by randomly resampling a subset of the training data for the synset with more samples to match to the synset with fewer samples (repeated 11 times). The discriminant functions calculated for all the resampled training datasets were averaged after normalization by the norm of the weight vector to yield the discriminant function (decoder) to be used for testing in each cross-validation step. We selected the synset pairs that showed high cross-validation performance in both of datasets (one-tailed binomial test, uncorrected P < 0.05). 24 3. Results 3.1 Behavioral results of sleep experiments Verbal report collections by multiple awakening procedure Three subjects participated in the fMRI sleep experiments (Fig. 2), in which they were woken when an EEG signature was detected (Hori et al., 1994) (Fig. 10). The subjects were asked to give a verbal report freely describing their visual experience before awakening (Table 1; duration, 34 ± 19 s [mean ± SD]). We repeated this procedure to attain at least 200 awakenings with a visual report for each subject. On average, we awakened subjects every 342.0 s, and visual contents were reported in over 75% of the awakenings (Fig. 11), indicating the validity of our methods to collect verbal reports. Offline sleep stage scoring (Fig. 12) further selected awakenings to exclude contamination from the wake stage in the period immediately before awakening (235, 198, and 186 awakenings for Subject 1-3 used for decoding analyses). 25 Baseline Subject 1 Subject 2 Subject 3 Normalized theta power (a.u) 1 0 120 90 60 Time to awakening (s) 30 0 Figure 10. Time course of theta power. The time course of theta power (4-7 Hz) during 2 min before awakening is shown for each subject (error bar, 95% CI; averaged across awakenings). For each awakening, we shifted the 9-s window (equivalent to three fMRI volumes) by 3 s (equivalent to one fMRI volume) to calculate the theta power from preprocessed EEG signals. The plotted points indicate the center of the 9-s window (slightly displaced to avoid overlaps between subjects). The power was normalized relative to the mean power during the time window of 60-90 s prior to each awakening (gray area). The result of this offline analysis is consistent with the awakening procedure in which theta ripples were detected in online monitoring to determine the awakening timing. 26 Table 1. Examples of Verbal Reports Subject Subject 1 Index Report Word 13 Well, what was that? Two male persons, well, what was that? I cannot remember very well, but there were e-mail texts. There were also characters. E-mail address? Yes, there were a lot of e-mail addresses. And two male persons existed. male person text character male (male person) writing (written material, piece of writing) character (grapheme, graphic symbol) 133 Well, somewhere, in a place like a studio to make a TV program or something, well, a male person ran with short steps, or run, from the left side to the right side. Then, he tumbled. He stumbled over something, and stood up while laughing, and said something. He said something to persons on the left side. So, well, a person ran, and there were a lot of unknown people. I did not saw female persons. There were a group of a lot of people, either male or female. The place was like a studio. Though there are a lot of variety for studios, the studio was a huge room. Since it was a huge room, it was indoor maybe. I saw such a scene in the huge room. studio room male person people workplace (work) room male (male person) group (grouping) 54 Yes, I had a dream. Something long. First at some shop. Ah, a bakery shop. I was choosing some merchandise. I took a roll in which a leaf of perilla was put. Then, I went out, and on the street, I saw a person who were doing something like taking a photograph. shop bakery merchandise roll leaf perilla street person point mercantile establishment (retail store, sales outlet, outlet) commodity (trade good, good) food (solid food) street 130 Yes, ah, yes, a female person, well, existed. The person served some foods, umm, like a flight attendant. Then, well, before that scene, I saw a scene in which I ate or saw yogurt, or I saw yogurt or a scence in which yogurt was served. What appeared was the female person and an unknown thing like a refrigerator. Maybe indoor, with colors. food flight attendant yogurt refrigerator person female person food (solid food) female (female person) commodity (trade good, good) 114 Well, from the sky, from the sky, well, what was it? I saw something like a bronze statue, a big bronze statue. The bronze statue existed on a small hill. Below the hill, there were houses, streets, and trees in an ordinary way. sky statue hill house street tree geological formation (formation) house way vascular plant (tracheophyte) 186 Well, in the night, somewhere, well, in a restaurant in office building covered with windowpanes, on the big table, someone important looked a menu and chose dishes. There were both male and female persons. Then, there was a night scenery from the window. restaurant office building windowpane table someone menu male female scenery window table communication male (male person) female (female person) Subject 2 Subject 3 Base synset Note. Originally, the reports and words were verbally reported in Japanese. They were transcribed into text and were translated into English (verified by a bilingual speaker). Note that not all reported visual words were assigned with a base synset (e.g., person in report #133 of Subject 1, and perilla in report #54 of Subject 2). This is because 1) the word and its hypernyms did not appear in ten or more reports, or 2) the synset assigned to the word is a hypernym of other synsets (see Methods “2.8. Visual dream content labeling”). On average, 47.7% of reported visual words were assigned with base synsets (49.5%, 50.0%, and 42.0% for Subject 1–3, respectively). 27 With visual content Subject 1 249 Total awakenings No visual content (total exps) 58 307 (10) 220 61 281 (7) 203 63 266 (7) Subject 2 Subject 3 0 50 100 (%) Figure 11. Awakening statistics. The numbers of awakenings with/without visual contents are shown for each subject (numbers of experiments in parentheses). 28 100 With visual content Subject 1 No visual content 50 Proportion (%) 0 100 Subject 2 50 0 100 Subject 3 50 0 120 60 0 120 60 Time to awakening (s) Wake Stage1 0 Stage2 Figure 12. Time course of sleep state proportion. The proportion of wake/sleep states (Wake, Stage 1, and Stage 2) is shown for all the awakenings with and without visual content in each subject. Offline sleep stage scoring was conducted for every 15-s epoch using simultaneously recorded EEG data. The last 15-s epoch before awakening was classified as Sleep Stage 1 or 2 in over 90% of the awakenings with visual contents, but in fewer awakenings with no visual content. This result suggests that most visual reports are indeed associated with dreaming during sleep, not with imagery during wakefulness. The samples in which the last 15-s epoch before awakening was classified as Wake were not used for further analyses. 29 Reported visual dream contents From the collected reports, words describing visual objects or scenes were manually extracted and mapped to WordNet, a lexical database in which semantically similar words are grouped as synsets in a hierarchical structure (Fig. 3; Fellbaum, 1998; Huth et al., 2012). Using a semantic hierarchy, we grouped extracted visual words into base synsets that appeared in at least 10 reports from each subject (26, 18, and 16 synsets for Subject 1-3; Tables 2-4) The fMRI data obtained before each awakening were labeled with a visual content vector, each element of which indicated the presence/absence of a base synset in the subsequent report (Fig. 4). We also collected images depicting each base synset from ImageNet (Deng et al., 2009), an image database in which web images are grouped according to WordNet, or Google image, for decoder training. 30 Table 2. List of Base Synsets for Subject 1 Base synset ID Definition Reported word Count Meta-category male (male person) 09624168 a person who belongs to the sex that cannot have babies gentleman, boy, middle-aged man, old man, young man, male, dandy 127 Human character (grapheme, graphic symbol) 06818970 a written symbol that is used to represent speech character, letter 35 Others room 04105893 an area within a building enclosed by walls and floor and ceiling booth, conference room, room, toilet 20 Scene workplace (work) 04602044 a place where work is done laboratory, recording studio, studio, workplace 17 Scene external body part 05225090 any body part visible externally lip, hand, face 17 Others natural object 00019128 an object occurring naturally; not made by man leaf, branch, figure, beard, mustache, orange, coconut, moon, sun 13 Others building (edifice) 02913152 a structure that has a roof and walls and stands more or less permanently in one place bathhouse, building, house, restaurant, schoolhouse, school 12 Scene clothing (article of clothing, vesture, wear, wearable, habiliment) 03051540 a covering designed to be worn on a person's body clothes, baseball cap, clothing, coat, costume, tuxedo, silk hat, hat, T-shirt, kimono, muffler, polo shirt, suit, uniform 13 Object chair 03001627 a seat for one person, with a support for the back chair, folding chair, wheelchair 12 Object picture (image, icon, ikon) 03931044 a visual representation (of an object or scene or person or abstraction) produced on a surface graphic, picture, image, portrait 12 Others shape (form) 00027807 the spatial arrangement of something as distinct from its substance circle, square, quadrangle, box, node, point, dot, tree, tree diagram, hole 11 Others vertebrate (craniate) 01471682 animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain enclosed in a skull or cranium bird, ostrich, raptor, hawk, falcon, eagle, frog, snake, dog, leopard, horse, sheep, monkey, fish, skipjack tuna 11 Others implement 03563967 instrumentation (a piece of equipment or tool) used to effect an end trigger, hammer, ice pick, pan, pen, pencil, plunger, pole, pot, return key, stick, wok 10 Object way 04564698 any artifact consisting of a road or path affording passage from one place to another hallway, hall, passageway, pedestrian crossing, stairway, street 11 Scene window 04588739 (computer science) a rectangular part of a computer screen that contains a display different from the rest of the screen window 11 Object girl (miss, missy, young lady, young woman, fille) 10129825 a young woman girl, young woman 11 Human material (stuff) 14580897 the tangible substance that goes into the makeup of a physical object water, paper, sand, wood, sheet, leaf, page 11 Others cognition (knowledge, noesis) 00023271 the psychological result of perception and learning and reasoning symbol, profile, monster, character 10 Others group (grouping) 00031264 any number of entities (members) considered as a unit string, people, pair, pop group, band, calendar, line, forest 10 Others table 04379243 a piece of furniture having a smooth flat top that is usually supported by one or more vertical legs desk, stand, table 10 Object code (computer code) 06355894 (computer science) the symbolic arrangement of data or instructions in a computer program or the set of such instructions computer code, code 10 Others writing (written material, piece of writing) 06362953 the work of a writer; anything expressed in letters of the alphabet (especially when considered from the point of view of style and effect) draft, text, document, written document, clipping, line 10 Others line 06799897 a mark that is long relative to its width line 10 Others illustration 06999233 artwork that helps make something clear or attractive illustration, figure, Figure 10 Others geographical area (geographic area, geographical region, geographic region) 08574314 a demarcated area of the Earth tennis court, campus, playing field, ground, field, lawn, park, parking area, square, public square, town 10 Scene performer (performing artist) 10415638 an entertainer who performs a dramatic or musical work for an audience idol, singer, actor, actress, clown, comedian 10 Human Note. In this study, a representative instance provided by WordNet for each synset was used as the name of the base synset. Here, other instances were described in parentheses. 31 Table 3. List of Base Synsets for Subject 2 Base synset ID Definition Reported word Count Meta-category character (grapheme, graphic symbol) 06818970 a written symbol that is used to represent speech character 34 Others male (male person) 09624168 a person who belongs to the sex that cannot have babies boy, male, male person 27 Human street 04334599 a thoroughfare (usually including sidewalks) that is lined with buildings street 21 Scene car (auto, automobile, machine, motorcar) 02958343 a motor vehicle with four wheels; usually propelled by an internal combustion engine car, patrol car, police cruiser, used-car 17 Object food (solid food) 07555863 any solid substance (as opposed to liquid) that is used as a source of nourishment food, chocolate bar, apple pie, cake, cookie, bread, roll, noodle, tomato, cherry tomato, yogurt, yoghurt 19 Object building (edifice) 02913152 a structure that has a roof and walls and stands more or less permanently in one place apartment house, apartment building, building, coffee shop, house, library, school 18 Scene representation 04076846 a creation that is a visual or tangible rendering of someone or something map, model, photograph, photo, picture, snowman 14 Others furniture (piece of furniture, article of furniture) 03405725 furnishings that make a room or other area ready for occupancy bed, chair, counter, desk, furniture, hospital bed, sofa, couch, table 13 Object female (female person) 09619168 a person who belongs to the sex that can have babies girl, wife, female, female person 13 Human book (volume) 02870092 physical objects consisting of a number of pages bound together book, notebook 12 Object point 08620061 the precise location of something; a spatially limited location bakery, corner, crossing, intersection, crossroad, laboratory, level crossing, studio, bus stop, port, Kobe 12 Scene commodity (trade good, good) 03076708 articles of commerce hat, iron, jacket, T-shirt, Kimono, kimono, merchandise, refrigerator, shirt, stove 11 Object computer screen (computer display) 03085602 a screen used to display the output of a computer to the user computer screen, computer display 10 Object electronic equipment 03278248 equipment that involves the controlled conduction of electrons (especially in a gas or vacuum or semiconductor) amplifier,mobile phone, cell phone, cellular phone, printer, television, TV 11 Object mercantile establishment (retail store, sales outlet, outlet) 03748162 a place of business for retailing goods bakery, bookstore, booth, convenience store, department store, shopping center, shopping mall, shop, stall, supermarket 11 Scene region 08630985 a large indefinite location on the surface of the Earth garden, downtown, park, parking area, scenery, town, Kobe 10 Scene covering 03122748 an artifact that covers something else (usually to protect or shelter or conceal it) accessory, accessories, clothes, covering, flying carpet, hat, jacket, T-shirt, Kimono, kimono, shirt, slipper 10 Object dwelling (home, domicile, abode, habitation, dwelling house) 03259505 housing that someone is living in home,house 10 Scene 32 Table 4. List of Base Synsets for Subject 3 Base synset ID Definition Reported word Count Meta-category male (male person) 09624168 a person who belongs to the sex that cannot have babies old man, male 28 Human way 04564698 any artifact consisting of a road or path affording passage from one place to another entrance, hallway, footpath, penny arcade, pipe, sewer, staircase, stairway, street, tunnel 19 Scene room 04105893 an area within a building enclosed by walls and floor and ceiling lobby, hospital room, kitchen, trunk, operating room, room 18 Scene tract (piece of land, piece of ground, parcel of land, parcel) 08673395 an extended area of land garden, field, ground, athletic field, green, lawn, grassland, rice paddy, paddy field, park, parking area, savannah, savanna 18 Scene female (female person) 09619168 a person who belongs to the sex that can have babies female 14 Human communication 00033020 something that is communicated by or to or between people or groups subtitle, text, menu, theatre ticket, poster, character, traffic light, traffic signal, graph, mark 13 Others vertebrate (craniate) 01471682 animals having a bony or cartilaginous skeleton with a segmented spinal column and a large brain enclosed in a skull or cranium bird, hawk, eagle, frog, whale, dog, cheetah, horse, water buffalo, sheep, giraffe, fish 12 Others display (video display) 03211117 an electronic device that represents information in visual form computer screen, computer display, display, screen, monitor, window 11 Object surface 04362025 the outer boundary of an artifact or a material layer constituting or resembling such a boundary ceiling, floor, platform, screen, stage 11 Others material (stuff) 14580897 the tangible substance that goes into the makeup of a physical object gravel, cardboard, clay, earth, water, log, paper, playing card 12 Others car (auto, automobile, machine, motorcar) 02958343 a motor vehicle with four wheels; usually propelled by an internal combustion engine car, jeep, sport car, sports car, sport car 11 Object house 03544360 a dwelling that serves as living quarters for one or more families house 12 Scene external body part 05225090 any body part visible externally head, neck, chest, breast, leg, foot, hand, finger, face, human face, face 11 Others geological formation (formation) 09287968 (geology) the geological features of the earth bank, beach, gorge, hill, mountain, slope 11 Scene table 04379243 a piece of furniture having a smooth flat top that is usually supported by one or more vertical legs counter, desk, operating table, table 10 Object vascular plant (tracheophyte) 13083586 green plant having a vascular system: ferns, gymnosperms, angiosperms flower, rice, tree 10 Scene 33 3.2 Dream contents decoding We constructed decoders by training linear support vector machines (SVM) (Vapnik, 1998) on fMRI data measured while each subject viewed web images for each base synset. Multivoxel patterns in the higher visual cortex (HVC; the ventral region covering the lateral occipital complex [LOC], fusiform face area [FFA], and parahippocampal place area [PPA]; 1,000 voxels), the lower visual cortex (LVC; V1-V3 combined; 1,000 voxels), or the subareas (400 voxels for each area) were used as the input for the decoders (Fig. 6). To demonstrate dream contents decoding, we performed three types of analysis, classification, detection, and identification analysis in the following. 3.2.1 Pairwise classification analysis In classification analysis, first, a binary classifier was trained on the fMRI responses to stimulus images of two base synsets (three-volume averaged data corresponding to the 9-s stimulus block), and tested on the sleep samples (three-volume [9-s] averaged data immediately before awakening) that contained exclusively one of the two synsets while ignoring other concurrent synsets (Fig. 8; stimulus-todream decoding analysis). We only used synset pairs in which one of the synsets appeared in at least 10 reports without co-occurrence with the other (201, 118, and 86 pairs for Subject 1-3). Pairwise stimulus-to-dream decoding analysiss The distribution of the pairwise stimulus-to-dream decoding accuracies for HVC is shown together with that from the decoders trained on the same stimulusinduced fMRI data with randomly shuffled synset labels (Fig. 13; fig. 29, individual subjects). The mean decoding accuracy was 60.0% (95% confidence interval, CI, [59.0, 61.0]; three subjects pooled), significantly higher than that of the label-shuffled decoders with both Wilcoxon rank-sum and permutation tests (P < 0.001). The synsets of a pair can have unbalanced numbers of samples, which could potentially lead to some bias. However, when the correct rates were calculated in each of paired synsets and then averaged, the averaged correct rates were highly correlated with the correct rates for all pooled samples (correlation 34 Pooled HVC Chance coefficients for Subject 1-3, 0.96, 0.97, and 0.97, respectively). Therefore the bias, if any, is likely to be small. Decoding accuracy (%) 80 Shuffled 50 Chance 20 Unshuffled All (405 pairs) Selected (97 pairs) Figure 13. Distributions of pairwise decoding accuracy for stimulus-to-dream decoding. Distributions of decoding accuracies with original and label-shuffled data for all pairs (light blue and gray) and selected pairs (dark blue and black) (three subjects pooled) were shown (three subjects pooled; chance level = 50%). 35 Commonality of neural representation between perception and dreaming To look into the commonality of brain activity between perception and sleeponset dreaming, we focused on the synset pairs that produced content-specific patterns in each of the stimulus and sleep experiments (pairs with high crossvalidation classification accuracy within each of the stimulus and sleep datasets; Fig. 14; figs. 30, 31 individual subjects). With the selected pairs, even higher accuracies were obtained (mean = 70.3%, CI [68.5, 72.1]; Fig. 13, dark blue; fig. 29, individual subjects; Tables S5-S7, lists of the selected pairs), indicating that content-specific patterns are highly consistent between perception and sleep-onset dreaming. The selection of synset pairs, which used knowledge of the test (sleep) data, does not bias the null distribution by the label-shuffled decoders (Fig. 13, black), because the content specificity in the sleep dataset alone does not imply commonality between the two datasets. Accurate stimulus-to-dream decoding requires that stimulus and dream data share similar content-specific patterns: in the case of binary classification, the class boundary in dream data should be similar to that in stimulus data. 36 B Pooled HVC Chance SS decoding Unshuffled Decoding accuracy (%) 20 DD decoding Pooled HVC Chance A Decoding accuracy (%) 50 80 Shuffled 20 Unshuffled 50 80 Shuffled Figure 14. Within dataset cross-validation decoding. The cross-validation analysis of the stimulus-induced and sleep datasets yielded the distribution of accuracies for (A) stimulus-to-stimulus (SS) pairwise decoding and (B) dream-to-dream (DD) pairwise decoding (three subjects pooled, HVC; shown together with the distribution from label-shuffled decoders). The mean decoding accuracies for SS and DD decoding were 83.4% (95% CI [82.3, 84.6]) and 54.8% (95% CI [53.6, 56.0]), respectively (three subject pooled). The fraction of pairs that showed significantly better decoding performance than chance level (one-tailed binomial test, uncorrected P < 0.05) for SS decoding was 96.0% (389/405) and for DD decoding was 24.7% (100/405) (three subjects pooled). The performance was significantly better than that from label-shuffled decoders for both SS and DD decoding (Wilcoxon rank-sum test, P < 0.001). Note that although the dream-todream decoding shows marginally significant accuracies (depending on subjects, see Fig. 31), it is not as accurate as the stimulus-to-dream decoding. This is presumably because training samples from the sleep dataset were fewer and noisier than those from the stimulus-induced dataset, and thus decoders were not well trained with the sleep dataset. 37 Representational similarity between stimulus perception and dreaming To further examine the commonality of representations for perception and sleeponset dreaming, we performed representational similarity analysis (RSA) (Kriegeskorte et al., 2008) using the accuracies from multiple pairs of both stimulus-tostimulus (SS) decoding and dream-to-dream (DD) decoding (Figs. 14). The RSA posits that the dissimilarities among categories calculated from different modalities (in our case, perception and dreaming) would be similar if they have common representation. Thus, we calculated correlations between pairwise accuracies obtained from SS and DD decoding, that is a measure of representational dissimilarity, and observed positive correlations from two of three subjects when the activity of HVC were used to train decoders (Fig. 15; r = 0.14, P < 0.05 for Subject 1; r = 0.18, P < 0.05 for Subject 2; r = -0.01, P > 0.05 for Subject 3). Note that although the RSA results of Subject 3 did not show a positive correlation between the accuracy of SS decoding and DD decoding, it could be attributed to the low DD decoding accuracy that may be derived from small number of training samples. 38 A SS B DD 100 C 0.2 Subject 1 Subject 2 Subject 3 50 0 100 0.1 0 50 0 100 50 DD decoding accuracy (%) 100 Su b 0 je Su ct 1 bj e Su ct 2 bj ec t3 50 Decoding accuracy (%) 0 (NaN) 0 100 Correlation coefficient SS decoding accuracy (%) 50 Figure 15. Representational similarity analysis. (A) Accuracy matrices of SS and DD decoding as a measure of neural dissimilarity (for DD decoding, threevolume [9-s] averaged data immediately before awakening in HVC were used). Each rows and columns indicate the base synsets of each subject. Only the pairs that have ample number of samples were used for this analysis (cyan cells indicate uncalculated pairs). (B) Plot of accuracy for the DD decoding analysis against accuracy for SS decoding analysis. Each dot indicates accuracy of a pair. The blue line indicates the regression line. (C) Correlation coefficient between accuracies of multiple pairs for SS and DD decoding (asterisk, P < 0.05). 39 Contributions of multivoxel pattern for dream contents decoding To quantitatively evaluate the efficiency of our use of multivoxel pattern over analyzing fluctuations of global signal level within each region, we performed the same pairwise decoding analysis but with averaging voxel values in each data sample, and the decoding accuracy with averaged activity was compared with that from the multivoxel decoders (Fig. 13, the top distribution). The performance with averaged activity was close to the chance level and significantly worse than that with multivoxel activity (Fig. 16; Wilcoxon signed-rank test, P < 0.001, all subjects; fig. 32, individual subjects), revealing that the multivoxel pattern, rather than the average activity level, was critical for decoding. A B Pooled HVC 100 10 Multivoxel Averaged 20 20 40 60 80 100 0 M 0 50 ox el er ag ed 0 ul tiv Decoding accuracy (%) Proportion (%) 10 Av 20 Decoding accuracy (%) Figure 16. Decoding with averaged vs. multivoxel activity. (A) Histograms of pairwise decoding accuracy with averaged and multivoxel activity. (B) Mean pairwise decoding accuracies with averaged and multivoxel activity. The voxel values in each data sample from HVC were averaged (before the z-transformation in each sample), and the averaged activity was used as the input to the pairwise decoders (error bar, 95% CI; averaged across all pairs). 40 Variations of decoding performance and semantic differences between paired synsets While the decoding performance was significantly high on average, a large variation in decoding performance could be observed. To explain the cause of this variation, we hypothesized that the semantic difference between the paired synsets affects the performance. We then grouped the pairwise decoding accuracies into the pairs within and across meta-categories and compared the performance for synsets paired across meta-categories (human, object, scene, and others; Tables 2-4) and that for synsets within meta-categories. As a result, the decoding accuracy for synsets paired across meta-categories was significantly higher than that for synsets within meta-categories, though the significance level varied across subjects (Fig. 17; Wilcoxon rank-sum test, P = 0.261 for Subject 1; P = 0.003 for Subject 2; P = 0.020 for Subject 3; P < 0.001 for the pooled; error bar, 95% CI). However, even within a meta-category, the mean decoding accuracy significantly exceeded chance level, indicating specificity to fine object categories. 41 Subject 1 Subject 2 Subject 3 Pooled 09 ) 0) s (1 (5 ro s Ac W ith in W ith in (1 0) Ac ro ss (2 0) ss (2 (1 2 in ith ro Ac W W ith in (2 8) Ac ro ss (6 2) 7) 50 ) Decoding accuracy (%) 80 Figure 17. Mean accuracies for the pairs within and across meta-categories. Pairwise decoding accuracies grouped into the pairs within and across meta-categories are shown (individual subjects and three subjects pooled). The numbers of available pairs are denoted in parentheses. Because the others category contains a large variety of synsets in terms of semantic similarity, the pairs of synsets in the others category were excluded from this analysis. 42 Pairwise decoding accuracy for each sleep state To find out the dependency of pairwise decoding accuracy on sleep state, the samples from each sleep state (Fig. 12) were separately evaluated (three-volume [9-s] averaged data immediately before awakening). While the accuracy for the samples from awake state tend to be slightly low in comparison with the other states (sleep stage 1 and 2), any consistent tendency in difference of the decoding accuracy from sleep stage 1 and 2 was observed across three subjects (Fig. 18; fig. 33 individual subjects). The slightly low performance for awake state might be able to be explained by that the contents reported from the awakenings where the preceding sleep periods were judged as awake might reflect dream contents viewed long before awakening and the information could not be decoded from the brain activity just before awakening. These results might indicate that the representations of objects or scenes are moderately steady and are not affected by sleep states. Pooled HVC 60 50 40 1, 2 Aw (6 a 19 St ke ( ) ag 53 e ) 1 St ( ag 433 e ) 2 (1 86 ) 30 St ag e Decoding accuracy (%) 70 Sleep state Figure 18. Mean accuracies for the samples from each sleep state. The decoding accuracy was separately evaluated for the samples from each sleep state judged from the last 15-s epoch before awakening (Fig. 12). The number of samples is denoted in parentheses. 43 Pairwise decoding accuracies across visual areas To investigate the performance difference across visual areas, the mean decoding accuracy from each subarea was evaluated (Fig. 19; fig. 34, individual subjects). The results showed that the LVC scored 54.3% (CI [53.4, 55.2]) for all pairs, and 57.2% (CI [54.2, 60.2] for selected pairs (three subjects pooled). The performance was significantly above chance level but worse than that for HVC. Individual areas (V1-V3, LOC, FFA, and PPA) showed a gradual increase in accuracy along the visual processing pathway, mirroring the progressively complex response properties from low-level image features to object-level features (Kobatake, et al., 1994). Decoding accuracy (%) 80 All Selected 50 LVC HVC V1 V2 V3 Area LOC FFA PPA Figure 19. Pairwise decoding accuracies across visual cortical areas. The numbers of selected pairs for V1, V2, V3, LOC, FFA, PPA, LVC, and HVC were 45, 50, 55, 70, 48, 78, 55, and 97. Error bars indicate 95% CI, and black dashed lines denote chance level. 44 Time course of decoding accuracy In order to specify the timing of dreaming, we checked the time course of the pairwise decoding accuracy calculated from the samples around the awakening. When the time window was shifted around the awakening timing, the decoding accuracy peaked around 0-10 s before awakening (Fig. 20 and fig. 35; no correction for hemodynamic delay). The high accuracies after awakening may be due to hemodynamic delay and the large time window. Thus, verbal reports are likely to reflect brain activity immediately before awakening. Note that fMRI signals after awakening may be contaminated with movement artifacts and brain activity associated with mental states during verbal reports. Mental states during verbal reports are unlikely to explain the high accuracy immediately after awakening, because the accuracy profile does not match the mean duration of verbal reports (34 ± 19 s, mean ± SD; three subjects pooled). 80 HVC Decoding accuracy (%) LVC 50 48 36 24 12 0 48 36 24 12 Time to awakening (s) All Selected 0 Figure 20. Time course of pairwise decoding accuracy. The time course of pairwise decoding accuracy is shown (three subjects pooled; shades, 95% CI; averaged across all or selected pairs and subjects). Averages of three fMRI volumes (9 s; HVC or LVC) around each time point were used as inputs to the decoders. The performance is plotted at the center of the window. The gray region indicates the time window used for the main analyses (the arrow denotes the performance obtained from the time window). No corrections for hemodynamic delay were conducted. 45 Pairwise decoding accuracy across whole cortical areas To further investigate the potential for representing dream contents across whole cortical areas, the time course of mean pairwise decoding accuracies from local areas that collectively cover a whole cortical surface were evaluated (Fig. 7), and the accuracies were mapped on the cortical surface (Fig. 21 for stimulus-tostimulus decoding analysis; Figs. 22-24 for stimulus-to-dream decoding analysis). The stimulus-to-dream decoding accuracy maps from the three subjects consistently showed high decoding accuracy from visual areas around the timing of awakening, which was already shown in the previous analysis (Figs. 13, 19). Additionally, a similar tendency among the three subjects can be seen that the parietal areas showed relatively high accuracy during the periods not only before but also after the awakening that roughly corresponds to the duration of report (34 ± 19 s, mean ± SD; three subjects pooled). 46 Subject 1 Dorsal Left Right Ventral Decoding accuracy (%) 50 Subject 2 85 Decoding accuracy (%) 50 80 Subject 3 Decoding accuracy (%) 50 75 Figure 21. Stimulus-to-stimulus decoding accuracy on whole cortical areas for the three subjects. The mean accuracies of stimulus-to-stimulus decoding analysis from the anatomically defined ROIs were mapped on a flattened cortical surface. Note that the colorbar ranges for three subjects were different. 47 Time to Awakening= 45.0s Time to Awakening= 3.0s Time to Awakening= 0.0s Time to Awakening= 39.0s Time to Awakening= 36.0s Time to Awakening= 33.0s Time to Awakening= 30.0s Time to Awakening= 9.0s Time to Awakening= 6.0s Decoding accuracy (%) 50 Figure 22. Time course of stimulus-to-dream decoding accuracy on whole cortical areas for Subject 1. The mean accuracies of stimulus-to-dream decoding analysis from the anatomically defined ROIs were mapped on a flattened cortical surface. Averages of three fMRI volumes (9 s) around each time point were used as inputs to the decoders. No corrections for hemodynamic delay were conducted. 48 56 Time to Awakening= 24.0s Time to Awakening= 42.0s Time to Awakening= 27.0s Time to Awakening= 0.0s Time to Awakening= 6.0s Decoding accuracy (%) 60 Figure 23. Time course of stimulus-to-dream decoding accuracy on whole cortical areas for Subject 2. The mean accuracies of stimulus-to-dream decoding analysis from the anatomically defined ROIs were mapped on a flattened cortical surface. Averages of three fMRI volumes (9 s) around each time point were used as inputs to the decoders. No corrections for hemodynamic delay were conducted. 49 Time to Awakening= 45.0s Time to Awakening= 24.0s Time to Awakening= 3.0s Time to Awakening= 42.0s Time to Awakening= 0.0s Time to Awakening= 39.0s Time to Awakening= 36.0s Time to Awakening= 33.0s Time to Awakening= 30.0s Time to Awakening= 9.0s Time to Awakening= 6.0s Decoding accuracy (%) 50 Figure 24. Time course of stimulus-to-dream decoding accuracy on whole cortical areas for Subject 3. The mean accuracies of stimulus-to-dream decoding analysis from the anatomically defined ROIs were mapped on a flattened cortical surface. Averages of three fMRI volumes (9 s) around each time point were used as inputs to the decoders. No corrections for hemodynamic delay were conducted. 50 56 3.2.2 Multilabel decoding analysis While the pairwise classification analysis provided a benchmark performance by preselecting synsets and dream samples for binary classification, to read out richer contents given arbitrary sleep data, we next performed a multilabel decoding analysis in which the presence/absence of each base synset was predicted by a synset detector constructed from a combination of pairwise decoders (Fig. 9). The synset detector provided a continuous score indicating how likely the synset is to be present in each report. Receiver operation characteristic (ROC) curves and the area under the curve (AUC) for each synset We calculated receiver operating characteristic (ROC) curves for each base synset by shifting the detection threshold for the output score (Fig. 25; HVC; threevolume [9-s] averaged data immediately before awakening), and the detection performance was quantified by the area under the curve (AUC). Although the performance varied across synsets, 18 out of the total 60 synsets were accurately detected (Wilcoxon rank-sum test, uncorrected P < 0.05; 7/26 synsets for Subject 1, 8/18 for Subject 2, and 3/16 for Subject 3), greatly exceeding the number of synsets expected by chance (0.05 × 60 = 3). Here, while we used the samples with at least one visual report for this analysis, when we compared the scores of each synset for the samples in which the synset was reported and those for the samples with no visual report, the former is significantly higher than the latter in 15/60 synsets (Wilcoxon rank-sum test, P < 0.05; three subjects pooled; 4/26 synsets for Subject 1, 7/18 for Subject 2, and 4/16 for Subject 3), a similar result to that with samples with visual reports, providing a further support for our conclusions. Here, while we used multi-label decoders constructed from combinations of pairwise decoders (1 vs 1 classifiers), the same analysis with another way to construct multi-label decoders (1 vs others classifiers) also showed similar performances (Wilcoxon rank-sum test, P < 0.05, 12/60 for three subjects pooled; 3/26 synsets for Subject 1, 8/18 for Subject 2, and 1/16 for Subject 3). 51 Subject 1 Human girl:0.724 performer:0.707 male:0.477 0.8 Object 0.6 implement:0.552 chair:0.548 window:0.537 clothing:0.528 table:0.520 0.4 Scene room:0.693 building:0.628 workplace:0.620 way:0.490 Others shape:0.805 picture:0.658 character:0.644 group:0.486 illustration:0.470 material:0.455 code:0.444 0.2 0 Subject 2 Human True positive 0.8 female:0.647 Object 0.6 book:0.776 electronicequipment:0.700 0.4 Scene region:0.794 street:0.774 mercantileestablishment:0.760 building:0.664 point:0.647 dwelling:0.605 Others commodity:0.596 furniture:0.562 character:0.767 representation:0.448 0.2 0 Subject 3 0.8 Human male:0.688 0.6 Object car:0.624 display:0.506 table:0.475 0.4 0.2 Scene room:0.600 house:0.545 way:0.539 Others communication:0.673 vertebrate:0.632 material:0.486 0 0 0.2 0.4 0.6 False positive 0.8 Figure 25. ROC analysis for the three subjects. The AUC for each synsets were denoted on the side of the name of each base synsets. The asterisk denoted the synsets with above-chance level (Wilcoxon rank-sum test; uncorrected P < 0.05). 52 AUC averaged within meta-categories for different visual areas Using the AUC, we compared the decoding performance for individual synsets grouped into meta-categories in different visual areas. Overall, the performance was better in HVC than in LVC, consistent with the pairwise decoding performance (Fig. 26; three subjects pooled; ANOVA, P = 0.003). While V1-V3 did not show different performances across meta-categories, the higher visual areas showed a marked dependence on meta-categories (Fig. 26; three subjects pooled). In particular, FFA showed better performance with human synsets, while PPA showed better performance with scene synsets (ANOVA [interaction], P = 0.001; three subjects pooled), consistent with the known response characteristics of these areas (Epstein Kanwisher, 1998; Kanwisher et al., 1997). LOC and FFA showed similar results, presumably because our functional localizers selected partially overlapping voxels. Because the sample sizes are small in individual subjects, evaluation with statistical tests for each subject is difficult. However, tendencies similar to the pooled results are found in individual subjects: HVC tends to outperform LVC (ANOVA, P = 0.029 for Subject 1, P = 0.159 for Subject 2, and P = 0.139 for Subject 3), and FFA and PPA tend to show better performances with human and scene, respectively (ANOVA [interaction], P = 0.099 for Subject 1; P = 0.028 for Subject 2; P = 0.044 for Subject 3). 53 Subject 1 Subject 2 Subject 3 Pooled 0.7 HVC LVC 0.5 0.7 AUC V1 V2 V3 0.5 0.7 LOC FFA PPA (2 1) s (1 6) er O th en e Sc t( ec an O bj um H 16 ) (7 ) ) (6 s (5 ) er th O Sc en e (3 ) (2 ) ct an O um H bj e (2 ) s (6 ) er th (8 en e Sc ct bj e an O O ) ) (2 3) um H O th er s (1 (5 ) (5 en e Sc ct bj e O H um an (3 ) ) 0.5 Figure 26. AUC averaged within meta-categories for different visual areas (three subjects pooled; numbers of synsets in parentheses). The mean AUC for the synsets within each meta-category is plotted for different visual areas (individual subjects and pooled results; error bars, 95% CI). The numbers of available synsets are shown in parentheses. 54 Time course of synset scores The output scores for individual synsets showed diverse and dynamic profiles in each sleep sample (Fig. 27A and fig. 36 for other examples). These profiles may reflect a dynamic variation of visual contents including those experienced even before the period near awakening. On average, there was a general tendency for the scores for reported synsets to increase toward the time of awakening (Fig. 27B and fig. 37 individual subjects). Interestingly, synsets that did not appear in reports showed greater scores if they had a high co-occurrence relationship with reported synsets (Fig. 27B; synsets with top 15% conditional probabilities given a reported synset, calculated from the whole content vectors in each subject). The effect of co-occurrence is rather independent of that of semantic similarity (Fig. 17) because both factors (high/low co-occurrence and within/across meta-categories) had highly significant effects on the scores of unreported synsets (three-volume [9-s] averaged data immediately before awakening; two-way ANOVA,P < 0.001, three subjects pooled) with moderate interaction (P = 0.016). The scores for reported synsets were significantly higher than those for unreported synsets even within the same meta-category (Wilcoxon rank-sum test, P < 0.001). Verbal reports are unlikely to describe full details of visual experience during sleep, and it is possible that contents with high general co-occurrence (e.g., street and car) tend to be experienced together even when all are not reported. Therefore, high scores for the unreported synsets may indicate unreported but actual visual contents during sleep, and we may be able to detect implicit contents by scrutinizing the score time course. 55 What I was just looking at was some kind of characters... Score Subject 2 118th 10 awakening B Normalized Score A character 0 48 36 24 12 0 Time to awakening (s) Reported Unreported (High/Low) 0.4 0 0.2 48 36 24 12 0 12 24 Time to awakening (s) Figure 27. Synset score time course. (A) Example time course of synset scores for a single dream sample (Subject 2, 118th; color legend as in Fig. 25; reported synset, character, in bold). (B) Time course of averaged synset scores for reported synsets (red) and unreported synsets with high/low (blue/gray) co-occurrence with reported synsets (averaged across awakenings and subjects). Scores are normalized by the mean magnitude in each subject. 56 Identification analysis Finally, to explore the potential of multilabel decoding to distinguish numerous contents simultaneously, we performed identification analysis (Kay, et al., 2008; Miyawaki, et al., 2008). The output scores (score vector) were used to identify the true visual content vector among a variable number of candidates (true vector + random vectors with matched probabilities for each synset) by selecting the candidate most correlated with the score vector (Fig. 28A; repeated 100 times for each sleep sample to obtain the correct identification rate). The performance exceeded chance level across all set sizes (Fig. 28B; HVC; three subjects pooled; fig. 38, individual subjects), although the accuracies were not as high as those achieved using stimulus-induced brain activity in previous studies (Kay, et al., 2008; Miyawaki, et al., 2008). The same analysis was performed with extended visual content vectors in which unreported synsets having a high cooccurrence with reported synsets (top 15% conditional probability) were assumed to be present. The results showed that extended visual content vectors were better identified (Fig. 28B and fig. 39), suggesting that multilabel decoding outputs may represent both reported and unreported contents. 57 N candidates Decoder output True B 80 Identification accuracy (%) A 60 40 20 0 Most similar vector? Original Extended 24 8 16 Candidate set size 32 Figure 28. Identification analysis. (A) Schematic view of the identification analysis. The correlation coefficients between the score vector from multilabel decoding and each of the candidate vectors consisting of the true visual content vector and a variable number of binary vectors generated randomly with a matched probability were calculated. The vector with the highest correlation coefficient was chosen as the one representing the visual contents. (B) Identification performance as a function of the candidate set size. Accuracies are plotted against candidate set size for original and extended visual content vectors (averaged across awakenings and subjects). Because Pearson’s correlation coefficient could not be calculated for vectors with identical elements, such samples were excluded. The shades indicate 95% CI, and dashed lines denote chance level. 58 4. Discussion We have shown that visual dream contents viewed during sleep-onset periods can be read out from fMRI signals of the human visual cortex. Our decoding analyses revealed that the accurate classification, detection, and identification regarding dream contents could be achieved with the higher visual cortex. This is the first demonstration of neural basis of dream contents, and our results can be interpreted as an evidence against the theory that our dreams are made up when we wake up (Windt, 2013.). Commonality of neural representation between perception and dreaming Because our decoders were trained using the fMRI responses induced by natural image viewing, the accurate performances of the pairwise and multilabel decoding analysis suggest that the specific dream contents are represented in activity patterns, which are shared by stimulus perception. Such a representational commonality was also confirmed from the representational similarity analysis, though the results showed only a marginal significance depending on subjects (Fig. 15). The common representational property between perception and dreaming was also observed from additional analyses, including the higher performance for synset pairs across meta-category than those within meta-category (Fig. 17), the gradual increase in decoding accuracy along the visual hierarchy (Fig. 19), and the marked dependence of the synset detection performance (quantified by AUCs) on meta-categories in the higher visual areas (Fig. 26). The results suggest that the principle of perceptual equivalence (Finke, 1989), which postulates a common neural substrate for perception and imagery, generalizes to spontaneously generated visual experience during sleep. Although we have demonstrated semantic decoding with the higher visual cortex, this does not rule out the possibility of decoding low-level features with the lower visual cortex. Furthermore, while we have focused on semantic aspects of the dreaming (nouns describing objects or scenes), dreaming consisted of multiple aspects like actions (described by verbs), emotions (described by adjectives), and so on. Then, further analyses focused on those aspects will help to reveal the whole picture of dreaming in comparison to 59 our actual experiences. Decoding from multivoxel patterns Our approach extends previous research on the (re)activation of the brain during sleep (Braun et al., 1998; Maquet, 2000; Yotsumoto et al., 2009; Wilson and Mcnaughton, 1994) and the relationship between dreaming and brain activity (Dresler et al., 2011; Marzano et al., 2011; Miyauchi et al., 2009), by discovering links between complex brain activity patterns and unstructured verbal reports using database-assisted machine learning decoders. A major difference of our approach from the previous studies is the use of multivoxel patterns to analyze the brain activity during sleep, as it was demonstrated by our analysis that revealed the contributions of the multivoxel patterns over the average activity level to read out dream contents (Fig. 16). The multivoxel pattern analysis required a large number of samples to obtain stable results, and that was a major difficulty to study dreaming with multivoxel pattern analysis. However, the multiple awakening procedure and the active use of lexical database to extract representative dream contents successfully worked to reconcile this difficulty. We expect the present study would be a good demonstration of a new approach to study dreaming with objective manner. Sleep-onset dreaming and REM dreaming While we focused on the dreaming during sleep-onset periods, analysis of dreaming during REM periods would be necessary to understand the nature of dreaming irrespective of the sleep states. The similarity between REM and sleep-onset reports (Foulkes and Vogel, 1965; Vogel et al., 1972; Oudiette et al., 2012) and the visual cortical activation during the REM sleep (Braun et al., 1998; Maquet, 2000; Miyauchi et al., 2009) suggest that the same approach could also be used to decode REM dreaming. Moreover, as our analysis could not detect any difference of accuracy from different sleep states (Fig. 18), dream contents might possibly be represented in the same manner across different sleep states. Our method may further work beyond the bounds of sleep stages to uncover the dynamics of spontaneous brain activity in association with stimulus representation. The decoding presented here is retrospective in nature: decoders were constructed af60 ter sleep experiments based on the collected reports. However, because reported synsets largely overlap between the first and the last halves of the experiments (59/60 base synsets appeared in both), the same decoders may apply to future sleep data. An interesting direction to train dream decoder is to use a large dream-report database (DreamBank; http://www.dreambank.net) that collects reports of dreaming from various people. We may be able to utilize the database to effectively select representative dream contents to train decoder. Is dreaming more similar to perception or visual imagery? Our analyses demonstrate the representational commonality between perception and dreaming in the higher visual areas just at the timing of their visual experience. However, the commonalities in the other areas and at the other timing, such as a timing of dream generation, are not yet thoroughly explored. Our analysis showed the high stimulus-to-dream decoding accuracy from the parietal areas as well as the higher visual areas during the periods around the awakening (Figs. 22-24), suggesting the existence of the representational commonality in non-visual areas. Since our technique can be a tool to detect the commonality between any types of visual experience and dreaming, decoders trained using fMRI responses induced by visual imagery during awake can be used to reveal the commonality between visual imagery and dreaming. Thus, comparing the accuracy time course of dream decoding on whole cortical areas using perceptor imagery-trained decoders could provide valuable insights into the mechanism of how dream contents are generated and represented in various brain areas. Furthermore, because the dreaming and visual imagery both occurred without visual stimuli, the imagery-trained decoders might provide higher accuracy than the percept-trained decoders in higher areas. Decoding spontaneous brain activity In contrast to the previous studies demonstrating the decoding of stimulus- or task-induced brain activity (Cox and Savoy, 2003; Kamitani and Tong 2005, 2006; Miyawaki et al., 2008; Stokes et al., 2009; Reddy et al., 2010; Harrison and Tong, 2010; Albers et al., 2013), the present study demonstrated decoding of spontaneous brain activity during sleeping. As the results of the multilabel 61 decoding analysis suggested (Figs. 27B, 28), one interesting possibility is that our approach can read out not only what subjects remember in their dreams but also what they forget or fail to report. Our method may work to detect the contents implicitly represented in spontaneous brain activity. Several studies reported representational overlaps between stimulus- or task-induced patterns and spontaneously emerged patterns during both awake and sleeping (Tsodyks et al., 1999; Kenet et al., 2003; Han et al., 2008; Luczak et al., 2009; Yotsumoto et al., 2009), but the functions of spontaneous brain activity are not yet clear. Our approach will be able to discover the contents represented in spontaneous brain activity, and will be a powerful tool to link spontaneous activity patterns with behavioral, physiological, and cognitive states of us. We expect that this study will lead to a better understanding of the functions of not only dreaming but also other spontaneous neural events. 62 References [1] H. W. Agnew, Jr, W. B. Webb, R. L. Williams, The first night effect: an EEG study of sleep. Psychophysiology 2, 263-266 (1966) [2] E. Aserinsky, N. Kleitman, Regularly occurring periods of eye motility, and concomitant phenomena during sleep. Science 118, 273-274 (1953) [3] G. W. Baylor, C. Cavallero, Memory Sources Associated with REM and NREM Dream Reports Throughout the Night: A New Look at the Data. Sleep 24, 165-170 (2001) [4] G. Bonmassar et al., Motion and ballistocardiogram artifact removal for interleaved recording of EEG and EPs during MRI. Neuroimage 16, 11271141 (2002) [5] A. R. Braun, T. J. Balkin, N. J. Wesenten, R. E. Carson, M. Varga, P. Baldwin, S. Selbie, G. Belenky, P. Herscovitch, Regional cerebral blood flow throughout the sleep-wake cycle. An H2(15)O PET study. Brain 120, 11731179 (1997) [6] A. R. Braun, T. J. Balkin, N. J. Wesensten, F. Gwadry, R. E. Carson, M. Varga, P. Baldwin, G. Belenky, P. Herscovitch, Dissociated pattern of activity in visual cortices and their projections during human rapid eye movement sleep. Science 279, 91-95 (1998) [7] C. Cavallero, P. Cicogna, V. Natale, M. Occhionero, A. Zito, Slow wave sleep dreaming. Sleep 15, 562-566 (1992) [8] C. C. Chang, C. J. Lin, LIBSVM a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, (2011) Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm [9] W. Dement, N. Kleitman, The relation of eye movements during sleep to dream activity: An objective method for the study of dreaming. J. Exp. Psychol. 53, 339-346 (1957) 63 [10] W. Dement, E. A. Wolpert. The relation of eye movements, body motility, and external stimuli to dream content. J. Exp. Psychol. 55, 543-553 (1958) [11] J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei, Imagenet: A largescale hierarchical image database. IEEE CVPR (2009) [12] M. Dresler et al., Dreamed movement elicits activation in the sensorimotor cortex. Curr. Biol. 21, 1833-1837 (2011) [13] S. A. Engel et al., fMRI of human visual cortex. Nature 369, 525 (1994) [14] R. Epstein, N. Kanwisher, A cortical representation of the local visual environment. Nature 392, 598-601 (1998) [15] C. Fellbaum, Ed., WordNet: An Electronic Lexical Database. (MIT Press, Cambridge, MA, 1998) [16] R. A. Finke, Principles of Mental Imagery. (MIT Press, Cambridge, MA, 1989) [17] D. Foulkes, G. Vogel, Mental activity at sleep onset. J. Abnorm. Psychol. 70, 231-243 (1965) [18] W. D. Foulkes, Dream reports from different stages of sleep. J. Abnorm. Soc. Psychol. 65, 14-25 (1962) [19] A.Germain, T. A. Nielsen, EEG Power Associated with Early Sleep Onset Images Differing in Sensory Content. Sleep Research Online 4, 83-90 (2001) [20] J. J. Gugger, M. L. Wagner, Rapid eye movement sleep behaviour disorder. Ann. Pharmacother. 41, 18331841 (2007) [21] F. Han, N. Caporale, Y. Dan, Reverberation of recent visual experience in spontaneous cortical waves. Neuron 60, 321-327 (2008) [22] S. A. Harrison, F. Tong, Decoding reveals the contents of visual working memory in early visual areas. Nature 458, 632-635 (2009) [23] J. V. Haxby et al., Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425-2430 (2001) 64 [24] J. A. Hobson, REM sleep and dreaming: towards a theory of protoconsciousness. Nature Rev. Neurosci. 10, 803-813 (2009) [25] J. A. Hobson, R. Stickgold, Dreaming: A Neurocognitive Approach. Conscious Cogn. 3, 1-15 (1994) [26] T. Hori, M. Hayashi, T. Morikawa, in Sleep Onset: Normal and Abnormal Processes. R. D. Ogilvie, J. R. Harsh, Eds. (American Psychological Association, Washington, 1994), pp. 237-253. [27] C. C. Hong, J. C. Harris, G. D. Pearlson, J. S. Kim, V. D. Calhoun, J. H. Fallon, X. Golay, J. S. Gillen, D. J. Simmonds, P. C. van Zijl, D. S. Xee, J. J. Pekar, fMRI evidence for multisensory recruitment associated with rapid eye movements during sleep. Hum. Brain Mapp. 30 1705-1722 (2009) [28] A. G. Huth, S. Nishimoto, A. T. Vu, J. L. Gallant, A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 1210-1240 (2012) [29] N. Kajimura, M. Uchiyama, Y. Takayama, S. Uchida, T. Uema, M. Kato, M. Sekimoto, T. Watanabe, T. Nakajima, S. Horikoshi, K. Ogawa, M. Nishikawa, M. Hiroki, Y. Kudo, H. Matsuda, M. Okawa, K. Takahashi, Activity of midbrain reticular formation and neocortex during the progression of human non-rapid eye movement sleep. J. Neurosci. 19, 10065-10073 (1999) [30] Y. Kamitani, F. Tong, Decoding the visual and subjective contents of the human brain. Nature Neurosci. 8, 679-685 (2005) [31] Y. Kamitani, F. Tong, Decoding seen and attended motion directions from activity in the human visual cortex. Curr. Biol. 16, 1096-1102 (2006) [32] N. Kanwisher, J. McDermott, M. M. Chun, The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci. 17, 4302-4311 (1997) [33] K. N. Kay, T. Naselaris, R. J. Prenger, J. L. Gallant, Identifying natural images from human brain activity. Nature 452, 352-355 (2008) 65 [34] T. Kenet, D. Bibitchkov, M. Tsodyks, A. Grinvald, A. Arieli,Spontaneously emerging cortical representations of visual attributes. Nature 425, 954-956 (2003) [35] E. Kobatake, K. Tanaka, Neuronal selectivities to complex object features in the ventral visual pathway of macaque cerebral cortex. J. Neurophysiol. 71, 856-867 (1994) [36] Z. Kourtzi, N. Kanwisher, Cortical regions involved in perceiving object shape. J. Neurosci. 20, 3310-3318 (2000) [37] N. Kriegeskorte, M. Mur, D. A. Ruff, R. Kiani, J. Bodurka, H. Esteky, K. Tanaka, P. A. Bandettini, Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron 60, 1126-1141 (2008) [38] A. Luczak, P. Bartho, K. D. Harris, Spontaneous events outline the realm of possible sensory responses in neocortical populations. Neuron 62, 413-425 (2009) [39] C. Marzano et al., Recalling and forgetting dreams: theta and alpha oscillations during sleep predict subsequent dream recall. J. Neurosci. 31, 6674-6683 (2011) [40] P. Maquet, J. Peters, J. Aerts, G. Delfiore, C. Degueldre, A. Luxen, G. Franck, Functional neuroanatomy of human rapid-eye-movement sleep and dreaming. Nature 383, 163-166 (1996) [41] P. Maquet, Functional neuroimaging of normal human sleep by positron emission tomography. J. Sleep Res. 9, 207-231 (2000) [42] M. Minear, D. C. Park, A lifespan database of adult facial stimuli. Behavior Research Methods, Instruments, and Computers. 36, 630-633 (2004) [43] S. Miyauchi, M. Misaki, S. Kan, T. Fukunaga, T. Koike, Human brain activity time-locked to rapid eye movements during REM sleep. Exp. Brain Res. 192, 657-667 (2009) 66 [44] Y. Miyawaki et al., Visual image reconstruction from human brain activity using a combination of multiscale local image decoders. Neuron 60, 915-929 (2008) [45] T. H. Monk, D. J. Buysse, C. F. Reynolds, 3rd, D. J. Kupfer, Circadian determinants of the postlunch dip in performance. Chronobiol. Int. 13, 123133 (1996). [46] Y. Nir, G. Tononi, Dreaming and the brain: from phenomenology to neurophysiology. Trends Cogn. Sci. 14, 88-100 (2010) [47] R. D. Ogilvie, R. T. Wilkinson, The detection of sleep onset: behavioral and physiological convergence. Psychophysiology 21, 510-520 (1989) [48] D. Oudiette et al., Dreaming without REM sleep. Counscious Cogn. 21, 1129-1140 (2012) [49] L. Palagini, A. Gemignani, I. Feinberg, M. Guazzelli, I. G. Campbell, Mental activity after early afternoon nap awakenings in healthy subjects. Brain Research Bulletin 63, 361-368 (2004) [50] S. M. Polyn, V. S. Natu, J. D. Cohen, K. A. Norman, Category-specific cortical activity precedes retrieval during memory search. Science 310, 19631966 (2005) [51] A. Rechtschaffen, A. Kales, A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Subjects. (U.S. Dept. of Health, Education, and Welfare, Public Health Services-National Institutes of Health, National Institute of Neurological Diseases and Blindness, Neurological Information Network, Bethesda, Md., 1968) [52] M. I. Sereno et al., Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science 268, 889-893 (1995). [53] F. Sharbrough et al., American electroencephalographic society guidelines for standard electrode position nomenclature. J. Clin. Neurophysiol. 8, 200202 (1991) 67 [54] M. Solms, Neuropsychology of dreams. Mahwah, NJ:Erbaum (1997) [55] R. Stickgold, A. Malia, D. Maguire, D. Roddenberry, M. O’Connor, Replaying the game: hypnagogic images in normals and amnesiacs. Science 290, 350-353 (2000) [56] M. Stokes, R. Thompson, R. Cusack, J. Duncan, Top-down activation of shape-specific population codes in visual cortex during mental imagery. J. Neurosci. 29, 1565-1572 (2009) [57] T. Takeuchi, A. Miyashita, M. Inugami, Y. Yamamoto, Intrinsic dreams are not produced without REM sleep mechanisms: evidence through elicitation of sleep onset REM periods. J. Sleep. Res. 10, 43-52 (2001) [58] M. Tamaki, H. Nittono, M. Hayashi, T. Hori, Examination of the first-night effect during the sleep-onset period. Sleep 28, 195-202 (2005) [59] M. Tsodyks, T. Kenet, A. Grinvald, A. Arieli, Linking spontaneous activity of single cortical neurons and the underlying functional architecture. Science 286 1943-1946 (1999) [60] V. N. Vapnik, Statistical Learning Theory. (Wiley, New York, 1998) [61] G. W. Vogel, B. Barrowclough, D. D. Giesler, Limited discriminability of REM and sleep onset reports and its psychiatric implications. Arch. Gen. Psychiatry 26, 449-455 (1972) [62] E. J. Wamsley, K. Perry, I. Djonlagic, L. B. Reaven, R. Stickgold, Cognitive replay of visuomotor learning at sleep onset: temporal dynamics and relationship to task performance. Sleep 33, 59-68 (2010) [63] M. A. Wilson, B. L. McNaughton, Reactivation of hippocampal ensemble memories during sleep. Science 265, 676-679 (1994) [64] J.M. Windt, Reporting dream experience: Why (not) to be skeptical about dream reports. Front. Hum. Neurosci. 7, 708 (2013) [65] J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba, SUN Database Large scale Scene Recognition from Abbey to Zoo. IEEE CVPR (2010) 68 [66] Y. Yotsumoto et al., Location-specific cortical activation changes during sleep after training for perceptual learning. Curr. Biol. 19, 1278-1282 (2009) 69 Acknowledgements はじめに ATR 脳情報研究所 神経情報学研究室 室長 神谷之康教授に深く感謝を いたします.修士の学生として指導を受け始めてから現在に至るまで,プロの研 究者としての姿勢,実験・解析に関するアドバイスなど,数々のご指導をいただ きましたことを心より感謝いたします.神谷先生のもとで博士課程を修了するこ とは,今後の研究者人生における最大の財産です.ありがとうございます.川人 光男教授には,研究に対する指導のみならず,素晴らしい環境のもとで研究をす る機会をいただきましたこと深く感謝いたします.数理情報学講座の池田和司教 授,柴田智広准教授には,セミナーでの発表に対して,有益なアドバイスやコメ ントをいただきましたこと感謝いたします.連携講座に所属する自分にとって, 先生方からのサポートは大きな助けとなりました.松本裕治教授には,神経科学 分野とは離れた立場から客観的なコメントをいただいたこと,本論文の審査を引 き受けていただいたことを感謝いたします.また,現在はブラウン大学に所属さ れている玉置應子研究員,電気通信大学に所属されている宮脇陽一准教授には, 本プロジェクトの進行にあたり多大な協力をいただけましたこと,心より感謝い たします.本論文で使用した睡眠データの取得は,玉置應子研究員の力なくして はなし得ないものでした.本研究に対するその甚大なる貢献に深く感謝いたしま す.解析方法に関する宮脇陽一准教授との数多くの議論は,今回のプロジェクト だけでなく,今後の研究生活においても有用な技術的土台を身につけるための非 常に有意義な経験でした.深く感謝いたします.また,本研究プロジェクトに関 わり,さまざまな点で研究の進展に尽力してくれた,大貫良幸くん,藤原祐介さ ん,Taylor Beck さん,久保孝富さんに感謝いたします.本論文の成果は,皆さ んの試行錯誤あってこそのものです.中野奈月子さん,大島由香さんには,実験 のスケジューリングに関してお世話になりましたことを感謝いたします.脳活動 イメージングセンタの皆様には,実験に協力していただきましたことを感謝いた します.ATR 脳情報研究所神経情報学研究室の皆様には,研究の内容をはじめ, 論文構成に関しても,たくさんの有益なアドバイスをいただきましたことを心か ら感謝いたします. 70 Appendix A. Supplementary results 71 Chance Subject 1 HVC Decoding accuracy (%) 20 Unshuffled 50 80 Shuffled All (201 pairs) Selected (39 pairs) Subject 2 HVC 20 50 80 All (118 pairs) Selected (47 pairs) Subject 3 HVC 20 50 80 All (86 pairs) Selected (11 pairs) Figure 29. Distribution of pairwise decoding accuracies. The distribution of the stimulus-to-dream pairwise decoding accuracies for the higher visual cortex (HVC) is shown together with that from label-shuffled decoders. The format is the same as in Fig. 13. The mean decoding accuracies for all pairs were 56.0% (95% CI [54.7, 57.3]) for Subject 1, 66.9% (CI [65.1, 66.8]) for Subject 2, and 59.8% (CI [57.9, 61.7]) for Subject 3, and those for selected pairs were 65.3% (CI [62.4, 68.2]) for Subject 1, 75.4% (CI [73.3, 77.4]) for Subject 2, and 66.1% (CI [61.2, 71.0]) for Subject 3. The fraction of pairs that showed significantly better decoding performance than chance level (one-tailed binomial test, uncorrected P < 0.05) was 22.4% (45/201) for Subject 1, 57.6% (68/118) for Subject 2, and 27.9% (24/86) for Subject 3. The performance was significantly better than that from label-shuffled decoders for all subjects for both of all and selected pairs (Wilcoxon rank-sum test, P < 0.001). 72 Unshuffled Chance Subject 1 HVC Decoding accuracy (%) 20 50 80 Shuffled 50 80 50 80 Subject 2 HVC 20 Subject 3 HVC 20 Figure 30. Stimulus-to-stimulus pairwise decoding. The cross-validation analysis of the stimulus-induced dataset (see Methods ”2.14. Synset pair selection by within-dataset cross-validation”) yielded the distribution of accuracies for stimulus-to-stimulus pairwise decoding (Subject 1-3, HVC; shown together with the distribution from label-shuffled decoders). The mean decoding accuracies were 85.6% (95% CI [84.1, 87.0]) for Subject 1, 87.0% (CI [85.4, 88.7]) for Subject 2, and 81.4% (CI [79.4, 83.4]) for Subject 3. The fraction of pairs that showed significantly better decoding performance than chance level (one-tailed binomial test, uncorrected P < 0.05) was 94.5% (190/201) for Subject 1, 99.8% (117/118) for Subject 2, and 95.3% (82/86) for Subjet 3. The performance was significantly better than that from label-shuffled decoders for all subjects (Wilcoxon rank-sum test, P < 0.001 for all three subjects). 73 Chance Subject 1 HVC Decoding accuracy (%) 20 Unshuffled 50 80 Shuffled 50 80 50 80 Subject 2 HVC 20 Subject 3 HVC 20 Figure 31. Dream-to-dream pairwise decoding. The cross-validation analysis of the sleep dataset (see Methods ”2.14. Synset pair selection by within-dataset crossvalidation”) yielded the distribution of accuracies for dream-to-dream pairwise decoding (Subject 1-3, HVC; shown with the distribution from label-shuffled decoders). The mean decoding accuracies were 52.9% (95% CI [51.2, 54.7]) for Subject 1, 60.1% (CI [58.1, 62.1]) for Subject 2, and 51.3% (CI [49.1, 53.5]) for Subject 3. The fraction of pairs that showed significantly better decoding performance than chance level (onetailed binomial test, uncorrected P < 0.05) was 20.4% (41/201) for Subject 1, 40.7% (48/118) for Subject 2, and 12.8% (11/86) for Subject 3. The results showed that the performance was significantly better than that from label-shuffled decoders for two of three subjects (Wilcoxon rank-sum test, P = 0.002 for Subject 1; P < 0.001 for Subject 2; P = 0.302 for Subject 3). Note that although the dream-to-dream decoding shows marginally significant accuracies (depending on subjects), it is not as accurate as the stimulus-to-dream decoding. This is presumably because training samples from the sleep dataset were fewer and noisier than those from the stimulus-induced dataset, and thus decoders were not well trained with the sleep dataset. 74 20 Subject 1 HVC Decoding accuracy (%) 100 10 0 10 50 Multivoxel Averaged 0 M ul t Av ivo er xel ag ed 20 Decoding accuracy (%) Proportion (%) 100 Subject 2 HVC 20 10 0 10 50 0 M ul t Av ivo er xel ag ed 20 Subject 3 HVC 100 Decoding accuracy (%) 20 10 0 10 20 50 20 40 60 80 Decoding accuracy (%) 100 M 0 ul t Av ivo er xel ag ed 0 Figure 32. Decoding with averaged vs. multivoxel activity for individual subject. (A) Histograms of pairwise decoding accuracy with averaged and multivoxel activity. (B) Mean pairwise decoding accuracies with averaged and multivoxel activity. The voxel values in each data sample from HVC were averaged (before the z-transformation in each sample), and the averaged activity was used as the input to the pairwise decoders (error bar, 95% CI; averaged across all pairs). 75 Subject 1 HVC 70 60 50 40 Subject 2 HVC 70 60 50 40 Aw (1 ak 98) e St (2 ag 2) e 1 ( St ag 178 ) e 2 (2 0) 30 St ag e 1, 2 Decoding accuracy (%) St ag e 1, 2 Aw (2 ak 35) St e ( ag 14 ) e 1 St (9 ag 1) e 2 (1 44 ) 30 Subject 3 HVC 70 60 50 St ag e 1, 2 30 Aw (1 ak 86) e St (1 ag 7) e 1 (1 St ag 64 ) e 2 (2 2) 40 Sleep state Figure 33. Mean accuracies for the samples from each sleep state for individual subjects. The decoding accuracy was separately evaluated for the samples from each sleep state judged from the last 15-s epoch before awakening (Fig. 12). The number of samples is denoted in parentheses. 76 80 Subject 1 All Selected Decoding accuracy (%) 50 80 Subject 2 50 80 Subject 3 50 LVC HVC V1 V2 V3 Area LOC FFA PPA Figure 34. Pairwise decoding accuracies across visual cortical areas. The decoding accuracies with different visual areas are shown for individual subjects (error bars, 95% CI). Selected pairs were determined on the basis of the cross-validation analysis in each area (the numbers of selected pairs for V1, V2, V3, LOC, FFA, PPA, LVC, and HVC: 24, 22, 29, 38, 36, 22, 27, and 39 pairs for Subject 1; 15, 23, 21, 24, 7, 47, 20, and 47 pairs for Subject 2; 6, 5, 5, 8, 5, 9, 8, and 11 pairs for Subject 3, respectively). 77 HVC LVC Subject 1 All Selected Decoding accuracy (%) 50 Subject 2 50 Subject 3 50 36 0 36 0 Time to awakening (s) Figure 35. Time course of pairwise decoding accuracy. The time course of pairwise decoding accuracy is shown for individual subjects (shades, 95% CI; averaged across all or selected pairs). Averages of three fMRI volumes (9 s; HVC or LVC) around each time point were used as inputs to the decoders. The performance is plotted at the center of the window. The gray region indicates the time window used for the main analyses (the arrow denotes the performance obtained from the time window). No corrections for hemodynamic delay were conducted. Note that fMRI signals after awakening may be contaminated with movement artifacts and brain activity associated with mental states during verbal reports. Mental states during verbal reports are unlikely to explain the high accuracy immediately after awakening, because the accuracy profile does not match the mean duration of verbal reports (34 ± 19 s, mean ± SD; three subjects pooled). 78 Subject 1 202th awakening 10 male 0 Subject 2 144th awakening female male Score 10 0 Subject 3 34th awakening display 10 0 48 36 24 12 0 Time to awakening (s) Figure 36. Examples for the time courses of synset scores. The time courses of synset scores from multilabel decoding analyses are shown for four individual dream examples. The plots represent the scores for the reported synset(s) (bold line with synset name) and the unreported synsets using the colors in the legend for Fig. 26. 79 Subject 1 Reported Unreported (High/Low) 0 Score Subject 2 0 Subject 3 0 48 36 24 12 0 Time to awakening (s) Figure 37. Time courses of averaged synset scores for each subject. Synset scores were averaged across awakenings for reported (red) and unreported synsets with high (blue) and low (gray) co-occurrence in individual subjects (shades, 95% CI). 80 80 Subject 1 Original Extended 60 40 20 Identification accuracy (%) 0 80 Subject 2 60 40 20 0 80 Subject 3 60 40 20 0 2 4 8 16 Candidate set size 32 Figure 38. Identification performance for individual subjects. The identification performance with original and extended visual content vectors is shown for individual subjects (shades, 95% CI; dashed line, chance level). 81